CN111507181B - Correction method and device for bill image and computer equipment - Google Patents

Correction method and device for bill image and computer equipment Download PDF

Info

Publication number
CN111507181B
CN111507181B CN202010164109.4A CN202010164109A CN111507181B CN 111507181 B CN111507181 B CN 111507181B CN 202010164109 A CN202010164109 A CN 202010164109A CN 111507181 B CN111507181 B CN 111507181B
Authority
CN
China
Prior art keywords
matrix
distortion
semantic segmentation
bill picture
bill
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010164109.4A
Other languages
Chinese (zh)
Other versions
CN111507181A (en
Inventor
周军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010164109.4A priority Critical patent/CN111507181B/en
Publication of CN111507181A publication Critical patent/CN111507181A/en
Application granted granted Critical
Publication of CN111507181B publication Critical patent/CN111507181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/80Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Character Input (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a correction method, device and computer equipment of bill image relates to the image processing field, can solve when correcting bill deformation, corrects the precision not high, and then easily detects and discerns the problem that produces the influence to follow-up bill text. The method comprises the following steps: acquiring a sample bill picture with the same bill type as the bill picture to be corrected; performing data processing on the sample bill picture to obtain a corresponding distortion matrix; creating a semantic segmentation model based on a semantic segmentation algorithm, and training the semantic segmentation model by using the processed sample bill picture and the corresponding distortion matrix so that the semantic segmentation model accords with a preset standard; inputting the bill picture to be corrected into a semantic segmentation model conforming to the preset standard, and obtaining a correction matrix; and correcting the bill picture to be corrected by using the correction matrix. The method and the device are suitable for automatic correction of bill pictures.

Description

Correction method and device for bill image and computer equipment
Technical Field
The present invention relates to the field of image processing, and in particular, to a method and apparatus for correcting a bill image, and a computer device.
Background
In recent years, automatic identification and micro-storage of document images are becoming a hotspot for research. In the industries of banks, financial tax, securities, etc., electronic imaging systems for financial documents have emerged, which systems often take scanned images of the document as input. In the process of scanning input, due to improper placement, paper folding and other various factors, a certain degree of inclination or folding of a scanning pattern may occur, which causes great difficulty in the next layout analysis, so that correction of a bill image through image preprocessing is an important link.
In the traditional claims industry, manual entry is often used to extract information on medical notes. In recent years, with the rise of machine learning and deep learning, OCR (Optical Character Recognition ) technology is increasingly applied to information extraction of medical notes. The main steps of medical bill OCR are as follows: image preprocessing- > text detection- > text recognition- > field partitioning- > field post-processing. The quality of the pretreatment has a great influence on the later detection and identification. In the preprocessing stage, existing methods mostly perform affine or perspective transformations by manually choosing four points on the invoice, top left, bottom left, top right and bottom right.
Because affine or perspective transformation regards the distortion and folding of the picture as a whole deformation, however, in many scenes, the distortion and folding of the picture are only partial deformations, so that correction of the picture through affine or perspective transformation can result in low correction precision and influence the accuracy of bill text detection and recognition.
Disclosure of Invention
In view of this, the application provides a correction method, device and computer equipment of bill image, can solve when correcting bill deformation, correct the precision not high, and then easily produce the problem that influences to follow-up bill text detection and discernment.
According to one aspect of the present application, there is provided a method of correcting a ticket image, the method comprising:
acquiring a sample bill picture with the same bill type as the bill picture to be corrected;
performing data processing on the sample bill picture to obtain a corresponding distortion matrix;
creating a semantic segmentation model based on a semantic segmentation algorithm, and training the semantic segmentation model by using the processed sample bill picture and the corresponding distortion matrix so that the semantic segmentation model accords with a preset standard;
inputting the bill picture to be corrected into a semantic segmentation model conforming to the preset standard, and obtaining a correction matrix;
and correcting the bill picture to be corrected by using the correction matrix.
According to another aspect of the present application there is provided an apparatus for correcting a ticket image, the apparatus comprising:
the acquisition module is used for acquiring sample bill pictures with the same bill type as the bill pictures to be corrected;
the processing module is used for carrying out data processing on the sample bill pictures to obtain corresponding distortion matrixes;
the training module is used for creating a semantic segmentation model based on a semantic segmentation algorithm, and training the semantic segmentation model by using the processed sample bill picture and the corresponding distortion matrix so as to enable the semantic segmentation model to accord with a preset standard;
the input module is used for inputting the bill picture to be corrected into a semantic segmentation model conforming to the preset standard to obtain a correction matrix;
and the correction module is used for correcting the bill picture to be corrected by using the correction matrix.
According to yet another aspect of the present application, there is provided a non-volatile readable storage medium having stored thereon a computer program which when executed by a processor implements the method of correcting a ticket image described above.
According to still another aspect of the present application, there is provided a computer device including a nonvolatile readable storage medium, a processor, and a computer program stored on the nonvolatile readable storage medium and executable on the processor, the processor implementing the ticket image correction method when executing the program.
By means of the technical scheme, compared with the mode of realizing bill correction by affine or perspective transformation at present, the correction method, device and computer equipment for the bill image can screen out sample bill pictures with the same type as the bill picture to be corrected under a real scene by simulating various distortion and folding effects of the bill image in advance, train and build a semantic segmentation model by utilizing the sample bill pictures and corresponding distortion matrixes, acquire the distortion matrixes of the bill picture to be corrected based on the semantic segmentation model which is successfully trained, determine correction coordinates corresponding to each pixel point in the bill picture to be corrected by using a remap function of opencv according to the distortion matrixes output by the model, and realize correction and reconstruction of the bill picture to be corrected by moving each pixel point to the corresponding correction coordinate position. The method and the device can automatically correct the image at the pixel level according to the characteristics of the input bill, can be applied to correction scenes of various local and non-local deformations, can further ensure correction accuracy, and improve correction efficiency.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the present application. In the drawings:
fig. 1 shows a schematic flow chart of a method for correcting a bill image according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of another method for correcting bill images according to an embodiment of the present application;
fig. 3 is a schematic flow chart of sample bill picture data processing according to an embodiment of the present application;
fig. 4 shows a schematic flow chart of a process for correcting a bill picture to be corrected according to an embodiment of the present application;
fig. 5 shows a schematic structural diagram of a bill image correction device according to an embodiment of the present application;
fig. 6 shows a schematic structural diagram of another bill image correction device according to an embodiment of the present application.
Detailed Description
The present application will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that, without conflict, the embodiments and features of the embodiments in the present application may be combined with each other.
The method aims at solving the problems that the correction precision is not high when the deformation of the bill is corrected at present, and then the detection and the recognition of the text of the follow-up bill are easily influenced. The embodiment of the application provides a method for correcting bill images, as shown in fig. 1, the method comprises the following steps:
101. and acquiring a sample bill picture with the same bill type as the bill picture to be corrected.
In a specific application scene, in order to ensure the training accuracy of the semantic segmentation model, the screened sample bill picture can be selected from bill pictures of the bill type to be corrected, the sample bill picture is ensured not to deviate from any pixel point, and the page is clear and complete.
102. And carrying out data processing on the sample bill picture to obtain a corresponding distortion matrix.
In a specific application scene, in order to fully train a model and ensure training accuracy, a large number of bill pictures with different forms of offset are needed to be used as training samples during model training, specifically, the sample bill pictures can be generated in advance through data processing, namely, the bill pictures of various deformations of the image in a real scene are simulated through increasing the torsion effect and the deformation effect, and a torsion matrix formed by original pixel point coordinates of the sample bill pictures and corresponding deformed pixel point coordinates is obtained.
103. Creating a semantic segmentation model based on a semantic segmentation algorithm, and training the semantic segmentation model by using the processed sample bill picture and the corresponding distortion matrix so as to enable the semantic segmentation model to accord with a preset standard.
The preset standard is that the error between the predicted result and the real result output by the semantic segmentation model is smaller than a preset threshold, the preset threshold can be set in a numerical value according to an actual application scene, and the smaller the preset threshold is, the higher the precision of the trained semantic segmentation model is represented.
104. Inputting the bill picture to be corrected into a semantic segmentation model which accords with a preset standard, and obtaining a correction matrix.
The picture to be corrected is a bill picture with certain deformation and needing image correction.
105. And correcting the bill picture to be corrected by using the correction matrix.
The correction matrix comprises current position coordinates of each pixel point in the bill picture to be corrected and corresponding correction coordinates, and the correction coordinates correspond to real coordinate positions of each pixel point. The correction matrix is utilized to correct the bill picture to be corrected, and the correction coordinates in the correction matrix are utilized to correct the current position coordinates of each pixel point, so that each offset pixel point is restored to the real position.
According to the correction method of the bill image, various distortion and folding effects of the bill image in a real scene can be simulated in advance, sample bill images with the same bill type as the bill images to be corrected are screened out, a semantic segmentation model is trained and built by utilizing the sample bill images and corresponding distortion matrixes, the distortion matrixes of the bill images to be corrected are obtained based on the semantic segmentation model which is successfully trained, correction coordinates corresponding to each pixel point in the bill images to be corrected are determined according to the distortion matrixes output by the model by using a remap function of opencv, and correction reconstruction of the bill images to be corrected is achieved by moving each pixel point to the corresponding correction coordinate position. The method and the device can automatically correct the image at the pixel level according to the characteristics of the input bill, can be applied to correction scenes of various local and non-local deformations, can further ensure correction accuracy, and improve correction efficiency.
Further, as a refinement and extension of the specific implementation of the foregoing embodiment, in order to fully describe the specific implementation process in this embodiment, another method for correcting a bill image is provided, as shown in fig. 2, where the method includes:
201. and acquiring a sample bill picture with the same bill type as the bill picture to be corrected.
For example, the type of the bill corresponding to the bill picture to be corrected is a medical clinic charging bill, and in order to ensure the training accuracy of the semantic segmentation model, the sample bill picture should be preferably a medical clinic charging bill.
202. And randomly selecting a plurality of distortion starting points from the sample bill pictures.
In a specific application scenario, in order to create a distorted ticket image sample, a pixel point at the distortion starting point needs to be moved to a corresponding position according to a preset distortion effect. And a plurality of different distortion starting points can be selected for the same sample bill picture at the same time.
203. And performing twisting processing on the sample bill pictures at the twisting starting points based on a twisting formula so as to obtain first twisting matrixes of the twisting starting points corresponding to different twisting states.
The corresponding warping formula may be: w1=1-d α Wherein w1 is a preset distortion effect, α is a constant value 2, and d is the euclidean distance between the offset point of the distortion starting point after the distortion is performed on the distortion starting point and the distortion starting point. The calculation formula of the Euclidean distance between the corresponding offset point and the distortion starting point is as follows:
Figure BDA0002406799030000051
(x 1, y 1) corresponds to the offset point, and (x 2, y 2) corresponds to the warp initiation point.
For the embodiment, in a specific application scenario, in order to perform a warping process on a sample ticket picture, a first warping matrix is obtained, and step 203 may specifically include: calculating a first offset point corresponding to a distortion starting point under a preset distortion effect according to a distortion formula; moving the pixel point at the distortion start point to a first offset point; a first warping matrix is constructed that includes warp origin position coordinates and first offset point position coordinates.
204. And performing linear interpolation filling of missing pixel points on the sample bill picture subjected to the distortion treatment, and obtaining a first sample bill picture conforming to a preset size.
In a specific application scene, after the pixels are moved, a plurality of black pixel points appear on the image, and in order to ensure the integrity of the image, the missing pixel points can be filled in a linear interpolation mode. The method comprises the following steps: for a pixel point (0 represents black) having a pixel value of 0, an average value of pixel values of 9 points around the point is selected as the pixel value of the pixel point. After filling of all missing pixels is completed, the bill picture is required to be preprocessed, and the picture is processed into a specified input format size mainly through operations such as scaling and normalization, and then the first sample bill picture is obtained.
205. And randomly selecting a plurality of folding deformation starting points from the sample bill picture.
In a specific application scenario, in order to create a folded bill image sample, a pixel point at the folding deformation starting point needs to be moved to a corresponding position according to a preset folding effect. And a plurality of different folding deformation starting points can be selected for the same sample bill picture at the same time.
206. And carrying out folding processing on the sample bill pictures at the folding deformation starting points based on a folding formula so as to obtain second distortion matrixes of the folding deformation starting points corresponding to different folding states.
The corresponding folding formula may be: w2=α/(d) 2 +α), wherein w2 is a preset folding effect, α is a constant value 2, and d is the euclidean distance between the offset point of the folding deformation starting point after the folding deformation and the folding deformation starting point. The calculation formula of the Euclidean distance between the corresponding offset point and the folding deformation starting point is as follows:
Figure BDA0002406799030000061
(x 1, y 1) corresponds to the offset point and (x 2, y 2) corresponds to the fold deformation starting point.
For this embodiment, in a specific application scenario, in order to perform folding processing on the sample ticket image, to obtain the second distortion matrix, step 204 may specifically include: calculating a second offset point corresponding to a folding deformation starting point under a preset folding effect according to a folding formula; moving the pixel point at the folding deformation starting point to a second offset point; a second warping matrix including fold deformation starting point position coordinates and second offset point position coordinates is constructed.
207. And carrying out linear interpolation filling of missing pixel points on the folded sample bill picture to obtain a second sample bill picture which accords with the preset size.
In a specific application scene, after the pixels are moved, a plurality of black pixel points appear on the image, and in order to ensure the integrity of the image, the missing pixel points can be filled in a linear interpolation mode. The method comprises the following steps: for a pixel point (0 represents black) having a pixel value of 0, an average value of pixel values of 9 points around the point is selected as the pixel value of the pixel point. After filling of all missing pixel points is completed, the bill picture needs to be preprocessed, and the picture is processed into a specified input format size mainly through operations such as scaling and normalization, so that a second sample bill picture is obtained.
Correspondingly, in a specific scene, a flow diagram of data processing on a sample bill picture is shown in fig. 3, after the sample bill picture is obtained, the sample bill picture can be subjected to twisting and folding processing to further obtain a deformed bill image, and after the deformed bill picture is subjected to linear interpolation filling of missing pixel points, the deformed bill picture is processed into a specified input format size.
208. And configuring labels for the first sample bill picture and the second sample bill picture, wherein the labels correspond to the first warping matrix and the second warping matrix.
In a specific application scene, a training set and a verification set can be created by using a first sample bill picture and a second sample bill picture, wherein the label value corresponding to each bill picture is w.h.2, w and h are the width and height of the picture respectively, and 2 represents the coordinates of pixel points x and y on a distorted image on an input image.
209. Inputting the first sample bill picture and the second sample bill picture after label configuration into a semantic segmentation model, and respectively obtaining a corresponding first prediction distortion matrix and a corresponding second prediction distortion matrix.
For the embodiment, in a specific application scenario, a first sample ticket picture or a second sample ticket picture in a training set can be used as input, a first distortion matrix and a second distortion matrix in corresponding labels are used as output to train a semantic segmentation model, after training for a certain period of time, the first sample ticket picture or the second sample ticket picture in a verification set is used as input, a first prediction distortion matrix and a second prediction distortion matrix output by the semantic segmentation model can be obtained, and training conditions of the semantic segmentation model are further verified by matching the first prediction distortion matrix and the second prediction distortion matrix with the first prediction distortion matrix and the second prediction distortion matrix in the input ticket picture labels.
210. A first loss function between the first predicted warp matrix and the first warp matrix and a second loss function between the second predicted warp matrix and the second warp matrix are calculated.
For the embodiment, in a specific application scenario, the predicted distortion matrix and the distortion matrix corresponding to the configuration label may be compared through a loss function, and a loss value is obtained, so as to determine an error between the network output result and the pre-labeled data. The calculation formula is as follows:
Figure BDA0002406799030000081
Figure BDA0002406799030000082
wherein y is i,j And
Figure BDA0002406799030000083
the pixel values predicted by the semantic segmentation model at the corresponding coordinates (i, j) and the pixel values stored in the configuration tag, respectively.
211. And if the first loss function and the second loss function are smaller than the preset threshold, judging that the semantic segmentation model meets the preset standard.
The preset threshold value can be set according to the actual application scene, and the smaller the preset threshold value is, the higher the precision of the trained semantic segmentation model is.
212. If the first loss function or the second loss function which is larger than or equal to the preset threshold value exists, the output result of the training semantic segmentation model is corrected by using the first distortion matrix and the second distortion matrix, so that the semantic segmentation model meets the preset standard.
In a specific application scenario, if it is determined that a loss function greater than or equal to a preset threshold exists in the first loss function or the second loss function, or a loss function duty ratio greater than or equal to the preset threshold is greater than a preset duty ratio, it may be determined that the semantic segmentation model does not meet a preset standard, and then an output result of the training semantic segmentation model may be corrected by using the first distortion matrix and the second distortion matrix, so that the output result reaches the preset standard.
213. And adjusting the bill picture to be corrected so as to enable the bill picture to be corrected to accord with a preset size.
In a specific application scene, before inputting the bill picture to be corrected into the semantic segmentation model, the bill picture to be corrected needs to be processed in advance, and the picture is processed into a specified input format size mainly through operations such as scaling in equal proportion, normalization and the like, so that the bill picture and the sample bill picture belong to the same format.
214. Inputting the bill picture to be corrected into a semantic segmentation model which accords with a preset standard, and obtaining a correction matrix.
In a specific application scenario, since the semantic segmentation model is trained, after the bill picture to be corrected is input into the semantic segmentation model, the correction matrix output by the semantic segmentation model can be obtained, and the correction matrix in the step of the embodiment is equivalent to the distortion matrix in the training step.
215. And extracting the current position coordinates and the correction position coordinates of all the pixel points contained in the bill picture to be corrected from the correction matrix.
For the embodiment, in a specific application scenario, the correction matrix may include the current position coordinates and the correction position coordinates of each pixel point in the bill picture to be corrected, and if the current position coordinates and the correction position coordinates of the same pixel point are matched, it may be determined that the pixel point is not shifted and does not need correction; if the current position coordinates of the same pixel point are not matched with the correction position coordinates, the pixel point can be judged to be not shifted, and correction processing is needed.
216. And determining the pixel points of which the current position coordinates are not matched with the correction position coordinates as the pixel points to be corrected.
For example, if it is determined that the current position coordinate corresponding to the pixel point a is (x 1, y 1), the corresponding correction position coordinate is (x 2, y 2), and if it is determined that the coordinates (x 1, y 1) and (x 2, y 2) do not overlap, the pixel point a may be further determined as the pixel point to be corrected, and all the pixel points to be corrected included in the bill picture to be corrected may be extracted by the coordinate matching method.
217. And replacing the current position coordinate by the correction position coordinate so as to enable the pixel point to be corrected to move to the correction position, thereby further realizing correction processing of the bill picture to be corrected.
For this embodiment, after determining all the pixels to be corrected based on embodiment step 216, the pixels to be corrected may be sequentially moved to the correction position, and further obtain the corrected bill picture.
In a specific application scenario, a flow chart of correction processing of a bill picture to be corrected is shown in fig. 4, when the bill picture to be corrected is acquired, picture size adjustment processing is needed to be performed in advance, the picture size adjustment processing and a sample bill picture belong to the same format size, then the processed bill picture to be corrected is input into a trained semantic segmentation model, a correction matrix of w x h x2 output by the semantic segmentation model is acquired, and then the bill picture to be corrected is reconstructed according to correction position coordinates of each pixel point in the correction matrix, so that pixel-level correction of the bill picture is realized.
According to the correction method of the bill image, after standard sample bill pictures are screened, data processing is carried out on the sample bill pictures based on a distortion formula and a folding formula, a first distortion matrix and a second distortion matrix are obtained, after linear interpolation operation is carried out on the sample bill pictures subjected to the distortion processing, the first sample bill pictures and the second sample bill pictures are obtained, then the distortion matrix and the corresponding first sample bill pictures or the second sample bill pictures are utilized to train and correct a semantic segmentation model, when the semantic segmentation model is judged to be in accordance with a preset standard, the preprocessed bill pictures to be corrected can be input into the semantic segmentation model, the correction matrix is obtained, and then pixels to be corrected in the bill pictures to be corrected are adjusted based on correction position coordinates in the correction matrix, so that correction processing of the bill pictures to be corrected is further achieved. The method and the device can automatically correct the image at the pixel level according to the characteristics of the input bill, can be applied to correction scenes of various local and non-local deformations, can further ensure correction accuracy, and improve correction efficiency. In addition, in the correction process, besides inputting the bill pictures to be corrected, no human participation is needed, so that the dependence on manual operation can be reduced, the automation efficiency is improved, and the error rate can be reduced.
Further, as an embodiment of the method shown in fig. 1 and fig. 2, an embodiment of the present application provides a correction device for a bill image, as shown in fig. 5, where the device includes: an acquisition module 31, a processing module 32, a training module 33, an input module 34, and a correction module 35.
The obtaining module 31 is configured to obtain a sample ticket picture with the same type as the ticket picture to be corrected;
the processing module 32 is configured to perform data processing on the sample bill picture, and obtain a corresponding distortion matrix;
the training module 33 is configured to create a semantic segmentation model based on a semantic segmentation algorithm, and train the semantic segmentation model by using the processed sample bill picture and the corresponding distortion matrix, so that the semantic segmentation model meets a preset standard;
the input module 34 is used for inputting the bill picture to be corrected into a semantic segmentation model conforming to a preset standard, and obtaining a correction matrix;
and the correction module 35 is used for correcting the bill picture to be corrected by using the correction matrix.
In a specific application scenario, in order to implement data processing on the sample bill picture and obtain a corresponding distortion matrix, the processing module 32 is specifically configured to randomly select a plurality of distortion starting points from the sample bill picture; performing twisting processing on the sample bill pictures at the twisting starting points based on a twisting formula so as to obtain first twisting matrixes of the twisting starting points corresponding to different twisting states; performing linear interpolation filling of missing pixel points on the sample bill picture subjected to the distortion treatment to obtain a first sample bill picture conforming to a preset size; randomly selecting a plurality of folding deformation starting points from the sample bill pictures; folding the sample bill pictures at the folding deformation starting points based on a folding formula so as to obtain second distortion matrixes of the folding deformation starting points corresponding to different folding states; and carrying out linear interpolation filling of missing pixel points on the folded sample bill picture to obtain a second sample bill picture which accords with the preset size.
Correspondingly, the corresponding warping formula is: w1=1-d α Wherein w1 is a preset distortion effect, alpha is a constant value 2, and d is the Euclidean distance between the offset point of the distortion starting point after the distortion is performed on the distortion starting point and the distortion starting point; in order to perform a twisting process on the sample bill picture based on the twisting formula at the twisting start point, obtain a first twisting matrix corresponding to each twisting start point in different twisting states, the processing module 32 is specifically configured to calculate a first offset point corresponding to the twisting start point under a preset twisting effect according to the twisting formula; moving the pixel point at the distortion start point to a first offset point; a first warping matrix is constructed that includes warp origin position coordinates and first offset point position coordinates.
In a specific application scenario, the corresponding folding formula is: w2=α/(d) 2 +α), wherein w2 is a preset folding effect, α is a constant value 2, and d is the euclidean distance between the offset point of the folding deformation starting point after the folding deformation and the folding deformation starting point; in order to perform folding processing on the sample bill picture based on the folding formula at the folding deformation starting point, obtain second distortion matrixes corresponding to different folding states of each folding deformation starting point, the processing module 32 is specifically configured to calculate a second offset point corresponding to the folding deformation starting point under a preset folding effect according to the folding formula; moving the pixel point at the folding deformation starting point to a second offset point; a second warping matrix including fold deformation starting point position coordinates and second offset point position coordinates is constructed.
Correspondingly, in order to train to obtain a semantic segmentation model meeting a preset standard, the training module 33 is specifically configured to configure labels for the first sample bill picture and the second sample bill picture, where the labels correspond to the first warping matrix and the second warping matrix; inputting the first sample bill picture and the second sample bill picture after label configuration into a semantic segmentation model to respectively obtain a corresponding first prediction distortion matrix and a corresponding second prediction distortion matrix; calculating a first loss function between the first predicted warp matrix and the first warp matrix and a second loss function between the second predicted warp matrix and the second warp matrix; if the first loss function and the second loss function are smaller than the preset threshold, judging that the semantic segmentation model meets the preset standard; if the first loss function or the second loss function which is larger than or equal to the preset threshold value exists, the output result of the training semantic segmentation model is corrected by using the first distortion matrix and the second distortion matrix, so that the semantic segmentation model meets the preset standard.
In a specific application scenario, in order to make the bill picture to be rectified and the sample bill picture for training the semantic segmentation model belong to the same format size, as shown in fig. 6, the device further includes: an adjustment module 36.
The adjustment module 36 is configured to adjust the bill picture to be corrected so that the bill picture to be corrected conforms to a preset size.
Correspondingly, in order to obtain the correction matrix corresponding to the bill picture to be corrected, the input module 34 is specifically configured to input the preprocessed bill picture to be corrected into a semantic segmentation model meeting a preset standard, so as to obtain the correction matrix.
In a specific application scenario, in order to correct the bill picture to be corrected by using the correction matrix, the correction module 35 is specifically configured to extract, from the correction matrix, a current position coordinate and a correction position coordinate of each pixel point included in the bill picture to be corrected; determining the pixel points of which the current position coordinates are not matched with the correction position coordinates as pixel points to be corrected; and replacing the current position coordinate by the correction position coordinate so as to enable the pixel point to be corrected to move to the correction position, thereby further realizing correction processing of the bill picture to be corrected.
It should be noted that, other corresponding descriptions of each functional unit related to the correction device for ticket images provided in this embodiment may refer to corresponding descriptions in fig. 1 to 2, and are not described herein again.
Based on the above method shown in fig. 1 and fig. 2, correspondingly, the embodiment of the application further provides a storage medium, on which a computer program is stored, which when executed by a processor, implements the above method for correcting the bill image shown in fig. 1 and fig. 2.
Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to perform the method of each implementation scenario of the present application.
Based on the methods shown in fig. 1 and fig. 2 and the virtual device embodiments shown in fig. 5 and fig. 6, in order to achieve the above objects, the embodiments of the present application further provide a computer device, which may specifically be a personal computer, a server, a network device, etc., where the entity device includes a storage medium and a processor; a storage medium storing a computer program; and a processor for executing a computer program to implement the bill image correction method as shown in fig. 1 and 2.
Optionally, the computer device may also include a user interface, a network interface, a camera, radio Frequency (RF) circuitry, sensors, audio circuitry, WI-FI modules, and the like. The user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., bluetooth interface, WI-FI interface), etc.
It will be appreciated by those skilled in the art that the computer device structure provided in this embodiment is not limited to this physical device, and may include more or fewer components, or may combine certain components, or may be arranged in different components.
The non-volatile readable storage medium may also include an operating system, a network communication module, etc. The operating system is a program of physical device hardware and software resources for ticket image rectification, supporting the execution of information handling programs and other software and/or programs. The network communication module is used for realizing communication among components in the nonvolatile readable storage medium and communication with other hardware and software in the entity device.
Through the description of the above embodiments, it can be clearly understood by those skilled in the art that the present application may be implemented by adding a necessary general hardware platform to software, or after screening out standard sample bill pictures, performing data processing on the sample bill pictures based on a distortion formula and a folding formula to obtain a first distortion matrix and a second distortion matrix, after performing linear interpolation operation on the distorted sample bill pictures, obtaining a first sample bill picture and a second sample bill picture, then training and correcting a semantic segmentation model by using the distortion matrix and the corresponding first sample bill picture or the second sample bill picture, when it is determined that the semantic segmentation model meets a preset standard, inputting the preprocessed bill pictures to be corrected into the semantic segmentation model, obtaining a correction matrix, and then adjusting pixels to be corrected in the bill pictures to be corrected based on correction position coordinates in the correction matrix, so as to further implement correction processing of the bill pictures to be corrected. The method and the device can automatically correct the image at the pixel level according to the characteristics of the input bill, can be applied to correction scenes of various local and non-local deformations, can further ensure correction accuracy, and improve correction efficiency. In addition, in the correction process, besides inputting the bill pictures to be corrected, no human participation is needed, so that the dependence on manual operation can be reduced, the automation efficiency is improved, and the error rate can be reduced.
Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application. Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario. The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.

Claims (8)

1. A method for correcting a ticket image, comprising:
acquiring a sample bill picture with the same bill type as the bill picture to be corrected;
performing data processing on the sample bill picture to obtain a corresponding distortion matrix;
creating a semantic segmentation model based on a semantic segmentation algorithm, and training the semantic segmentation model by using the processed sample bill picture and the corresponding distortion matrix so that the semantic segmentation model accords with a preset standard;
inputting the bill picture to be corrected into a semantic segmentation model conforming to the preset standard, and obtaining a correction matrix;
correcting the bill picture to be corrected by using the correction matrix;
the data processing is carried out on the sample bill picture to obtain a corresponding distortion matrix, and the method specifically comprises the following steps:
randomly selecting a plurality of distortion starting points from the sample bill picture;
performing twisting processing on the sample bill pictures at the twisting starting points based on a twisting formula so as to obtain first twisting matrixes of the twisting starting points corresponding to different twisting states;
performing linear interpolation filling of missing pixel points on the sample bill picture subjected to the distortion treatment to obtain a first sample bill picture conforming to a preset size;
randomly selecting a plurality of folding deformation starting points from the sample bill picture;
folding the sample bill pictures at the folding deformation starting points based on a folding formula so as to obtain second distortion matrixes of the folding deformation starting points corresponding to different folding states;
performing linear interpolation filling of missing pixel points on the folded sample bill picture to obtain a second sample bill picture conforming to the preset size;
the semantic segmentation model is created based on a semantic segmentation algorithm, and is trained by using the processed sample bill pictures and the corresponding distortion matrix, so that the semantic segmentation model meets the preset standard, and the method specifically comprises the following steps:
configuring labels for the first sample bill picture and the second sample bill picture, wherein the labels correspond to the first distortion matrix and the second distortion matrix;
inputting the first sample bill picture and the second sample bill picture after label configuration into a semantic segmentation model, and respectively obtaining a corresponding first prediction distortion matrix and a corresponding second prediction distortion matrix;
calculating a first loss function between the first predicted warp matrix and the first warp matrix and a second loss function between the second predicted warp matrix and the second warp matrix;
if the first loss function and the second loss function are smaller than the preset threshold, judging that the semantic segmentation model meets a preset standard;
and if the first loss function or the second loss function which is larger than or equal to the preset threshold exists, correcting and training an output result of the semantic segmentation model by using the first distortion matrix and the second distortion matrix so as to enable the semantic segmentation model to meet the preset standard.
2. The method of claim 1, wherein the warping formula is:
Figure QLYQS_1
wherein->
Figure QLYQS_2
For presetting the distortion effect, < >>
Figure QLYQS_3
The constant value is 2, d is the Euclidean distance between the offset point of the distortion starting point after the distortion is performed on the distortion starting point and the distortion starting point;
and performing a twisting process on the sample bill picture based on a twisting formula at the twisting start point so as to obtain a first twisting matrix corresponding to different twisting states of each twisting start point, wherein the method specifically comprises the following steps:
calculating a first offset point corresponding to the distortion starting point under the preset distortion effect according to the distortion formula;
moving the pixel point at the distortion starting point to the first offset point;
a first warping matrix including the warp starting point position coordinates and the first offset point position coordinates is constructed.
3. The method of claim 1, wherein the folding formula is:
Figure QLYQS_4
wherein, the method comprises the steps of, wherein,
Figure QLYQS_5
for presetting folding effect, < >>
Figure QLYQS_6
The constant value is 2, d is the Euclidean distance between the offset point of the folding deformation starting point after the folding deformation and the folding deformation starting point;
folding the sample bill picture based on a folding formula at the folding deformation starting point so as to obtain second distortion matrixes of the folding deformation starting point corresponding to different folding states, wherein the method specifically comprises the following steps of:
calculating a second offset point corresponding to the folding deformation starting point under the preset folding effect according to the folding formula;
moving the pixel point at the folding deformation starting point to the second offset point;
and constructing a second distortion matrix comprising the position coordinates of the folding deformation starting point and the position coordinates of the second offset point.
4. The method according to claim 1, wherein before the inputting the bill picture to be corrected into the semantic segmentation model meeting the preset standard to obtain the correction matrix, the method specifically further comprises:
adjusting the bill picture to be corrected so as to enable the bill picture to be corrected to accord with a preset size;
inputting the bill picture to be corrected into a semantic segmentation model conforming to the preset standard, and obtaining a correction matrix specifically comprises the following steps:
inputting the preprocessed bill picture to be corrected into a semantic segmentation model conforming to the preset standard, and obtaining a correction matrix.
5. The method according to claim 4, wherein the correcting the ticket picture to be corrected by using the correction matrix specifically comprises:
extracting current position coordinates and correction position coordinates of all pixel points contained in the bill picture to be corrected from the correction matrix;
determining the pixel points of which the current position coordinates are not matched with the correction position coordinates as pixel points to be corrected;
and replacing the current position coordinate by the correction position coordinate so that the pixel point to be corrected moves to the correction position, and further realizing correction processing of the bill picture to be corrected.
6. A ticket image correction device, comprising:
the acquisition module is used for acquiring sample bill pictures with the same bill type as the bill pictures to be corrected;
the processing module is used for carrying out data processing on the sample bill picture to obtain a corresponding distortion matrix, and carrying out data processing on the sample bill picture to obtain the corresponding distortion matrix, and specifically comprises the following steps:
randomly selecting a plurality of distortion starting points from the sample bill picture;
performing twisting processing on the sample bill pictures at the twisting starting points based on a twisting formula so as to obtain first twisting matrixes of the twisting starting points corresponding to different twisting states;
performing linear interpolation filling of missing pixel points on the sample bill picture subjected to the distortion treatment to obtain a first sample bill picture conforming to a preset size;
randomly selecting a plurality of folding deformation starting points from the sample bill picture;
folding the sample bill pictures at the folding deformation starting points based on a folding formula so as to obtain second distortion matrixes of the folding deformation starting points corresponding to different folding states;
performing linear interpolation filling of missing pixel points on the folded sample bill picture to obtain a second sample bill picture conforming to the preset size;
the training module is used for creating a semantic segmentation model based on a semantic segmentation algorithm, training the semantic segmentation model by using the processed sample bill picture and the corresponding distortion matrix so that the semantic segmentation model accords with a preset standard, creating the semantic segmentation model based on the semantic segmentation algorithm, and training the semantic segmentation model by using the processed sample bill picture and the corresponding distortion matrix so that the semantic segmentation model accords with the preset standard, and specifically comprises the following steps:
configuring labels for the first sample bill picture and the second sample bill picture, wherein the labels correspond to the first distortion matrix and the second distortion matrix;
inputting the first sample bill picture and the second sample bill picture after label configuration into a semantic segmentation model, and respectively obtaining a corresponding first prediction distortion matrix and a corresponding second prediction distortion matrix;
calculating a first loss function between the first predicted warp matrix and the first warp matrix and a second loss function between the second predicted warp matrix and the second warp matrix;
if the first loss function and the second loss function are smaller than the preset threshold, judging that the semantic segmentation model meets a preset standard;
if the first loss function or the second loss function which is larger than or equal to the preset threshold value exists, correcting and training an output result of the semantic segmentation model by using the first distortion matrix and the second distortion matrix so that the semantic segmentation model accords with the preset standard;
the input module is used for inputting the bill picture to be corrected into a semantic segmentation model conforming to the preset standard to obtain a correction matrix;
and the correction module is used for correcting the bill picture to be corrected by using the correction matrix.
7. A non-transitory readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements the ticket image correction method of any one of claims 1 to 5.
8. A computer device comprising a non-volatile readable storage medium, a processor and a computer program stored on the non-volatile readable storage medium and executable on the processor, characterized in that the processor implements the method of rectifying a ticket image according to any of claims 1 to 5 when executing the program.
CN202010164109.4A 2020-03-11 2020-03-11 Correction method and device for bill image and computer equipment Active CN111507181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010164109.4A CN111507181B (en) 2020-03-11 2020-03-11 Correction method and device for bill image and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010164109.4A CN111507181B (en) 2020-03-11 2020-03-11 Correction method and device for bill image and computer equipment

Publications (2)

Publication Number Publication Date
CN111507181A CN111507181A (en) 2020-08-07
CN111507181B true CN111507181B (en) 2023-05-26

Family

ID=71871553

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010164109.4A Active CN111507181B (en) 2020-03-11 2020-03-11 Correction method and device for bill image and computer equipment

Country Status (1)

Country Link
CN (1) CN111507181B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597998A (en) * 2021-01-07 2021-04-02 天津师范大学 Deep learning-based distorted image correction method and device and storage medium
CN112767270B (en) * 2021-01-19 2022-07-15 中国科学技术大学 Fold document image correction system
CN113011249B (en) * 2021-01-29 2024-05-28 招商银行股份有限公司 Bill auditing method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107292823A (en) * 2017-08-20 2017-10-24 平安科技(深圳)有限公司 Electronic installation, the method for invoice classification and computer-readable recording medium
CN109214382A (en) * 2018-07-16 2019-01-15 顺丰科技有限公司 A kind of billing information recognizer, equipment and storage medium based on CRNN
CN109886257A (en) * 2019-01-30 2019-06-14 四川长虹电器股份有限公司 Using the method for deep learning correction invoice picture segmentation result in a kind of OCR system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8774556B2 (en) * 2011-11-30 2014-07-08 Microsoft Corporation Perspective correction using a reflection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107292823A (en) * 2017-08-20 2017-10-24 平安科技(深圳)有限公司 Electronic installation, the method for invoice classification and computer-readable recording medium
CN109214382A (en) * 2018-07-16 2019-01-15 顺丰科技有限公司 A kind of billing information recognizer, equipment and storage medium based on CRNN
CN109886257A (en) * 2019-01-30 2019-06-14 四川长虹电器股份有限公司 Using the method for deep learning correction invoice picture segmentation result in a kind of OCR system

Also Published As

Publication number Publication date
CN111507181A (en) 2020-08-07

Similar Documents

Publication Publication Date Title
CN111507181B (en) Correction method and device for bill image and computer equipment
WO2018223994A1 (en) Method and device for synthesizing chinese printed character image
CN101460937B (en) Model- based dewarping method and apparatus
US8472726B2 (en) Document comparison and analysis
US8724924B2 (en) Systems and methods for processing mobile images to identify and extract content from forms
CN108846385B (en) Image identification and correction method and device based on convolution-deconvolution neural network
US9235779B2 (en) Method and apparatus for recognizing a character based on a photographed image
US8606046B2 (en) System and method for clean document reconstruction from annotated document images
CN112183038A (en) Form identification and typing method, computer equipment and computer readable storage medium
EP2270746A2 (en) Method for detecting alterations in printed document using image comparison analyses
CN111985465A (en) Text recognition method, device, equipment and storage medium
CN111737478B (en) Text detection method, electronic device and computer readable medium
WO2017141802A1 (en) Image processing device, character recognition device, image processing method, and program recording medium
CN111667556A (en) Form correction method and device
CN109741273A (en) A kind of mobile phone photograph low-quality images automatically process and methods of marking
CN113592735A (en) Text page image restoration method and system, electronic equipment and computer readable medium
JP2001266068A (en) Method and device for recognizing table, character- recognizing device, and storage medium for recording table recognizing program
CN109697442B (en) Training method and device of character recognition model
CN113221897B (en) Image correction method, image text recognition method, identity verification method and device
JP6167528B2 (en) Method and apparatus for correcting image corner and image processing equipment
CN115660933A (en) Method, device and equipment for identifying watermark information
US10115036B2 (en) Determining the direction of rows of text
US8897538B1 (en) Document image capturing and processing
CN117095417A (en) Screen shot form image text recognition method, device, equipment and storage medium
CN112364835B (en) Video information frame taking method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant