CN113723392A

CN113723392A - Document image quality evaluation method and device, computer equipment and storage medium

Info

Publication number: CN113723392A
Application number: CN202111149187.8A
Authority: CN
Inventors: 姜天昌; 李波; 勇妍; 吴光军
Original assignee: Glodon Co Ltd
Current assignee: Glodon Co Ltd
Priority date: 2021-09-29
Filing date: 2021-09-29
Publication date: 2021-11-30

Abstract

The invention provides a method and a device for evaluating the quality of a document image, computer equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring a document image uploaded by a user; preprocessing the document image to obtain a target image which only contains the document and has a regular shape; evaluating the definition of the target image by using a definition evaluation model and outputting a definition score; the definition comprises the combination of any of illumination definition, printing font definition, handwriting font definition and seal definition; and evaluating the quality of the document image according to the definition score. The invention automatically scores the document image through a machine learning model in the artificial intelligence technology, thereby achieving the aim of accurately and rapidly carrying out quantitative quality evaluation.

Description

Document image quality evaluation method and device, computer equipment and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method and a device for evaluating quality of a document image, computer equipment and a storage medium.

Background

With the development of the industrial internet, the processes of auditing, counting, processing and the like of various documents are completed on line, for example, a transportation bill is taken as an important certificate in the logistics transportation process and needs to be uploaded to a logistics management platform for unified examination and management. At present, a user usually collects image information of various documents through a mobile phone, a pad, a high-speed scanner, a scanner, and other devices, and then uploads the collected image information to a management platform or a server at a mobile end (ios) or a PC end.

In the image acquisition process, due to the reasons of non-standard operation, equipment diversity, complex surrounding environment and the like, the definition of the uploaded document image is not high, and the document image is difficult to identify by a computer, so that in the prior art, the quality of the document image uploaded by a user needs to be evaluated in a manual judgment mode to reduce invalid document images. However, the manual evaluation often has the defects of non-uniform evaluation standards, inaccurate evaluation results, low evaluation efficiency and the like, so that the image quality evaluation link cannot achieve the expected effect.

Disclosure of Invention

The invention aims to provide a technical scheme capable of accurately and quickly automatically evaluating the document image quality so as to solve the problems in the prior art.

In order to achieve the purpose, the invention provides a method for evaluating the quality of a document image, which comprises the following steps:

acquiring a document image uploaded by a user;

preprocessing the document image to obtain a target image which only contains the document and has a regular shape;

evaluating the definition of the target image by using a definition evaluation model and outputting a definition score; the definition comprises the combination of any of illumination definition, printing font definition, handwriting font definition and seal definition;

and evaluating the quality of the document image according to the definition score.

According to the method for evaluating the quality of the document image, the step of preprocessing the document image to obtain the target image which only contains the document and has a regular shape comprises the following steps:

segmenting the document image by using an image segmentation algorithm to determine a target area in the document image;

adjusting the vertex position of the target area to enable the target area to form a regular rectangle;

and outputting the image in the target area after adjustment as the target image.

According to the quality evaluation method of the document image, the definition evaluation model comprises a light reflection recognition model, and the step of evaluating the definition of the target image and outputting the definition score by using the definition evaluation model comprises the following steps:

and inputting the target image into the reflection recognition model to output the reflection probability score of the target image.

According to the quality evaluation method of the document image provided by the invention, the definition evaluation model further comprises a printing font recognition model and a handwriting font recognition model, and the step of evaluating the definition of the target image and outputting the definition score by using the definition evaluation model comprises the following steps:

recognizing all font strings in the target image by using an OCR recognition method;

dividing all the font strings into printing font strings and handwriting font strings by utilizing a classification algorithm;

inputting the printing font string into a printing font recognition model to output a printing font definition score of the target image;

and inputting the handwritten font string into a handwritten font recognition model so as to output the handwritten font definition score of the target image.

According to the quality evaluation method of the document image, the printing font string and the handwriting font string both comprise a plurality of characters, the printing font definition score is obtained after weighted calculation according to a plurality of single printing font definition scores, and the handwriting font definition score is obtained after weighted calculation according to a plurality of single handwriting font definition scores.

According to the quality evaluation method of the document image provided by the invention, the definition evaluation model further comprises a seal recognition model, and the step of evaluating the definition of the target image by using the definition evaluation model and outputting the definition score comprises the following steps:

segmenting the target image by utilizing an image segmentation algorithm to obtain a stamp image only containing the stamp content;

and inputting the seal image into the seal identification model to output the seal definition score of the target image.

According to the quality evaluation method of the document image provided by the invention, the step of evaluating the definition of the target image by using the definition evaluation model and outputting the definition score further comprises the following steps:

and weighting and summing the reflection probability score, the printing font definition score, the handwriting font definition score and the seal definition score to obtain a comprehensive definition score of the target image.

In order to achieve the above object, the present invention further provides a document image quality evaluation device, including:

the document acquisition module is used for acquiring a document image uploaded by a user;

the preprocessing module is suitable for preprocessing the document image to obtain a target image which only contains the document and has a regular shape;

the definition grading module is suitable for evaluating the definition of the target image by using a definition evaluation model and outputting a definition grade; the definition comprises the combination of any of illumination definition, printing font definition, handwriting font definition and seal definition;

and the quality evaluation module is suitable for evaluating the quality of the document image according to the definition score.

To achieve the above object, the present invention further provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.

To achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the above method.

The method, the device, the computer equipment and the readable storage medium for evaluating the quality of the document image automatically score the document image through machine learning in an artificial intelligence technology, so that the aim of accurately and quickly carrying out quantitative quality evaluation is fulfilled. According to the invention, firstly, the document image is preprocessed, so that interference factors in the document image are removed, and the identification degree of the document image is improved. And then, by utilizing the definition evaluation model trained by machine learning, the definition of the document image can be graded from different dimensions such as light reflection probability, printing font definition, handwriting font definition, seal definition and the like, and finally, the quality evaluation is carried out based on the definition grading of different dimensions. The definition scoring is carried out by utilizing the definition evaluation model, the objectivity, the standard and the accuracy of the scoring can be ensured, and different weights can be set for the definition scoring of a plurality of different dimensions according to different application scenes, so that the application range of the method is wider, and the method has higher flexibility.

Drawings

FIG. 1 is a flowchart of a first embodiment of a method for evaluating quality of a document image according to a first embodiment of the present invention;

FIG. 2 is a schematic flow diagram of the pre-processing of document images in accordance with an embodiment of the present invention;

FIG. 3 is a schematic flow chart of evaluating font sharpness for embodiments of the present invention;

FIG. 4 is a schematic flow chart of quality assessment based on sharpness in accordance with one embodiment of the present invention;

FIG. 5 is a schematic diagram of a program module of a document image quality evaluation apparatus according to a first embodiment of the present invention;

fig. 6 is a schematic hardware configuration diagram of a document image quality evaluation apparatus according to a first embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example one

Referring to fig. 1, the present embodiment provides a method for evaluating quality of a document image, including the following steps:

and S100, acquiring a document image uploaded by a user. The document in this embodiment may include any form of documentary material, such as an invoice, a physical distribution slip, a bill of lading, a packing slip, and the like, as the present invention is not limited in this respect. The document image may be a photograph or image obtained by any image capture device, such as a cell phone, camera, tablet, etc.

S200, preprocessing the document image to obtain a target image which only contains the document and has a regular shape.

The purpose of preprocessing in the step is to remove invalid information in the document image and improve the identification degree of the document image. Any image preprocessing technology in the prior art can be used for preprocessing the document image, such as image noise reduction, image filtering, image segmentation and the like, and the step is not limited.

S300, evaluating the definition of the target image by using a definition evaluation model and outputting a definition score; the definition comprises the combination of any of illumination definition, printing font definition, handwriting font definition and seal definition.

The definition combination model in this embodiment is an artificial neural network model obtained through machine learning training based on a big data technology. Specifically, different neural network structures such as a BP neural network, a deep neural network, a convolutional neural network, and a recurrent neural network may be used, which is not limited in this embodiment.

The sharpness evaluation model in this embodiment may include a plurality of models, each of which is used to evaluate sharpness of the target image from different dimensions. Such as an illumination sharpness model for evaluating whether there is light reflection, a print font sharpness model for evaluating print font sharpness, a handwriting font sharpness model for evaluating handwriting font sharpness, and a stamp sharpness model for evaluating stamp sharpness. Each model correspondingly outputs scores of different dimensions for the input target image.

The training processes for the different sharpness models described above are similar, differing only in the choice of training samples. Taking the illumination sharpness model as an example, the training sample includes a plurality of training images and a reflection probability score corresponding to each training image, where the reflection probability score may be manually pre-evaluated based on a specific criterion. And taking the training image as input data, taking the corresponding reflection probability score as output data to train the weight value of the hidden variable in the illumination definition model until the illumination definition model tends to be converged, and finishing the training process. At the moment, the weight value of the hidden variable in the illumination definition model is fixed and is not adjusted, and for any image input in the subsequent stage, the illumination definition model can output a light reflection probability score corresponding to the any image. The training process of other sharpness models is similar to the above process, and the description is omitted in this embodiment.

And S400, evaluating the quality of the document image according to the definition score.

As described above, in the case where the sharpness evaluation model includes a plurality of scores, a plurality of sharpness scores may be obtained accordingly. Different weights can be set for different definition scores based on different practical needs, and a comprehensive definition score is finally obtained through weighting and summing. In this way, the quality of the document image can be assessed according to the composite sharpness score. For example, the comprehensive definition scores can be classified into different data intervals, each data interval corresponds to a quality grade, such as excellent, good, medium, poor and the like, so that conversion from quantification to qualification is achieved, and subsequent uniform processing is conveniently adopted according to different quality grades, such as deletion of document images with poor quality grades and the like.

FIG. 2 is a schematic flow chart illustrating preprocessing of document images according to a first embodiment of the present invention. As shown in fig. 2, step S200 includes:

s210, segmenting the bill image by using an image segmentation algorithm to determine a target area in the bill image.

In one example, the image segmentation algorithm may be implemented by a UNET model. The UNET model is an improvement on the basis of the FCNs, and the trained UNET model can segment a target object to be recognized from an input image. In this embodiment, the UNET model is used to segment a target area containing only document content from a document image. For example, the document image often contains partial background information, such as a desktop on which the document is placed, and the step can determine the boundary between the document content and the background information, so as to obtain the target area.

S220, adjusting the vertex position of the target area to enable the target area to form a regular rectangle.

This step is used for the case where the target area is irregular due to the inclination of the shooting angle. For example, the plane of the camera is not parallel to the plane of the document, so that the target area in the document image is in a trapezoid, rhombus or trapezoid shape, and the like, which easily causes interference to the subsequent identification process. In this step, the shape of the target region may be adjusted so that the vertex of the target region is adjusted. For example, initial first position coordinates of four vertices in the target region are respectively obtained, and second position coordinates of the four vertices in the target region are calculated through a geometric function, so that the target region forms a regular rectangle.

And S230, outputting the adjusted image in the target area as the target image. Therefore, the method better conforms to the actual geometric shape of the document and is beneficial to improving the accuracy of quality evaluation.

Those skilled in the art understand that font definition has a significant impact on the definition of a document image. In one example, the present application performs font sharpness evaluation from both printed and handwritten fonts, respectively. Fig. 3 is a schematic flowchart illustrating the evaluation of font definition according to the present embodiment, and as shown in fig. 3, step S300 includes:

and S310, recognizing all font strings in the target image by using an OCR recognition method.

Optical Character Recognition (OCR) technology is the recognition of font strings in a target image line by line, where each font string refers to a continuous font recognized in a rectangular box. Those skilled in the art understand that two fonts are identified as two font strings when other non-font elements are included between the two fonts, such as images, white spaces, etc. Therefore, a plurality of font strings can be generally acquired from a target image by an OCR recognition method.

And S320, dividing all the font strings into printing font strings and handwriting font strings by utilizing a classification algorithm.

The plurality of font strings may be classified by an existing arbitrary classification algorithm, thereby dividing different font strings into a print font string or a handwriting font string, respectively. The existing classification algorithm includes an NBC algorithm, an LR algorithm, a decision tree algorithm, an SVM algorithm, an ANN algorithm, and the like, which is not limited in this embodiment.

And S330, inputting the printing font string into a printing font recognition model so as to output the printing font definition score of the target image.

And S340, inputting the handwritten font string into a handwritten font recognition model so as to output the handwritten font definition score of the target image.

Steps S330 and S340 obtain a print font clarity score and a handwritten font clarity score through the print font recognition model and the handwritten font recognition model, respectively. It should be noted that, since each font string includes a plurality of font strings, the font clarity score output by the font recognition model is a score after weighted average. For example, ten print font strings are input into the print font recognition model, the print font recognition model first calculates the individual score of each print font string, then performs weighted average calculation on the ten individual scores according to a preset weight value, and finally outputs the print font average definition score corresponding to the ten print font strings. The weight value may be determined according to the number of fonts included in the font string, for example, the larger the number of fonts is, the larger the corresponding weight value is, and the like. The same weighted average calculation is performed on the handwritten font definition score output by the handwritten font recognition model, which is not illustrated here.

By classifying the printed fonts and the handwritten fonts and respectively outputting the definition scores, different scoring standards can be correspondingly adopted for different fonts, so that the accuracy and the reliability of the font definition score are improved.

In one example, the sharpness evaluation model further includes a stamp identification model, so step S300 further includes:

and S350, segmenting the target image by utilizing an image segmentation algorithm to obtain the seal image only containing the seal content. The stamp image can be segmented from the target image by any existing image segmentation algorithm, such as UNET model. The UNET model can be obtained through big data training, and a seal in any input image can be identified and the seal image can be segmented.

And S360, inputting the stamp image into the stamp identification model to output the stamp definition score of the target image. The seal recognition model is similar to the illumination definition model, the printing font definition model and the handwriting font definition model, and the definition of a target image is evaluated from the seal definition dimension, so that seal definition scores are obtained.

Further, on the basis of obtaining definition scores of different dimensions of the target image, weighting and summing the light reflection probability score, the printing font definition score, the handwriting font definition score and the seal definition score to obtain a comprehensive definition score of the target image. The weight values of all the definition scores can be set according to different document characteristics, for example, for documents with more printing fonts, the weight of the definition score of the printing fonts is set to be higher than that of the definition score of the handwriting fonts; and for documents with more handwritten fonts, setting the weight of the definition score of the handwritten fonts to be higher than the weight of the definition score of the printing fonts and the like. Therefore, the comprehensive definition score can better meet the requirements of users, and the accuracy of image quality evaluation is improved.

FIG. 4 is a schematic flow chart of quality assessment based on sharpness, according to an embodiment of the present invention. As shown in fig. 4, in this embodiment, firstly, image input is performed, the input image is detected and cut (which is equivalent to performing a preprocessing process on a document image), and definition recognition is further performed from multiple dimensions, including reflection detection (i.e., illumination definition detection), print font definition recognition, handwritten font definition recognition, and stamp definition recognition. And obtaining a comprehensive definition score based on the dimensions, and finally evaluating the image quality according to the relation between the comprehensive definition score and a preset threshold value.

Continuing to refer to fig. 5, a document image quality evaluation apparatus is shown, in this embodiment, the document image quality evaluation apparatus 50 may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and executed by one or more processors to implement the present invention and implement the above-mentioned document image quality evaluation method. The program modules referred to herein are a series of computer program instruction segments capable of performing specific functions, better suited than the program itself for describing the execution of the document image quality assessment apparatus 50 in a storage medium. The following description will specifically describe the functions of the program modules of the present embodiment:

the document acquisition module 51 is used for acquiring a document image uploaded by a user;

a preprocessing module 52, adapted to preprocess the document image to obtain a target image with a regular shape and containing only the document;

a sharpness scoring module 53 adapted to evaluate sharpness of the target image using a sharpness evaluation model and output a sharpness score; the definition comprises the combination of any of illumination definition, printing font definition, handwriting font definition and seal definition;

and the quality evaluation module 54 is suitable for evaluating the quality of the document image according to the definition score.

The embodiment also provides a computer device, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server or a rack server (including an independent server or a server cluster composed of a plurality of servers) capable of executing programs, and the like. The computer device 60 of the present embodiment includes at least, but is not limited to: a memory 61, a processor 62, which may be communicatively coupled to each other via a system bus, as shown in FIG. 6. It is noted that fig. 6 only shows a computer device 60 with components 61-62, but it is to be understood that not all shown components are required to be implemented, and that more or fewer components may be implemented instead.

In the present embodiment, the memory 61 (i.e., a readable storage medium) includes a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the storage 61 may be an internal storage unit of the computer device 60, such as a hard disk or a memory of the computer device 60. In other embodiments, the memory 61 may also be an external storage device of the computer device 60, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the computer device 60. Of course, the memory 61 may also include both internal and external storage devices of the computer device 60. In this embodiment, the memory 61 is generally used for storing an operating system and various application software installed in the computer device 60, such as the program code of the document image quality evaluation apparatus 50 in the first embodiment. Further, the memory 61 may also be used to temporarily store various types of data that have been output or are to be output.

Processor 62 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 62 is typically used to control the overall operation of the computer device 60. In this embodiment, the processor 62 is configured to execute the program code stored in the memory 61 or process data, for example, execute the document image quality evaluation apparatus 50, so as to implement the document image quality evaluation method according to the first embodiment.

The present embodiment also provides a computer-readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application mall, etc., on which a computer program is stored, which when executed by a processor implements corresponding functions. The computer-readable storage medium of this embodiment is used for storing the quality evaluation apparatus 50 of the document image, and when being executed by the processor, the method for evaluating the quality of the document image according to the first embodiment is implemented.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable medium, and when executed, the program includes one or a combination of the steps of the method embodiments.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example" or "some examples" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for evaluating the quality of a document image is characterized by comprising the following steps:

acquiring a document image uploaded by a user;

2. The method of claim 1, wherein the step of pre-processing the document image to obtain a target image of regular shape containing only the document comprises:

3. The document image quality assessment method according to claim 1 or 2, wherein the sharpness assessment model comprises a reflection recognition model, and the step of assessing the sharpness of the target image and outputting a sharpness score using the sharpness assessment model comprises:

4. A document image quality assessment method according to claim 3, wherein said sharpness assessment model further comprises a print font recognition model and a handwriting font recognition model, and said step of assessing the sharpness of said target image using the sharpness assessment model and outputting a sharpness score comprises:

5. The document image quality assessment method according to claim 4, wherein the print font string and the handwritten font string each comprise a plurality of strings, the print font sharpness score is obtained by weighted calculation based on a plurality of single print font sharpness scores, and the handwritten font sharpness score is obtained by weighted calculation based on a plurality of single handwritten font sharpness scores.

6. A document image quality assessment method according to claim 5, wherein said sharpness assessment model further comprises a stamp recognition model, and said step of assessing the sharpness of said target image using the sharpness assessment model and outputting a sharpness score comprises:

7. The document image quality assessment method according to claim 5, wherein the step of assessing the sharpness of the target image using a sharpness assessment model and outputting a sharpness score further comprises:

8. An apparatus for evaluating the quality of a document image, comprising:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented by the processor when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.