CN113409278B

CN113409278B - Image quality detection method, device, equipment and medium

Info

Publication number: CN113409278B
Application number: CN202110694356.XA
Authority: CN
Inventors: 游锦成
Original assignee: Ping An Health Insurance Company of China Ltd
Current assignee: Ping An Health Insurance Company of China Ltd
Priority date: 2021-06-22
Filing date: 2021-06-22
Publication date: 2024-04-26
Anticipated expiration: 2041-06-22
Also published as: CN113409278A

Abstract

The invention is used in the artificial intelligence field, relates to the block chain field, and discloses an image quality detection method, a device, equipment and a medium, wherein the method comprises the following steps: acquiring an image to be processed, positioning characters in the image to be processed by adopting a character positioning model to determine character areas in the image to be processed, performing Sobel operator convolution processing on the character areas in the image to be processed to acquire a target convolution matrix of the character areas, acquiring area information of the character areas according to pixel values and distribution intervals of pixels in the target convolution matrix, and finally determining a quality detection result of the image to be processed according to the area information of the character areas and preset quality conditions; according to the invention, the interference of image background information can be reduced, the accuracy of a quality detection algorithm is improved, and the accuracy of image quality is further improved.

Description

Image quality detection method, device, equipment and medium

Technical Field

The present invention relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a medium for detecting image quality.

Background

Optical character recognition (Optical Character Recognition, OCR) refers to the process of analyzing and recognizing an image file of a text material to obtain text information. The text material image information is typically acquired by a scanner, camera, or the like and stored in an image file, and OCR software reads, analyzes the image file and extracts character strings therein by character recognition. However, the performance of ocr is easily affected by the image quality, if the image to be identified is not clear enough or even seriously blurred, the subsequent positioning and identification of the text in the image are easy to fail, and the text identification task is further failed. Therefore, the quality control of the image is important before character recognition.

In the existing image quality detection method, filtering is generally carried out on the whole image by using an edge detection operator, and then, the obtained filtering result is simply summed or variance is calculated to give a quality score; or the quality detection of the image is carried out by a quality detection model of deep learning, and then the quality detection of an edge detection operator or the quality detection of the deep learning model is easily interfered by non-text background information in the image, so that the quality accuracy of the obtained image is not high.

Disclosure of Invention

The invention provides an image quality detection method, an image quality detection device and an image quality detection medium, which are used for solving the problem that the accuracy of the obtained image quality is low due to the fact that the image quality is easily interfered by non-text background information in an image in the existing image quality detection.

Provided is an image quality detection method including:

Acquiring an image to be processed, and positioning characters in the image to be processed by adopting a character positioning model so as to determine a character area in the image to be processed;

Performing Sobel operator convolution processing on a character area in the image to be processed to obtain a target convolution matrix of the character area;

Acquiring the area information of the text area according to the pixel values of the pixels in the target convolution matrix and the distribution interval of the pixel values;

and determining a quality detection result of the image to be processed according to the area information of each text area and a preset quality condition.

There is provided an image quality detecting apparatus including:

The positioning module is used for acquiring an image to be processed, and positioning characters in the image to be processed by adopting a character positioning model so as to determine a character area in the image to be processed;

the processing module is used for carrying out Sobel operator convolution processing on the text region in the image to be processed to obtain a target convolution matrix of the text region;

the calculation module is used for acquiring the area information of the text area according to the pixel values of the pixels in the target convolution matrix and the distribution interval of the pixel values;

and the determining module is used for determining the quality detection result of the image to be processed according to the area information of each text area and the preset quality condition.

There is provided a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above image quality detection method when executing the computer program.

There is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the above-described image quality detection method.

In one scheme provided by the image quality detection method, the device, the equipment and the medium, a character area in an image to be processed is determined by acquiring the image to be processed and positioning characters in the image to be processed by adopting a character positioning model, then a Sobel operator convolution processing is carried out on the character area in the image to be processed to acquire a target convolution matrix of the character area, then area information of the character area is acquired according to pixel values of pixels in the target convolution matrix and distribution intervals of the pixel values, and finally a quality detection result of the image to be processed is determined according to the area information of the character area and preset quality conditions; according to the invention, the Sobel operator convolution processing is carried out on each text region, and then the pixel value of each pixel in the sobel result is processed, so that the edge of the text region with obvious change is strengthened in the quality score, the interference edge in the background with smaller change degree is weakened, the interference of the image background information can be reduced, the accuracy of the quality detection algorithm is improved, and the accuracy of the image quality is further improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic view of an application environment of an image quality detection method according to an embodiment of the present invention;

FIG. 2 is a flow chart of an image quality detecting method according to an embodiment of the invention;

FIG. 3 is a flowchart illustrating an implementation of step S20 in FIG. 2;

FIG. 4 is a flowchart illustrating an implementation of step S21 in FIG. 3;

FIG. 5 is a flow chart of an image quality detecting method according to an embodiment of the invention;

FIG. 6 is a flowchart illustrating an implementation of step S40 in FIG. 2;

FIG. 7 is a flowchart of step S41 in FIG. 6;

FIG. 8 is a flowchart illustrating an implementation of step S42 in FIG. 6;

FIG. 9 is a flow chart illustrating the acquisition of quality score threshold values according to an embodiment of the present invention;

FIG. 10 is a schematic diagram of an image quality detecting apparatus according to an embodiment of the present invention;

FIG. 11 is a schematic diagram of a computer device according to an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The image quality detection method provided by the embodiment of the invention can be applied to an application environment as shown in fig. 1, wherein a client communicates with a server through a network. The method comprises the steps that a server side obtains an image to be processed, sent by a client side, of a user, a character positioning model is adopted to position characters in the image to be processed, so that a character area in the image to be processed is determined, then Sobel operator convolution processing is conducted on the character area in the image to be processed, a target convolution matrix of the character area is obtained, area information of the character area is obtained according to pixel values of pixels in the target convolution matrix and distribution intervals of the pixel values, and finally quality detection results of the image to be processed are determined according to the area information of the character area and preset quality conditions; the method has the advantages that the Sobel operator convolution processing is carried out on each text region, then the pixel value of each pixel in the sobel result is processed, so that the edge of the text region with obvious change is strengthened in the quality score, the interference edge in the background with smaller change degree is weakened, the interference of the image background information can be reduced, the accuracy of a quality detection algorithm is improved, the accuracy of the image quality is further improved, the artificial intelligence of the image quality detection is further improved, and the detection efficiency is improved.

The image quality detection method comprises the steps of storing an image to be processed, area information of each text area, target quality score of the image to be processed and other related data in a database of a server, and directly storing acquired, generated and used data in the database when the image quality detection method is executed, so that the image quality detection method is convenient for subsequent use.

The database in this embodiment is stored in the blockchain network, and is used for storing data used and generated in the semantic recall method based on the graph neural network, such as related data of an image to be processed, region information of each text region, a target quality score of the image to be processed, and the like. The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like. Deploying databases in blockchains may improve the security of data storage.

The client is also called a client, and refers to a program corresponding to the server for providing local service for the client. Clients may be, but are not limited to, various personal computers, notebook computers, smartphones, tablet computers, and portable wearable devices. The server may be implemented by a stand-alone server or a server cluster formed by a plurality of servers.

In one embodiment, as shown in fig. 2, an image quality detection method is provided, and the method is applied to the server in fig. 1, and the method includes the following steps:

s10: and acquiring an image to be processed, and positioning characters in the image to be processed by adopting a character positioning model so as to determine a character area in the image to be processed.

Acquiring an image to be processed, wherein the image to be processed is an image which needs to be subjected to image quality detection to determine whether the image quality is qualified, and after the image to be processed is acquired, firstly adopting a text positioning model based on deep learning to position text parts in the image to be processed so as to determine text areas in the image to be processed. The text areas are independent and separated text areas, and the text parts in the image to be processed are positioned, namely all the separated text areas in the image to be processed are positioned.

The character positioning model is a deep learning model for positioning characters in an image to be processed. After the image to be processed is input into the text positioning model, the text positioning model identifies all the texts in the image to be processed, and then determines the intervals between all the texts, and if the interval between two adjacent texts is larger than the preset interval (such as two characters), the edges of the two texts are one edge of the text region to which each of the two texts belongs, so as to divide all the text regions. Each text region in the image to be processed can be displayed by a highlighted text box so as to determine the number of text regions later.

S20: and carrying out the Sobel operator convolution processing on the text region in the image to be processed to obtain a target convolution matrix of the text region.

After characters in an image to be processed are positioned by adopting a character positioning model to determine character areas in the image to be processed, performing sobel convolution processing on each character area, performing quality detection on horizontal edges and vertical edges of the character areas to obtain sobel convolution matrixes of the character areas in multiple directions, and then performing summarization calculation on sobel convolution data in multiple directions to obtain target convolution matrixes of each character area.

S30: and acquiring the area information of the text area according to the pixel values of the pixels in the target convolution matrix and the distribution interval of the pixel values.

After the target convolution matrix of each text region is obtained, weighting and summing the pixels in the target convolution matrix according to the pixel values of the pixels and the distribution intervals of the pixel values in the target convolution matrix of the text region to determine the quality score of the text region, and using the quality score of the text region to represent the region information of the text region. In other embodiments, the area information of the text area may also be represented by other information, which is not described herein.

Specifically, the distribution situation of the pixel values of each pixel in the target convolution matrix is counted, the distribution interval of the pixel value of each pixel is determined, then the number of the pixel values in different distribution intervals is determined, namely the number of the pixels in different distribution intervals is determined, finally, the number of the pixels in different distribution intervals is weighted and summed, and the weighted and summed result is used as the quality score of the text region. Wherein each distribution section is a preset, continuous and separate section.

The calculation formula of the quality score of the text region is as follows:

score＝k₁×m₁+k₂×m₂+k₃×m₃+k₄×m₄+k₅×m₅...+k_nm_n;

Wherein k _n is a coefficient corresponding to each distribution interval, m _n is the number of pixels whose pixel values are in each distribution interval, n=1, 2,3,4, 5; score is the quality score for each text region.

For example, each distribution interval may be [0,0.1 ], [0.1,0.2 ], [0.2,0.3 ], [0.3,0.4 ], [0.4,0.5 ], [0.5, + ], distribution intervals [0,0.1), [0.1, 0.2), [0.2,0.3), [0.3, 0.4), [0.4, 0.5), [0.5, corresponding to +++) the coefficients are in turn: 10 ^-5、10^-4、10^-3、10^-2、10^-1、10⁰; in a text region label convolution matrix, the number of pixels with a pixel value of [0,0.1 ] is a, the number of pixels with a pixel value of [0.1, 0.2) is b, the number of pixels with a pixel value of [0.2,0.3) is c, the number of pixels with a pixel value of [0.2,0.3 ] is d, the number of pixels with a pixel value of [0.4, 0.5) is e, and the number of pixels with a pixel value of [0.4, 0.5) is f, then the quality score of the text region is:

score＝10^-5×a+10^-4×b+10^-3×c+10^-2×d+10^-1×e+f。

In this embodiment, each distribution interval may be [0,0.1 ], [0.1,0.2 ], [0.2,0.3 ], [0.3,0.4 ], [0.4,0.5 ], [0.5, + ], which is merely exemplary, and the coefficient corresponding to each distribution interval is 10 ^-5、10^-4、10^-3、10^-2、10^-1、10⁰, which is merely exemplary, and in other embodiments, each distribution interval may be another continuous interval, and the coefficient corresponding to each distribution interval may be another value, which is not described herein.

And weighting and summing the pixel values of all pixels in the target convolution matrix corresponding to all the character areas and the distribution intervals of the pixel values, so that the quality score of each character area can be obtained to represent the area information of the character area.

S40: and determining a quality detection result of the image to be processed according to the area information of each text area and the preset quality condition.

After the quality detection is carried out on each text region to determine the region information of each text region, the quality of the image to be processed is determined according to the region information of each text region, and then the quality detection result of the image to be processed is determined according to the quality of the image to be processed and the preset quality condition, wherein the quality detection result comprises passing quality detection and failing quality detection. The preset quality condition is a preset condition for evaluating whether the quality of the image to be processed is qualified or not.

The quality of the image to be processed is determined according to the area information of each text area, for example, the median in the quality scores of all text areas is used as the target quality score of the image to be processed to represent the quality of the image to be processed, and the representative quality score median is adopted, so that the quality requirement of the image to be processed can be relaxed, the processing speed can be improved, and the time cost can be reduced; the minimum value in the quality scores of all the text areas can be used as the target quality score of the image to be processed, so that the quality requirement of the image to be processed is improved, the quality auditing strength of the image is enhanced, and the text in the image can be accurately identified; the position of the light source can be determined according to the quality score of each text region, the target quality score of the image to be processed is determined according to the quality score of the text region far from the position of the light source, the possibility of errors is reduced, and the accuracy of the target quality score is improved.

After determining the target quality score of the image to be processed according to the area information of each text area, namely determining the quality of the image to be processed, determining the quality detection result of the image to be processed according to the target quality score and the preset quality condition, wherein the higher the target quality score of the image to be processed is, the clearer the image to be processed is, and when the target quality score of the image to be processed is within a certain quality score range or is higher than a certain threshold value, the better the quality of the image to be processed is, and the image to be processed is an available clear image, namely, the target quality score meets the preset quality condition, and determining that the quality detection result of the image to be processed passes the quality detection; when the target quality score of the image to be processed is not in a certain quality score range or the target quality score of the image to be processed is lower than a certain threshold, the image to be processed is poor in quality and is an unavailable fuzzy image, namely the target quality score does not meet a preset quality condition, and the quality detection result of the image to be processed is determined to be failed in quality detection.

In the embodiment, a character area in an image to be processed is determined by acquiring the image to be processed and positioning characters in the image to be processed by adopting a character positioning model, then Sobel operator convolution processing is carried out on the character area in the image to be processed to acquire a target convolution matrix of the character area, then area information of the character area is acquired according to pixel values of pixels in the target convolution matrix and distribution intervals of the pixel values, and finally a quality detection result of the image to be processed is determined according to the area information of the character area and preset quality conditions; the sobel operator convolution processing is carried out on each text region, and then the weighted summation is carried out on the pixel quantity of each pixel value in different regions in the sobel result, so that the edges of the text region with obvious change are strengthened in the quality score, the interference edges in the background with smaller change degree are weakened, the interference of the image background information can be reduced, the precision of a quality detection algorithm is improved, and the accuracy of the image quality is further improved.

In one embodiment, as shown in fig. 3, in step S20, the sobel operator convolution processing is performed on the text region in the image to be processed, to obtain the target convolution matrix of the text region, which specifically includes the following steps:

S21: and processing the image to be processed according to the positions of the text areas to obtain single-channel images corresponding to the text areas.

If the number of the text areas in the image to be processed is greater than or equal to the preset number, the image to be processed accords with the set text area number requirement, and firstly, the image to be processed is required to be processed according to the positions of the text areas so as to obtain single-channel images corresponding to the text areas.

For example, after a plurality of text regions in an image to be processed are determined, the image to be processed is divided into images corresponding to the text regions according to highlighted text boxes of the text regions. Then, the image corresponding to each text region is converted into a single-channel image (such as a gray level image) so as to detect the image quality of the image corresponding to each text region.

S22: and performing Sobel operator convolution processing on the single-channel image corresponding to each text region to obtain the Sobel operator convolution matrixes of the single-channel image corresponding to the text region in multiple directions.

After obtaining single-channel images corresponding to a plurality of text areas, performing sobel operator (sobel) convolution processing on the single-channel images corresponding to the text areas to obtain sobel operator convolution matrixes of the single-channel images corresponding to the text areas in a plurality of directions.

Specifically, the sobel operator is carried out on the single-channel image corresponding to each text region in the x direction and the y direction respectively, and the convolution result of the edge pixels in the single-channel image corresponding to each text region is set to be 0, so that the sobel operator convolution matrix of the single-channel image corresponding to each text region in the x direction and the y direction is obtained.

Wherein, the convolution kernel in the x direction is:

Wherein, the convolution kernel in the y direction is:

After the single-channel images corresponding to the text areas are subjected to the Sobel operator convolution processing, the Sobel operator convolution matrix of the single-channel images corresponding to the text areas in the x direction and the Sobel operator convolution matrix of the single-channel images corresponding to the text areas in the x direction are obtained.

S23: and carrying out root mean square calculation on the sobel operator convolution data in a plurality of directions to obtain a target convolution matrix of the text region.

After the sobel operator convolution matrix of the single-channel image corresponding to the text region in a plurality of directions is obtained, the sobel operator convolution data in the plurality of directions is subjected to root mean square calculation, and the target convolution matrix of the text region is obtained.

For example, after obtaining the sobel operator convolution matrix of the single-channel image corresponding to each text region in the x and y directions, performing root mean square calculation on the sobel operator convolution data in the x and y directions to obtain a target convolution matrix of the text region, where the calculation formula of the target convolution matrix of each text region is as follows:

Wherein res_sobel is a target convolution matrix of the text region, res_sobel_h is a sobel operator convolution matrix of the single-channel image corresponding to the text region in the x direction, and res_sobel_v is a sobel operator convolution matrix of the single-channel image corresponding to the text region in the x direction.

The target convolution matrix res-sobel of the text area is a matrix formed by pixels, and each pixel corresponds to a numerical value.

Before the sobel operator (sobel) convolution processing is performed on the single-channel image corresponding to each text region, in order to facilitate the calculation of the single-channel image corresponding to each text region, the pixel value interval in the single-channel image corresponding to each text region needs to be compressed from [0, 255] to [0,1], that is, all pixel values in the single-channel image corresponding to each text region are divided by 255, so that integer values in the single-channel image can be converted into floating-point values by dividing by 255, thereby facilitating the processing. After dividing all pixels by 255, the numerical distribution of each pixel in the res-sobel is more convenient to count later, so that the error of the res-sobel is reduced, and the boundary of each distribution interval is further reduced.

In this embodiment, an image to be processed is processed according to the positions of each text region to obtain single-channel images corresponding to a plurality of text regions, then sobel operator convolution processing is performed on the single-channel images corresponding to each text region to obtain sobel operator convolution matrices of the single-channel images corresponding to the text regions in a plurality of directions, then, weighted summation is performed on sobel operator convolution data in the plurality of directions to obtain a target convolution matrix of the text region, finally, weighted summation is performed on each pixel in the target convolution matrix according to the pixel value and the distribution interval of each pixel in the target convolution matrix to determine the region information of the text region, and specific steps of quality detection on each text region to determine the region information of each text region are refined, so that a basis is provided for subsequent calculation.

In an embodiment, as shown in fig. 4, in step S21, the image to be processed is processed according to the positions of the text regions to obtain single-channel images corresponding to the text regions, which specifically includes the following steps:

s211: determining whether the image to be processed is a multi-channel image;

s212: if the image to be processed is a multi-channel image, converting the image to be processed into a single-channel image to be processed;

S213: dividing the image to be processed of the single channel according to the position of each text region to obtain a single channel image corresponding to each text region.

Since most of the images are color images, i.e. multi-channel images, the color images are composed of R, G, B channels, each channel is a numerical matrix, and for the following algorithm processing, the multi-channel color images need to be converted into single-channel images.

Therefore, when the image to be processed is processed according to the position of each text region, it is required to determine whether the image to be processed is a multi-channel image, that is, whether the image corresponding to the text region is a multi-channel image, and if the image to be processed is a single-channel image, that is, the image corresponding to the text region is a single-channel image, it means that the image can be directly processed without conversion. If the image to be processed is determined to be a multi-channel image, the image to be processed needs to be converted into a single-channel image to be processed for subsequent processing.

After converting the image to be processed into a single-channel image to be processed, dividing the single-channel image to be processed according to the positions of the text areas to obtain single-channel images corresponding to the text areas.

In this embodiment, by determining whether the image to be processed is a multi-channel image, if the image to be processed is a multi-channel image, the image to be processed is converted into a single-channel image to be processed, and the single-channel image to be processed is divided according to the positions of each text region, so as to obtain a single-channel image corresponding to each text region, and the specific process of processing the image to be processed according to the positions of each text region to obtain single-channel images corresponding to a plurality of text regions is clarified, so that the multi-channel image is converted into the single-channel image, thereby facilitating the subsequent image quality detection, accelerating the subsequent image quality detection speed, and improving the image quality detection efficiency.

In one embodiment, as shown in fig. 5, after determining the text region in the image to be processed, before step S20, that is, before performing the sobel operator convolution processing on the text region in the image to be processed, the method specifically includes the following steps:

S11: and determining whether the number of the text areas in the image to be processed is smaller than a preset number.

After determining the text regions in the image to be processed, before performing the sobel operator convolution processing on the text regions in the image to be processed, it is required to determine whether the number of the text regions in the image to be processed is smaller than a preset number. Because the text positioning model can not position text areas in the blurred image, if the image to be processed is the blurred image, the text areas positioned by the text positioning model are fewer or almost none. Therefore, in order to reduce the subsequent calculation amount and the time cost caused by processing the text regions, it is necessary to determine whether the number of text regions in the image to be processed is smaller than the preset number, so as to determine whether the image to be processed needs to be read for further quality detection according to the determination result.

S12: and if the number of the text areas in the image to be processed is greater than or equal to the preset number, performing the Sobel operator convolution processing on the text areas in the image to be processed.

After determining whether the number of the text regions in the image to be processed is smaller than the preset number, if the number of the text regions in the image to be processed is larger than or equal to the preset number, the image to be processed accords with the set text region number requirement, the image to be processed is difficult to recognize text, and if the quality of the image to be processed is qualified, the image to be processed is determined to be a clear image. At this time, the sobel operator convolution processing needs to be performed on the text regions in the image to be processed to determine the region information of each text region.

S13: if the number of the text areas in the image to be processed is smaller than the preset number, determining that the quality detection result of the image to be processed is not passing the quality detection.

If the number of the text areas in the image to be processed is smaller than the preset number, the image to be processed does not meet the set text area number requirement, text recognition is difficult to carry out on the image to be processed, the quality of the image to be processed is poor, the image to be processed is determined to be a fuzzy image, and the quality detection result of the image to be processed is determined to be that the quality detection result does not pass. By the method, very blurred pictures can be filtered, subsequent calculation steps are reduced, and the image quality detection speed is increased.

The preset number is required to be determined according to the type of the image to be processed, namely, according to the service scene setting to which the image to be processed belongs, and different service scenes have different text area setting numbers, namely, the preset number values in different scenes are different.

For example, if the image to be processed is an invoice image, in the scene of performing invoice OCR recognition, all the separate text areas on the invoice image are positioned, after each text area in the invoice is determined, if the number of text areas in the invoice image is less than 10 (i.e. the preset number of the invoice OCR recognition scene is 10), it is determined that the invoice image does not meet the set number of text areas, text recognition cannot be performed on the invoice image, the invoice image quality is poor, the invoice is determined to be a fuzzy invoice, the invoice image is determined to be unavailable, and a new invoice image needs to be uploaded. If the image to be processed is an identity card image, positioning all separated text areas on the identity card image in an identity card OCR recognition scene, wherein the number of the texts corresponding to all the identity card images is smaller as the texts in the identity card are smaller, and the preset number of the front images of the identity card can be 4; after determining each text area in the front image of the identity card, if the number of the text areas in the front image of the identity card is smaller than 4, determining that the front image of the identity card does not meet the set text area number requirement, and cannot perform text recognition on the front image of the identity card, if the front image of the identity card is poor in quality, determining that the front image of the identity card is a fuzzy invoice, and judging that the front image of the identity card is unavailable, and uploading a new front image of the identity card is needed.

In this embodiment, the service scene to which the image to be processed belongs is an invoice OCR recognition scene and an identity card OCR recognition scene, which are only described as examples, and in other embodiments, the service scene to which the image to be processed belongs may also be other scenes, which are not described herein.

In this embodiment, the preset number in the invoice OCR recognition scene is 10, and the preset number in the identification card OCR recognition scene is 4, which is only an exemplary illustration, and in other embodiments, the preset number may be other, which is not described herein.

In this embodiment, before performing convolution processing on the text regions in the image to be processed by using the sobel operator, determining whether the number of the text regions in the image to be processed is smaller than a preset number, and if the number of the text regions in the image to be processed is greater than or equal to the preset number, performing convolution processing on the text regions in the image to be processed by using the sobel operator; if the number of the text areas in the image to be processed is smaller than the preset number, determining that the quality detection result of the image to be processed does not pass the quality detection, and judging the number of the text areas to be processed can be used for primarily judging the image quality of the image to be processed, so that the subsequent calculation steps are reduced, and the image quality detection speed is increased.

In one embodiment, as shown in fig. 6, in step S40, a quality detection result of the image to be processed is determined according to the area information of each text area and the preset quality condition, and the method specifically includes the following steps:

S41: and determining the quality of the image to be processed according to the area information of the character areas and the Euclidean distance between each character area and the light source.

The position of the light source can be determined according to the area information of each text area, then the Euclidean distance between each text area and the light source is determined, and then the target quality score of the image to be processed is determined according to the area information of the text area and the Euclidean distance between each text area and the light source, so that the quality of the image to be processed is represented by adopting the target quality score.

For example, a text region having the largest euclidean distance to the light source may be determined as a target region, that is, a text region farthest from the light source is a target region, and a quality score of the target region is taken as a target quality score of the image to be processed; the plurality of text regions having the largest euclidean distance from the light source may be selected as the plurality of target text regions, and then the median of the quality scores of the plurality of target text regions, or the average of the quality scores of the plurality of target text regions, may be used as the target quality score of the image to be processed.

It can be understood that the further the Euclidean distance between the text region and the light source is, the lower the quality score of the text region is, and the target quality score of the image to be processed is determined according to the Euclidean distance between each text region and the light source, so that the possibility of calculating errors of the quality score of part of text regions can be reduced, the errors are further reduced, and the accuracy of the target quality score is improved, namely the accuracy of the quality of the image to be processed is improved.

In other embodiments, the median in the region information of all text regions is also used as a target quality score of the image to be processed to characterize the quality of the image to be processed. After the quality score of each text region is determined, the median in the quality scores of all the text regions can be used as the target quality score of the image to be processed, and the representative median in the quality score can be used for relaxing the quality requirement of the image to be processed, improving the processing speed and reducing the time cost.

S42: and determining a quality detection result of the image to be processed according to the quality of the image to be processed and a preset quality condition.

After determining the quality of the image to be processed according to the area information of the text areas and the Euclidean distance between each text area and the light source, acquiring a preset quality condition, determining whether the quality of the image to be processed meets the preset quality condition, and if so, determining that the image to be processed passes the quality detection, namely that the quality detection result is the passing quality detection; if the quality detection result does not meet the quality detection requirement, determining that the image to be processed fails the quality detection, namely, determining that the quality detection result is that the image to be processed fails the quality detection.

In this embodiment, the median in the area information of all the text areas is used as the target quality score of the image to be processed; or determining the quality of the image to be processed according to the area information of the character areas and the Euclidean distance between each character area and the light source, and determining the quality detection result of the image to be processed according to the quality of the image to be processed and the preset quality condition, so that the step of determining the quality detection result of the image to be processed according to the area information of each character area and the preset quality condition is clarified, the accuracy of the quality of the image to be processed is improved, and the accuracy of the quality detection result is further improved.

In one embodiment, as shown in fig. 7, in step S41, the quality of the image to be processed is determined according to the area information of the text areas and the euclidean distance between each text area and the light source, and the method specifically includes the following steps:

s411: and determining the position of the light source according to the area information of each text area.

In this embodiment, the area information is taken as an example of the quality score, and after the quality score of each text area is obtained, a light source matrix formed by all text areas is constructed according to the quality score of each text area to determine the position of the light source, and generally, the farther the text area is from the light source, the lower the quality score of the text area is. This step may be obtained by calculation from associated software.

S412: and determining the Euclidean distance between each character area and the light source according to the position of each character area and the position of the light source.

After the position of the light source is determined according to the area information of each character area, the Euclidean distance between each character area and the light source is determined according to the position of each character area and the position of the light source.

S413: and taking the text area with the Euclidean distance smaller than the preset distance from the light source as a target text area.

After determining the Euclidean distance between each text region and the light source, determining the text region with the Euclidean distance smaller than the preset distance as the target text region to obtain a plurality of target text regions, and then

S414: and determining the quality of the image to be processed according to the area information of the plurality of target text areas.

After the plurality of target text regions are obtained, the median of the quality scores of the plurality of target text regions can be used as the target quality score of the image to be processed, the target quality score of the image to be processed is reduced, and the quality determination result of the image to be processed is determined. The maximum value of the quality scores of the plurality of target text areas can be used as the target quality score of the image to be processed, so that the quality detection threshold is improved.

In this embodiment, the position of the light source is determined by determining the area information of each text area, then the euclidean distance between each text area and the light source is determined according to the position of each text area and the position of the light source, then the text area with the euclidean distance smaller than the preset distance with the light source is used as the target text area, and finally the quality of the image to be processed is determined according to the area information of a plurality of target text areas, so that the specific process of determining the quality of the image to be processed according to the area information of the text areas and the euclidean distance between each text area and the light source is clarified, and the condition that the target quality score is inaccurate due to the occurrence of errors of the area information of part of text areas can be reduced.

In one embodiment, as shown in fig. 8, in step S50, a quality detection result of the image to be processed is determined according to the quality of the image to be processed and a preset quality condition, and the method specifically includes the following steps:

S421: and determining a preset quality condition according to the type of the image to be processed.

After the image to be processed is acquired, a preset quality condition for evaluating the quality of the image to be processed needs to be determined according to the type of the image to be processed. In the embodiment, the quality scoring threshold value is used for representing the preset quality condition, so that the quality scoring method is visual and convenient, reduces the information processing complexity in the quality detection process, and further improves the quality detection speed.

Different types of images have different quality score thresholds. For example, if the image to be processed is an invoice image, the quality score threshold of the image to be processed is a quality score threshold corresponding to the invoice image; and if the image to be processed is the identity card image, the quality grading threshold of the image to be processed is the quality grading threshold corresponding to the identity card image.

The quality score threshold is a predetermined score threshold, and may be an empirical value set manually, or may be determined according to quality scores of images of the same type.

S422: and determining whether the quality of the image to be processed meets a preset quality condition.

After determining the quality of the image to be processed, it is necessary to determine whether the quality of the image to be processed satisfies a preset quality condition. When the target quality score characterizes the quality of the processed image and the quality score threshold characterizes the preset quality condition, namely, whether the target quality score of the image to be processed is larger than the quality score threshold is determined, so that whether the image to be processed meets the preset quality condition is determined according to the comparison condition of the target quality score and the quality score threshold.

S423: if the quality of the processed image meets the preset quality condition, determining that the quality detection result of the image to be processed is passing the quality detection.

After determining whether the target quality score of the image to be processed is greater than the quality score threshold, if the target quality score of the image to be processed is greater than the quality score threshold, the target quality score of the image to be processed is higher, the quality of the image to be processed is better, the image to be processed is determined to be a clear image, the image to be processed is available, the image to be processed meets the preset quality condition, and the quality detection result of the image to be processed is determined to pass the quality detection.

S424: if the quality of the processed image meets the preset quality condition, determining that the quality detection result of the image to be processed is not passing the quality detection.

After determining whether the target quality score of the image to be processed is greater than the quality score threshold, if the target quality score of the image to be processed is smaller than or equal to the quality score threshold, which means that the target quality score of the image to be processed is lower, the quality of the image to be processed is poor, the image to be processed is determined to be a clear image, the image to be processed is not available, the image to be processed does not meet the preset quality condition, the image to be processed is determined to not pass the quality detection, and the new image needs to be retransmitted and uploaded.

In this embodiment, a preset quality condition is determined according to the type of the image to be processed, whether the quality of the image to be processed meets the preset quality condition is determined, and if the quality of the image to be processed meets the preset quality condition, the quality detection result of the image to be processed is determined to be passing the quality detection; if the quality of the processed image meets the preset quality condition, determining that the quality detection result of the image to be processed does not pass the quality detection, defining a specific process of determining the quality detection result of the image to be processed according to the quality of the image to be processed and the preset quality condition, setting different preset quality conditions for different types of images, further determining whether the quality of the image meets the requirement according to the corresponding preset quality condition, and further improving the accuracy of the quality detection result of the image.

In an embodiment, as shown in fig. 9, before determining the preset quality condition of the image to be processed according to the type of the image to be processed, a quality score threshold of the image to be processed of different types needs to be determined to characterize the preset quality condition by the quality score threshold, where the quality score threshold of the image to be processed is determined by:

s01: a plurality of historical images with different quality labels are acquired.

And obtaining a plurality of historical images of the same type, and manually labeling the plurality of historical images of the same type to mark the plurality of historical images of the same type with quality labels, so as to obtain a plurality of historical images with different quality labels. The type of the historical image is the same as the type of the image to be processed, and the quality label on the historical image comprises a clear image and a blurred image.

S02: and determining the target quality score of each historical image according to the quality score of the text region in each historical image.

After a plurality of history images with different quality labels are acquired, the quality score of the text region in each history image is determined according to the processing procedure described above, and then the target quality score of each history image is determined.

S03: and determining the image quality of each historical image under a plurality of different preset values according to the target quality score of each historical image and the magnitude of the preset value.

After the target quality score of each history image is obtained, a plurality of preset values are set, and the target quality score of each history image is compared with each preset value one by one to determine the image quality of each history image under a plurality of different preset values. Wherein the image quality includes sharpness and blur.

For example, selecting a preset value from a plurality of preset values, determining the target quality score and the preset value of each historical image, and if the target quality score of the historical image is smaller than or equal to the preset value, determining the quality of the historical image as fuzzy; if the target quality score of the historical image is larger than the preset value, determining the quality of the historical image as clear, sequentially selecting each preset value, repeating the process, and determining the image quality of each historical image under each preset value.

Wherein the plurality of different preset values may be a plurality of consecutive values determined according to the number of the history images. For example, if the number of history images is 500, the preset value is gradually increased from 0 to 0.1, and then the number is increased to 500, that is 0,0.1,0.2,0.3, … 499.8, 499.9, 500, and a total of 5001 values are respectively regarded as the preset value.

In this embodiment, the values of the preset values are only exemplary, and in other embodiments, the preset values may be other continuous values.

S04: and determining a quality scoring threshold value in a plurality of different preset values according to the matching condition of the image quality of each historical image under different preset values and the quality label.

After determining the image quality of each historical image under each preset value, determining a quality scoring threshold value in a plurality of different preset values according to the matching condition of the image quality of each historical image under each preset value and the quality label.

Specifically, the image quality of each historical image under each preset value and the matching quantity of the quality labels of each historical image are determined, so that the matching quantity of the image quality of each historical image under different preset values and the quality labels is obtained, and one preset value with the largest matching quantity is selected as a quality scoring threshold.

In this embodiment, a plurality of history images with different quality labels are obtained, the quality labels include a clear image and a blurred image, wherein the type of the history images is the same as the type of the image to be processed, then a target quality score of each history image is determined according to the quality score of a text region in each history image, then the image quality of each history image under a plurality of different preset values is determined according to the size of the target quality score and the preset value of each history image, wherein the image quality includes clear and blurred, finally a quality score threshold is determined in a plurality of different preset values according to the matching condition of the image quality of each history image under different preset values and the quality labels, a specific determination process of the quality score threshold of the image to be processed is determined, and then the quality score threshold is determined in a plurality of different preset values according to the matching condition of the image quality of each history image and the quality label, so that the accuracy of the quality score threshold is improved, and the accuracy of the image quality determined according to the quality score threshold is further improved.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.

In one embodiment, an image quality detecting apparatus is provided, which corresponds to the image quality detecting method in the above embodiment one by one. As shown in fig. 10, the image quality detecting apparatus includes a positioning module 101, a processing module 102, a calculating module 103, and a determining module 104. The functional modules are described in detail as follows:

The positioning module 101 is configured to obtain an image to be processed, and position characters in the image to be processed by using a character positioning model to determine a character area in the image to be processed;

the processing module 102 is configured to perform sobel operator convolution processing on a text region in the image to be processed, and obtain a target convolution matrix of the text region;

a calculation module 103, configured to obtain area information of the text area according to a pixel value of a pixel in the target convolution matrix and a distribution interval of the pixel value;

And the determining module 104 is configured to determine a quality detection result of the image to be processed according to the area information of each text area and a preset quality condition.

Further, the processing module 102 is specifically configured to:

processing the image to be processed according to the positions of the text areas to obtain a plurality of single-channel images corresponding to the text areas;

Performing Sobel operator convolution processing on the single-channel image corresponding to each text region to obtain a Sobel operator convolution matrix of the single-channel image corresponding to the text region in multiple directions;

And carrying out root mean square calculation on the Sobel operator convolution data in the directions to obtain a target convolution matrix of the text region.

Further, the processing module 102 is specifically further configured to:

determining whether the image to be processed is a multi-channel image;

If the image to be processed is the multi-channel image, converting the image to be processed into a single-channel image to be processed;

Dividing the image to be processed of the single channel according to the position of each text region to obtain a single channel image corresponding to each text region.

Further, the determining module 104 is specifically configured to:

determining the quality of the image to be processed according to the area information of the character areas and the Euclidean distance between each character area and the light source;

And determining a quality detection result of the image to be processed according to the quality of the image to be processed and the preset quality condition.

Further, the determining module 104 is specifically further configured to:

determining the position of a light source according to the area information of each text area;

determining the Euclidean distance between each text region and the light source according to the position of each text region and the position of the light source;

taking the text area with the Euclidean distance smaller than the preset distance with the light source as a target text area;

and determining the quality of the image to be processed according to the area information of the plurality of target text areas.

Further, the determining module 104 is specifically further configured to:

Determining the preset quality condition according to the type of the image to be processed;

determining whether the quality of the image to be processed meets the preset quality condition;

If the quality of the image to be processed meets the preset quality condition, determining that the quality detection result of the image to be processed is passing quality detection;

If the quality of the image to be processed does not meet the preset quality condition, determining that the quality detection result of the image to be processed is not passing the quality detection.

Further, before the text region in the image to be processed is processed by using the sobel operator, the determining module 104 is specifically configured to:

determining whether the number of the text areas in the image to be processed is smaller than a preset number;

if the number of the text areas in the image to be processed is greater than or equal to the preset number, a Sobel operator is adopted to process the text areas in the image to be processed;

If the number of the text areas in the image to be processed is smaller than the preset number, determining that the quality detection result of the image to be processed is not passing the quality detection.

For specific limitations of the image quality detection apparatus, reference may be made to the above limitations of the image quality detection method, and no further description is given here. The respective modules in the above-described image quality detecting apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 11. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data such as images to be processed, text areas, quality scores, quality score thresholds and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an image quality detection method.

In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program:

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. An image quality detection method, comprising:

determining a quality detection result of the image to be processed according to the quality of the image to be processed and a preset quality condition;

Performing sobel operator convolution processing on a text region in the image to be processed to obtain a target convolution matrix of the text region, including:

2. The method for detecting image quality according to claim 1, wherein said processing the image to be processed according to the position of each text region to obtain a plurality of single-channel images corresponding to the text regions comprises:

determining whether the image to be processed is a multi-channel image;

3. The image quality detecting method according to claim 1, wherein the determining the quality of the image to be processed based on the area information of the text areas, the euclidean distance of each of the text areas from the light source comprises:

4. The image quality detection method according to claim 1, wherein the determining the quality detection result of the image to be processed according to the quality information of the image to be processed and the preset quality condition includes:

determining a preset quality condition of the image to be processed according to the type of the image to be processed;

5. The method for detecting image quality according to any one of claims 1 to 4, wherein before the text region in the image to be processed is processed by using the sobel operator, the method further comprises:

If the number of the text areas in the image to be processed is greater than or equal to the preset number, performing Sobel operator convolution processing on the text areas in the image to be processed;

6. An image quality detecting apparatus, comprising:

the determining module is used for determining the quality of the image to be processed according to the area information of the character areas and the Euclidean distance between each character area and the light source; determining a quality detection result of the image to be processed according to the quality of the image to be processed and a preset quality condition;

7. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the image quality detection method according to any one of claims 1 to 5 when the computer program is executed.

8. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the image quality detection method according to any one of claims 1 to 5.