CN114399495A

CN114399495A - Image definition calculation method, device, equipment and storage medium

Info

Publication number: CN114399495A
Application number: CN202210044215.8A
Authority: CN
Inventors: 陈昊
Original assignee: Ping An Puhui Enterprise Management Co Ltd
Current assignee: Ping An Puhui Enterprise Management Co Ltd
Priority date: 2022-01-14
Filing date: 2022-01-14
Publication date: 2022-04-26

Abstract

The application provides a method, a device and equipment for calculating image definition and a computer readable storage medium, wherein the method comprises the following steps: acquiring an image to be identified; inputting the image to be identified into an error prediction sub-model of the definition calculation model to obtain a characteristic graph and an error graph; inputting an image to be identified into a smooth region calculation sub-model to obtain a smooth region; inputting an image to be recognized into the significance prediction sub-model to obtain a significant region; carrying out nonlinear processing on the characteristic diagram to obtain a first reference diagram; overlapping the salient region and the smooth region to obtain a second reference image; and determining the definition of the image to be recognized according to the error map, the first reference map and the second reference map based on a definition computing network. By calculating the error map, the first reference map and the second reference map, the sharpness can be calculated for a part of the image, thereby improving the calculation accuracy of the sharpness. The present application also relates to blockchain techniques in which the sharpness computation model can be stored.

Description

Image definition calculation method, device, equipment and storage medium

Technical Field

The present application relates to the field of computer image processing technologies, and in particular, to a method, an apparatus, a device, and a computer readable storage medium for calculating image sharpness.

Background

In the character recognition or other feature recognition of the image, the definition of the input image can greatly affect the recognition result of the characters or other features, so that the definition of the input image needs to be judged in advance, and the situation of error recognition can be effectively avoided. In addition, the existing image definition judging method is mostly used for judging by means of the whole information of the image, and the situation of local blurring of the image may not be concerned, so that the calculation error of the image definition is large.

Disclosure of Invention

The application mainly aims to provide a method, a device and equipment for calculating image definition and a computer readable storage medium, and aims to improve the calculation accuracy of the image definition.

In a first aspect, the present application provides a method for calculating image sharpness, including the steps of:

acquiring an image to be identified;

inputting the image to be recognized into an error prediction submodel of a definition calculation model to obtain a feature map and an error map output by the error prediction submodel, wherein the feature map is used for indicating features on the image to be recognized, and the error map is used for indicating the distortion degree of the feature map;

inputting the image to be identified into a smooth region calculation submodel of a definition calculation model to obtain a smooth region output by the smooth region calculation submodel, wherein the difference between the average gray value and the median gray value of the smooth region is less than or equal to a preset gray threshold;

inputting the image to be recognized into a significance prediction submodel of a definition calculation model to obtain a significance region output by the significance prediction submodel, wherein the RGB value corresponding to a pixel in the significance region is greater than a preset RGB value;

carrying out nonlinear processing on the characteristic diagram to obtain a first reference diagram;

overlapping the salient region and the smooth region to obtain a second reference image;

and determining the definition of the image to be recognized according to the error map, the first reference map and the second reference map based on a definition computing network of the definition computing model.

In a second aspect, the present application further provides an image sharpness calculation apparatus, including:

the image acquisition module is used for acquiring an image to be identified;

the error prediction module is used for inputting the image to be recognized into an error prediction submodel of the definition calculation model to obtain a characteristic graph and an error graph output by the error prediction submodel;

the smooth region calculation module is used for inputting the image to be identified into a smooth region calculation submodel of the definition calculation model to obtain a smooth region output by the smooth region calculation submodel;

the saliency prediction module is used for inputting the image to be recognized into a saliency prediction submodel of a definition calculation model to obtain a saliency region output by the saliency prediction submodel;

the first reference image calculation module is used for carrying out nonlinear processing on the characteristic image to obtain a first reference image;

the second reference image calculation module is used for performing superposition processing on the salient region and the smooth region to obtain a second reference image;

and the definition calculating module is used for determining the definition of the image to be recognized according to the error map, the first reference map and the second reference map based on a definition calculating network of the definition calculating model.

In a third aspect, the present application further provides a computer device, which includes a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, implements the steps of the image sharpness calculating method as described above.

In a fourth aspect, the present application further provides a computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the steps of the method for calculating image sharpness as described above.

The application provides a method, a device and equipment for calculating image definition and a computer readable storage medium, wherein the method comprises the steps of calculating an image to be identified through a definition calculation model to obtain a feature map of the image to be identified, the distortion degree of the obtained feature map, a smooth area with uniform gray values in the image to be identified and a remarkable area in the image to be identified, so that local features of different positions and different characteristics of the image can be obtained, the definition calculation is performed aiming at the local parts of the image, and the calculation accuracy of the definition is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart of a method for calculating image sharpness according to an embodiment of the present application;

FIG. 2 is a schematic block diagram of a sharpness computation model provided in an embodiment of the present application;

fig. 3 is a schematic block diagram of an image sharpness calculating apparatus provided in an embodiment of the present application;

fig. 4 is a block diagram illustrating a structure of a computer device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

The embodiment of the application provides a method and a device for calculating image definition, computer equipment and a computer readable storage medium. The image definition calculating method can be applied to terminal equipment, and the terminal equipment can be electronic equipment such as a tablet computer, a notebook computer and a desktop computer. The method can also be applied to a server, which can be an independent server, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, Content Delivery Network (CDN), big data and artificial intelligence platform, and the like.

Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.

Referring to fig. 1 and fig. 2, fig. 1 is a schematic flow chart of a method for calculating image sharpness according to an embodiment of the present application, and fig. 2 is a schematic block diagram of a sharpness calculation model according to an embodiment of the present application, where sharpness of an image to be recognized is calculated by using each sub-model and network in the sharpness calculation model to obtain sharpness of the image to be recognized.

The definition calculation model can be stored in the block chain, so that when the computer equipment needs to use the definition calculation model, the computer equipment sends broadcast to the block chain and receives the definition calculation model sent by the block chain, and the definition of the image to be recognized is calculated. The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

As shown in fig. 1, the method of calculating the image sharpness includes steps S101 to S107.

And step S101, acquiring an image to be identified.

For example, the image to be recognized may be an image acquired by an image acquisition device, wherein the image acquisition device may be a shooting device or a scanning device; the image acquisition device may acquire, for example, newspapers, documents, business cards, documents, letters, and the like to obtain an image to be recognized.

It is understood that the camera is connected to the computer device in communication, so that the image captured by the camera can be transmitted to the computer device, and the image is subjected to the definition calculation by the computer device.

It can be understood that the image to be recognized may be used for performing a text recognition operation, so as to extract text in the image from the acquired image to be recognized, for example, in the financial field, text of a document is extracted to audit the document, but if the definition of the acquired image to be recognized is not sufficient, the operation of extracting text may be seriously affected, for example, the extracted text is wrong or the text cannot be extracted, so that the definition of the image to be recognized needs to be determined, so as to perform a text recognition operation better, for example, if it is determined that the definition of the image to be recognized does not meet the requirement, a shooting instruction is sent, so that the image acquisition device acquires the image again, thereby reducing the situation of inaccurate text recognition caused by insufficient definition.

Step S102, inputting an image to be recognized into an error prediction submodel of a definition calculation model to obtain a feature map and an error map output by the error prediction submodel, wherein the feature map is used for indicating features on the image to be recognized, and the error map is used for indicating the distortion degree of the feature map.

Illustratively, the image to be recognized is input into the definition calculation model to obtain the definition of the image to be recognized, so that whether character recognition and other operations are performed on the image to be recognized is judged according to the definition of the image to be recognized. As can be understood, if the definition of the image to be recognized is greater than or equal to the preset definition threshold, the image to be recognized can be subjected to character recognition through a subsequent character recognition model; and if the definition of the image to be recognized is smaller than the preset definition threshold, outputting indicating information for indicating that the image to be recognized is a fuzzy image, and not inputting the image to be recognized into a subsequent character recognition model for character recognition.

It should be noted that, in the present application, only the definition calculation method of the image to be recognized is described, and the obtained definition may be used to determine whether the image to be recognized is suitable for performing character recognition or other operations, which is not limited in the present application.

Illustratively, the sharpness calculation model comprises an error prediction submodel, wherein the error prediction submodel may be configured to extract features of the image to be recognized and determine a distortion degree of the extracted feature map, and it can be understood that, after the image to be recognized is input into the error prediction submodel, a feature map and an error map corresponding to the image to be recognized output by the error prediction submodel may be obtained.

For example, the feature map may be used to characterize features in the image to be recognized, such as text or logos in the image.

For example, the error map may be an error map obtained by performing entity error calculation on the image to be recognized, and is used for characterizing the distortion degree of the feature map. It is understood that the higher the distortion degree of the obtained feature map, the higher the pixel value corresponding to each pixel in the error map.

In some embodiments, inputting the image to be recognized into an error prediction submodel of a sharpness calculation model to obtain a feature map and an error map output by the error prediction submodel, including:

based on a feature extraction network in the error prediction submodel, carrying out feature extraction processing on the image to be recognized to obtain a feature map;

and carrying out convolution processing on the characteristic graph based on a convolution network in the error prediction submodel to obtain an error graph.

Illustratively, the error prediction submodel includes a feature extraction network and a convolution network, and it can be understood that the feature extraction network is used for performing feature extraction processing on the image to be recognized to obtain a feature map of the image to be recognized. And the convolution network performs convolution processing on the feature map obtained by the feature extraction network so as to obtain an error map through the feature map.

For example, the feature extraction network includes 8n convolutional layers, where n is a positive integer greater than or equal to 1, where when n is equal to 1, the activation functions in the 8 convolutional layers may be the same and are all activated by a rec-tifier linear unit (Relu), but the parameters of the convolutional layers are not exactly the same, e.g., the kernels of the 8 convolutional layers are all 3 × 3, but the unit batch size of the first convolutional layer is 1 (B: 1), and the feature channel size is 3 × 48; the unit batch size of the second convolutional layer is 1 (B: 1), the step size of the maximum pool is 2 multiplied by 2 (S: 2 multiplied by 2), and the characteristic channel size is 48 multiplied by 48; the characteristic channel size of the third convolutional layer is 48 multiplied by 64; the characteristic channel size of the fourth convolutional layer is 64 multiplied by 64, and the step size of the maximum pool is 2 multiplied by 2 (S: 2 multiplied by 2); the characteristic channel size of the fifth convolutional layer is 64 multiplied by 64; the characteristic channel size of the sixth building base layer is 64 multiplied by 128; the characteristic channel size of the seventh convolutional layer is 128 × 128; the characteristic channel size of the eighth convolutional layer is 128 × 1.

Exemplarily, feature extraction processing is performed on the image to be recognized through each convolution layer in the feature extraction network, so as to obtain a feature map corresponding to the image to be recognized.

Illustratively, the feature maps output from the feature extraction network are input into a convolution network, so as to linearly combine the feature maps to obtain an error map. Where the convolutional network may be, for example, a kernel size of 1 × 1, a feature channel size of 128 × 1, and convolutional layers with linear activation functions, the size of the error map processed in the convolutional network may be 1/4 of the size of the feature map, and it is understood that the size of the error map may be determined by the step size of the total maximum pool of the convolutional network.

It can be understood that, through the error prediction submodel, the feature map and the error map of the image to be recognized can be obtained, so as to analyze the definition of the image to be recognized through the feature map and the error map of the image to be recognized.

Step S103, inputting the image to be identified into a smooth region calculation submodel of the definition calculation model to obtain a smooth region output by the smooth region calculation submodel, wherein the difference between the average gray value and the median gray value of the smooth region is less than or equal to a preset gray threshold.

For example, an image to be recognized is input into the smooth region calculation submodel of the sharpness calculation model to obtain a smooth region output by the smooth region calculation submodel, and in general, many operations may cause image blurring, for example, in some images, severe distortion of the image may be caused in the process of creating a data set. In the case where a reference image, which may be an originally input image such as an image to be recognized, is unclear, it is difficult to determine whether a blurred region is caused by distortion.

When the image is severely distorted, the error map obtained through the above steps receives higher frequency components, and meanwhile, the distorted image loses high-frequency details. Blurred and severely distorted images can produce significant portions of the image with nearly the same pixel range and we can treat this as a uniform region. Therefore, the error predictor model using the multi-convolution layer design is likely to fail to predict a uniform region of the image to be recognized.

The smooth region output by the sub-model is calculated through the smooth region, so that the uniform region which cannot be predicted by the error prediction sub-model can be predicted, and the accuracy of the definition calculation result of the image to be recognized is improved.

It can be understood that the uniform region in the image may be considered that the gray values in the region are relatively close, so that when the feature extraction of the image is performed, the feature map cannot well represent the feature corresponding to the region, and the region can be determined by the smooth region calculation sub-model.

For example, when the difference between the mean value of the gray-scale values corresponding to all the pixels in a certain region and the median of the gray-scale values corresponding to all the pixels in the region is less than or equal to the preset gray-scale threshold, the region may be considered as a uniform region.

For example, in the area a, the area a includes pixels a, b, c, d, and e, and the corresponding gray values are 0.9, 0.4, 0.3, and 1, it can be understood that the average gray value of the area a is calculated to be 0.6, the median is 0.4, and the difference is 0.2, and when the preset gray threshold is 0.1, the area a does not belong to the smooth area; when the preset gradation threshold is 0.3, the region a belongs to the smooth region.

The area, the pixels, and the gray scale values corresponding to the pixels are examples, and the number of pixels in the area, the gray scale values corresponding to the pixels, and the preset gray scale threshold are not limited.

In some embodiments, the inputting the image to be recognized into the smooth region calculation submodel of the sharpness calculation model to obtain the smooth region output by the smooth region calculation submodel includes: based on a gray processing network in the smooth region calculation sub-model, carrying out gray processing on the image to be recognized to obtain a gray image corresponding to the image to be recognized; based on a filter network in the smooth region calculation sub-model, carrying out gray value adjustment and size conversion processing on the gray image to obtain a filter image; and performing region calculation on the filtered image based on a region calculation network in the smooth region calculation sub-model to obtain a smooth region.

Illustratively, the image to be recognized is subjected to gray processing through a gray processing network, and it is understood that the image to be recognized may be an RGB image, and the RGB image is converted into a gray image.

For example, the RGB values of each pixel in the image to be recognized may be converted into a gray scale value, so as to obtain a gray scale image corresponding to the image to be recognized.

It is understood that the RGB values corresponding to the pixels may be converted into the gray values by one of a floating point algorithm, an integer method, a shift method, an average value method, and a green color only method.

For example, the grayscale image is subjected to a grayscale adjustment process in the filter network, and the grayscale adjustment process may be, for example, a grayscale normalization process.

It is understood that the gray value normalization process may be to adjust the gray value having a value range of [0,255] to a value range of [ -1,1 ].

For example, a mapping relationship is set, and a gray value with a value range of [0,255] is normalized to a value range of [ -1,1], for example, when the gray value of a pixel is 255, the value is 1; when the gray-scale value of the pixel is 128, the value is 0.

It should be noted that the normalization value of the gray-scale value is only an exemplary illustration, and the value of the present application is not limited, and a corresponding relationship may be set according to an actual situation to determine the value of the gray-scale value of the pixel.

It is to be understood that the above-described gray-scale adjustment process is only an exemplary example, and the specific operation of the gray-scale adjustment of the present application is not limited.

For example, the gray-scale image after the gray-scale value adjustment is subjected to a size conversion process, such as a scaling process, to obtain the gray-scale image after the gray-scale value adjustment and the size conversion process.

Illustratively, the gray image after the gray value adjustment and the size conversion process is subtracted from the gray image to obtain a filtered image, as shown in the following formula:

I_dn＝|I_dg-I_dl|

wherein, I_dnFor indicating the filtered image, I_dgFor indicating grey scale images, I_dlFor indicating the gray image after the gray value adjustment and the size conversion processing.

Illustratively, the region calculation is performed on the filtered image to obtain the smooth region.

In some embodiments, the performing a region calculation on the filtered image to obtain a smooth region includes: performing distortion prediction calculation on the filtered image based on a preset distortion calculation function to obtain a distorted image; and calculating a formula based on a preset smooth region parameter, and intercepting the distorted image according to a gray value corresponding to the distorted image and the smooth region parameter to obtain a smooth region.

For example, the preset distortion calculation function may be as follows:

wherein R is the resulting distorted image, α is a model parameter, is a constant, I_dnIndicating a filtered image, exp indicating that the computed result retains a positive part.

Illustratively, the constant a may be adjusted through training of the model and/or in response to a user operation, wherein the constant a is related to the saturation characteristics of the distorted image (whiteness of the distorted image).

For example, distortion prediction calculation may be performed on the filtered image by the above formula to avoid the situation that a uniform region of the image to be recognized cannot be predicted.

Illustratively, in order to minimize unnecessary influence of the smooth region on the objective error map, the gray-scale values of pixels in the distorted image are averaged by a preset smooth region parameter calculation formula to obtain the smooth region.

For example, the preset smooth region parameter calculation formula may be as follows:

wherein R is_nFor smooth regions, R is a distorted image, H_RW_RIs a shape parameter of a preset smooth region, R_i,jIndicating the pixel in the ith row and the jth column in the distorted image.

As will be appreciated, when filtering an image (I)_dn) When the gray value of the pixel is 0, the smooth area returns to zero value since the smooth area is passed through the filtered image (I)_dn) The resulting, and therefore smooth, region cannot represent the complete image of the image to be identified, and therefore the salient region predicted in the saliency predictor model is used to adjust the smooth region (R)_n) The gray value corresponding to the pixel of the corresponding space position can improve the definition calculation accuracy of the image to be identified.

And step S104, inputting the image to be recognized into the significance prediction submodel of the definition calculation model to obtain a significance region output by the significance prediction submodel, wherein the RGB value corresponding to the pixel in the significance region is larger than the preset RGB value.

Illustratively, the saliency prediction is performed on the image to be recognized based on the saliency prediction submodel of the definition calculation model to obtain a saliency region corresponding to the image to be recognized.

It can be understood that the salient region is used to indicate a more salient portion in the image to be recognized, and may be characterized by RGB values, when the RGB values of a pixel in the image to be recognized are greater than the preset RGB values, the pixel is considered to be a salient pixel, and when a plurality of salient pixels form a region, the region is a second region corresponding to the salient portion.

In some embodiments, inputting the image to be recognized into the saliency prediction submodel of the sharpness calculation model to obtain a saliency region output by the saliency prediction submodel, includes: based on a back projection network in the significance prediction sub-model, carrying out histogram back projection processing on the image to be recognized to obtain a back projection image corresponding to the image to be recognized; and on the basis of an image processing network in the significance prediction submodel, sequentially carrying out normalization and equalization processing on the back projection images to obtain a significant region.

Illustratively, by means of a histogram back projection technology, a back projection image corresponding to the image to be recognized is obtained, and it can be understood that the back projection image obtained by the histogram back projection processing can be used to indicate a probability value of the image to be recognized. So that the position of the object (text, character or other feature) in the image can be known by the probability map.

For example, the average RGB values correspond to a graph of

Calculating a histogram with a set interval of 0.5 and projecting the histogram in reverse to the original image to obtain a reverse projected image described by the following formula:

it is understood that the RGB mean and the size of the image are exemplary, and the RGB mean and the size of the back projection image are not limited in the present application.

Illustratively, normalization and equalization processing are sequentially performed on the obtained reverse projection images to obtain a salient region corresponding to the image to be identified.

In some embodiments, the image processing network further includes a foreground extraction layer, a multi-scale pyramid layer, a hue back projection layer, a mean shift layer, and a contrast equalization layer, and the normalization and equalization processing is performed on the back projection image in sequence to obtain a significant region, including:

based on the foreground extraction layer, extracting a foreground image from the back projection image to obtain a foreground image and a background image of the back projection image; based on the multi-scale pyramid layer, filtering the foreground image according to the similarity of pixels in the foreground image to obtain a foreground smooth image; based on the hue back projection layer, carrying out back projection processing on the foreground smooth image according to a preset hue and saturation histogram to obtain a back projection image corresponding to the foreground smooth image; based on the mean shift layer, performing mean shift processing on the back projection image corresponding to the foreground smooth image according to a preset mean value; and based on the contrast equalization layer, performing contrast equalization processing on the back projection image subjected to the mean shift processing according to a preset contrast histogram to obtain a significant region.

It can be understood that the process of obtaining a foreground smoothed image can be regarded as a normalization process.

For example, in the foreground extraction layer, the foreground of the back-projected image may be extracted through a preset RGB value bounding box, where the preset RGB value bounding box may be RGB values of one or more image regions, for example, in a central region of the image, the RGB value of the preset pixel value bounding box is 0.1, and a region larger than the RGB value is determined as the foreground image, so that the foreground image and the background image are determined through the preset RGB value bounding box.

Illustratively, the similarity of pixels in the foreground image is determined through the multi-scale pyramid layer, and it can be understood that the similarity of pixels can be characterized by RGB values of the pixels, and pixels with different RGB values or the same RGB values are considered as pixels with high similarity.

According to the similarity of pixels in the foreground image, a plurality of level graphs are output, wherein the pixels on the graph of the same level can be used for indicating two areas with similar sizes or two objects with similar sizes, the pixels are grouped together through preset pixels, so that the plurality of level graphs are superposed to obtain the foreground smooth image.

Illustratively, in the hue back projection layer, a preset hue and saturation histogram is used to perform back projection of the hue and saturation histogram on a foreground smooth image, so as to obtain a back projection image corresponding to the foreground smooth image. It will be appreciated that the specific operations may refer to the back projection operation of the preceding steps and will not be repeated here.

Illustratively, after back projection is performed through the hue and saturation histograms, a back projection image corresponding to the foreground smooth image is obtained, and the back projection image corresponding to the foreground smooth image is input into the mean shift layer for mean shift processing.

The mean shift is a statistical iterative algorithm, specifically, a point is determined in a graph, then an average value of distance vectors from all points to a central point within a certain range of the central point is calculated, the average value is calculated to obtain a shift mean, and then the central point is moved to a shift mean position to complete the processing of the mean shift.

Illustratively, after the mean shift processing is completed, the processed image is input into the contrast equalizing layer.

In the contrast equalization layer, contrast equalization is carried out on the reverse-phase projection image corresponding to the foreground smooth image through a preset contrast histogram so as to enhance the contrast of the image, and the salient region is obtained through inversion of the contrast.

Illustratively, the definition calculation result of the image to be recognized can be effectively improved through the prediction of the salient region.

Step S105, carrying out nonlinear processing on the characteristic diagram to obtain a first reference diagram;

illustratively, the feature map output from the error prediction submodel is input into a nonlinear regression network of the sharpness calculation model to perform nonlinear processing on the feature map, resulting in a first reference map. It will be appreciated that the first reference map may be used to calculate the sharpness of the image to be identified.

Illustratively, in a nonlinear regression network, the feature map is subjected to global average pooling and full-join processing to obtain a first reference map.

Illustratively, the nonlinear regression network includes a Global Average Pool (GAP) layer, configured to perform global average pooling on the feature map, and input the feature map into three fully-connected layers after processing, where the first two fully-connected layers have 128 feature channels, and the last fully-connected layer has 64 feature channels, so as to complete fully-connected processing to obtain the first reference map.

And S106, overlapping the salient region and the smooth region to obtain a second reference image.

Illustratively, a salient region output from the saliency predictor model and a smooth region output from the smooth region calculation sub-model are calculated to obtain the second reference map.

Illustratively, the salient region and the smooth region may be subjected to an overlay process to obtain the second reference map. The superimposing process may be, for example, adding the gray value corresponding to each pixel in the salient region to the gray value corresponding to each pixel in the smooth region to complete the superimposing process, so as to obtain the second reference map.

In other embodiments, a saliency vector matrix corresponding to the saliency region and a gray level vector matrix corresponding to the smooth region may be further determined, a second reference vector matrix is obtained by using the saliency vector matrix and the gray level vector matrix, and the second reference vector matrix is subjected to inverse vectorization to obtain a second reference map, as shown in the following formula:

the matrix A is used for indicating a saliency vector matrix corresponding to the saliency region, the matrix B is used for indicating a gray level vector matrix, and the matrix obtained by adding is used for indicating a second reference vector matrix.

Illustratively, the definition of the image to be recognized is calculated through the obtained second reference image, so that the definition calculation accuracy of the image to be recognized can be improved.

And S107, determining the definition of the image to be recognized according to the error map, the first reference map and the second reference map based on the definition computing network of the definition computing model.

Illustratively, the error map, the first reference map and the second reference map are modulated by a definition calculation network of the definition calculation model to obtain the definition of the image to be recognized.

Illustratively, the definition of the image to be recognized can be represented by a numerical value, and a numerical value for representing the definition of the image to be recognized can be obtained by calculating the error map, the first reference map and the second reference map, so as to determine the definition of the image to be recognized.

In some embodiments, the determining the sharpness of the image to be recognized according to the error map, the first reference map and the second reference map by the sharpness calculation network based on the sharpness calculation model includes: determining a first vector matrix corresponding to the first reference image, a second vector matrix corresponding to the second reference image and a third vector matrix corresponding to the error image based on a matrix calculation layer of the definition calculation network; multiplying the first vector matrix, the second vector matrix and the third vector matrix based on a matrix multiplication layer of the definition calculation network to obtain a fourth vector matrix; and determining the definition of the image to be recognized according to the fourth vector matrix.

Specifically, the definition calculation network includes a matrix calculation layer and a matrix multiplication layer, where the matrix calculation layer may calculate a vector matrix corresponding to each graph, and multiply each vector matrix by the matrix multiplication layer to obtain a fourth feature matrix.

It can be understood that the definition corresponding to the image to be recognized can be obtained by calculating a modulus of the fourth feature matrix, and the definition corresponding to each region of the image to be recognized can also be obtained by outputting the fourth feature matrix, so that the definition of the image to be recognized is determined.

In some embodiments, in the process of determining the first reference map, a plurality of data sets corresponding to the feature map are obtained through nonlinear processing, and the score is predicted through the data sets, wherein some data sets, such as data in MOS or DMO, can be regarded as a true score. Through a regression loss function, a loss value can be calculated, and the definition of the image to be recognized is calculated through the loss value, wherein the regression loss function is as follows:

wherein L is used to indicate the regression loss function, I_dFor indicating the characteristic diagram, H (V) for indicating the clearance function, with the arguments V, theta_HParameter, S, for indicating a full connectivity layer in a non-linear network_gtFor true scores, subscript 2 for 2-norm, and superscript 2 for squaring calculations.

Where V is the output of the Global Average Pool (GAP) layer, which is mathematically a function of the output of the error predictor model and the model parameters Θ F, as shown by the following equation:

V＝GAP(F(I_dn；ΘF))

wherein GAP is used to indicate the function used in the Global Average Pool (GAP) layer, not limited herein, F is used to indicate the function used in the error predictor model, I_dnFor indicating the filtered image, Θ F is used to indicate the parameters of the sharpness computation model, as described above.

It is understood that the fourth vector matrix may be obtained by calculating the calculated loss value with the first vector matrix, the second vector matrix, and the third vector matrix.

For another example, the loss value may also be used to adjust parameters of the sharpness calculation model to improve the calculation accuracy of the sharpness calculation model.

The method for calculating the image definition provided by the embodiment comprises the steps of obtaining an image to be identified; inputting the image to be identified into an error prediction submodel of a definition calculation model to obtain a characteristic graph and an error graph output by the error prediction submodel; inputting the image to be identified into a smooth region calculation submodel of a definition calculation model to obtain a smooth region output by the smooth region calculation submodel; inputting the image to be identified into a significance prediction submodel of a definition calculation model to obtain a significance region output by the significance prediction submodel; carrying out nonlinear processing on the characteristic diagram to obtain a first reference diagram; overlapping the salient region and the smooth region to obtain a second reference image; and determining the definition of the image to be recognized according to the error map, the first reference map and the second reference map based on a definition computing network of the definition computing model. The characteristic image of the image to be recognized, the distortion degree of the obtained characteristic image, the smooth area with uniform gray scale value in the image to be recognized and the obvious area in the image to be recognized are obtained, so that the local characteristics of different positions and different characteristics of the image can be obtained, the definition can be calculated according to the local characteristics of the image, and the calculation accuracy of the definition of the image is improved.

Referring to fig. 3, fig. 3 is a schematic diagram of an image sharpness calculating apparatus according to an embodiment of the present application, where the image sharpness calculating apparatus may be configured in a server or a terminal, and is used to execute the foregoing image sharpness calculating method.

As shown in fig. 3, the image sharpness calculating apparatus includes: the image processing apparatus includes an image acquisition module 110, an error prediction module 120, a smooth region calculation module 130, a saliency prediction module 140, a first reference map calculation module 150, a second reference map calculation module 160, and a sharpness calculation module 170.

And an image obtaining module 110, configured to obtain an image to be identified.

And the error prediction module 120 is configured to input the image to be recognized into an error prediction submodel of a sharpness calculation model, to obtain a feature map and an error map output by the error prediction submodel, where the feature map is used to indicate features on the image to be recognized, and the error map is used to indicate a distortion degree of the feature map.

The smooth region calculation module 130 is configured to input the image to be recognized into a smooth region calculation sub-model of the sharpness calculation model, to obtain a smooth region output by the smooth region calculation sub-model, where a difference between a gray average value and a gray median value of the smooth region is less than or equal to a preset gray threshold.

The saliency prediction module 140 is configured to input the image to be recognized into a saliency prediction sub-model of a sharpness calculation model to obtain a saliency region output by the saliency prediction sub-model, where an RGB value corresponding to a pixel in the saliency region is greater than a preset RGB value.

And the first reference map calculation module 150 is configured to perform nonlinear processing on the feature map to obtain a first reference map.

A second reference map calculating module 160, configured to perform superposition processing on the salient region and the smooth region to obtain a second reference map.

A definition calculating module 170, configured to determine the definition of the image to be recognized according to the error map, the first reference map, and the second reference map based on a definition calculating network of the definition calculating model.

Illustratively, the error prediction module 120 further includes a feature extraction sub-module and a convolution processing sub-module.

And the feature extraction submodule is used for extracting features of the image to be identified based on a feature extraction network in the error prediction submodel to obtain a feature map.

And the convolution processing submodule is used for performing convolution processing on the characteristic diagram based on a convolution network in the error prediction submodel to obtain an error diagram, and the error diagram is used for representing the distortion degree of the characteristic diagram.

Illustratively, the smooth region calculation module 130 further includes a gray processing sub-module, a filtering sub-module, and a region calculation network sub-module.

And the gray processing submodule is used for carrying out gray processing on the image to be recognized based on a gray processing network in the smooth region calculation submodel to obtain a gray image corresponding to the image to be recognized.

And the filtering submodule is used for carrying out gray value adjustment and size conversion processing on the gray image based on a filtering network in the smooth region calculation submodel to obtain a filtering image.

And the region calculation network submodule is used for performing region calculation on the filtering image based on a region calculation network in the smooth region calculation submodel to obtain a smooth region.

Illustratively, the area calculation network sub-module further comprises a distortion calculation sub-module and a gray area parameter calculation sub-module.

And the distortion calculation submodule is used for carrying out distortion prediction calculation on the filtering image based on a preset distortion calculation function to obtain a distortion image.

And the area parameter calculation submodule is used for intercepting the distorted image according to the gray value corresponding to the distorted image and the smooth area parameter based on a preset smooth area parameter calculation formula to obtain a smooth area.

Illustratively, the saliency prediction module 140 further comprises a back projection processing sub-module, an image processing sub-module.

And the back projection processing submodule is used for carrying out histogram back projection processing on the image to be identified based on a back projection network in the significance prediction submodel to obtain a back projection image corresponding to the image to be identified.

And the image processing submodule is used for carrying out normalization and equalization processing on the back projection image based on an image processing network in the significance prediction submodel to obtain a significant region.

Illustratively, the image processing sub-module further comprises a foreground extraction module, a foreground smooth image determination module, a tone back projection processing sub-module, a mean shift processing sub-module, and an equalization processing sub-module.

And the foreground extraction module is used for extracting the foreground of the back projection image based on a foreground extraction layer in an image processing network to obtain a foreground image and a background image of the back projection image.

And the foreground smooth image determining module is used for carrying out filtering processing on the foreground image according to the similarity of pixels in the foreground image based on the multi-scale pyramid layer in the image processing network to obtain the foreground smooth image.

And the tone back projection processing submodule is used for carrying out back projection processing on the foreground smooth image according to a preset tone and saturation histogram on the basis of a tone back projection layer in the image processing network to obtain a back projection image corresponding to the foreground smooth image.

And the mean shift processing submodule is used for carrying out mean shift processing on the back projection image corresponding to the foreground smooth image according to a preset mean value based on a mean shift layer in the image processing network.

And the equalization processing submodule is used for carrying out contrast equalization processing on the back projection image subjected to the mean shift processing according to a preset contrast histogram based on a contrast equalization layer in the image processing network so as to obtain a significant region.

Illustratively, the sharpness calculation module 170 further includes a matrix calculation sub-module, a matrix multiplication sub-module, and a sharpness determination sub-module.

And the matrix calculation submodule is used for determining a first vector matrix corresponding to the first reference map, a second vector matrix corresponding to the second reference map and a third vector matrix corresponding to the error map based on a matrix calculation layer of the definition calculation network.

And the matrix multiplication submodule is used for multiplying the first vector matrix, the second vector matrix and the third vector matrix based on a matrix multiplication layer of the definition calculation network to obtain a fourth vector matrix.

And the definition determining submodule is used for determining the definition of the image to be identified according to the fourth vector matrix.

It should be noted that, as will be clear to those skilled in the art, for convenience and brevity of description, the specific working processes of the apparatus, the modules and the units described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The methods of the present application are operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The above-described methods and apparatuses may be implemented, for example, in the form of a computer program that can be run on a computer device as shown in fig. 4.

Referring to fig. 4, fig. 4 is a schematic block diagram of a computer device according to an embodiment of the present disclosure. The computer device may be a server or a terminal.

As shown in fig. 4, the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a storage medium and an internal memory.

The storage medium may store an operating system and a computer program. The computer program includes program instructions that, when executed, cause a processor to perform any of the methods of calculating image sharpness.

The processor is used for providing calculation and control capability and supporting the operation of the whole computer equipment.

The internal memory provides an environment for running a computer program in the storage medium, and the computer program, when executed by the processor, causes the processor to perform any one of the methods for calculating the image sharpness.

The network interface is used for network communication, such as sending assigned tasks and the like. Those skilled in the art will appreciate that the architecture shown in fig. 4 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

It should be understood that the Processor may be a Central Processing Unit (CPU), and the Processor may be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Wherein, in one embodiment, the processor is configured to execute a computer program stored in the memory to implement the steps of:

acquiring an image to be identified;

inputting the image to be identified into an error prediction submodel of a definition calculation model to obtain a characteristic graph and an error graph output by the error prediction submodel;

inputting the image to be identified into a smooth region calculation submodel of a definition calculation model to obtain a smooth region output by the smooth region calculation submodel;

inputting the image to be identified into a significance prediction submodel of a definition calculation model to obtain a significance region output by the significance prediction submodel;

In one embodiment, when the processor is used for inputting the image to be recognized into an error prediction submodel of a sharpness calculation model to obtain a feature map and an error map output by the error prediction submodel, the processor is used for realizing:

and carrying out convolution processing on the characteristic graph based on a convolution network in the error prediction submodel to obtain an error graph, wherein the error graph is used for representing the distortion degree of the characteristic graph.

In one embodiment, when the processor inputs the image to be recognized into the smooth region calculation submodel of the sharpness calculation model to obtain the smooth region output by the smooth region calculation submodel, the processor is configured to implement:

based on a gray processing network in the smooth region calculation sub-model, carrying out gray processing on the image to be recognized to obtain a gray image corresponding to the image to be recognized;

based on the filter network in the smooth region calculation submodel, carrying out gray value adjustment and size conversion processing on the gray image to obtain a filter image;

and performing region calculation on the filtering image based on a region calculation network in the smooth region calculation sub-model to obtain a smooth region.

In one embodiment, when the processor performs the region calculation on the filtered image to obtain the smooth region, the processor is configured to perform:

based on a preset distortion calculation function, performing distortion prediction calculation on the filtered image to obtain a distorted image;

and calculating a formula based on a preset smooth region parameter, and intercepting the distorted image according to a gray value corresponding to the distorted image and the smooth region parameter to obtain a smooth region.

In one embodiment, when the processor is used for realizing a salient region output by a salient prediction submodel in a salient prediction submodel of the image input definition calculation model to be recognized, the processor is used for realizing:

based on a back projection network in the significance prediction submodel, performing histogram back projection processing on the image to be recognized to obtain a back projection image corresponding to the image to be recognized;

and normalizing and equalizing the back projection image based on an image processing network in the significance prediction submodel to obtain a significant region.

In one embodiment, when the processor implements a graph processing network in the saliency prediction based submodel, and sequentially performs normalization and equalization processing on the back projection image to obtain a saliency region, the processor is further configured to implement:

based on a foreground extraction layer in an image processing network, carrying out foreground extraction on the back projection image to obtain a foreground image and a background image of the back projection image;

based on a multi-scale pyramid layer in an image processing network, filtering the foreground image according to the similarity of pixels in the foreground image to obtain a foreground smooth image;

based on a hue back-projection layer in an image processing network, carrying out back-projection processing on the foreground smooth image according to a preset hue and saturation histogram to obtain a back-projection image corresponding to the foreground smooth image;

based on a mean shift layer in the image processing network, carrying out mean shift processing on a back projection image corresponding to the foreground smooth image according to a preset mean value;

and based on a contrast equalizing layer in the image processing network, carrying out contrast equalizing processing on the back projection image after the mean shift processing according to a preset contrast histogram to obtain a salient region.

In one embodiment, the processor, when implementing a sharpness computation network based on the sharpness computation model, determines the sharpness of the image to be recognized from the error map, the first reference map, and the second reference map, is configured to implement:

determining a first vector matrix corresponding to the first reference map, a second vector matrix corresponding to the second reference map and a third vector matrix corresponding to the error map based on a matrix calculation layer of the sharpness calculation network;

multiplying the first vector matrix, the second vector matrix and the third vector matrix based on a matrix multiplication layer of the definition calculation network to obtain a fourth vector matrix;

and determining the definition of the image to be identified according to the fourth vector matrix.

It should be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the image definition calculation described above may refer to the corresponding process in the foregoing image definition calculation control method embodiment, and details are not described herein again.

Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, where the computer program includes program instructions, and a method implemented when the program instructions are executed may refer to various embodiments of the method for calculating image sharpness in the present application.

The computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device.

It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments. While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for calculating image sharpness is characterized by comprising the following steps:

acquiring an image to be identified;

inputting the image to be recognized into a significance prediction submodel of a definition calculation model to obtain a significance region output by the significance prediction submodel, wherein RGB values corresponding to pixels in the significance region are larger than preset RGB values;

2. The method for calculating image sharpness according to claim 1, wherein the inputting the image to be recognized into an error prediction submodel of a sharpness calculation model to obtain a feature map and an error map output by the error prediction submodel comprises:

3. The method for calculating the sharpness of an image according to claim 1, wherein the inputting the image to be recognized into a smooth region calculation submodel of a sharpness calculation model to obtain a smooth region output by the smooth region calculation submodel comprises:

and performing region calculation on the filtered image based on a smooth region calculation network in the smooth region calculation sub-model to obtain a smooth region.

4. A method for calculating sharpness of an image according to claim 3, wherein the performing region calculation on the filtered image to obtain a smooth region comprises:

5. A method for calculating image sharpness according to any one of claims 1 to 4, wherein the inputting the image to be recognized into a saliency prediction sub-model of a sharpness calculation model to obtain a saliency region output by the saliency prediction sub-model comprises:

and sequentially carrying out normalization and equalization processing on the back projection images based on an image processing network in the significance prediction sub-model to obtain a significant region.

6. The method of claim 5, wherein the sequentially normalizing and equalizing the backprojected images based on an image processing network in the saliency prediction submodel to obtain a saliency region comprises:

based on a foreground extraction layer in the image processing network, carrying out foreground extraction on the back projection image to obtain a foreground image and a background image of the back projection image;

based on the multi-scale pyramid layer in the image processing network, filtering the foreground image according to the similarity of pixels in the foreground image to obtain a foreground smooth image;

based on a hue back-projection layer in the image processing network, carrying out back-projection processing on the foreground smooth image according to a preset hue and saturation histogram to obtain a back-projection image corresponding to the foreground smooth image;

7. A method of calculating a sharpness of an image according to any one of claims 1 to 4, wherein the sharpness calculation network based on the sharpness calculation model determining the sharpness of the image to be recognized from the error map, the first reference map, and the second reference map comprises:

8. A device for calculating sharpness of an image, the device comprising:

the image acquisition module is used for acquiring an image to be identified;

the error prediction module is used for inputting the image to be recognized into an error prediction submodel of a definition calculation model to obtain a feature map and an error map output by the error prediction submodel, wherein the feature map is used for indicating features on the image to be recognized, and the error map is used for indicating the distortion degree of the feature map;

the smooth region calculation module is used for inputting the image to be identified into a smooth region calculation submodel of the definition calculation model to obtain a smooth region output by the smooth region calculation submodel, and the difference between the gray average value and the gray median value of the smooth region is less than or equal to a preset gray threshold;

the saliency prediction module is used for inputting the image to be recognized into a saliency prediction submodel of a definition calculation model to obtain a saliency region output by the saliency prediction submodel, and the RGB value corresponding to a pixel in the saliency region is greater than a preset RGB value;

9. A computer arrangement, characterized in that the computer arrangement comprises a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, carries out the steps of the method of calculating image sharpness of any of claims 1 to 7.

10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, wherein the computer program, when executed by a processor, implements the steps of the method of calculating image sharpness of any one of claims 1 to 7.