CN114742774A - No-reference image quality evaluation method and system fusing local and global features - Google Patents
No-reference image quality evaluation method and system fusing local and global features Download PDFInfo
- Publication number
- CN114742774A CN114742774A CN202210326356.9A CN202210326356A CN114742774A CN 114742774 A CN114742774 A CN 114742774A CN 202210326356 A CN202210326356 A CN 202210326356A CN 114742774 A CN114742774 A CN 114742774A
- Authority
- CN
- China
- Prior art keywords
- image
- global
- local
- images
- quality evaluation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013441 quality evaluation Methods 0.000 title claims abstract description 56
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000000605 extraction Methods 0.000 claims abstract description 26
- 238000012549 training Methods 0.000 claims abstract description 24
- 230000007246 mechanism Effects 0.000 claims abstract description 20
- 238000012360 testing method Methods 0.000 claims abstract description 16
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 238000011176 pooling Methods 0.000 claims description 24
- 238000004590 computer program Methods 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 11
- 230000006870 function Effects 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 8
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 6
- 238000001303 quality assessment method Methods 0.000 claims description 4
- 241000282414 Homo sapiens Species 0.000 claims description 3
- 230000002457 bidirectional effect Effects 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 230000009286 beneficial effect Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 10
- 238000011156 evaluation Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000009966 trimming Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a no-reference image quality evaluation method and system fusing local and global characteristics, wherein the method comprises the following steps: step S1: performing data preprocessing on images in the distorted image data set, and dividing the data set into a training set and a test set; step S2: constructing a global and local image coordinate extraction module; step S3: constructing a time attention mechanism module; step S4: constructing a no-reference image quality evaluation network fusing local and global image characteristics, and training the no-reference image quality evaluation network by adopting a training set; step S5: and inputting the images in the test set into a trained non-reference image quality evaluation network model, and outputting a corresponding image quality evaluation result. The method and the system are beneficial to improving the accuracy of the quality evaluation of the non-reference image.
Description
Technical Field
The invention belongs to the field of image processing and computer vision, and particularly relates to a non-reference image quality evaluation method and system fusing local and global characteristics.
Background
With the advent of the mobile age, billions of pictures are produced each day in various social media applications, most of which are taken by non-professional users in various outdoor environments. Unlike pictures taken by professional photographers, the quality of pictures taken by ordinary users tends to degrade due to underexposure or overexposure, low visibility, motion blur, ghosting, and other distortions. Due to technical or hardware limitations and other reasons, various distortions of different degrees are inevitably introduced into the image, quality degradation of different degrees is presented, and certain distortion occurs inevitably in the processes of compression, processing, transmission, display and the like. While high quality images may improve the quality of experience for the viewer on the one hand and may benefit many computer vision algorithms on the other hand. Therefore, how to measure the quality of an image and whether the image meets the requirements of a specific application is the target of image quality evaluation. Moreover, the image quality evaluation result can be used as auxiliary reference information of some image restoration enhancement technologies, so that image quality evaluation methods are very needed, and can also provide a feasible way for designing and optimizing advanced image/video processing algorithms.
Conventional no-reference image quality evaluation methods rely on artificially designed features and most attempt to detect certain types of distortions, such as blurring, blocking artifacts, various forms of noise, etc. For example, there are a method based on edge analysis and a method based on a transform domain for evaluating the degree of blur of an image. For the evaluation of image noise, there are a filter-based method, a wavelet transform-based method, and some other transform domain-based methods. For the evaluation of the image block effect, there are methods based on block boundaries and transform domain. There are also general-purpose based non-reference image quality assessment methods that do not detect specific types of distortions, and they usually transform the non-reference image quality assessment problem into a classification or regression problem, wherein the classification, regression, is trained using specific features. However, the manually designed features have their limitations because different types of image content have different image characteristics, which have a large impact on the quality assessment score.
At present, research work without reference image quality evaluation enters the deep learning era, and compared with characteristics of artificial design, characteristics extracted by a neural network are more suitable for image quality evaluation and stronger. However, there is still a problem in using a neural network for quality evaluation of images. First, the trimming or scaling of the picture in the preprocessing stage of training the neural network affects the quality of the picture, resulting in an error in the evaluation result. Secondly, the evaluation of a picture depends not only on the whole picture but also on the partial areas of the picture. Therefore, it is necessary to further study the no-reference image quality evaluation method.
Disclosure of Invention
The invention aims to provide a no-reference image quality evaluation method and a no-reference image quality evaluation system fusing local and global characteristics, which are beneficial to improving the accuracy of no-reference image quality evaluation.
In order to achieve the purpose, the invention adopts the technical scheme that: a no-reference image quality evaluation method fusing local and global features comprises the following steps:
step S1: dividing a distorted image data set into a training set and a test set, and performing data preprocessing on images in the data set;
step S2: constructing a global and local image coordinate extraction module;
step S3: constructing a time attention mechanism module;
step S4: constructing a no-reference image quality evaluation network fusing local and global image characteristics, and training the no-reference image quality evaluation network by adopting a training set;
step S5: and inputting the images in the test set into a trained non-reference image quality evaluation network model, and outputting a corresponding image quality evaluation result.
Further, the step S1 specifically includes the following steps:
step S11: matching the images in the distorted image data set with the corresponding labels;
step S12: dividing images in the distorted image data set into a training set and a test set according to a set proportion;
step S13: all images to be trained in the training set are scaled to a fixed size H multiplied by W;
step S14: carrying out uniform random overturning operation on the image processed in the step S13 to enhance the data;
step S15: the normalization process is performed on the image processed in step S14 and the images in the test set.
Further, in step S2, the global and local image coordinate extracting module performs an operation of extracting global or local image coordinates as follows: dividing an image of size H x W into n2A disjoint global or local image of size h x w, whereinThen recording coordinates of each pixel at the upper left corner and the lower right corner of the global or local image in the original image, wherein n is a set parameter, when n is 1, the global image is extracted, and when n is more than 1, the local image is extracted;
the global and local image coordinate extraction module repeatedly executes the global or local image coordinate extraction operation for N times, the parameter N is i when the ith execution time is executed, coordinates of the upper left corner and the lower right corner of the global or local image on N image scales are obtained, the coordinates of the upper left corner and the lower right corner of the global or local image on each image scale are spliced, and coordinate vectors (x) of the upper left corner and the lower right corner are obtainedl,yl,xr,yr) Vector ofIs Q, wherein Q ═ 1+22+32+...+N2。
Further, the step S3 specifically includes the following steps:
step S31: the input of the time attention mechanism module is FinDimension of C × hx×wx(ii) a First changing the input features FinDimension of Qxcxhx×wxObtaining a characteristic FreshapeWherein C is C/Q, and Q is the number of local and global images;
step S32: f in step S31reshapeSequentially feeding into space pooling layer and channel pooling layer, and mixing FreshapeInput into the spatial pooling layer to obtain an output FspatialWith dimensions Q × c × 1 × 1, FspatialThe calculation formula of (2) is as follows:
Fspatial=Maxpool(Freshape)+Avgpool(Freshape)
wherein Maxpool (×) represents the spatial maximum pooling layer with step size 1, and Avgpool (×) represents the spatial average pooling layer with step size 1;
then F is mixedspatialInput to the channel pooling layer to obtain an output FchannelWith dimensions Q × 1 × 1 × 1, FchannelThe calculation formula of (2) is as follows:
Fchannel=Conv1×1(Concat(CMaxpool(Fspatial),CAvgpool(Fspatial)))
wherein CMaxpool (. cndot.) represents the maximum pooling layer of the channel with step 1, CAvgpool (. cndot.) represents the average pooling layer of the channel with step 1, Concat (. cndot.) represents the splicing of features in a new dimension, and Con v1×1(. x) represents convolution layer for dimensionality reduction with convolution kernel size 1 x 1;
step S33: f in step S32channelChanges its dimension from Q × 1 × 1 × 1 to Q by Reshape operation, and then changes F to QchannelInputting into two fully-connected layers, adopting attention mechanism to obtain importance degree of model learning to different global or local images of the image to determine which image of the local and global images is relative to the global imageThe quality evaluation has greater influence; mapping the value to (0, 1) through a sigmoid function to obtain a characteristic weight wtimeW is to betimeChanging the dimension from Q to Q multiplied by 1 through Reshape operation, and then using the characteristic weight as a guiding weight for the local and global images, namely, the initially input image characteristic FinMultiplied by a weight wtimePlus FinThe final output of the time attention mechanism module is obtained as FtimeDimension of C × hx×wx,FtimeThe calculation formula of (2) is as follows:
wtime=Sigmoid(MLP1(Reshape1(Fchannel)))
Ftime=Fin+(Fin×Reshape2(wtime))。
further, the step S4 specifically includes the following steps:
step S41: establishing a main network on the basis of one of image classification networks comprising ResNet50 and ResNet101, and removing the last layer of the main network to serve as a feature extraction network;
step S42: inputting a batch of images in the training set into the feature extraction network in step S41, the feature extraction network outputting image features FbackboneDimension of C × hx×wxC is the number of channels of the image characteristics; meanwhile, inputting the image into a global and local image coordinate extraction module to obtain the coordinates of the upper left corner and the lower right corner of the local and global images;
step S43: since the same image has a global image and a local image with different sizes and the feature dimensions are required to be the same in the neural network batch processing stage, the output image feature FbackboneAnd the coordinates of the upper left corner and the lower right corner of the corresponding local image and the global image obtained by the global and local image coordinate extraction module are input into the interested region correction module together, so as to obtain the local image feature F and the global image feature F with the same dimensionalignThe dimension of the image is C × poolsize × poolsize, wherein poolsize is the size of the image feature;
step S44: feature F output in step S43alignInput to the time attention mechanism module constructed in step S3, and output F of the time attention mechanism is obtainedtimeThen F is puttimeInputting the image data into a bidirectional gate control circulation unit network to simulate the local sequential viewing of the image when human beings evaluate the image quality to obtain an output FbigruThe dimension is QxC;
step S45: for output F of step S44bigruFirst, the dimension is changed from Q × C to P, and then F is changed to P × Q × C by Reshape operationbigruInputting the image quality evaluation score into the last two fully-connected layers to obtain a final image quality evaluation score FoutThe dimension is 1, the quality score of the picture is represented, and the calculation formula is as follows:
Fout=MLP2(Reshape3(Fbigru))
step S46: calculating the loss function of the non-reference image quality evaluation network fusing local and global image characteristics as follows:
wherein m is the number of samples, yiRepresenting the true quality score of the image,representing the quality score of the image obtained by a non-reference image quality evaluation network fusing local and global image characteristics; the real quality score of each global image and each local image is the same as that of the image to which the global image and the local image belong;
step S47: and (4) repeating the steps S42 to S46 by taking batches as units until the loss value calculated in the step S46 converges and tends to be stable, storing the network parameters, and finishing the training process of the non-reference image quality evaluation network fusing the local and global image characteristics.
Further, in step S5, the images in the test set are input to the trained non-reference image quality evaluation network model, and the quality scores corresponding to the images are output.
The invention also provides a non-reference image quality evaluation system fusing local and global features, which comprises a memory, a processor and computer program instructions stored on the memory and capable of being executed by the processor, wherein when the processor executes the computer program instructions, the steps of the method can be realized.
Compared with the prior art, the invention has the following beneficial effects: the method does not perform operation for influencing the image quality on an input picture, retains details and proportion of the input picture, simulates behaviors of a person in evaluating the picture quality, focuses attention on a region with large influence on the picture quality and pays attention in sequence, effectively utilizes local and global characteristics of the picture, and improves accuracy of non-reference image quality evaluation. Therefore, the invention has strong practicability and wide application prospect.
Drawings
FIG. 1 is a flow chart of a method implementation of an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a non-reference image quality evaluation network that fuses local and global image features in an embodiment of the present invention.
FIG. 3 is a schematic diagram of a time attention mechanism module according to an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a method for evaluating quality of a non-reference image by fusing local and global features, including the following steps:
step S1: and dividing the distorted image data set into a training set and a test set, and performing data preprocessing on the images in the data set.
In this embodiment, the step S1 specifically includes the following steps:
step S11: and carrying out pairing processing on the images in the distorted image data set and the corresponding labels.
Step S12: and dividing the images in the distorted image data set into a training set and a test set according to a set proportion.
Step S13: all the images to be trained in the training set are scaled to a fixed size H W.
Step S14: and (4) performing uniform random flipping operation on the image processed in the step (S13) to enhance the data.
Step S15: the normalization process is performed on the image processed in step S14 and the images in the test set.
Step S2: and constructing a global and local image coordinate extraction module.
Specifically, the global and local image coordinate extraction module performs an operation of extracting global or local image coordinates as follows: dividing an image of size H x W into n2A disjoint global or local image of size h x w, whereinAnd then recording the coordinates of each pixel at the upper left corner and the lower right corner of the global or local image in the original image, wherein n is a set parameter, when n is 1, the global image is extracted, and when n is more than 1, the local image is extracted.
The global and local imagesThe coordinate extraction module repeatedly executes the global or local image coordinate operation for N times, the parameter N is i when the ith execution time is executed, coordinates of the upper left corner and the lower right corner of the global or local image on N image scales are obtained, the coordinates of the upper left corner and the lower right corner of the global or local image on each image scale are spliced, and coordinate vectors (x) of the upper left corner and the lower right corner are obtainedl,yl,xr,yr) The vector has a dimension of Q, wherein Q is 1+22+32+...+N2。
Step S3: as shown in fig. 3, a time attention mechanism module is constructed.
In this embodiment, the step S3 specifically includes the following steps:
step S31: the input of the time attention mechanism module is FinDimension of C × hx×wx(ii) a First changing the input features FinDimension of Qxcxhx×wxObtaining a characteristic FreshapeAnd C is C/Q, and Q is the number of local and global images.
Step S32: f in step S31reshapeSequentially inputting into space pooling layer and channel pooling layer, and mixing FreshapeInput into the spatial pooling layer to obtain an output FspatialWith dimensions Q × c × 1 × 1, FspatialThe calculation formula of (c) is:
Fspatial=Maxpool(Freshape)+Avgpool(Freshape)
wherein Maxpool (×) represents the spatial maximum pooling layer with step size 1, and Avgpool (×) represents the spatial average pooling layer with step size 1;
then F is mixedspatialInput to the channel pooling layer to obtain an output FchannelWith dimensions Q × 1 × 1 × 1, FchannelThe calculation formula of (c) is:
Fchannel=Conv1×1(Concat(CMaxpool(Fspatial),CAvgpool(Fspatial)))
wherein CMaxpool (. sup.). sup. -) represents the largest pooling layer of the channel with step size of 1, CAvgpool (. sup.). sup. -) represents the average pooling layer of the channel with step size of 1,concat (·) denotes the splicing of features in a new dimension, Conv1×1(. x) represents the convolution layer for dimensionality reduction with a convolution kernel size of 1 × 1.
Step S33: f in step S32channelBy Reshape operation (noted as Reshape)1) Change its dimension from Q × 1 × 1 × 1 to Q, and then change FchannelInput into two fully-connected layers (denoted as MLP)1) Adopting an attention mechanism to obtain the importance degree of the model to learn different global or local images of the image so as to determine which images in the local and global images have larger influence on the quality evaluation of the overall image; mapping the value to (0, 1) through a sigmoid function to obtain a characteristic weight wtimeW is to betimeBy Reshape operation (noted as Reshape)2) Changing the dimension from Q to Q × 1 × 1 × 1, and then using the feature weight as a guiding weight for the local and global images, i.e. the image feature F input firstinMultiplied by a weight wtimePlus FinThe final output of the time attention mechanism module is obtained as FtimeDimension of C × hx×wx,FtimeThe calculation formula of (2) is as follows:
wtime=Sigmoid(MLP1(Reshape1(Fchannel)))
Ftime=Fin+(Fin×Reshape2(wtime))。
step S4: as shown in fig. 2, a no-reference image quality evaluation network fusing local and global image features is constructed, and the no-reference image quality evaluation network is trained by using a training set.
In this embodiment, the step S4 specifically includes the following steps:
step S41: and establishing a backbone network based on one of the image classification networks comprising ResNet50 and ResNet101, and removing the last layer of the backbone network to serve as a feature extraction network.
Step S42: inputting a batch of images in the training set into the feature extraction network in step S41, and outputting image features F by the feature extraction networkbackboneDimension of C × hx×wxC is the channel number of the image characteristics; meanwhile, the image is input into a global and local image coordinate extraction module to obtain the coordinates of the upper left corner and the lower right corner of the local and global images.
Step S43: because the same image has a global image and a local image with different sizes, and the feature dimensions are required to be the same in the neural network batch processing stage, the image feature F to be outputbackboneAnd the corresponding coordinates Of the upper left corner and the lower right corner Of the local image and the global image obtained by the global and local image coordinate extraction module are input into a Region Of Interest correction module (Region Of Interest Align) together, so as to obtain the local and global image features F with the same dimensionalignAnd its dimension is C × poolsize × poolsize, where poolsize is the size of the image feature.
Step S44: feature F output in step S43alignInput to the time attention mechanism module constructed in step S3, and output F of the time attention mechanism is obtainedtimeThen F is puttimeInputting the image data into a Bidirectional Gate control circulation Unit (BiGRU) network to simulate the local sequence of the image when human beings evaluate the image quality, and obtaining an output FbigruIts dimension is QXC.
Step S45: for output F of step S44bigruFirst, a Reshape operation (denoted as Reshape) is employed3) Changing its dimension from Q × C to P, P being Q × C, and then changing FbigruInput into the last two fully-connected layers (denoted as MLP)2) Thereby obtaining a final image quality evaluation score FoutThe dimension is 1, the quality score of the picture is represented, and the calculation formula is as follows:
Fout=MLP2(Reshape3(Fbigru))。
step S46: calculating the loss function of the non-reference image quality evaluation network fusing local and global image characteristics as follows:
wherein m is the number of samples, yiRepresenting the true quality score of the image,representing the quality score of the image obtained by a non-reference image quality evaluation network fusing local and global image characteristics; the real quality scores of each global image and each local image are the same as the real quality scores of the images to which the global images and the local images belong.
Step S47: and (4) repeating the steps S42 to S46 by taking batches as units until the loss value calculated in the step S46 converges and tends to be stable, storing the network parameters, and finishing the training process of the non-reference image quality evaluation network fusing the local and global image characteristics.
Step S5: and inputting the images in the test set into a trained non-reference image quality evaluation network model, and outputting the quality scores corresponding to the images.
The embodiment provides a no-reference image quality evaluation system fusing local and global features, which comprises a memory, a processor and computer program instructions stored on the memory and capable of being executed by the processor, wherein when the computer program instructions are executed by the processor, the above-mentioned method steps can be realized.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.
Claims (7)
1. A no-reference image quality evaluation method fusing local and global features is characterized by comprising the following steps:
step S1: dividing a distorted image data set into a training set and a test set, and performing data preprocessing on images in the data set;
step S2: constructing a global and local image coordinate extraction module;
step S3: constructing a time attention mechanism module;
step S4: constructing a no-reference image quality evaluation network fusing local and global image characteristics, and training the no-reference image quality evaluation network by adopting a training set;
step S5: and inputting the images in the test set into a trained non-reference image quality evaluation network model, and outputting a corresponding image quality evaluation result.
2. The method for evaluating the quality of the reference-free image with the fused local and global features according to claim 1, wherein the step S1 specifically comprises the following steps:
step S11: matching the images in the distorted image data set with the corresponding labels;
step S12: dividing images in the distorted image data set into a training set and a test set according to a set proportion;
step S13: all images to be trained in the training set are scaled to a fixed size H multiplied by W;
step S14: carrying out uniform random overturning operation on the image processed in the step S13 to enhance the data;
step S15: the normalization process is performed on the image processed in step S14 and the images in the test set.
3. The method for evaluating the quality of a reference-free image by fusing local and global features according to claim 1, wherein in step S2, the global and local image coordinate extraction module performs the operation of taking global or local image coordinates as follows: dividing an image of size H x W into n2A disjoint global or local image of size h x w, whereinAnd then recording the coordinates of each pixel at the upper left corner and the lower right corner of the global or local image in the original image, wherein n is a set parameter, when n is 1, the global image is extracted, and n is the global image>Extracting a local image at 1 hour;
the global and local image coordinate extraction module repeatedly executes the global or local image coordinate extraction operation for N times, the parameter N is i when the ith execution time is executed, coordinates of the upper left corner and the lower right corner of the global or local image on N image scales are obtained, the coordinates of the upper left corner and the lower right corner of the global or local image on each image scale are spliced, and coordinate vectors (x) of the upper left corner and the lower right corner are obtainedl,yl,xr,yr) The vector has a dimension of Q, wherein Q is 1+22+32+…+N2。
4. The method for evaluating the quality of the reference-free image with the fused local and global features according to claim 3, wherein the step S3 specifically comprises the following steps:
step S31: the input of the time attention mechanism module is FinDimension of C × hx×wx(ii) a First changing the input features FinDimension of Qxcxhx×wxObtaining a characteristic FreshapeC is C/Q, and Q is the number of local and global images;
step S32: f in step S31reshapeSequentially feeding into space pooling layer and channel pooling layer, and mixing FreshapeInput into the spatial pooling layer to obtain an output FspatialWith dimensions Q × c × 1 × 1, FspatialThe calculation formula of (2) is as follows:
Fspatial=Maxpool(Freshape)+Avgpool(Freshape)
wherein Maxpool (×) represents the spatial maximum pooling layer with step size 1, and Avgpool (×) represents the spatial average pooling layer with step size 1;
then F is mixedspatialInput to the channel pooling layer to obtain an output FchannelWith dimensions Q × 1 × 1 × 1, FchannelThe calculation formula of (2) is as follows:
Fchannel=Conv1×1(Concat(CMaxpool(Fspatial),CAvgpool(Fspatial)))
wherein CMaxpool (. cndot.) represents the maximum pooling layer of the channel with step 1, CAvgpool (. cndot.) represents the average pooling layer of the channel with step 1, Concat (. cndot.) represents the splicing of features in a new dimension, and Conv1×1(. x) represents the convolution layer for dimensionality reduction with a convolution kernel size of 1 × 1;
step S33: f in step S32channelChanges its dimension from Q × 1 × 1 × 1 to Q by Reshape operation, and then changes F to QchannelInputting the images into two fully-connected layers, and obtaining the importance degree of the model to learn different global or local images by adopting an attention mechanism so as to determine which images in the local and global images have larger influence on the quality evaluation of the overall image; mapping the value to (0, 1) through a sigmoid function to obtain a characteristic weight wtimeW is to betimeChanging the dimension from Q to Q multiplied by 1 through Reshape operation, and then using the characteristic weight as a guiding weight for the local and global images, namely, the initially input image characteristic FinMultiplied by a weight wtimePlus FinThe final output of the time attention mechanism module is obtained as FtimeDimension of C × hx×wx,FtimeThe calculation formula of (2) is as follows:
wtimg=Sigmoid(MLP1(Reshape1(Fchannel)))
Ftime=Fin+(Fin×Reshape2(wtime))。
5. the method for evaluating the quality of the reference-free image with the fused local and global features according to claim 4, wherein the step S4 specifically comprises the following steps:
step S41: establishing a main network on the basis of one of image classification networks comprising ResNet50 and ResNet101, and removing the last layer of the main network to serve as a feature extraction network;
step S42: inputting a batch of images in the training set into the feature extraction network in step S41, and outputting image features F by the feature extraction networkbackboneDimension of C × hx×wxC is the number of channels of the image characteristics; meanwhile, inputting the image into a global and local image coordinate extraction module to obtain the coordinates of the upper left corner and the lower right corner of the local and global images;
step S43: because the same image has a global image and a local image with different sizes, and the feature dimensions are required to be the same in the neural network batch processing stage, the image feature F to be outputbackboneAnd the coordinates of the upper left corner and the lower right corner of the corresponding local image and the global image obtained by the global and local image coordinate extraction module are input into the interested region correction module together, so as to obtain the local image feature F and the global image feature F with the same dimensionalignThe dimension of the image is C × poolsize × poolsize, wherein poolsize is the size of the image feature;
step S44: feature F output in step S43alignInput to the time attention mechanism module constructed in step S3, and output F of the time attention mechanism is obtainedtimeAfter that F istimeInputting the image data into a bidirectional gate control circulation unit network to simulate the local sequential viewing of the image when human beings evaluate the image quality to obtain an output FbigruThe dimension is QxC;
step S45: for output F of step S44bigruFirst, the dimension is changed from Q × C to P, and then F is changed to P × Q × C by Reshape operationbigruInputting the image quality evaluation score into the last two fully-connected layers to obtain a final image quality evaluation score FoutThe dimension is 1, the quality score of the picture is represented, and the calculation formula is as follows:
Fout=MLP2(Reshape3(Fbigru))
step S46: calculating the loss function of the non-reference image quality evaluation network fusing local and global image characteristics as follows:
wherein m is the number of samples, yiRepresenting the true quality score of the image,representing the quality score of the image obtained by a non-reference image quality evaluation network fusing local and global image characteristics; the real quality score of each global image and each local image is the same as the real quality score of the image to which the global image and the local image belong;
step S47: and (4) repeating the steps S42 to S46 by taking batches as units until the loss value calculated in the step S46 converges and tends to be stable, storing the network parameters, and finishing the training process of the non-reference image quality evaluation network fusing the local and global image characteristics.
6. The method for evaluating the quality of the reference-free image fusing the local features and the global features according to claim 5, wherein in the step S5, the images in the test set are input into a trained reference-free image quality evaluation network model, and the quality scores corresponding to the images are output.
7. A reference-free image quality assessment system fusing local and global features, comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, the computer program instructions, when executed by the processor, being capable of implementing the method steps of any of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210326356.9A CN114742774A (en) | 2022-03-30 | 2022-03-30 | No-reference image quality evaluation method and system fusing local and global features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210326356.9A CN114742774A (en) | 2022-03-30 | 2022-03-30 | No-reference image quality evaluation method and system fusing local and global features |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114742774A true CN114742774A (en) | 2022-07-12 |
Family
ID=82280475
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210326356.9A Pending CN114742774A (en) | 2022-03-30 | 2022-03-30 | No-reference image quality evaluation method and system fusing local and global features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114742774A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115830028A (en) * | 2023-02-20 | 2023-03-21 | 阿里巴巴达摩院(杭州)科技有限公司 | Image evaluation method, device, system and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160148074A1 (en) * | 2014-11-26 | 2016-05-26 | Captricity, Inc. | Analyzing content of digital images |
CN111192200A (en) * | 2020-01-02 | 2020-05-22 | 南京邮电大学 | Image super-resolution reconstruction method based on fusion attention mechanism residual error network |
CN112085102A (en) * | 2020-09-10 | 2020-12-15 | 西安电子科技大学 | No-reference video quality evaluation method based on three-dimensional space-time characteristic decomposition |
WO2021134871A1 (en) * | 2019-12-30 | 2021-07-08 | 深圳市爱协生科技有限公司 | Forensics method for synthesized face image based on local binary pattern and deep learning |
CN113888501A (en) * | 2021-09-29 | 2022-01-04 | 西安理工大学 | Non-reference image quality evaluation method based on attention positioning network |
-
2022
- 2022-03-30 CN CN202210326356.9A patent/CN114742774A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160148074A1 (en) * | 2014-11-26 | 2016-05-26 | Captricity, Inc. | Analyzing content of digital images |
WO2021134871A1 (en) * | 2019-12-30 | 2021-07-08 | 深圳市爱协生科技有限公司 | Forensics method for synthesized face image based on local binary pattern and deep learning |
CN111192200A (en) * | 2020-01-02 | 2020-05-22 | 南京邮电大学 | Image super-resolution reconstruction method based on fusion attention mechanism residual error network |
CN112085102A (en) * | 2020-09-10 | 2020-12-15 | 西安电子科技大学 | No-reference video quality evaluation method based on three-dimensional space-time characteristic decomposition |
CN113888501A (en) * | 2021-09-29 | 2022-01-04 | 西安理工大学 | Non-reference image quality evaluation method based on attention positioning network |
Non-Patent Citations (2)
Title |
---|
牛玉贞: ""No-Reference image quality assessment based on multi-scale convolutional neural networks""", 《INTELLIGENT COMPUTING. 2019 COMPUTING CONFERENCE》, 17 July 2019 (2019-07-17) * |
牛玉贞: ""基于多尺度特征的无参考屏幕内容图像质量评估"", 《小型微型计算机系统》, 28 February 2022 (2022-02-28) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115830028A (en) * | 2023-02-20 | 2023-03-21 | 阿里巴巴达摩院(杭州)科技有限公司 | Image evaluation method, device, system and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112233038B (en) | True image denoising method based on multi-scale fusion and edge enhancement | |
Ignatov et al. | Dslr-quality photos on mobile devices with deep convolutional networks | |
CN108428227B (en) | No-reference image quality evaluation method based on full convolution neural network | |
US20080232707A1 (en) | Motion blurred image restoring method | |
US8908989B2 (en) | Recursive conditional means image denoising | |
CN111047543A (en) | Image enhancement method, device and storage medium | |
CN112785572B (en) | Image quality evaluation method, apparatus and computer readable storage medium | |
Kundu et al. | No-reference image quality assessment for high dynamic range images | |
CN111882555A (en) | Net detection method, device, equipment and storage medium based on deep learning | |
CN114742774A (en) | No-reference image quality evaluation method and system fusing local and global features | |
CN111047618A (en) | Multi-scale-based non-reference screen content image quality evaluation method | |
CN112801890B (en) | Video processing method, device and equipment | |
US9196025B2 (en) | Image processing apparatus, image processing method and image processing program | |
CN113706400A (en) | Image correction method, image correction device, microscope image correction method, and electronic apparatus | |
CN112348808A (en) | Screen perspective detection method and device | |
CN113658091A (en) | Image evaluation method, storage medium and terminal equipment | |
CN111724306A (en) | Image reduction method and system based on convolutional neural network | |
CN115937121A (en) | Non-reference image quality evaluation method and system based on multi-dimensional feature fusion | |
CN115798005A (en) | Reference photo processing method and device, processor and electronic equipment | |
CN110276744B (en) | Image splicing quality evaluation method and device | |
CN113935910A (en) | Image fuzzy length measuring method based on deep learning | |
Xu et al. | Joint learning of super-resolution and perceptual image enhancement for single image | |
CN115396743B (en) | Video watermark removing method, device, equipment and storage medium | |
CN110895801A (en) | Image processing method, device, equipment and storage medium | |
Ponomarenko et al. | Transfer learning for no-reference image quality metrics using large temporary image sets |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |