WO2022145769A1

WO2022145769A1 - Method and apparatus for calculating image quality through image classification

Info

Publication number: WO2022145769A1
Application number: PCT/KR2021/018219
Authority: WO
Inventors: 송철환
Original assignee: 오드컨셉 주식회사
Priority date: 2021-01-04
Filing date: 2021-12-03
Publication date: 2022-07-07
Also published as: KR20220098504A

Abstract

The present invention relates to a method and apparatus for calculating an image quality index and one objective of the present invention is to calculate a quality index of an image by varying the weight according to the type of image. To this end, the present invention comprises: step A of receiving at least one query image from a user terminal; step B of generating a feature vector by applying a first neural network model to the query image; step C of identifying a label of the query image on the basis of the feature vector and calculating a first quality index; and step D of calculating a third quality index of the query image by using the label and the first quality index.

Description

Image quality calculation method and apparatus through image classification

The present invention relates to a method and apparatus for calculating image quality through image classification, and more particularly, to a method and apparatus for calculating an image quality index by adjusting a weight of an image quality operation based on a label for an image.

As the demand for multimedia services such as images and video increases and portable multimedia devices are universally distributed, the need for processing and analysis technologies for vast amounts of multimedia data is growing. Among them, image filtering technology plays an important role in image processing and analysis technology.

The image filtering technology may use the quality index for each image, but the conventional technology for calculating the quality index has a limitation that the quality determined by the computer and the user may be different depending on the type of image. For example, the computer determines that the quality of the general photo and the pictorial photo is the same, but the user may recognize that the pictorial photo is of higher quality than the general photo.

An object of the present invention is to solve the above-mentioned problem, and to calculate the quality index of an image by varying the weight according to the type of the image.

The present invention for achieving this object is a step A of receiving at least one query image from a user terminal, a step B of generating a feature vector by applying a first neural network model to the query image, and the query based on the feature vector. It is characterized in that it comprises a step C of identifying a label of the image and calculating a first quality index, and a step D of calculating a third quality index of the query image using the label and the first quality index.

In addition, the present invention provides a query image input module that receives at least one query image from a user terminal, generates a feature vector by applying a first neural network model to the query image, and identifies a label of the query image based on the feature vector and an image analysis module for calculating a first quality index, and a quality index calculation module for calculating a third quality index of the query image using the label and the first quality index.

According to the present invention as described above, it is possible to calculate the quality index of the image by changing the weight according to the type of the image. Through this, the gap between the image quality judged by the computer and the image quality judged by humans can be narrowed.

1 is a block diagram showing the configuration of an image quality calculating device according to an embodiment of the present invention;

2 is a flowchart for explaining an image quality calculation method according to an embodiment of the present invention;

3 is a flowchart illustrating a process of learning a neural network model for image quality calculation according to an embodiment of the present invention.

The above-described objects, features and advantages will be described below in detail with reference to the accompanying drawings, and accordingly, those of ordinary skill in the art to which the present invention pertains will be able to easily implement the technical idea of the present invention. In describing the present invention, if it is determined that a detailed description of a known technology related to the present invention may unnecessarily obscure the gist of the present invention, the detailed description will be omitted.

In the drawings, the same reference numerals are used to indicate the same or similar elements, and all combinations described in the specification and claims may be combined in any manner. And unless otherwise provided, it is to be understood that references to the singular may include one or more, and references to the singular may also include plural expressions.

The terminology used herein is for the purpose of describing specific exemplary embodiments only and is not intended to be limiting. As used herein, singular expressions may also be intended to include plural meanings unless the sentence clearly indicates otherwise. The term “and/or,” “and/or” includes any and all combinations of the items listed therewith. The terms "comprises", "comprising", "comprising", "comprising", "having", "having", etc. have an implicit meaning, so that these terms refer to their described features, integers, It specifies steps, operations, elements, and/or components and does not exclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The steps, processes, and acts of the methods described herein should not be construed as necessarily performing their performance in such a specific order as discussed or exemplified, unless specifically determined to be an order of performance thereof. . It should also be understood that additional or alternative steps may be used.

In addition, each of the components may be implemented as a hardware processor, the above components may be integrated into one hardware processor, or the above components may be combined with each other and implemented as a plurality of hardware processors.

Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

1 is a diagram for explaining the configuration of an image quality calculating apparatus according to an embodiment of the present invention.

Referring to FIG. 1 , the image quality calculating device 10 may receive at least one query image and calculate a quality index for the query image. The image quality calculating device 10 may analyze the query image to identify a label corresponding to the query image, and calculate a quality index by changing a weight according to the identified label.

The image quality calculation device 10 may include a query image input module 100 , an image analysis module 200 , and a quality index calculation module 300 . The image quality calculation device 10 may process operations of the query image input module 100 , the image analysis module 200 , and the quality index calculation module 300 through at least one processor.

The query image input module 100 may receive at least one query image from the user terminal.

The image analysis module 200 may identify a label corresponding to the received query image. Specifically, the image analysis module 200 may classify the query image by extracting a feature vector from the query image and identifying a label of the query image based on the feature vector.

In order to identify a label corresponding to a query image in the image analysis module 200 according to an embodiment of the present invention, the first neural network model will be used. The first neural network model is trained based on machine learning.

The first neural network model according to an embodiment of the present invention will be based on a convolutional neural network (CNN). Convolutional neural networks are a class of multilayer perceptrons designed to use minimal preprocessing. A convolutional neural network consists of one or several convolutional layers and general artificial neural network layers on top of it, and additionally utilizes weights and pooling layers. Thanks to this structure, the convolutional neural network can fully utilize the input data of the two-dimensional structure.

The image analysis module 200 may include an encoder 230 and a decoder 260 as the first neural network model based on CNN is used. The encoder 230 may generate a feature vector representing detailed features in the received query image, and the decoder 260 may reconstruct data from the feature vector using a deconvolution layer.

The encoder 230 of the image analysis module 200 according to an embodiment of the present invention includes a convolution layer, an activation function layer (Relu layer), a dropout layer, and a max pooling layer (Max- It can be created by combining pooling layers).

The encoder 230 may use a conventional method such as a Scale Invariant Feature Transform (SIFT) algorithm to extract a feature vector of the received query image.

The decoder 260 may be generated by combining an upsampling layer, a deconvolution layer, a sigmoid layer, and a dropout layer.

The decoder 260 may identify a label corresponding to the query image based on the feature vector corresponding to the query image, and further calculate a first quality index of the query image.

The decoder 260 may normalize the feature vector by applying a softmax function to the feature vector of the query image. The softmax function is a function that provides normalization of the output value so that it can classify the output value used in the artificial neural network. The decoder 260 may apply a softmax function to classify the label of the query image based on the feature vector.

The decoder 260 may identify a label corresponding to the query image using a result of applying the softmax function to the feature vector. A label according to an embodiment of the present invention may include a portrait pictorial, an object (animal/landscape) pictorial, a cover photo, a portrait interview photo, a general photo, a general photo + text, a captured photo, a low-quality photo, and the like. Labels may be further added or deleted according to the administrator's settings.

For example, when an interview picture for Mr. A is input as a query image, the decoder 260 generates a feature vector for the query image and applies the softmax function to {(personal picture, 13%), (object ( Animal/Landscape) pictorial, 2%), (Cover photo, 7%), (person interview photo, 54%), (Normal photo, 17%), (Normal photo+text, 3%), (Capture photo, 3 %), (low quality photo, 1%)}.

The decoder 260 will select the label with the highest probability as the label corresponding to the query image.

Also, the decoder 260 may further calculate the first quality index of the query image based on the feature vector. The quality index is a value indicating the degree of quality of the query image, and the decoder 260 according to an embodiment of the present invention will use a conventional method in calculating the first quality index.

The quality index calculation module 300 may calculate a final quality index (third quality index) based on the label corresponding to the query image identified by the image analysis module 200 and the first quality index. The quality index calculation module 300 may calculate the third quality index by using the first quality index and the second quality index according to the label of the query image.

The quality index calculation module 300 may vary the second quality index according to the label of the query image. The quality index calculation module 300 according to an embodiment of the present invention may select a second quality index within the range by varying the range of the second quality index according to the priority of a table preset by the user.

For example, if the user sets the priority for the group to portrait photo>= object (animal/landscape) photo> cover photo>= person interview photo> general photo> normal photo + text> captured photo> low-quality photo, this In this case, the range of the second quality index of the person picture is (4, 5), the range of the second quality index of the object (animal/landscape) picture is (3.7, 4.7), and the range of the second quality index of the cover picture is (3.5) , 4), the range of the second quality index of interview photos is (3, 4), the range of the second quality index of general photos is (2.5, 3.5), and the range of the second quality index of general photos + text is ( 2, 3), the range of the second quality index of the captured photo may be set to (1, 2), and the range of the second quality index of the low-quality photo may be set to (0, 1).

The second quality index corresponding to each label according to an embodiment of the present invention will be selected by the second neural network model within a preset range as described above. The second neural network model may precede learning with a training data set to select a second quality index. The learning process of the second neural network model will be described in detail below.

The quality index calculation module 300 is a third quality index (final quality index) based on the first quality index of the query image calculated in the first neural network model for the query image and the second quality index calculated in the second neural network model can create

The quality index calculation module 300 may generate a third quality index by adding the first quality index and the second quality index. For example, when the first quality index of the query image is 0.8, the label of the query image is a portrait of a person, and the second quality index is 4, the quality index calculation module 300 sets the third quality index to 0.8 ( 1 quality index) + 4 (second quality index) = 4.8.

The image analysis module 200 and the quality index calculation module 300 according to an embodiment of the present invention may be trained through the following process.

The image analysis module 200 and the quality index calculation module 300 may be learned through supervised learning. In supervised learning, a model is trained in a state in which a label (correct answer) for training data is given.

The image analysis module 200 may receive a training data set from the user terminal. The training data set will include a training image, a first label (type of image) of the training image, and a fourth quality index of the training image. For example, the training data may have the form (training image, first label, fourth quality index).

The image analysis module 200 may generate a feature vector for the training image through the first neural network model, and identify a second label corresponding to the training data based on this.

The image analysis module 200 may compare with the first label for the training image included in the training data set to determine whether the second label is correctly identified. The image analysis module 200 will train the first neural network model by giving a positive feedback to the first neural network model if the first label and the second label are the same, and a negative feedback if not the same.

The quality index calculation module 300 may calculate the final quality index (the fifth quality index) for the training data by using the second neural network model. The quality index calculation module 300 may calculate a fifth quality index for the training image through the feature vector and the second label generated by the image analysis module 200 through the training data set.

The quality index calculation module 300 may calculate a loss (loss) based on the fifth quality index for the training image and the fourth quality index included in the training data set. The quality index calculation module 300 may train the second neural network model by adjusting a parameter for selecting the second quality index for each label of the second neural network model based on the loss value.

Through this process, the image analysis module 200 and the quality index calculation module 300 according to an embodiment of the present invention may more accurately calculate the third quality index for the query image.

2 is a flowchart illustrating an image quality calculation method according to an embodiment of the present invention. Hereinafter, an image quality calculation method will be described with reference to FIG. 2 . In the description of the image quality calculation method, a detailed embodiment overlapping with the image quality calculation apparatus described above may be omitted.

The image quality calculating device (hereinafter, the electronic device) may receive at least one query image from the user terminal ( S110 ). The electronic device may analyze the received query image to identify a label corresponding to the query image, and calculate a quality index by varying a weight according to the label. The electronic device may use the first neural network model to analyze the query image to identify a label corresponding to the query image, and use the second neural network model to calculate a quality index by changing a weight according to the label.

The electronic device may extract a feature vector of the received query image (S120). The electronic device may use a conventional method such as a Scale Invariant Feature Transform (SIFT) algorithm to extract a feature vector of a query image.

The electronic device may identify a label corresponding to the query image based on the feature vector ( S130 ). The electronic device may identify a label corresponding to the query image by normalizing the feature vector by applying a Softmax function to the feature vector. The electronic device may select a label having the highest probability among the result values of the softmax function for the feature vector as a label corresponding to the query image.

The label according to an embodiment of the present invention may include a person picture, an object (animal/landscape) picture, a cover picture, a person interview picture, a general picture, a general picture + text, a captured picture, a low quality picture, etc. It may be added or deleted according to the setting of .

The electronic device may further calculate the first quality index of the feature vector of the query image extracted in step 120 ( S135 ). The quality index is a value indicating the degree of quality of the query image, and the electronic device will use a conventional method in calculating the first quality index.

The electronic device may calculate a final quality index (third quality index) based on the label of the query image obtained in steps 130 and 135 and the first quality index ( S140 ). The electronic device may calculate the third quality index by using the first quality index and the second quality index according to the label of the query image.

The electronic device may vary the second quality index according to the label of the query image. The electronic device according to an embodiment of the present invention may select a second quality index within the range by varying the range of the second quality index according to the priority of a label preset by the user.

The electronic device may calculate the third quality index by adding the first quality index to the second quality index selected according to the label of the query image. For example, if the first quality index of the query image is 0.8, the label of the query image is a portrait of a person, and the second quality index is 4, the electronic device sets the third quality index to 0.8 (first quality index) + 4 (Second quality index) = 4.8.

The quality index of an image generated through the image quality calculation method according to an embodiment of the present invention may be applied to services such as image filtering and ranking.

3 is a flowchart illustrating a learning process of a neural network model of an image quality calculating apparatus according to an embodiment of the present invention. The electronic device may learn the neural network model through supervised learning, and referring to FIG. 3 , the electronic device may receive a training data set from the user terminal ( S210 ). The training data set will include a training image, a first label (type of image) of the training image, and a fourth quality index of the training image. For example, the training data may have the form (training image, first label, fourth quality index).

The electronic device may generate a feature vector for the training image through the first neural network model, and identify a second label corresponding to the training data based on this ( S220 ).

The electronic device may determine whether the second label is correctly identified by comparing the second label with the first label for the training image included in the training data set ( S230 ). The electronic device will train the first neural network model by giving a positive feedback to the first neural network model if the first label and the second label are the same, and a negative feedback to the first neural network model if they are not identical ( S240 ).

The electronic device may calculate a final quality index (fifth quality index) for the training data using the second neural network model ( S250 ). The electronic device may calculate a fifth quality index for the training image based on the feature vector and the second label generated by the image analysis module 200 through the training data set.

The electronic device may calculate a loss (loss) based on the fifth quality index for the training image and the fourth quality index included in the training data set ( S260 ).

The electronic device may train the second neural network model by adjusting a parameter for selecting the second quality index for each label of the second neural network model based on the loss value ( S270 ).

Through this process, the electronic device according to an embodiment of the present invention may more accurately calculate the third quality index for the query image.

The embodiments of the present invention disclosed in the present specification and drawings are merely provided for specific examples to easily explain the technical content of the present invention and help the understanding of the present invention, and are not intended to limit the scope of the present invention. It will be apparent to those of ordinary skill in the art to which the present invention pertains that other modifications based on the technical spirit of the present invention can be implemented in addition to the embodiments disclosed herein.

Claims

A method for an electronic device to calculate an image quality index, the method comprising:

A step of receiving at least one query image from the user terminal;

Step B of generating a feature vector by applying a first neural network model to the query image;

Step C of identifying a label of the query image based on the feature vector and calculating a first quality index: And

and D of calculating a third quality index of the query image by using the label and the first quality index.
The method of claim 1, wherein the C step,

normalizing the feature vector by applying a softmax function;

Image quality index calculation method further comprising the step of selecting a label having the highest probability in the normalization result value as a label corresponding to the query image.
According to claim 1, In step D,

An image quality index calculation method for calculating a third quality index by applying a second neural network model to the label and the first quality index.
4. The method of claim 3,

A third quality index is calculated by adding the first quality index and the second quality index according to the label,

The second quality index has a different range according to the priority of a preset label, and a value within the range is selected by a second neural network model.
The method of claim 1,

The first neural network model is trained based on a deep learning-based convolutional neural network,

receiving a training data set comprising a training image, a first label of the training image, and a fourth quality index;

generating a feature vector for the training image by applying a first neural network model to the training image;

identifying a second label according to the training image based on a feature vector for the training image;

Comparing a first label and a second label, the image quality index calculation method comprising the step of transmitting a feedback to the first neural network model according to the result.
6. The method of claim 5,

The second neural network model is

calculating a fifth quality index (final quality index) for the training image based on the feature vector and the second label for the training image;

calculating a loss value based on the fourth quality index and the fifth quality index;

and adjusting a parameter for selecting a second quality index of a second neural network model based on the loss value.
a query image input module for receiving at least one query image from the user terminal;

An image analysis module that generates a feature vector by applying a first neural network model to the query image, identifies a label of the query image based on the feature vector, and calculates a first quality index: And

and a quality index calculation module for calculating a third quality index of the query image by using the label and the first quality index.