CN116052090A

CN116052090A - Image quality evaluation method, model training method, device, equipment and medium

Info

Publication number: CN116052090A
Application number: CN202211607912.6A
Authority: CN
Inventors: 张洪; 肖嵘; 王孝宇
Original assignee: Shenzhen Intellifusion Technologies Co Ltd
Current assignee: Shenzhen Intellifusion Technologies Co Ltd
Priority date: 2022-12-14
Filing date: 2022-12-14
Publication date: 2023-05-02

Abstract

The present invention relates to the field of artificial intelligence technologies, and in particular, to an image quality evaluation method, a model training method, a device, equipment, and a medium. The method comprises the steps of preprocessing an initial image, training a classification model by using a sample image and a label set thereof obtained by preprocessing, inputting the sample image into the trained classification model to obtain probability distribution vectors of each image attribute dimension, splicing the probability distribution vectors of all the image attribute dimensions, training a sub-model corresponding to the vehicle attribute dimension in a quality assessment model by using a splicing result and an assessment score label under the vehicle attribute dimension to obtain a trained quality assessment model, training the classification model by adopting a classification task of multiple image attribute dimensions, improving the feature extraction capability of the classification model, taking the splicing result of the probability distribution vectors of the multiple image attribute dimensions as an input sample of the quality assessment model, and improving the accuracy of image quality assessment.

Description

Image quality evaluation method, model training method, device, equipment and medium

Technical Field

The present invention relates to the field of artificial intelligence technologies, and in particular, to an image quality evaluation method, a model training method, a device, equipment, and a medium.

Background

Motor vehicles are important targets in urban management and can be used to learn about the condition of vehicles in cities by identifying motor vehicle attributes (common attributes are brands, colors, types, such as BMW-Black-Car). The quality of the motor vehicle image is a key factor affecting the identification of the vehicle attributes, and for images with poor quality which are not suitable for the identification of the vehicle attributes, filtering through a quality model is needed.

However, for different vehicle attributes, the corresponding quality is not the same. For example, when the side of the vehicle body faces the camera, the color, type, but generally the brand cannot be identified; for another example, when the logo is blocked, the color type can be identified, but brands cannot be identified generally, and the existing image quality evaluation method generally directly outputs a single quality score, so that the accuracy of image quality evaluation under the dimension of multiple vehicle attributes is lower. Therefore, how to improve the accuracy of image quality evaluation is a problem to be solved.

Disclosure of Invention

In view of the above, the embodiments of the present invention provide an image quality evaluation method, a model training method, a device, equipment and a medium, so as to solve the problem that the accuracy of image quality evaluation is low.

In a first aspect, an embodiment of the present invention provides a training method for an image quality evaluation model, where the training method includes:

preprocessing an obtained initial image containing a target vehicle to obtain a sample image and a corresponding label set thereof, wherein the label set comprises class labels corresponding to image attribute dimensions, and the image attribute dimensions comprise an angle dimension and a vehicle main body dimension;

training the classification model by using the sample image and the label set to obtain a trained classification model;

inputting the sample image into the trained classification model to obtain probability distribution vectors of each image attribute dimension, and splicing all probability distribution vectors of all image attribute dimensions to obtain an input vector of the sample image;

the quality evaluation model is obtained, the quality evaluation model comprises at least one sub-model corresponding to a preset vehicle attribute dimension, each vehicle attribute dimension is traversed, an evaluation score label of the sample image under the vehicle attribute dimension is obtained, and the sub-model corresponding to the vehicle attribute dimension is trained by using the input vector and the evaluation score label, so that a trained quality evaluation model is obtained.

In a second aspect, an embodiment of the present invention provides an image quality evaluation method, including:

acquiring an image to be evaluated, and inputting the image to be evaluated into a trained vehicle attribute recognition model to obtain a recognition result and a confidence coefficient of corresponding vehicle attribute dimensions;

inputting the images to be evaluated into a trained classification model to obtain reference probability distribution vectors of each image attribute dimension, and splicing all the reference probability distribution vectors of all the image attribute dimensions to obtain the reference vectors of the images to be evaluated;

inputting the reference vector into a sub-model corresponding to the attribute dimension of the vehicle in the trained quality evaluation model to obtain a prediction evaluation score;

multiplying the confidence and the predictive assessment score, determining that the multiplied result is a quality assessment result corresponding to the vehicle attribute dimension, the trained classification model and the trained quality assessment model each being obtained based on the image quality assessment model training method according to any one of claims 1-6.

In a third aspect, an embodiment of the present invention provides an image quality evaluation model training apparatus, including:

The preprocessing module is used for preprocessing an obtained initial image containing a target vehicle to obtain a sample image and a corresponding label set thereof, wherein the label set comprises class labels corresponding to image attribute dimensions, and the image attribute dimensions comprise an angle dimension and a vehicle main body dimension;

the classification model training module is used for training the classification model by using the sample image and the label set to obtain a trained classification model;

the vector splicing module is used for inputting the sample image into the trained classification model to obtain probability distribution vectors of each image attribute dimension, and splicing all probability distribution vectors of all image attribute dimensions to obtain an input vector of the sample image;

the quality evaluation model training module is used for acquiring a quality evaluation model, wherein the quality evaluation model comprises at least one sub-model corresponding to a preset vehicle attribute dimension, traversing each vehicle attribute dimension, acquiring an evaluation score label of the sample image under the vehicle attribute dimension, and training the sub-model corresponding to the vehicle attribute dimension by using the input vector and the evaluation score label to obtain a trained quality evaluation model.

In a fourth aspect, an embodiment of the present invention provides a computer device, the computer device comprising a processor, a memory, and a computer program stored in the memory and executable on the processor, the processor implementing the model training method according to the first aspect when executing the computer program.

In a fifth aspect, embodiments of the present invention provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the model training method according to the first aspect.

Compared with the prior art, the embodiment of the invention has the beneficial effects that:

the method comprises the steps of preprocessing an obtained initial image containing a target vehicle to obtain a sample image and a label set corresponding to the sample image, training a classification model to obtain a trained classification model, inputting the sample image into the trained classification model to obtain probability distribution vectors of each image attribute dimension, splicing all probability distribution vectors of all image attribute dimensions to obtain input vectors of the sample image, obtaining a quality assessment model, wherein the quality assessment model comprises at least one sub-model corresponding to a preset vehicle attribute dimension, traversing each vehicle attribute dimension to obtain an assessment score label of the sample image under the vehicle attribute dimension, training the sub-model corresponding to the vehicle attribute dimension by using the input vectors and the assessment score label to obtain a trained quality assessment model, training the classification model by adopting classification tasks of multiple image attribute dimensions to improve the feature extraction capability of the classification model, and taking the splicing result of the probability distribution vectors of the multiple image attribute dimensions as an input sample of the quality assessment model to improve the accuracy of quality assessment of each vehicle attribute.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic view of an application environment of an image quality assessment model training method according to an embodiment of the present invention;

FIG. 2 is a flowchart of an image quality assessment model training method according to an embodiment of the present invention;

fig. 3 is a flowchart of an image quality evaluation method according to a second embodiment of the present invention;

fig. 4 is a schematic structural diagram of an image quality evaluation model training device according to a third embodiment of the present invention;

fig. 5 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

As used in the present description and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".

Furthermore, the terms "first," "second," "third," and the like in the description of the present specification and in the appended claims, are used for distinguishing between descriptions and not necessarily for indicating or implying a relative importance.

Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the invention. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

The embodiment of the invention can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

It should be understood that the sequence numbers of the steps in the following embodiments do not mean the order of execution, and the execution order of the processes should be determined by the functions and the internal logic, and should not be construed as limiting the implementation process of the embodiments of the present invention.

In order to illustrate the technical scheme of the invention, the following description is made by specific examples.

The image quality evaluation model training method provided by the embodiment of the invention can be applied to an application environment as shown in fig. 1, wherein a client communicates with a server. The client includes, but is not limited to, a palm top computer, a desktop computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a cloud terminal device, a personal digital assistant (personal digital assistant, PDA), and other computer devices. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms. The server may include at least one image acquisition device, where the image acquisition device includes, but is not limited to, a camera, a video camera, and a camera, and the server acquires an initial image including a target vehicle through the image acquisition device, where a deployment scene of the image acquisition device may be a practical application scene such as a road scene and a parking scene, the target vehicle may be a motor vehicle, the client may be used to train an image quality evaluation model, an evaluation object of the trained image quality evaluation model may be an image including the target vehicle, and the trained image quality evaluation model may be an image to be identified that has a better image quality for attribute identification of the motor vehicle, so that the embodiment may be widely applied in specific scenes suitable for multiple vehicle attribute identification such as vehicle violation identification, urban vehicle information statistics, and parking anomaly identification.

Referring to fig. 2, a flowchart of an image quality assessment model training method provided by an embodiment of the present invention is shown, where the image quality assessment model training method may be applied to a client in fig. 1, where a computer device corresponding to the client is connected to a server to obtain an initial image including a target vehicle, and the client includes an initialized classification model and an initialized quality assessment model, and trains the initialized classification model and the initialized quality assessment model to obtain a trained quality assessment model applicable to multiple vehicle attribute dimensions. As shown in fig. 2, the image quality assessment model training method may include the steps of:

step S201, preprocessing an acquired initial image including a target vehicle to obtain a sample image and a corresponding label set thereof.

The target vehicle may refer to a vehicle that needs to be applied to vehicle attribute identification, in this embodiment, the target vehicle may be a motor vehicle, the initial image may refer to an original image acquired by the image acquisition device, the initial image includes the target vehicle, the sample image may refer to a training sample for training a subsequent classification model and a quality evaluation model, the label set includes class labels corresponding to image attribute dimensions, the image attribute dimensions may include an angle dimension and a vehicle body dimension, the class labels corresponding to the angle dimension may be angle class labels, the angle class labels may include at least one preset angle class, the angle class may be used to represent angle information between the direction of the target vehicle and an imaging plane of the image acquisition device, the class labels corresponding to the vehicle body dimension may include a body definition class, the body definition class is two classes in this embodiment, one class is a body definition class, and the other class is a body ambiguity class.

Specifically, the image capturing device captures an original image in an application scene deployed by the image capturing device, where the application scene may be a scene related to vehicle attribute recognition, such as a road scene, a parking scene, or the like, and in this embodiment, the image capturing device captures the original image at a fixed frame rate, which may be thirty frames per second, and after capturing the original image, the trained image classification model may be used to perform preliminary screening on the original image to obtain an initial image including the target vehicle.

The trained image classification model can be implemented by adopting depth convolution networks such as Resnet and VGG, the trained image classification model can comprise a trained encoder and a trained classifier, the trained encoder can comprise a plurality of layers of convolution layers, the input of the trained encoder is a single Zhang Yuanshi image, the output of the trained encoder is an image feature vector corresponding to an original image, the trained classifier can be implemented by adopting a full-connection layer, the input of the trained classifier is an image feature vector, and the output of the trained classifier is an image class.

The training data set of the image classification model may include a plurality of original images collected by the image collection device at different time points, and because the classification task of the image classification model is simpler, the labeling cost is lower, each original image is manually labeled to obtain an image class label of the image, the original image is input into the image classification model to obtain an output predicted image class, the image classification loss is obtained by calculation according to the predicted image class, the image class label and a preset classification loss function, the image classification model is iteratively trained by adopting a gradient descent method based on the image classification loss until the image classification loss converges to obtain a trained image classification model, and because in the embodiment, the classification task of the image classification model is two classes, the preset classification loss function can be selected as a binary cross entropy loss function.

In an embodiment, the original image may be initially screened by using a trained target detection model, and compared with a trained image classification model, a trained predictor is additionally added to the trained target detection model, where the trained predictor may be used for predicting bounding box information of the target vehicle, and the trained target detection model may be selected from an R-CNN model, a YOLO model, and the like. The input of the trained predictor is also an image feature vector, but the output of the trained predictor is a predicted bounding box of the target vehicle, and the predicted bounding box can be characterized by the upper left corner and the lower right corner of the predicted bounding box of the target vehicle. Through the target detection model that has been trained, whether can include the target vehicle in the original image and screen still can carry out further screening according to the size of target vehicle bounding box, under the general circumstances, the deployment position appearance of image acquisition device is fixed, when gathering the original image of moving target vehicle, the original image that only gathering the regional primitive image of target vehicle, for example, the target vehicle is gathered in the formation of image scope process of driving into and driving image acquisition device, at this moment, through setting up the area threshold value, can screen once more for original image, in the original image that the image category is including the target vehicle category, compare the bounding box area of target vehicle with the area threshold value of predetermineeing, remain the original image that is greater than the bounding box area of target vehicle and correspond as initial image, it is to be noted, because only screen once more through the area threshold value, in order to avoid missing effective initial image, set up the area threshold value less, for example can set up to 1/12 of formation of image size in this embodiment.

Optionally, the image attribute dimension further includes a truncated attribute dimension;

preprocessing the acquired initial image containing the target vehicle to obtain a sample image and a corresponding label set thereof, wherein the step of obtaining the sample image comprises the following steps of:

inputting the initial image into a trained segmentation model to obtain a segmentation area;

and carrying out truncation processing on the segmentation area according to a preset truncation proportion to obtain at least two truncated images and the truncation proportion thereof, wherein the truncated images are used as sample images, and the truncation proportion is used as a class label of a truncation attribute dimension.

The class labels corresponding to the cutoff attribute dimensions may be cutoff proportion classes, the cutoff proportion classes may include at least one cutoff proportion, the trained segmentation model may be a trained semantic segmentation model or a trained instance segmentation model, and the segmentation region may refer to a region formed by all pixels corresponding to the target vehicle in the initial image.

The truncation process may be referred to as percentage truncation, which essentially retains the pixel values of the pixels of the initial image that belong to the target gray value interval, and adjusts the pixel values of the pixels of the initial image that do not belong to the target gray value interval to a preset value. The truncated image may refer to an image obtained by performing a truncation process on a divided region in the initial image, and the truncation ratio may include a truncation upper limit ratio and a truncation lower limit ratio.

Specifically, in this embodiment, the trained segmentation model adopts a trained instance segmentation model, the trained instance segmentation model can segment the same object in the initial image, for example, the initial image includes a plurality of target vehicles, each target vehicle corresponds to different marking values in the segmentation result output by the trained instance segmentation model, and all target vehicles correspond to the same marking value in the segmentation result output by the trained semantic segmentation model, obviously, the trained instance segmentation model can segment and extract a single target vehicle more accurately to improve the effect of subsequent preprocessing.

Cutting off the divided area according to a preset cut-off upper limit proportion, wherein in the embodiment, the cut-off upper limit proportion can be set to be 2%, all pixels in the divided area are arranged according to the sequence of gray values corresponding to the pixels from small to large to obtain a gray value sequence, the total number of all the pixels in the divided area and the preset cut-off upper limit proportion are multiplied, the multiplication result is subjected to rounding operation, the result of the rounding operation is determined to be the number of the pixels needing cutting off, and the number K of the pixels needing cutting off is determined ₁ Selecting a post K in a sequence of gray values ₁ The pixel point corresponding to the gray value is taken as a cut-off pixel point, and the cut-off pixel point is the pixel point which does not belong to the target gray value interval, and at the moment, the target gray value interval can be obtained by counting the gray values of the reserved pixel points.

Similarly, the segmentation area is truncated according to a preset lower limit ratio, in this embodiment, the lower limit ratio may be set to 4%, all pixels in the segmentation area are arranged according to the sequence from small to large gray values corresponding to the pixels, a gray value sequence is obtained, the total number of all pixels in the segmentation area and the preset lower limit ratio are multiplied, a rounding operation is performed on the multiplication result, the result of the rounding operation is determined to be the number of pixels to be truncated, and the number of pixels to be truncated is determined to be K according to the requirement ₂ Selecting the first K in the sequence of gray values ₂ And the pixel points corresponding to the gray values are taken as truncated pixel points.

In this embodiment, the preset value may be set to zero, and then the truncated process is to perform the zero setting operation on the gray values of all truncated pixel points, so as to obtain a truncated image.

In an embodiment, the segmentation area can be directly truncated according to a preset upper limit proportion and a preset lower limit proportion to obtain a single truncated image, and the preset upper limit proportion and the preset lower limit proportion are used as class labels of the truncated attribute dimension.

In this embodiment, the initial image is preprocessed through the truncation operation, so that in the subsequent training process of the classification model, the gray value distribution information of the target vehicle region in the initial image can be learned, the training quality of the classification model is improved, and the classification accuracy of the trained classification model under the dimension of the truncation attribute is also improved.

Optionally, the image attribute dimension further includes an occlusion attribute dimension;

carrying out shielding treatment on the initial image by adopting a preset shielding area to obtain a shielding image;

calculating a difference value between a first area of the target vehicle in the initial image and a second area of the target vehicle in the shielding image, and comparing the difference value with the first area to obtain a shielding proportion;

and taking the shielding image as a sample image, and taking the shielding proportion as a category label of shielding attribute dimension.

The class labels corresponding to the occlusion attribute dimensions may be occlusion proportion classes, the occlusion proportion classes may include at least one occlusion proportion, occlusion processing may refer to that a mask image and an initial image are multiplied to occlude part of pixel points in the initial image, and an occlusion region may refer to a region obtained by composing pixel points to be occluded in the initial image.

The first area may be a first number counted by all pixels corresponding to the target vehicle in the initial image, the second area may be a second number counted by all pixels corresponding to the target vehicle in the occlusion image, and the difference is a difference between the first number and the second number, and the occlusion ratio may be used to represent a ratio of the number of occluded pixels to the number of all pixels corresponding to the target vehicle.

Specifically, in order to reduce the calculation amount, the initial image can still be input into a trained segmentation model to obtain a segmentation region during the occlusion processing, and then the segmentation region is subjected to the occlusion processing, so that the ineffective calculation of the background pixel points is avoided.

The size of the mask image is consistent with that of the initial image, namely, the pixel points in each initial image are provided with corresponding pixel points in the mask image, the pixel value of the pixel point which is required to be blocked in the mask image is 1, the pixel value of the pixel points which are not required to be blocked in the mask image is 0, when the mask image and the initial image are multiplied by each pixel point, the pixel point of the pixel value which is 1 in the mask image is multiplied by the pixel value of the corresponding pixel point in the initial image, the obtained pixel value of the corresponding pixel point in the initial image is still, namely, the corresponding pixel point is not blocked, the pixel value of the pixel point which is 0 in the mask image is multiplied by the pixel value of the corresponding pixel point in the initial image, the obtained pixel value of the corresponding pixel point is changed to be 0 in the initial image, namely, the pixel value of the corresponding pixel point which is blocked to be 0 in the final obtained blocked image is consistent with the pixel value of the pixel point which is not blocked in the initial image.

In this embodiment, the occlusion proportion is obtained by adopting a statistical calculation manner, at this time, an operator can directly set a fixed occlusion region, the occlusion proportion of the operator to the target vehicle region is determined according to the occlusion result of the occlusion region, and in the image processing, the region area can be represented by the number of all pixel points in the region.

In an embodiment, the shielding proportion can be directly set by people, at this time, the shielding area needs to be adjusted in real time according to the information of the target vehicle area and the shielding proportion in the initial image, so as to meet the preset shielding proportion, and a shielding image with better shielding effect can be obtained, but the consumption of human resources needs to be improved.

In this embodiment, the initial image is preprocessed through the occlusion operation, so that in the subsequent training process of the classification model, the shape information of the target vehicle region in the initial image can be learned, the training quality of the classification model is improved, and the classification accuracy of the trained classification model under the occlusion attribute dimension is also improved.

Optionally, the image attribute dimension further includes a sharpness attribute dimension;

Image compression is carried out on the initial image according to a preset compression ratio, and a compressed image is obtained;

the compressed image is used as a sample image, and the compression ratio is used as a category label of the definition attribute dimension.

The class label corresponding to the sharpness attribute dimension may be a compression ratio class, where the compression ratio class may include at least one compression ratio, and in this embodiment, image compression refers only to a compression operation on an image size, and compresses an initial image from an original size to a target size.

The compressed image may refer to an image in which the original image is compressed to a target size, and the compression ratio may be characterized by a ratio of the target size to the original size.

Specifically, the pixel values of the pixels in the compressed image may be obtained in a pooling manner during image compression, the pooling manner may be a maximum pooling manner, a mean pooling manner and a weighted mean pooling manner, in this embodiment, the image compression is performed in a weighted mean pooling manner, for each pixel in the compressed image, the pixel corresponds to at least one pixel in the initial image, and it is obvious that, under a preset compression ratio, one pixel in the compressed image may not perfectly correspond to a group of pixels in the initial image, for example, in an ideal case, an upper left corner in the compressed image corresponds to a group of pixels in an upper left corner 3*3 area in the initial image, and at this time, the obtaining of the upper left corner pixel value may be performed directly in a mean pooling manner.

In this embodiment, the compression processing is performed in a superpixel manner, each pixel point is regarded as a point of 1*1 size, one pixel point in the compressed image may completely cover a part of the pixel points and not completely cover a part of the pixel points when corresponding to the initial image, an intersection area of the coverage area and the pixel point area is calculated, a ratio of an area of the intersection area to an area of the pixel point area is taken as a weight value of the corresponding pixel point, for example, a pixel point in which a corresponding area of an upper left corner point in the compressed image in the initial image is 105 x 1.5 area is taken as an upper left corner point, the coverage area only completely covers a pixel point in the initial image with an image coordinate of (1, 1), an intersection area of the pixel point area of (1, 1) and the coverage area is 1, the pixel area of each pixel is 1, the weight of the pixel with the image coordinate of (1, 1) is 1, the pixel with the image coordinate of (1, 2) and the pixel with the image coordinate of (2, 1) is weighted according to the corresponding weight, the intersection area of the pixel area and the coverage area is 0.5, the weight of the pixel with the image coordinate of (1, 2) and the pixel with the image coordinate of (2, 1) is 0.5, the intersection area of the pixel area and the coverage area is 0.25, the weight of the pixel with the image coordinate of (2, 2) is 0.25, the pixel with the coverage area covering all the pixels is weighted according to the corresponding weight, the pixel with the upper left corner in the compressed image is obtained, for example, if the pixel with the image coordinate of (1, 1) is 1, the pixel with the image coordinate of (2), 1) If the pixel value of the compressed pixel is 1, the pixel value of the pixel with the image coordinates of (1, 2) is 1, and the pixel value of the pixel with the image coordinates of (2, 2) is 2, the pixel value of the compressed pixel can be calculated by the formula (1×1+1×0.5+1×0.5+20.25)/(1.5×1.5), the pixel value of the compressed pixel is 10/9, and it can be seen that the coverage area is used as the denominator in the weighted average calculation.

It should be noted that, in order to facilitate training of the classification model, after obtaining the compressed image, the compressed image may be expanded, that is, the compressed image may be expanded by using zero elements, so that the compressed image after expansion is consistent with the initial image, so as to ensure that the compressed image can be adapted to the input size of the classification model when the compressed image is used as a sample image.

In this embodiment, the initial image is preprocessed through the image compression operation, so that in the subsequent training process of the classification model, the definition information of the initial image can be learned, the training quality of the classification model is improved, and the classification accuracy of the trained classification model under the definition attribute dimension is also improved.

Optionally, the image attribute dimension further includes a brightness attribute dimension;

preprocessing the acquired initial image to obtain a sample image and a corresponding label set thereof, wherein the step of obtaining the sample image comprises the following steps of:

converting the initial image from an RGB color space to an HSV color space to obtain a converted image, and obtaining a sub-image of the converted image under a brightness channel to obtain a first brightness image;

performing brightness adjustment on the first brightness image according to a preset brightness adjustment proportion to obtain a second brightness image;

And taking the second brightness image as a sample image, and taking the brightness adjustment proportion as a category label of the brightness attribute dimension.

The class labels corresponding to the brightness attribute dimensions may be brightness adjustment proportion classes, the brightness adjustment proportion classes may include at least one brightness adjustment proportion, the converted image may refer to an image obtained by converting the initial image into an HSV color space, the first brightness image may refer to a brightness channel sub-image of the converted image, and the second brightness image may refer to an image obtained by brightness adjustment of the first brightness image.

Specifically, the initial images acquired by the default image acquisition device all belong to an RGB color space, in order to extract brightness information of the initial images, the initial images are converted from the RGB color space to an HSV color space according to a preset mapping relationship, the mapping relationship comprises a conversion relationship of an RGB channel value and an HSV channel value, and a converted V channel is a brightness channel.

In this embodiment, the brightness adjustment ratio may be set to 0.5, that is, each brightness value in the first brightness image is multiplied by 0.5 to obtain the second brightness image, and it should be noted that, in brightness adjustment, the practitioner should pay attention to that the adjusted brightness value should be within the range of [0, 255 ].

In this embodiment, the first luminance image corresponding to the initial image is preprocessed through the luminance adjustment operation, so that luminance information of the initial image can be learned in a subsequent classification model training process, training quality of the classification model is improved, and classification accuracy of the trained classification model under the dimension of luminance attribute is also improved.

The step of preprocessing the obtained initial image containing the target vehicle to obtain the sample image and the corresponding label set thereof, and the sample image under the multiple image attribute dimension is obtained by the preprocessing mode of the multiple image attribute dimension, so that the training quality of the subsequent classification model is improved conveniently.

Step S202, training the classification model by using the sample image and the label set to obtain a trained classification model.

The classification model may include an encoder, a classifier, a MobileNet, resNet, convNeXt, viT encoder, and the like, and the classifier may be implemented by a full connection layer.

Specifically, a sample image is input into a classification model, a prediction classification result under an image attribute dimension is obtained through calculation after the model, a preset classification loss function is input into the prediction classification result under the image attribute dimension and a classification label under the image attribute dimension, the classification loss function can adopt a cross entropy loss function to obtain classification loss through calculation, the classification loss is used as a basis, and a gradient descent method is adopted to carry out iterative training on the classification model until the classification loss converges, so that a trained classification model is obtained.

The step of training the classification model by using the sample image and the label set to obtain a trained classification model, wherein the classification model is trained by adopting a classification task with multiple image attribute dimensions, so that the capability of extracting features of the classification model is improved, more accurate input samples are provided for the training of a subsequent quality assessment model, and the accuracy of quality assessment is improved.

Step S203, inputting the sample image into the trained classification model to obtain probability distribution vectors of each image attribute dimension, and splicing all probability distribution vectors of all image attribute dimensions to obtain the input vector of the sample image.

The probability distribution vector may refer to a vector formed by probabilities that the sample image belongs to each category of the image attribute dimension under the image attribute dimension, the stitching may refer to a join operation, and the input vector may be used as a training sample of a subsequent quality evaluation model.

Specifically, in the classification task of multiple image attribute dimensions, if the classification category of each image attribute dimension is different, each attribute dimension corresponds to a probability distribution vector, the probability distribution vector includes the prediction probabilities of the respective categories in the image attribute dimension, the category corresponding to each element in the probability distribution vector is fixed, for example, in the vehicle body dimension, the corresponding category is a subject clear category and a subject ambiguous category, and the probability distribution vector can be expressed as [ [ the ₁ ，p ₂ ]Wherein p is ₁ Prediction probability for explicit class for subject [ ₂ Is a predictive probability of being a subject ambiguous class. Optionally, stitching all probability distribution vectors of all image attribute dimensions to obtain an input vector of the sample image includes:

splicing all probability distribution vectors of all image attribute dimensions to obtain a spliced vector;

inputting the initial image into a trained detection model to obtain a bounding box of the target vehicle in the initial image;

and splicing the spliced vector and the coordinate points corresponding to the bounding box to obtain an input vector.

The stitching vector may refer to a stitching result of probability distribution vectors of all image attribute dimensions, the trained detection model may refer to a target detection model, and bounding boxes of the target vehicle may be used to represent position information of the target vehicle in an initial image, where the bounding boxes are generally represented by upper left corner points and lower right corner points of the bounding boxes, and coordinate points are coordinate points of the upper left corner points and the lower right corner points of the bounding boxes.

In the embodiment, when the input vector is constructed, bounding box information of the target vehicle is additionally added, so that characterization information of the input vector is enriched, quality evaluation model training is conveniently carried out according to richer input vector information in the subsequent quality evaluation model training process, accuracy of quality evaluation model training is improved, and accuracy of quality evaluation is further improved.

And the step of inputting the sample image into the trained classification model to obtain the probability distribution vector of each image attribute dimension, splicing all the probability distribution vectors of all the attribute dimensions to obtain the input vector of the sample image, and taking the splicing result of the probability distribution vectors of the multiple image attribute dimensions as the input sample of the quality evaluation model, so that the requirements of each vehicle attribute on quality can be considered, and the accuracy of image quality evaluation under the multiple vehicle attribute dimensions can be improved.

Step S204, a quality assessment model is obtained, the quality assessment model comprises at least one sub-model corresponding to a preset vehicle attribute dimension, each vehicle attribute dimension is traversed, an assessment score label of a sample image in the vehicle attribute dimension is obtained, and the sub-model corresponding to the vehicle attribute dimension is trained by using an input vector and the assessment score label, so that a trained quality assessment model is obtained.

The quality evaluation model may include at least one sub-model corresponding to a preset vehicle attribute dimension, each sub-model may be configured as an encoder and a predictor, the encoder may be configured to extract feature information of an input vector, and the predictor may output a predicted quality evaluation result according to the feature information of the input vector.

Specifically, in order to adapt to different requirements of different vehicle attribute dimensions on quality, evaluation score labels corresponding to the vehicle attribute dimensions are required to be marked respectively, when training is performed, if the evaluation score labels are labels with a preset fixed number, for example, integers ranging from 0 to 10, manual marking is only to select one label from the labels with the fixed number as the evaluation score label, then input vectors are input into a quality evaluation model to obtain a prediction evaluation score, and classification losses are calculated according to the preset evaluation score, the evaluation score labels and a preset classification loss function, at this time, the classification loss function can still adopt a cross entropy loss function, and the quality evaluation model corresponding to the vehicle attribute dimensions is iteratively trained by adopting a gradient descent method until the classification losses are converged, so that a trained quality evaluation model corresponding to the attribute dimensions is obtained.

In an embodiment, if the manually marked evaluation score label is a score value of an unfixed number, for example, any one of values ranging from 0 to 10, the predicted loss is calculated according to the preset evaluation score, the evaluation score label and the preset predicted loss function, and at this time, the predicted loss function may adopt a mean square error loss function, and iteratively train the quality evaluation model corresponding to the attribute dimension based on the predicted loss.

It should be noted that, the vehicle attribute dimension may refer to a dimension used for attribute recognition, for example, a brand dimension, a color dimension, a type dimension, and the like of the target vehicle, so that after training, the quality evaluation model is better generalized in a specific vehicle attribute recognition task, and in general, the vehicle attribute dimension and the image attribute dimension are different.

In this embodiment, the sample image corresponds to a probability distribution vector in an image attribute dimension, all probability distribution vectors in all image attribute dimensions are spliced to obtain an input vector of the sample image, at this time, the input vector can be simultaneously used as a training sample of each sub-model corresponding to the vehicle attribute dimension in the quality evaluation model, and the sub-model corresponding to each vehicle attribute dimension defaults to be capable of learning information which is favorable for quality evaluation in the corresponding vehicle attribute dimension from classification information of all image attribute dimensions in the training process.

In one embodiment, an implementer may screen probability distribution vectors corresponding to the image attribute dimension with a larger association degree with the vehicle attribute dimension from all probability distribution vectors of all image attribute dimensions according to the sub-model of the corresponding vehicle attribute dimension to be trained in the quality evaluation model, and splice the probability distribution vectors to obtain training samples only aiming at the sub-model of the corresponding vehicle attribute dimension, for example, for the sub-model corresponding to the vehicle attribute dimension, such as the vehicle color, the image attribute dimension with the larger association degree with the vehicle color is the brightness attribute dimension, the occlusion attribute dimension and the truncation attribute dimension, and splice the probability distribution vectors respectively corresponding to the brightness attribute dimension, the occlusion attribute dimension and the truncation attribute dimension to obtain training samples only used in the training process of the sub-model of the corresponding color.

The quality evaluation model comprises at least one sub-model corresponding to a preset vehicle attribute dimension, each vehicle attribute dimension is traversed, an evaluation score label of a sample image under the vehicle attribute dimension is obtained, the sub-model corresponding to the vehicle attribute dimension is trained by using an input vector and the evaluation score label, a trained quality evaluation model is obtained, and the sub-models in the quality evaluation model are respectively trained for different vehicle attribute dimensions, so that the requirements of different vehicle attribute dimensions on quality are better met, and the accuracy of quality evaluation is improved.

In the embodiment, the classification model is trained by adopting the classification task of multiple image attribute dimensions, the capability of extracting the characteristics of the classification model is improved, and the splicing result of the probability distribution vectors of the multiple image attribute dimensions is used as an input sample of the quality evaluation model, so that the requirements of each vehicle attribute on quality can be considered, and the accuracy of image quality evaluation is improved.

Referring to fig. 3, a flowchart of an image quality evaluation method according to a second embodiment of the present invention is shown, and the image quality evaluation method includes the following steps:

Step S301, an image to be evaluated is obtained, the image to be evaluated is input into a trained vehicle attribute recognition model, and a recognition result and a confidence coefficient corresponding to the vehicle attribute dimension are obtained;

step S302, inputting the image to be evaluated into a trained classification model to obtain reference probability distribution vectors of each image attribute dimension, and splicing all the reference probability distribution vectors of all the image attribute dimensions to obtain reference vectors of the image to be evaluated;

step S303, inputting the reference vector into a sub-model corresponding to the attribute dimension of the vehicle in the trained quality evaluation model to obtain a prediction evaluation score;

and step S304, multiplying the confidence coefficient and the prediction evaluation score, and determining the multiplied result as a quality evaluation result corresponding to the attribute dimension of the vehicle.

Wherein the trained classification model and the trained quality assessment model are both obtained based on the image quality assessment model training method according to embodiment one.

The trained vehicle attribute recognition model can adopt any existing trained multi-label image classification model, such as an MLkNN model, a Rank-SVM model and the like, the multi-label can refer to a plurality of vehicle attributes of a vehicle, such as color attributes, category attributes and brand attributes, the input of the trained vehicle attribute recognition model can be an image to be evaluated, namely an image containing a target vehicle, and the output of the trained vehicle attribute recognition model can refer to recognition results of the plurality of vehicle attributes of the target vehicle in the image to be evaluated.

The image to be evaluated may refer to an image to be subjected to quality evaluation of a target vehicle image, the reference probability distribution vector may refer to a vector composed of reference probabilities of respective categories of the image to be evaluated in a corresponding image attribute dimension, the reference vector may refer to input of a quality evaluation model, the recognition result may refer to a predicted category of the image to be evaluated in a corresponding vehicle attribute dimension, the confidence level may refer to a probability that the image to be evaluated belongs to the recognition result in the corresponding vehicle attribute dimension, the prediction evaluation score may refer to an evaluation score output by a trained quality evaluation model, and the reference vector may be used for characterizing classification prediction information of the image to be evaluated in the respective image attribute dimension.

In this embodiment, the confidence coefficient and the prediction evaluation score are multiplied to determine the quality evaluation result under the corresponding vehicle attribute dimension, the accuracy of attribute category prediction and quality evaluation score is comprehensively considered, the accuracy of quality evaluation is improved, different result thresholds are also convenient for an implementer to set according to actual requirements, and images with poor quality evaluation results are filtered to be better applied to a vehicle attribute recognition scene.

Fig. 4 shows a block diagram of an image quality assessment model training apparatus according to a third embodiment of the present invention, where the image quality assessment model training apparatus may be applied to a client, and a computer device corresponding to the client is connected to a server to obtain an initial image including a target vehicle, and the client includes an initialized classification model and an initialized quality assessment model, and trains the initialized classification model and the initialized quality assessment model to obtain a trained quality assessment model applicable to multiple vehicle attribute dimensions. For convenience of explanation, only portions relevant to the embodiments of the present invention are shown.

Referring to fig. 4, the image quality evaluation model training apparatus includes:

the preprocessing module 41 is configured to preprocess an acquired initial image including a target vehicle to obtain a sample image and a label set corresponding to the sample image, where the label set includes a category label corresponding to an image attribute dimension, and the image attribute dimension includes an angle dimension and a vehicle main body dimension;

a classification model training module 42, configured to train the classification model using the sample image and the label set to obtain a trained classification model;

the vector stitching module 43 is configured to input the sample image into the trained classification model to obtain probability distribution vectors of each image attribute dimension, and stitch all probability distribution vectors of all image attribute dimensions to obtain an input vector of the sample image;

the evaluation model training module 44 is configured to obtain a quality evaluation model, where the quality evaluation model includes at least one sub-model corresponding to a preset vehicle attribute dimension, traverse each vehicle attribute dimension, obtain an evaluation score label of the sample image in the vehicle attribute dimension, and train the sub-model corresponding to the vehicle attribute dimension by using the input vector and the evaluation score label to obtain a trained quality evaluation model.

the preprocessing module 41 includes:

the image segmentation unit is used for inputting the initial image into the trained segmentation model to obtain a segmentation area;

the first label determining unit is used for carrying out truncation processing on the segmented region according to a preset truncation proportion to obtain at least two truncated images and the truncation proportion thereof, wherein the truncated images are used as sample images, and the truncation proportion is used as a class label of a truncation attribute dimension.

the preprocessing module 41 includes:

the image shielding unit is used for shielding the initial image by adopting a preset shielding area to obtain a shielding image;

the ratio calculation unit is used for calculating a difference value between a first area of the target vehicle in the initial image and a second area of the target vehicle in the shielding image, and comparing the difference value with the first area to obtain shielding ratio;

the second label determining unit is used for taking the shielding image as a sample image and taking the shielding proportion as a category label of the shielding attribute dimension.

the preprocessing module 41 includes:

The image compression unit is used for compressing the initial image according to a preset compression ratio to obtain a compressed image;

and the third label determining unit is used for taking the compressed image as a sample image and taking the compression ratio as a category label of the definition attribute dimension.

the preprocessing module 41 includes:

the image conversion unit is used for converting the initial image from an RGB color space to an HSV color space to obtain a converted image, and obtaining a sub-image of the converted image under a brightness channel to obtain a first brightness image;

the brightness adjusting unit is used for adjusting the brightness of the first brightness image according to a preset brightness adjusting proportion to obtain a second brightness image;

and the fourth label determining unit is used for taking the second brightness image as a sample image and taking the brightness adjustment proportion as a class label of the brightness attribute dimension.

Optionally, the vector stitching module 43 includes:

the first vector stitching unit is used for stitching all probability distribution vectors of all image attribute dimensions to obtain stitching vectors;

the target detection unit is used for inputting the initial image into the trained detection model to obtain a bounding box of the target vehicle in the initial image;

And the second vector splicing unit is used for splicing the spliced vector and the coordinate points corresponding to the bounding box to obtain an input vector.

It should be noted that, because the content of information interaction and execution process between the modules and units is based on the same concept as the method embodiment of the present invention, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein.

Fig. 5 is a schematic structural diagram of a computer device according to a fourth embodiment of the present invention. As shown in fig. 5, the computer device of this embodiment includes: at least one processor (only one shown in fig. 5), a memory, and a computer program stored in the memory and executable on the at least one processor, the processor executing the computer program performing the steps of any of the various image quality assessment model training method embodiments described above.

The computer device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that fig. 5 is merely an example of a computer device and is not intended to limit the computer device, and that a computer device may include more or fewer components than shown, or may combine certain components, or different components, such as may also include a network interface, a display screen, an input device, and the like.

The processor may be a CPU, but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory includes a readable storage medium, an internal memory, etc., where the internal memory may be the memory of the computer device, the internal memory providing an environment for the execution of an operating system and computer-readable instructions in the readable storage medium. The readable storage medium may be a hard disk of a computer device, and in other embodiments may be an external storage device of the computer device, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. that are provided on the computer device. Further, the memory may also include both internal storage units and external storage devices of the computer device. The memory is used to store an operating system, application programs, boot loader (BootLoader), data, and other programs such as program codes of computer programs, and the like. The memory may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present invention. The specific working process of the units and modules in the above device may refer to the corresponding process in the foregoing method embodiment, which is not described herein again. The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above-described embodiment, and may be implemented by a computer program to instruct related hardware, and the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of the method embodiment described above. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code, a recording medium, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

The present invention may also be implemented as a computer program product for implementing all or part of the steps of the method embodiments described above, when the computer program product is run on a computer device, causing the computer device to execute the steps of the method embodiments described above.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided by the present invention, it should be understood that the disclosed apparatus/computer device and method may be implemented in other manners. For example, the apparatus/computer device embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. An image quality assessment model training method, characterized in that the model training method comprises:

2. The model training method of claim 1, wherein the image attribute dimension further comprises a truncated attribute dimension;

the preprocessing the obtained initial image containing the target vehicle to obtain a sample image and a corresponding label set thereof comprises the following steps:

And carrying out truncation processing on the segmentation area according to a preset truncation proportion to obtain at least two truncated images and the truncation proportion thereof, wherein the truncated images are used as the sample images, and the truncation proportion is used as a class label of the truncation attribute dimension.

3. The model training method of claim 1, wherein the image attribute dimension further comprises an occlusion attribute dimension;

and taking the shielding image as the sample image, and taking the shielding proportion as a category label of the shielding attribute dimension.

4. The model training method of claim 1, wherein the image attribute dimension further comprises a sharpness attribute dimension;

and taking the compressed image as the sample image, and taking the compression ratio as a category label of the definition attribute dimension.

5. The model training method of claim 1, wherein the image attribute dimension further comprises a brightness attribute dimension;

and taking the second brightness image as the sample image, and taking the brightness adjustment proportion as a category label of the brightness attribute dimension.

6. The model training method of claim 1, wherein the stitching all probability distribution vectors for all image attribute dimensions to obtain an input vector for the sample image comprises:

inputting the initial image into a trained detection model to obtain a bounding box of a target vehicle in the initial image;

and splicing the splicing vector and the coordinate point corresponding to the bounding box to obtain the input vector.

7. An image quality evaluation method, characterized in that the image quality evaluation method comprises:

8. An image quality assessment model training apparatus, characterized in that the model training apparatus comprises:

9. A computer device, characterized in that it comprises a processor, a memory and a computer program stored in the memory and executable on the processor, which processor implements the model training method according to any of claims 1 to 6 when executing the computer program.

10. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the model training method according to any of claims 1 to 6.