CN111626353A

CN111626353A - Image processing method, terminal and storage medium

Info

Publication number: CN111626353A
Application number: CN202010457562.4A
Authority: CN
Inventors: 张弓
Original assignee: Oppo Chongqing Intelligent Technology Co Ltd
Current assignee: Oppo Chongqing Intelligent Technology Co Ltd
Priority date: 2020-05-26
Filing date: 2020-05-26
Publication date: 2020-09-04

Abstract

The embodiment of the invention discloses an image processing method, a terminal and a storage medium, wherein the image processing method comprises the steps of inputting an image to be classified into an image classification model to obtain a class label parameter corresponding to the image to be classified; determining similar parameters between class label parameters and average characteristic parameters, wherein the average characteristic parameters are characteristic parameters determined from a plurality of sample images by using an image classification model, and the plurality of sample images are images with the same class as the images to be classified; and outputting label information corresponding to the category label parameters under the condition that the similarity parameters meet the preset similarity parameter range, so as to utilize the label information to perform scene recognition on the image to be classified.

Description

Image processing method, terminal and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image processing method, a terminal, and a storage medium.

Background

With the continuous development of scientific technology, image classification technology has wide application in character recognition, face recognition, object recognition, pedestrian detection, image retrieval and other aspects.

In the prior art, when a terminal receives an image to be classified, the terminal inputs the image to be classified into an image classification model, the image classification model calculates a category label parameter of the image to be classified, determines label information of the image to be classified according to the category label parameter, and outputs the label information.

Disclosure of Invention

In order to solve the above technical problem, embodiments of the present invention desirably provide an image processing method, a terminal, and a storage medium, which can improve accuracy of tag information corresponding to an image to be classified output by the terminal.

The technical scheme of the invention is realized as follows:

the embodiment of the application provides an image processing method, which comprises the following steps:

inputting an image to be classified into an image classification model to obtain a class label parameter corresponding to the image to be classified;

determining a similarity parameter between the class label parameter and an average characteristic parameter, wherein the average characteristic parameter is a characteristic parameter determined from a plurality of sample images by using the image classification model, and the sample images are images with the same class as the image to be classified;

and outputting label information corresponding to the category label parameters under the condition that the similar parameters meet a preset similar parameter range, so as to perform scene recognition on the image to be classified by using the label information.

In the above scheme, before the image to be classified is input into the image classification model and the class label parameter corresponding to the image to be classified is obtained, the method further includes:

inputting the sample images into a first transmission layer of an original image classification model to obtain a plurality of first parameter information corresponding to the sample images, wherein one sample image corresponds to one first parameter information;

inputting the first parameter information into a first convolution layer of the original image classification model, and obtaining a plurality of sample label parameters corresponding to the first parameter information based on the weight of the first convolution layer, wherein one piece of first parameter information corresponds to one sample label parameter, and the first convolution layer is a next transmission layer executed after the first transmission layer;

determining a loss value of the original image classification model according to the plurality of sample label parameters and standard sample label parameters corresponding to the plurality of sample images;

and training the original image classification model by using the loss value to obtain the image classification model.

In the above scheme, the training the original image classification model by using the loss value to obtain the image classification model includes:

iteratively adjusting the weight of a first convolution layer in the original image classification model under the condition that the loss value does not meet the preset parameter range to obtain an iterative image classification model;

training the iterative image classification model by using the plurality of first parameter information until the image classification model is obtained;

and taking the original image classification model as the image classification model under the condition that the loss value meets a preset parameter range.

In the above scheme, after the original image classification model is trained by using the loss value to obtain the image classification model, the method further includes:

inputting the plurality of first parameter information into a first convolution layer of the image classification model, and obtaining a plurality of first sample label parameters based on the adjusted weight of the first convolution layer;

determining an average of the plurality of first sample tag parameters;

and taking the average value as the average characteristic parameter.

In the above scheme, the inputting the image to be classified into an image classification model to obtain the class label parameter corresponding to the image to be classified includes:

inputting an image to be classified into an image classification model to obtain image characteristic parameters of the image to be classified;

and carrying out normalization processing on the image characteristic parameters by using the image classification model to obtain the class label parameters.

In the above scheme, the determining the similar parameter between the category label parameter and the average feature parameter includes:

determining a first feature vector corresponding to the image feature parameter and a second feature vector corresponding to the average feature parameter;

determining a binary pattern of the first eigenvector and a binary pattern of the second eigenvector;

and determining the similar parameters according to the first characteristic vector, the second paradigm of the first characteristic vector and the second paradigm of the second characteristic vector.

The embodiment of the application provides a terminal, the terminal includes:

the input unit is used for inputting the image to be classified into the image classification model to obtain the class label parameter corresponding to the image to be classified;

a determining unit, configured to determine a similarity parameter between the category label parameter and an average feature parameter, where the average feature parameter is a feature parameter determined from a plurality of sample images by using the image classification model, and the plurality of sample images are images of the same category as the image to be classified;

and the output unit is used for outputting label information corresponding to the category label parameter under the condition that the similarity parameter meets a preset similarity parameter range so as to utilize the label information to carry out scene identification on the image to be classified.

In the above scheme, the terminal further includes a training unit;

the input unit is used for inputting the sample images into a first transmission layer of an original image classification model to obtain a plurality of first parameter information corresponding to the sample images, and one sample image corresponds to one first parameter information; inputting the first parameter information into a first convolution layer of the original image classification model, and obtaining a plurality of sample label parameters corresponding to the first parameter information based on the weight of the first convolution layer, wherein one piece of first parameter information corresponds to one sample label parameter, and the first convolution layer is a next transmission layer executed after the first transmission layer;

the determining unit is used for determining a loss value of the original image classification model according to the plurality of sample label parameters and standard sample label parameters corresponding to the plurality of sample images;

and the training unit is used for training the original image classification model by using the loss value to obtain the image classification model.

An embodiment of the present application further provides a terminal, where the terminal includes:

the image processing device comprises a memory, a processor and a communication bus, wherein the memory is communicated with the processor through the communication bus, the memory stores an image classification program executable by the processor, and when the image classification program is executed, the processor executes the image processing method.

The embodiment of the application provides a storage medium, on which a computer program is stored, and the storage medium is applied to a terminal, and is characterized in that the computer program is executed by a processor to realize the image processing method.

The embodiment of the invention provides an image processing method, a terminal and a storage medium, wherein the image processing method comprises the following steps: inputting the image to be classified into an image classification model to obtain a class label parameter corresponding to the image to be classified; determining similar parameters between class label parameters and average characteristic parameters, wherein the average characteristic parameters are characteristic parameters determined from a plurality of sample images by using an image classification model, and the plurality of sample images are images with the same class as the images to be classified; and outputting label information corresponding to the category label parameters under the condition that the similarity parameters meet the preset similarity parameter range, so as to utilize the label information to perform scene recognition on the image to be classified. By adopting the method, when the terminal obtains the category label parameter of the image to be classified by utilizing the image classification model, the terminal can firstly determine the similarity between the category label parameter and the average characteristic parameter, and when the similarity meets the preset parameter range, the terminal can output the label information corresponding to the category label parameter, but the terminal directly outputs the label information corresponding to the category label parameter when obtaining the category label parameter of the image to be classified according to the image classification model, so that the accuracy of the classification label corresponding to the image to be classified output by the terminal is improved.

Drawings

Fig. 1 is a schematic network structure diagram of an exemplary deep residual error network provided in an embodiment of the present application;

fig. 2 is a schematic structural diagram of a training phase and an inference phase of an exemplary CNN image classification model according to an embodiment of the present application;

FIG. 3 is a flow chart of a model training portion of an exemplary image classification model provided in an embodiment of the present application;

fig. 4 is a schematic structural diagram of an inference stage of an exemplary CNN image classification model according to an embodiment of the present application;

fig. 5 is a flowchart of an image processing method according to an embodiment of the present application;

FIG. 6 is a first schematic structural diagram illustrating an inference stage of an exemplary image classification model according to an embodiment of the present application;

FIG. 7 is a schematic structural diagram of an inference phase and a training phase of an exemplary image classification model provided in an embodiment of the present application;

FIG. 8 is a flowchart illustrating an exemplary process of training a classification model of an original image by a terminal according to an embodiment of the present disclosure;

FIG. 9 is a flowchart of an exemplary image processing method provided in an embodiment of the present application;

FIG. 10 is a flowchart illustrating an exemplary process of training a classification model of an original image by a terminal according to an embodiment of the present application;

fig. 11 is a first schematic structural diagram of a terminal according to an embodiment of the present disclosure;

fig. 12 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

Image classification is an image processing method for distinguishing different types of images according to the characteristics of the images reflected in image information, and traditional image classification can be classified into methods based on textures, shapes, spatial relationships and the like. In recent years, deep learning is a popular direction of computer vision, image classification is a basic task in computer vision, and an image classification technique based on a Convolutional Neural Network (CNN) is applied to an actual scene. The deep learning is to use a convolution neural network model to carry out end-to-end output for the image classification method, namely, the input is a picture and the output is a label of the picture, and the calculation of complex parts such as traditional algorithm feature extraction is omitted.

The image classification model of CNN generally includes: convolution layer (confinement), Pooling layer (Pooling), BN layer (Batch Normalization), activation layer (Relu), Softmax layer, Loss layer (Loss).

The network structure of the classical depth residual error network is shown in fig. 1:

the network structure of the deep residual error network comprises a network structure input rod, a stage 1, a stage 2, a stage 3, a stage 4 and an output part 6, wherein the input part of the input rod is data information with 224 points on a horizontal axis, 224 points on a vertical axis and 3 channels, the input rod part specifically comprises a convolutional layer and a pooling layer, and the output part of the input rod is data information with 56 points on the horizontal axis, 56 points on the vertical axis and 64 channels; the arrangement modes of the layers in the stage 1, the stage 2, the stage 3 and the stage 4 are the same, except that the corresponding data forms of the images to be processed in different stages are different, wherein the stage 1, the stage 2, the stage 3 and the stage 4 comprise a convolution layer, a normalization layer and an activation layer.

The image classification model of CNN can be divided into a training phase and an inference phase:

firstly, in a training stage, the image to be classified passes through an image classification model to obtain label information corresponding to the class label parameter of the image to be classified, the process is called forward propagation, and the loss of the current image classification model is obtained according to GT and the label information; and calculating the loss of each layer in the image classification model from the label information by using the calculated loss through a gradient descent algorithm, and updating the weight of each layer, wherein the process is called back propagation. As shown in fig. 2, when an image to be classified is input into the image classification model, the image to be classified passes through the first convolution layer and the classification network layer of the convolution layer, the normalization layer, the activation layer, the pooling layer … in the image classification model, and then the label information corresponding to the image to be classified is directly obtained, i.e. the forward propagation process of the image classification model. And the loss layer of the image classification model can calculate the loss of the image classification model according to the category information and the GT, the loss of each layer in the image classification model is calculated from the classification network layer by utilizing the loss through a gradient descent algorithm, and the weight of each layer is updated, namely the image classification model is subjected to a back propagation process, so that the trained image classification model is obtained.

Specifically, as shown in fig. 3, a trained image classification model can be obtained by continuously performing forward and backward propagation training using a picture (a plurality of sample images) of a data set, and a current image classification model training process is as follows:

1. the server determines to select the original image classification model.

2. The server determines a plurality of sample images.

3. The server inputs a plurality of sample images into an original image classification model, and determines whether the original image classification model converges.

4. When the server determines that the original image classification model does not converge, the server determines a loss of the original image classification model using the plurality of sample images.

5. And the server updates the weight of each layer in the original image classification model by using the loss of the original image classification model to obtain an updated image classification model.

6. When the server determines that the updated image classification model converges, the server will update the image classification model as the image classification model.

7. When the server determines that the original image classification model converges, the server takes the original image classification model as the image classification model.

And in the inference stage, inputting a picture to be classified and obtaining label information corresponding to the picture to be classified through an image classification model, namely, performing forward propagation once, as shown in fig. 4. When an image to be classified is input into the image classification model, the image passes through the first convolution layer and the classification network layer of the convolution layer, the normalization layer, the activation layer, the pooling layer … in the image classification model to directly obtain the label information corresponding to the image to be classified, namely, the forward propagation process of the classification network model.

There are often more problems to be encountered with the image classification model in the handset than is expected. At present, the training of the CNN image classification model is completed at a server side, and because the training of the CNN image classification model requires higher hardware parameter configuration, the hardware parameter configuration on a mobile phone is lower, and the difficulty in training the image classification model by using the mobile phone is higher. At present, the mainstream mobile phone only optimizes an image classification model inference process and does not optimize an image classification model training process; when a user needs to add some new image data or category images, the mobile phone uploads the new image data or category images to the server in a network transmission mode, the server trains an image classification model by using the new image data or category images, and then the trained image classification model is transmitted back to the terminal. This way of training the image classification model is less efficient and not user friendly.

The mobile phone application scene is characterized by openness, so that when the scene recognition of the classification model is used, a user may use the image classification model to recognize scenes except some training sets, and the requirements on the robustness and the generalization capability of the scene recognition classification network are high. The current common method is to add corresponding pictures under the condition of error identification and missing identification, thereby optimizing the image classification model. It will be appreciated that such an approach is inefficient and has a significant lag, often waiting for user feedback to correct, and ultimately may result in a user's infrequent use of this function.

The problems in the prior art can be solved by the methods in the first embodiment and the second embodiment.

Example one

An embodiment of the present application provides an apparatus connection method, and fig. 5 is a first flowchart of the apparatus connection method provided in the embodiment of the present application, and as shown in fig. 5, the apparatus connection method may include:

s101, inputting the image to be classified into an image classification model to obtain a class label parameter corresponding to the image to be classified.

The image processing method provided by the embodiment of the application is suitable for a scene in which the terminal performs scene recognition on the image to be classified by using the image classification model.

In this embodiment of the application, the image to be classified may be an image that needs to determine a category label and is received by a terminal, specifically, the image to be classified may be a face image, a landscape image, an animal image, a plant image, a gesture image, and the like, and may be specifically determined according to an actual situation, which is not limited in this embodiment of the application.

In this embodiment of the present application, the category label parameter is a label parameter describing a feature of an image to be classified, specifically, the category label parameter may be a pixel value parameter of the image to be processed, a contour feature parameter of the image to be processed, a texture feature parameter of the image to be processed, and the like, and may be specifically determined according to an actual situation, which is not limited in this embodiment of the present application.

In the embodiment of the application, before the terminal inputs the image to be classified into the image classification model and the category label parameter corresponding to the image to be classified is obtained, the terminal inputs a plurality of sample images into a first transmission layer of an original image classification model to obtain a plurality of first parameter information corresponding to the plurality of sample images; after the terminal obtains a plurality of first parameter information corresponding to a plurality of sample images, the terminal inputs the plurality of first parameter information into a first convolution layer of the original image classification model, and obtains a plurality of sample label parameters corresponding to the plurality of first parameter information based on the weight of the first convolution layer; after the terminal obtains a plurality of sample label parameters corresponding to a plurality of pieces of first parameter information, the terminal determines a loss value of the original image classification model according to the plurality of sample label parameters and standard sample label parameters corresponding to a plurality of sample images; after the terminal determines the loss value of the original image classification model, the terminal trains the original image classification model by using the loss value to obtain the image classification model.

It should be noted that one sample image corresponds to one first parameter information, and one first parameter information corresponds to one sample label parameter.

Note that the first convolution layer is a next transport layer executed after the first transport layer.

In this embodiment, the first transmission layer may include a convolutional layer, a normalization layer, an active layer pooling layer, and the like, which may be determined according to practical situations and is not limited in this embodiment.

It should be noted that the convolution layer may be a vector convolution operation layer; the active layer may be a Linear rectification function (relax) active layer, or may also be a hyperbolic function (hyperbolical function, Tanh) active layer, which may be specifically determined according to an actual situation, and this is not limited in this embodiment of the present application.

In the embodiment of the present application, the plurality of sample images are images of the same category as the image to be processed.

In the embodiment of the application, the process of determining the loss value of the original image classification model by the terminal according to the plurality of sample label parameters and the standard sample label parameters corresponding to the plurality of sample images can obtain a plurality of difference values between the plurality of sample label parameters and the standard sample label parameters corresponding to the plurality of sample images for the terminal, and the terminal takes the average value of the plurality of difference values as the loss value of the original image classification model; the terminal can also obtain a plurality of quotients between the plurality of sample label parameters and the standard sample label parameters corresponding to the plurality of sample images, and the average value of the quotients is used as the loss value of the original image classification model; the terminal may further obtain a plurality of cosine values between the parameter vectors corresponding to the plurality of sample label parameters and the parameter vectors corresponding to the standard sample label parameters corresponding to the plurality of sample images, and use an average value of the plurality of cosine values as a loss value of the original image classification model, which may be specifically determined according to an actual situation, which is not limited in this embodiment of the present application.

In the embodiment of the present application, the process of training the original image classification model by using the loss value by the terminal to obtain the image classification model includes: under the condition that the terminal determines that the loss value does not meet the preset parameter range, the terminal iteratively adjusts the weight of a first convolution layer in the original image classification model to obtain an iterative image classification model; after the terminal obtains the iterative image classification model, the terminal trains the iterative image classification model by using a plurality of pieces of first parameter information until the image classification model is obtained; and under the condition that the terminal determines that the loss value meets the preset parameter range, the terminal takes the original image classification model as an image classification model.

Note that the first convolution layer is the last convolution layer in the original image classification model.

In this embodiment of the application, the terminal may further iteratively adjust the weight of the first convolution layer in the original image classification model to obtain an iterative image classification model, so as to train the iterative image classification model by using a plurality of pieces of first parameter information until the image classification model is obtained, count the number of times information of the training of the original image classification model, and when the number of times information reaches a preset number of times threshold, the terminal stops training the original image classification model, thereby obtaining the image classification model.

In the embodiment of the application, in the process that the terminal trains the original image classification model by using a plurality of sample images, the terminal can count the number information of times of training the original image classification model, and when the number information reaches a preset number threshold, the terminal stops training the original image classification model so as to obtain the image classification model; the terminal can also determine an iteration loss value corresponding to the iterative image classification model, and when the terminal determines that the iteration loss value meets the preset parameter range, the terminal takes the iterative image classification model as the image classification model, and the determination can be specifically performed according to the actual situation, which is not limited in the embodiment of the application.

It can be understood that, when the terminal trains the original image classification model, only the weight of the first convolution layer in the original image classification model is adjusted, the weights of other convolution layers in the image classification model are not changed, and the calculation amount when the original image classification model is trained is reduced, so that the terminal can directly train the original image classification model by using a plurality of sample images, the terminal does not need to transmit the plurality of sample images to the server, the server trains the original image classification model, and the terminal does not need to transmit the plurality of sample images to the server, thereby reducing the data amount of network transmission, reducing the utilization of network resources, and reducing the maintenance cost of the server.

In this embodiment of the present application, the training, by the terminal, of the original image classification model by using the loss value to obtain the image classification model includes: the terminal inputs a plurality of first parameter information into a first convolution layer of the image classification model, and obtains a plurality of first sample label parameters based on the adjusted weight of the first convolution layer; after the terminal obtains a plurality of first sample label parameters, the terminal determines the average value of the plurality of first sample label parameters; when the terminal determines the average value of the plurality of first sample label parameters, the terminal takes the average value as the average characteristic parameter.

It should be noted that a first sample label parameter corresponds to a sample image.

In this embodiment of the present application, the terminal may obtain the average characteristic parameter according to an average value of the plurality of first sample tag parameters, and the terminal may also obtain the average characteristic parameter according to weights corresponding to the plurality of first sample tag parameters, which may be specifically determined according to an actual situation, and this is not limited in this embodiment of the present application.

In this embodiment of the application, the terminal may use an average value of a plurality of first sample tag parameters as an average characteristic parameter; the terminal may also perform normalization processing on the plurality of first sample tag parameters, and then use an average value of the plurality of first sample tag parameters after the normalization processing as an average characteristic parameter, which may be specifically determined according to an actual situation, which is not limited in this embodiment of the present application.

In this embodiment of the application, the way in which the terminal performs normalization processing on the plurality of first sample tag parameters may be that each parameter value in the plurality of first sample tag parameters is divided by a preset adjustment value; or multiplying each parameter value in the plurality of first sample tag parameters by a preset adjusting value by the terminal; the terminal can also add a preset adjusting value to each parameter value in the plurality of first sample label parameters; the preset adjustment value may also be subtracted from each parameter value of the plurality of first sample tag parameters by the terminal, which may be specifically determined according to an actual situation, and this is not limited in this embodiment of the application.

It should be noted that the preset adjustment value may be an adjustment value configured in the terminal, or may also be a parameter value carried in the instruction received by the terminal, which may be specifically determined according to an actual situation, and this is not limited in this embodiment of the present application.

In this embodiment of the present application, the process of inputting, by the terminal, an image to be classified into the image classification model to obtain a category label parameter corresponding to the image to be classified specifically includes: the terminal inputs the image to be classified into an image classification model to obtain image characteristic parameters of the image to be classified; after the terminal obtains the image characteristic parameters of the image to be classified, the terminal performs normalization processing on the image characteristic parameters by using the image classification model to obtain class label parameters.

In the embodiment of the application, when the terminal inputs the image to be classified into the image classification model, the first transmission layer of the image classification model determines the image characteristic parameters to be classified corresponding to the image to be classified, and after the first convolution layer in the image classification model processes the image characteristic parameters to be classified, the terminal obtains the image characteristic parameters of the image to be classified.

In the embodiment of the application, the image classification model further comprises a classification network layer, the terminal performs normalization processing on the image characteristic parameters by using the image classification model to obtain a category label parameter mode, and each parameter value in the image characteristic parameters can be divided by a preset adjustment value by using the classification network layer for the terminal; or the terminal can multiply each parameter value in the image characteristic parameters by a preset adjusting value by utilizing a classification network layer; the terminal can also add a preset adjusting value to each parameter value in the image characteristic parameters by utilizing a classification network layer; the preset adjustment value may also be subtracted from each parameter value in the image characteristic parameter by using a classification network layer for the terminal, which may be specifically determined according to an actual situation, and this is not limited in the embodiment of the present application.

S102, determining similar parameters between class label parameters and average characteristic parameters, wherein the average characteristic parameters are characteristic parameters determined from a plurality of sample images by using an image classification model, and the plurality of sample images are images with the same class as the images to be classified.

In the embodiment of the application, after the terminal obtains the class label parameter corresponding to the image to be classified by using the image classification model, the terminal can determine the similar parameter between the class label parameter and the average characteristic parameter.

In the embodiment of the application, in the process of determining the similar parameter between the category label parameter and the average characteristic parameter by the terminal, a first characteristic vector corresponding to the image characteristic parameter and a second characteristic vector corresponding to the average characteristic parameter can be determined for the terminal; after the terminal determines a first characteristic vector corresponding to the image characteristic parameter and a second characteristic vector corresponding to the average characteristic parameter, the terminal determines a second norm of the first characteristic vector and a second norm of the second characteristic vector; after the terminal determines the second norm of the first eigenvector and the second norm of the second eigenvector, the terminal can determine the similar parameters according to the first eigenvector, the second norm of the first eigenvector and the second norm of the second eigenvector.

In this embodiment of the application, the image feature parameters include a plurality of image feature parameter values, and the terminal may convert the plurality of image feature parameter values into a vector form, so as to obtain the first feature vector.

In this embodiment, the average feature parameter includes a plurality of average feature parameter values, and the terminal may convert the plurality of average feature parameter values into a vector form, so as to obtain a second feature vector.

In this embodiment of the present application, both the first eigenvector and the second eigenvector may be one-dimensional vectors or multidimensional vectors, which may be determined specifically according to actual situations, and this is not limited in this embodiment of the present application.

In this embodiment, the terminal may obtain a first product between the first feature vector and the second feature vector and a second product between the second norm of the first feature vector and the second norm of the second feature vector, respectively, and then use a quotient between the first product and the second product as the similarity parameter.

For example, the first feature vector may be F_curThe second feature vector may be F_avgThe first feature vector may have a two-normal form of | | | F_curThe second feature vector may have a two-normal form of F_avg|, the similar parameter may be cos<F_cur,F_avg>Then, the terminal may determine the similar parameters through formula (1), that is, the terminal determines that the first feature vectors may be F respectively_curAnd the second feature vector may be F_avgThe first product between, and the second normal form of the first feature vector may be | | | F_curThe two-normal form of | | | and the second eigenvector may be | | | | F_avgAnd obtaining the similar parameters by the terminal according to the quotient between the first product and the second product.

S103, outputting label information corresponding to the category label parameters under the condition that the similarity parameters meet the preset similarity parameter range, so as to perform scene recognition on the image to be classified by using the label information.

In the embodiment of the application, after the terminal determines the similar parameter between the category label parameter and the average characteristic parameter, the terminal judges the similar parameter, and the terminal outputs the label information corresponding to the category label parameter when determining that the similar parameter meets the preset similar parameter range.

It should be noted that the preset similarity parameter range may be a similarity parameter range configured in the terminal, or may also be a similarity parameter range carried in the instruction received by the terminal, which may be specifically determined according to an actual situation, and this is not limited in this embodiment of the present application.

For example, the preset similar parameter range may be 0.8 to 1, the similar parameter satisfies the preset similar parameter range when the similar parameter between the image feature parameter determined by the terminal and the average feature parameter is 0.9, and the similar parameter does not satisfy the preset similar parameter range when the similar parameter between the image feature parameter determined by the terminal and the average feature parameter is 0.6.

In the embodiment of the present application, the process that the similar parameter satisfies the preset similar parameter range may be that the similar parameter is greater than the preset similar parameter range, or that the similar parameter is greater than or equal to the preset similar parameter range, which may be specifically determined according to an actual situation, and the embodiment of the present application does not limit this.

For example, in fig. 6, when the terminal receives a plurality of sample images, the terminal inputs the plurality of sample images into the original image classification model, and trains the original image classification model to obtain the image classification model. Specifically, when the plurality of sample images are input into the original image classification model by the terminal, a plurality of first parameter information of the plurality of sample images can be obtained when the plurality of sample images pass through a first transmission layer of the original image classification model, a plurality of sample image feature parameters of the plurality of sample images can be obtained after the plurality of first parameter information pass through a first convolution layer of the original image classification model, a plurality of sample label parameters of the plurality of sample images can be obtained by the terminal after the plurality of sample image feature parameters pass through a classification network layer, and the terminal calculates the plurality of sample label parameters and standard sample label parameters (GT) corresponding to the plurality of sample images by using a loss layer in the original image classification model to obtain a loss value of the original image classification model, which is a forward propagation process of the original image classification model. After the terminal obtains the loss value of the original image classification model, the terminal determines whether the loss value meets a preset parameter range, and under the condition that the terminal determines that the loss value does not meet the preset parameter range, the terminal adjusts the weight of a first convolution layer in the original image classification model to obtain an iterative image classification model, which is a back propagation process of the original image classification model. And the terminal iteratively adjusts the weight of the first convolution layer in the original image classification model to obtain an iterative image classification model, and the iterative image classification model is trained by utilizing a plurality of pieces of first parameter information until the image classification model is obtained.

The first transmission layer includes a convolutional layer, a normalization layer, an active layer, and a pooling layer …, which may be determined according to the actual situation, and is not limited in this embodiment.

The transport layers after the first convolution layer are classified network layers, and the first transport layer is between the first convolution layers.

Illustratively, when the terminal trains an original image classification model by using a plurality of sample images to obtain an image classification model, the terminal can determine an average feature parameter by using the image classification model. Specifically, as shown in fig. 7: the terminal can input a plurality of sample images into the image classification model, and after the convolution layer, the normalization layer, the activation layer, the pooling layer … and the first convolution layer of the image classification model, the terminal obtains a plurality of sample image characteristic parameters, and the terminal obtains the average characteristic parameters of the plurality of sample images by calculating the average value of the plurality of sample image characteristic parameters.

In this embodiment of the present application, the terminal may further obtain an average characteristic parameter by averaging the characteristic parameters of the plurality of sample images corresponding to the iteration loss value when the iteration loss value meets the preset parameter range, which may be specifically determined according to an actual situation, and this is not limited in this embodiment of the present application.

Illustratively, when the terminal receives the image to be classified, the terminal classifies the image to be classified by using the image classification flow shown in fig. 8. Specifically, the terminal inputs the image to be classified into the image classification model, when the image to be classified passes through the first convolution layer of the image classification model, the normalization layer, the activation layer and the pooling layer …, the image classification model can obtain the image characteristic parameters of the image to be classified, when the terminal inputs the image characteristic parameters into the classification network layer of the image classification model, the terminal obtains the class label parameters of the image to be classified, and when the terminal determines that the cosine values meet the preset similar parameter range through calculating the image characteristic parameters and the cosine values of the average characteristic parameters, the terminal outputs the label information corresponding to the class label parameters.

Exemplarily, fig. 9 is a flowchart when the terminal receives an image to be classified and classifies the image to be classified:

11. the terminal receives an image to be classified.

12. And the terminal determines the category label parameter corresponding to the image to be classified.

13. The terminal determines a similarity parameter between the category label parameter information and the average characteristic parameter.

In the embodiment of the application, after the terminal determines the category label parameter corresponding to the image to be classified, the terminal obtains the average characteristic parameter corresponding to the image to be classified according to the category label parameter, and determines the similar parameter according to the similar parameter between the category label parameter and the average characteristic parameter.

14. And the terminal judges whether the similar parameters meet the preset similar parameter range.

15. And under the condition that the terminal determines that the similar parameters meet the preset similar parameter range, the terminal outputs the category label information.

In the embodiment of the application, when the terminal determines that the similar parameters meet the preset similar parameter range, the terminal outputs the category label information of the image to be classified.

16. And under the condition that the terminal determines that the similar parameters do not meet the preset similar parameter range, the terminal outputs prompt information.

In the embodiment of the application, when the terminal determines that the similar parameters do not meet the preset similar parameter range, the terminal outputs the prompt information.

It should be noted that the prompt information may be information that the category label information corresponding to the image to be classified cannot be determined.

For example, fig. 10 is a flowchart of a method for determining a training original image classification model by a terminal:

21. the terminal receives a plurality of sample images.

22. The terminal inputs the sample images to a first transmission layer of the original image classification model to obtain a plurality of first parameter information corresponding to the sample images.

23. The terminal inputs the first parameter information into a first convolution layer of the original image classification model, and obtains a plurality of sample label parameters corresponding to the first parameter information based on the weight of the first convolution layer.

24. And the terminal determines the loss value of the original image classification model according to the plurality of sample label parameters and the standard sample label parameters corresponding to the plurality of sample images.

25. And the terminal judges whether the loss value meets a preset parameter range.

26. And under the condition that the terminal determines that the loss value does not meet the preset parameter range, iteratively adjusting the weight of the first convolution layer in the original image classification model to obtain an iterative image classification model, and training the iterative image classification model by using a plurality of pieces of first parameter information until the image classification model is obtained.

27. And under the condition that the terminal determines that the loss value meets the preset parameter range, the terminal takes the original image classification model as an image classification model.

It can be understood that when the terminal obtains the category label parameter of the image to be classified by using the image classification model, the terminal may first determine the similarity between the category label parameter and the average characteristic parameter, and when the similarity satisfies the preset parameter range, the terminal may output the label information corresponding to the category label parameter, instead of directly outputting the label information corresponding to the category label parameter when the terminal obtains the category label parameter of the image to be classified according to the image classification model, thereby improving the accuracy of the classification label corresponding to the image to be classified output by the terminal.

Example two

Based on the same inventive concept of the embodiments, the embodiments of the present application provide a terminal 1, corresponding to an image processing method; fig. 11 is a schematic structural diagram of a terminal according to an embodiment of the present application, where the terminal 1 may include:

the input unit 11 is configured to input an image to be classified into an image classification model, so as to obtain a class label parameter corresponding to the image to be classified;

a determining unit 12, configured to determine a similarity parameter between the class label parameter and an average feature parameter, where the average feature parameter is a feature parameter determined from a plurality of sample images by using the image classification model, and the plurality of sample images are images of the same class as the image to be classified;

and the output unit 13 is configured to output label information corresponding to the category label parameter when the similarity parameter meets a preset similarity parameter range, so as to perform scene identification on the image to be classified by using the label information.

In some embodiments of the present application, the terminal further comprises a training unit;

the input unit 11 is configured to input the plurality of sample images into a first transmission layer of an original image classification model, to obtain a plurality of first parameter information corresponding to the plurality of sample images, where one sample image corresponds to one first parameter information; inputting the first parameter information into a first convolution layer of the original image classification model, and obtaining a plurality of sample label parameters corresponding to the first parameter information based on the weight of the first convolution layer, wherein one piece of first parameter information corresponds to one sample label parameter, and the first convolution layer is a next transmission layer executed after the first transmission layer;

the determining unit 12 is configured to determine a loss value of the original image classification model according to the plurality of sample label parameters and standard sample label parameters corresponding to the plurality of sample images;

In some embodiments of the present application, the terminal further comprises an adjustment unit;

the adjusting unit is used for iteratively adjusting the weight of the first convolution layer in the original image classification model under the condition that the loss value does not meet the preset parameter range to obtain an iterative image classification model;

the training unit is used for training the iterative image classification model by using the plurality of pieces of first parameter information until the image classification model is obtained;

the determining unit 12 is configured to use the original image classification model as the image classification model when the loss value satisfies a preset parameter range.

In some embodiments of the present application, the determining unit 12 is configured to input the plurality of first parameter information into a first convolution layer of the image classification model, and obtain a plurality of first sample label parameters based on the adjusted weights of the first convolution layer; determining an average of the plurality of first sample tag parameters; and taking the average value as the average characteristic parameter.

In some embodiments of the present application, the terminal further comprises a processing unit;

the input unit 11 is configured to input an image to be classified into an image classification model, so as to obtain image characteristic parameters of the image to be classified;

and the processing unit is used for carrying out normalization processing on the image characteristic parameters by using the image classification model to obtain the class label parameters.

In some embodiments of the present application, the determining unit 12 is configured to determine a first feature vector corresponding to the image feature parameter and a second feature vector corresponding to the average feature parameter; determining a binary pattern of the first eigenvector and a binary pattern of the second eigenvector; and determining the similar parameters according to the first characteristic vector, the second paradigm of the first characteristic vector and the second paradigm of the second characteristic vector.

In practical applications, the input Unit 11, the determining Unit 12 and the output Unit 13 may be implemented by a processor 14 on the terminal 1, specifically implemented by a Central Processing Unit (CPU), an MPU (Microprocessor), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), or the like; the above data storage may be implemented by the memory 15 on the terminal 1.

An embodiment of the present invention further provides a terminal 1, and as shown in fig. 12, the terminal 1 includes: a processor 14, a memory 15 and a communication bus 16, the memory 15 communicating with the processor 14 via the communication bus 16, the memory 15 storing a program executable by the processor 14, the program, when executed, performing the image processing method as described above via the processor 14.

In practical applications, the Memory 15 may be a volatile Memory (volatile Memory), such as a Random-Access Memory (RAM); or a non-volatile Memory (non-volatile Memory), such as a Read-Only Memory (ROM), a flash Memory (flash Memory), a Hard disk (Hard disk Drive, HDD) or a Solid-State Drive (SSD); or a combination of the above types of memories and provides instructions and data to processor 14.

An embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, which when executed by the processor 14 implements the image processing method as described above.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims

1. An image processing method, characterized in that the method comprises:

2. The method according to claim 1, wherein before the image to be classified is input into the image classification model and the class label parameter corresponding to the image to be classified is obtained, the method further comprises:

3. The method of claim 2, wherein the training the original image classification model using the loss value to obtain the image classification model comprises:

4. The method of claim 2, wherein after the training the original image classification model with the loss value to obtain the image classification model, the method further comprises:

determining an average of the plurality of first sample tag parameters;

and taking the average value as the average characteristic parameter.

5. The method according to claim 1, wherein the inputting the image to be classified into an image classification model to obtain the class label parameter corresponding to the image to be classified comprises:

6. The method of claim 5, wherein determining the similarity parameter between the category label parameter and the average feature parameter comprises:

7. A terminal, characterized in that the terminal comprises:

8. The terminal of claim 7, wherein the terminal further comprises a training unit;

9. A terminal, characterized in that the terminal comprises:

a memory, a processor, and a communication bus, the memory in communication with the processor through the communication bus, the memory storing an image processing program executable by the processor, the image processing program when executed causing the processor to perform the method of any of claims 1 to 6.

10. A storage medium having stored thereon a computer program for application to a terminal, characterized in that the computer program, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 6.