WO2021244425A1

WO2021244425A1 - Model training method and apparatus, image recognition method and apparatus, electronic device, and storage medium

Info

Publication number: WO2021244425A1
Application number: PCT/CN2021/096763
Authority: WO
Inventors: 黄颖; 邱尚锋; 张文伟
Original assignee: 广州虎牙科技有限公司
Priority date: 2020-06-01
Filing date: 2021-05-28
Publication date: 2021-12-09
Also published as: CN111639607A

Abstract

Provided are a model training method and apparatus, an image recognition method and apparatus, an electronic device, and a storage medium, which relate to the technical field of image recognition. In the present application, the model training method comprises: firstly, performing classification processing on image features by means of an image classification layer in a neural network model, so as to obtain a classification result, wherein the image features are obtained by means of processing a sample image on the basis of a feature extraction layer of the neural network model; secondly, performing reconstruction processing on the image features by means of an image reconstruction layer in the neural network model, so as to obtain a reconstructed image; then, performing loss determination processing on the reconstructed image by means of a loss determination layer in the neural network model, so as to obtain a reconstruction loss, and then performing loss determination processing on the classification result to obtain a classification loss; and finally, performing update processing on the neural network model on the basis of the reconstruction loss and the classification loss, so as to obtain an image recognition model. By means of the method, the problem of the recognition precision of a trained image recognition model in the related art not being high can be solved.

Description

Model training, image recognition method and device, electronic equipment and storage medium

Cross-references to related applications

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 1, 2020, with the application number 2020104869488 and titled "Model Training, Image Recognition Method and Device, Electronic Equipment and Storage Medium", all of which are approved The reference is incorporated in this application.

Technical field

This application relates to the field of image recognition technology, and in particular to a model training, image recognition method and device, electronic equipment, and storage medium.

Background technique

With the continuous development of image recognition technology, its application range has become wider and wider, and the accuracy requirements for image recognition results have become higher and higher.

The inventor’s research found that in the process of training the image recognition model in related technologies, due to insufficient constraints and restrictions on the information on which the training is based, the trained image recognition model extracts There is less information, which leads to the problem of low recognition accuracy in the image recognition model.

Summary of the invention

In view of this, the purpose of this application is to provide a model training, image recognition method and device, electronic equipment and storage medium to improve the problem of low recognition accuracy of image recognition models trained in related technologies.

In order to achieve the foregoing objectives, the following technical solutions are adopted in the embodiments of the present application:

An image recognition model training method, including:

The image features are classified through the preset image classification layer to obtain the classification result. The image classification layer belongs to the preset neural network model, and the image feature is based on the feature extraction layer of the neural network model to process the sample image. ；

Performing reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model;

Performing loss determination processing on the reconstructed image through the loss determination layer in the neural network model to obtain a reconstruction loss, and performing loss determination processing on the classification result to obtain a classification loss;

The neural network model is updated based on the reconstruction loss and the classification loss to obtain an image recognition model, where the image recognition model is configured to recognize a target image.

Optionally, in the above-mentioned image recognition model training method, the step of updating the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model includes:

Performing summation processing based on the reconstruction loss and the classification loss to obtain a total loss;

The neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.

Optionally, the step of performing summing processing based on the reconstruction loss and the classification loss to obtain a total loss value includes:

Obtain pre-configured weight coefficients;

Calculate the weighted sum of the reconstruction loss and the classification loss based on the weight coefficient, and use the weighted sum as the total loss.

Optionally, in the above-mentioned image recognition model training method, the step of updating the neural network model based on the total loss value and a preset back propagation algorithm to obtain an image recognition model includes:

a. Update the neural network model based on the obtained total loss value and the preset backpropagation algorithm to obtain a new neural network model, wherein the new neural network model is configured to perform the operation on the sample image again Processing, get the new total loss;

b. Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.

Optionally, in the above-mentioned image recognition model training method, the step of performing loss determination processing on the reconstructed image to obtain reconstruction loss includes:

For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;

Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.

Optionally, in the above-mentioned image recognition model training method, the step of performing loss determination processing on the classification result to obtain the classification loss includes:

Obtaining a plurality of preset classification labels through the loss determination layer, wherein the classification labels are generated based on identification processing on the plurality of sample images;

Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.

On the basis of the foregoing, an embodiment of the present application also provides an image recognition method, including:

Obtain a target image, and input the target image into a preset image recognition model, where the image recognition model is trained based on the above-mentioned image recognition model training method;

Perform recognition processing on the target image through the image recognition model to obtain a recognition result.

The embodiment of the present application also provides an image recognition model training device, including:

The feature classification module is configured to classify image features through a preset image classification layer to obtain a classification result, where the image classification layer belongs to a preset neural network model, and the image feature is based on the feature extraction layer of the neural network model Obtained by processing the sample image;

The feature reconstruction module is configured to perform reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model;

A loss determination module configured to perform loss determination processing on the reconstructed image through a loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss;

The model update module is configured to update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model, wherein the image recognition model is configured to recognize a target image.

Optionally, the model update module is further configured to:

Obtain pre-configured weight coefficients;

Optionally, the model update module is further configured to:

a. Update the neural network model based on the obtained total loss value and the preset back propagation algorithm to obtain a new neural network model, where the new neural network model is used to perform the sample image again Processing, get the new total loss;

Optionally, the loss determination module is further configured to:

On the basis of the foregoing, an embodiment of the present application also provides an image recognition device, including:

The image input module is configured to obtain a target image and input the target image to a preset image recognition model, wherein the image recognition model is trained based on the above-mentioned image recognition model training device;

The image recognition module is configured to perform recognition processing on the target image through the image recognition model to obtain a recognition result.

On the basis of the foregoing, an embodiment of the present application also provides an electronic device, including:

Memory, configured to store computer programs;

The processor connected to the memory is configured to execute the computer program stored in the memory to implement the above-mentioned image recognition model training method or the above-mentioned image recognition method.

On the basis of the foregoing, the embodiments of the present application also provide a computer-readable storage medium on which a computer program is stored. When the computer program is executed, the foregoing image recognition model training method or the foregoing image recognition method is implemented .

In order to make the above objectives, features, and advantages of the present application more comprehensible, preferred embodiments and accompanying drawings are described below in detail.

Description of the drawings

In order to explain the technical solution of the present application more clearly, the accompanying drawings that need to be used therein will be briefly introduced below. It should be understood that the following drawings only show some implementations of the present application, and therefore should not be regarded as It is a limitation of the scope. For those of ordinary skill in the art, without creative work, other related drawings can be obtained based on these drawings.

FIG. 1 is a structural block diagram of an electronic device provided by an embodiment of the application.

FIG. 2 is a schematic flowchart of an image recognition model training method provided by an embodiment of the application.

FIG. 3 is a schematic diagram of the effect between the sample image and the corresponding reconstructed image provided by the embodiment of the application.

FIG. 4 is a schematic flowchart of each sub-step included in step S130 in FIG. 2.

FIG. 5 is a schematic diagram of the effect of pixel information of a sample image provided by an embodiment of the application.

FIG. 6 is a schematic diagram of the effect of reconstructed image pixel information provided by an embodiment of the application.

FIG. 7 is a schematic flowchart of other sub-steps included in step S130 in FIG. 2.

FIG. 8 is a schematic flowchart of each sub-step included in step S140 in FIG. 2.

FIG. 9 is a schematic flowchart of each sub-step included in step S142 in FIG. 8.

FIG. 10 is a schematic flowchart of an image recognition method provided by an embodiment of this application.

FIG. 11 is a block diagram of the functional modules included in the image recognition model training apparatus provided by an embodiment of the application.

FIG. 12 is a schematic block diagram of functional modules included in the image recognition device provided by an embodiment of the application.

Icon: 10-electronic equipment; 12-memory; 14-processor; 100-image recognition model training device; 110-feature classification module; 120-feature reconstruction module; 130-loss determination module; 140-model update module; 200 -Image recognition device; 210-Image input module; 220-Image recognition module.

detailed description

In order to make the purpose, technical solutions and advantages of the embodiments of this application clearer, the following will clearly and completely describe the technical solutions in the embodiments of this application with reference to the drawings in the embodiments of this application. Obviously, the described embodiments It is only a part of the embodiments of the present application, but not all the embodiments. The components of the embodiments of the present application generally described and shown in the drawings herein may be arranged and designed in various different configurations.

Therefore, the following detailed description of the embodiments of the application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely represents selected embodiments of the application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

As shown in FIG. 1, an embodiment of the present application provides an electronic device 10. Wherein, the electronic device 10 may include a memory 12 and a processor 14.

In detail, the memory 12 and the processor 14 are directly or indirectly electrically connected to realize data transmission or interaction. For example, one or more communication buses or signal lines can be electrically connected to each other. The memory 12 may store at least one software function module that may exist in the form of software or firmware. The processor 14 may be used to execute an executable computer program stored in the memory 12, such as the aforementioned software function module, so as to implement the image recognition model training method provided by the embodiment of the present application (described later). In order to obtain the image recognition model, or implement the image recognition method provided in the embodiment of the present application (as described later), to obtain the recognition result of the target image.

Optionally, the memory 12 may be, but is not limited to, a random access memory (Random Access Memory, RAM), a read only memory (Read Only Memory, ROM), a programmable read only memory (Programmable Read-Only Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), Electrical Erasable Programmable Read-Only Memory (EEPROM), etc.

In addition, the processor 14 may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), a system on chip (System on Chip, SoC), etc.; or It is a Digital Signal Processor (Digital Signal Processing, DSP), Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic Devices, discrete hardware components.

It can be understood that the electronic device 10 may be a computer or server with data processing capabilities.

In addition, the structure shown in FIG. 1 is only for illustration, and the electronic device 10 may also include more or less components than that shown in FIG. 1, or have a configuration different from that shown in FIG. 1, for example, A communication unit used for information interaction with other devices, such as information interaction with other databases to obtain sample images, or information interaction with terminal devices to obtain target images. limit.

With reference to FIG. 2, an embodiment of the present application also provides an image recognition model training method that can be applied to the above-mentioned electronic device 10. Wherein, the method steps defined in the process related to the image recognition model training method can be implemented by the electronic device 10, that is, the image recognition model training method provided in the embodiments of the present application can be executed by the electronic device. The main steps S110 to S140 shown in FIG. 2 will be described in detail below.

Step S110: Perform classification processing on the image features through a preset image classification layer to obtain a classification result.

In this embodiment, the electronic device 10 may first perform classification processing on the image features through the image classification layer of the preset neural network model to obtain the classification result.

Wherein, the image feature may be obtained by processing a sample image based on the feature extraction layer of the neural network model.

Step S120: Perform reconstruction processing on the image feature through a preset image reconstruction layer to obtain a reconstructed image.

In this embodiment, the electronic device 10 may also perform reconstruction processing on the image features through the image reconstruction layer of the neural network model to obtain a reconstructed image.

Step S130: Perform loss determination processing on the reconstructed image through the loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss.

In this embodiment, after obtaining the classification result and the reconstructed image respectively based on step S110 and step S120, on the one hand, the electronic device 10 can classify the classification through the loss determination layer in the neural network model. As a result, loss determination processing is performed to obtain the corresponding classification loss; on the other hand, the electronic device 10 may also perform loss determination processing on the reconstructed image through the loss determination layer to obtain the corresponding reconstruction loss.

Step S140: Update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model.

That is, the aforementioned neural network model includes a feature extraction layer, an image classification layer, an image reconstruction layer, and a loss determination layer; wherein the feature extraction layer is connected to the image classification layer and the image reconstruction layer, respectively, the image classification layer and the image reconstruction layer They are respectively connected to the loss determination layer; that is, the input of the feature extraction layer is the sample image, and the output of the feature extraction layer is the image feature; the input of the image classification layer and the image reconstruction layer are both image features, and the output of the image classification layer is For the classification result, the output of the image reconstruction layer is the reconstructed image; the input of the loss determination layer is the classification result and the reconstructed image, and the output of the loss determination layer is the reconstruction loss and the classification loss.

In this embodiment, after obtaining the reconstruction loss and the classification loss based on step S130, the electronic device 10 may update the neural network model based on the reconstruction loss and the classification loss (that is, Training the neural network model), specifically, updating the model parameters of the neural network model, and the neural network model finally trained is the image recognition model, and the image recognition model is configured to recognize the target image.

Based on the above method, due to the training of the neural network model, the reconstruction loss and classification loss are fully considered, and the constraints and restrictions on the information based on the training of the neural network model are increased, so that the trained image recognition model When performing image recognition, more image feature information can be extracted, thereby improving the accuracy of image recognition, and better improving the problem of low recognition accuracy in related image recognition technologies. Especially when it is applied to face recognition, because the feature information of different faces is more similar (if fewer features are extracted, the problem of recognition failure or error is very likely to occur), so that the image recognition model obtained by training can be More different feature information is extracted, which makes the accuracy of the recognition result higher.

In the first aspect, it should be noted for step S110 that the specific method for classifying the image features is not limited, and can be selected according to actual application requirements.

For example, in an alternative example, there are multiple sample images. Correspondingly, there may be multiple image features. The image classification layer may be a fully connected layer (FC). ), the image classification is realized by means of the classification function of the fully connected layer, that is, multiple image features can be processed through the fully connected layer to obtain the classification result.

Wherein, in order to make it easy for the electronic device 10 to determine the classification loss based on the classification result, in an alternative example, the output of the fully connected layer may also be connected with a normalization function, so that the normalization function When the value output by the function is used as the classification result, it can have the probability meaning that each image feature belongs to different categories.

Optionally, the aforementioned normalization function may be a normalized exponential function (such as a softmax function), which can compress the value of any dimension in any k-dimensional vector z of the input to between (0, 1) And, the sum of the values of the k dimensions after compression is 1, so that the value of each dimension in the input vector z has a probability meaning.

In other words, in fact, the output of the fully connected layer is a k-dimensional vector, that is, there are k image features, and each image feature corresponds to a one-dimensional vector.

In the second aspect, it should be noted that in step S120, the specific method of performing reconstruction processing on the image feature is not limited, and can also be selected according to actual application requirements.

For example, in an alternative example, the feature extraction layer configured to process the sample image may be an encoder in a neural network model, and correspondingly, the image reconstruction layer may be a neural network model. decoder.

In other words, the electronic device 10 may perform feature extraction processing on the sample image based on the encoder to obtain corresponding image features (belonging to a feature vector), and then perform a feature extraction process on the image based on the decoder. The feature is reconstructed to obtain the corresponding reconstructed image.

As shown in Figure 3, the first line shows the facial images of 9 persons (ie 9 sample images). In this way, the feature extraction process of the 9 sample images can obtain 9 image features, and then the The 9 image features are reconstructed to obtain 9 face images in the second row (ie 9 reconstructed images).

In the third aspect, it should be noted for step S130 that the specific method for determining the reconstruction loss is not limited, and can be selected according to actual application requirements.

For example, in an alternative example, the reconstructed image may be subjected to feature extraction processing again through the feature extraction layer in the neural network model to obtain new image features. Then, the new image feature is compared with the image feature obtained by performing feature extraction on the sample image, and the corresponding reconstruction loss is obtained.

For another example, in another alternative example, in order to enable the neural network model to be trained based on the reconstruction loss to obtain the image recognition model, the image recognition model may have higher information extraction capabilities. With reference to FIG. 4, step S130 may include step S131 and step S132, and the specific content is as follows.

Step S131: For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer.

In this embodiment, the sample image may be multiple, and correspondingly, the reconstructed image may also be multiple. In this way, for each reconstructed image, the loss determination layer in the neural network model can be used to determine the pixel loss between the reconstructed image and the corresponding sample image.

Step S132: Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.

In this embodiment, after obtaining a plurality of the pixel losses based on step S131, the first loss calculation process may be performed on the determined plurality of pixel losses through the loss determination layer again to obtain the total reconstruction loss.

In detail, in a specific application example, as shown in FIG. 5, a sample image may include 9 pixels, which may be pixel A1, pixel A2, pixel A3, pixel A4, pixel A5, pixel point A6, pixel point A7, pixel point A8, and pixel point A9. As shown in Figure 6, it is a reconstructed image based on the sample image shown in Figure 5 after feature extraction processing and reconstruction processing. It can also include 9 corresponding pixel points, which can be pixel point B1 and pixel point B2. , Pixel point B3, pixel point B4, pixel point B5, pixel point B6, pixel point B7, pixel point B8, and pixel point B9.

Based on this, the pixel difference between pixel A1 and pixel B1 (ie pixel loss), the pixel difference between pixel A2 and pixel B2, and the pixel between pixel A3 and pixel B3 can be calculated separately Difference, pixel difference between pixel A4 and pixel B4, pixel difference between pixel A5 and pixel B5, pixel difference between pixel A6 and pixel B6, pixel A7 and pixel The pixel difference between the point B7, the pixel difference between the pixel point A8 and the pixel point B8, the pixel difference between the pixel point A9 and the pixel point B9.

In this way, 9 pixel difference values can be obtained, and then, based on the 9 pixel difference values (pixel loss), the reconstruction loss between the sample image shown in FIG. 5 and the reconstructed image shown in FIG. 6 can be determined.

That is to say, the reconstruction loss is determined based on the pixel loss of each pixel, so that when the neural network model is trained based on the reconstruction loss, it can be guaranteed that the constraint information or the supervision information is at the pixel level. The constraints and restrictions on the information on which the neural network model is trained are strengthened. In this way, the image recognition model obtained by training can have a higher feature information extraction capability, thereby achieving high-precision image recognition.

In addition, for step S130, it should be noted that the specific method for determining the classification loss is not limited, and can be selected according to actual application requirements.

For example, in an alternative example, for a sample image, the corresponding prediction result may include the probability value that the sample image is predicted to be each sample image in all the sample images. Wherein, if the number of all sample images is k, then for one sample image, the obtained classification result may include k probability values.

In this way, a probability value with a probability of 1 can be determined according to the probability value with the largest value among the k probability values, and then k-1 probability values with a probability of 0 can be determined based on the other k-1 probability values. Then, based on the k probability values included in the classification result and the determined k probability values consisting of a 1 and k-1 0s, a classification loss that occurs when a sample image is classified is calculated.

For another example, in another alternative example, considering the probability value of the largest value in the classification result, the probability value may not be characterized due to insufficient feature information extracted, and the corresponding two sample images are the same. A sample image. In this way, in conjunction with FIG. 7, step S130 may include step S133 and step S134, and the specific content is as follows.

Step S133: Obtain multiple preset classification labels through the loss determination layer.

In this embodiment, when the classification loss needs to be calculated, multiple preset classification labels may be obtained through the loss determination layer in the neural network model.

Wherein, the classification label is generated based on identifying a plurality of the sample images, that is, there is a one-to-one correspondence between the classification label and the sample image, for example, sample images with different human faces have Different classification labels, 1 million sample images of different faces, corresponds to 1 million classification labels.

Step S134: Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.

In this embodiment, after the classification label is obtained based on step S133, the classification result and the classification label may be subjected to a second loss calculation process through the loss determination layer to obtain the classification loss.

Wherein, the classification result may be a k-dimensional column vector. For example, for k image features (k sample images), the classification result can be expressed as a classification vector matrix with k rows and k columns (a column of data indicates: a sample image is predicted as each sample image in all sample images Probability value). In this way, based on the classification vector matrix and the classification label, a preset loss function can be used to calculate the classification loss.

In the fourth aspect, it should be noted for step S140 that the specific method for updating the neural network model is not limited, and can be selected according to actual application requirements.

For example, in an alternative example, the neural network model may be trained based on the reconstruction loss and the classification loss respectively.

For another example, in another alternative example, in order to improve the efficiency of training the neural network model, in conjunction with FIG. 8, step S140 may include step S141 and step S143, and the specific content is as follows.

Step S141, performing a summation process based on the reconstruction loss and the classification loss to obtain a total loss value.

In this embodiment, after the reconstruction loss and the classification loss are obtained based on step S130, the reconstruction loss and the classification loss can be summed, that is, the sum of the reconstruction loss and the classification loss can be calculated. , In order to get the corresponding total loss.

Step S142, updating the neural network model based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.

In this embodiment, after the total loss value is obtained based on step S141, the total loss value can be used to analyze the neural network model according to the back propagation algorithm (Backpropagation algorithm, BP algorithm, which is a supervised learning algorithm). Perform update processing, that is, update the parameters of each network layer (such as the feature extraction layer, image classification layer, image reconstruction layer, etc.) included in the neural network model to obtain a new neural network model, that is, to obtain the required Image recognition model.

Optionally, the specific manner of performing step S141 to calculate the total loss is not limited, and can be selected according to actual application requirements.

For example, in an alternative example, the reconstruction loss and the classification loss can be directly summed to obtain the corresponding total loss.

For another example, in another alternative example, a pre-configured weight coefficient can be obtained first, and then the weighted sum of the reconstruction loss and the classification loss is calculated based on the weight coefficient to obtain the corresponding total loss value.

Wherein, the weight coefficient may be a fixed value or a dynamically changing value. For example, the weight coefficient may be adjusted based on the determined reconstruction loss and classification loss. For example, the value of a loss is larger, The corresponding weight coefficient can be set larger.

Optionally, the specific manner of performing step S142 to update the neural network model is not limited, and can also be selected according to actual application requirements.

For example, in an alternative example, the neural network model can be updated once based on the total loss value (that is, the number of iterations of the backpropagation algorithm is limited to 1) to ensure that the neural network model is updated s efficiency.

For another example, in another alternative example, in order to ensure that the obtained image recognition model has higher image recognition accuracy, in conjunction with FIG. 9, step S142 may include step S142a and step S142b, and the specific content is as follows.

In step S142a, the neural network model is updated based on the obtained total loss value and the preset back propagation algorithm to obtain a new neural network model.

In this embodiment, after the total loss value is obtained based on step S141, the neural network model can be updated according to the back propagation algorithm based on the total loss value (that is, one iteration is completed based on the back propagation algorithm), In order to get a new neural network model.

Wherein, the new neural network model may be configured to process the sample image again to obtain a new total loss value.

Step S142b, judging whether the new total loss value is less than a preset loss value.

In this embodiment, after obtaining a new neural network model based on step S142a, and performing feature extraction processing, classification processing, reconstruction processing, and loss determination processing on the sample image based on the new neural network model, the determined Whether the new total loss value is less than the preset loss value.

In this way, on the one hand, when the new total loss value is less than the preset loss value, it indicates that the current neural network model has high feature information extraction capabilities and high recognition accuracy. Therefore, the current neural network model can be The network model (that is, the neural network model obtained from the last update process) is used as the image recognition model.

On the other hand, when the new total loss value is not less than the preset loss value, it indicates that the current neural network model does not yet have high feature information extraction capabilities, and the recognition accuracy is not high. Therefore, it needs to be executed again. In step S142, based on the new total loss value, the current neural network model is updated again according to the back propagation algorithm (that is, the second iteration is completed based on the back propagation algorithm), so as to obtain a new neural network model again.

With reference to FIG. 10, an embodiment of the present application also provides an image recognition method applicable to the above-mentioned electronic device 10. The method steps defined in the process related to the image recognition method can be implemented by the electronic device 10, that is, the electronic device can execute the image recognition method provided in the embodiment of the present application. The specific process shown in FIG. 10 will be described in detail below.

Step S210: Input the obtained target image into a preset image recognition model.

In this embodiment, after obtaining the target image, the electronic device 10 may first input the target image into a preset image recognition model.

Wherein, the image recognition model may be obtained by training based on the aforementioned image recognition model training method.

Step S220: Perform recognition processing on the target image through the image recognition model to obtain a recognition result.

In this embodiment, after the target image is input to the image recognition model based on step S210, the electronic device 10 may perform recognition processing on the target image through the image recognition model to obtain a corresponding recognition result. For example, the corresponding person information can be determined by the facial features in the target image, such as determining whether it belongs to a certain person.

Wherein, because the image recognition model is trained based on the above-mentioned image recognition model training method, it has a high feature information extraction ability, so that the target image has a high recognition accuracy, so that the obtained recognition result can have Higher accuracy.

In conjunction with FIG. 11, an embodiment of the present application also provides an image recognition model training apparatus 100, which can be configured as the above-mentioned electronic device 10. The image recognition model training device 100 may include a feature classification module 110, a feature reconstruction module 120, a loss determination module 130, and a model update module 140.

The feature classification module 110 may be configured to classify image features through a preset image classification layer to obtain classification results, where the image classification layer belongs to a preset neural network model, and the image feature is based on the neural network model. The feature extraction layer is obtained by processing the sample image. In this embodiment, the feature classification module 110 may be configured to perform step S110 shown in FIG. 2, and for related content of the feature classification module 110, reference may be made to the foregoing description of step S110.

The feature reconstruction module 120 may be configured to perform reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model. In this embodiment, the feature reconstruction module 120 may be configured to perform step S120 shown in FIG. 2, and the relevant content of the feature reconstruction module 120 can refer to the foregoing description of step S120.

The loss determination module 130 may be configured to perform loss determination processing on the reconstructed image through the loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss. In this embodiment, the loss determination module 130 may be configured to perform step S130 shown in FIG. 2, and for related content of the loss determination module 130, reference may be made to the foregoing description of step S130.

The model update module 140 may be configured to update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model, wherein the image recognition model is configured to recognize a target image. In this embodiment, the model update module 140 may be configured to perform step S140 shown in FIG. 2. For related content of the model update module 140, reference may be made to the foregoing description of step S140.

The above-mentioned image recognition model training device provided by the embodiment of the application obtains the classification result and the reconstructed image by performing classification processing and reconstruction processing on the image features, so that the parameters of the neural network model are updated (that is, the neural network model is updated). When training), the neural network model can be trained based on the reconstruction loss and the classification loss to obtain the image recognition model. In this way, due to the training of the neural network model, the reconstruction loss and classification loss are fully considered, so that the constraints and restrictions on the information based on the training are relatively high, and the image recognition model obtained by training can be used for image recognition. More image feature information can be extracted, thereby improving the accuracy of image recognition, better improving the problem of low recognition accuracy in the existing image recognition technology, and having higher practical value. Especially when adult face recognition should be configured, because the feature information of different faces is more similar (if fewer features are extracted, the problem of recognition failure or error is very likely to occur), so that more differences can be extracted The characteristic information makes the recognition result more accurate and the application effect is remarkable.

Optionally, the model update module is further configured to:

Obtain pre-configured weight coefficients;

Optionally, the model update module is further configured to:

Optionally, the loss determination module is further configured to:

With reference to FIG. 12, an embodiment of the present application also provides an image recognition apparatus 200, which may be configured as the above-mentioned electronic device 10. Wherein, the image recognition device 200 may include an image input module 210 and an image recognition module 220.

The image input module 210 may be configured to obtain a target image and input the target image into a preset image recognition model, wherein the image recognition model is trained based on the aforementioned image recognition model training device. In this embodiment, the image input module 210 may be configured to perform step S210 shown in FIG. 10, and for related content of the image input module 210, reference may be made to the foregoing description of step S210.

The image recognition module 220 may be configured to perform recognition processing on the target image through the image recognition model to obtain a recognition result. In this embodiment, the image recognition module 220 may be configured to perform step S220 shown in FIG. 10, and for related content of the image recognition module, reference may be made to the foregoing description of step S220.

Since the image recognition device provided by the embodiment of the present application uses the image recognition model obtained by the aforementioned model training device for image recognition, it has high image recognition accuracy.

In the embodiment of the present application, corresponding to the above-mentioned image recognition model training method, a computer-readable storage medium is also provided. The computer-readable storage medium stores a computer program that executes the above-mentioned image recognition model when the computer program is running. The individual steps of the training method.

Among them, the steps performed during the running of the aforementioned computer program will not be repeated here one by one, and reference may be made to the previous explanation of the image recognition model training method.

In addition, in the embodiments of the present application, corresponding to the above-mentioned image recognition method, a computer-readable storage medium is also provided. The computer-readable storage medium stores a computer program, and the computer program executes the above-mentioned image recognition method when the computer program is running. The various steps.

Among them, the steps performed during the running of the aforementioned computer program will not be repeated here one by one, and reference may be made to the previous explanation of the image recognition method.

In summary, the model training, image recognition method and device, electronic equipment, and storage medium provided by this application perform classification processing and reconstruction processing on image features, respectively, to obtain classification results and reconstructed images, so that the neural network When the model is updated (that is, the neural network model is trained), the neural network model can be trained based on the reconstruction loss and the classification loss to obtain an image recognition model. In this way, due to the training of the neural network model, the reconstruction loss and classification loss are fully considered, so that the constraints and restrictions on the information based on the training are relatively high, and the image recognition model obtained by training can be used for image recognition. More image feature information can be extracted, thereby improving the accuracy of image recognition, and better improving the problem of low recognition accuracy in related image recognition technologies. It has high practical value, especially in face recognition. At this time, because the feature information of different faces is more similar (if fewer features are extracted, the problem of recognition failure or error is very likely to occur), so that more different feature information can be extracted, so that the accuracy of the recognition result Higher, the application effect is remarkable.

In the several embodiments provided in the embodiments of the present application, it should be understood that the disclosed device and method may also be implemented in other ways. The device and method embodiments described above are merely illustrative. For example, the flowcharts and block diagrams in the accompanying drawings show possible implementation architectures of devices, methods, and computer program products according to multiple embodiments of the present application. Function and operation. In this regard, each block in the flowchart or block diagram may represent a module, program segment, or part of the code, and the module, program segment, or part of the code contains one or more modules for realizing the specified logical function. Executable instructions. It should also be noted that, in some alternative implementations, the functions marked in the blocks may also occur in a different order from the order marked in the drawings. For example, two consecutive blocks can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.

In addition, the functional modules in the various embodiments of the present application may be integrated together to form an independent part, or each module may exist alone, or two or more modules may be integrated to form an independent part.

If the function is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, an electronic device, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. . It should be noted that in this article, the terms "including", "including" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, method, article, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or equipment that includes the element.

The foregoing descriptions are only preferred embodiments of the application, and are not intended to limit the application. For those skilled in the art, the application can have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the protection scope of this application.

Industrial applicability

In the technical solution proposed in this application, the classification result and the reconstructed image are obtained by performing classification processing and reconstruction processing on the image features, so that when the neural network model is updated (that is, the neural network model is trained), it can be Based on the reconstruction loss and classification loss, the neural network model is trained to obtain the image recognition model. In this way, due to the training of the neural network model, the reconstruction loss and classification loss are fully considered, so that the constraints and restrictions on the information based on the training are relatively high, and the image recognition model obtained by training can be used for image recognition. More image feature information can be extracted, thereby improving the accuracy of image recognition, better improving the problem of low recognition accuracy in related image recognition technologies, and having higher practical value.

Claims

An image recognition model training method, which is characterized in that it includes:

The image features are classified through the preset image classification layer to obtain the classification result. The image classification layer belongs to the preset neural network model, and the image feature is based on the feature extraction layer of the neural network model to process the sample image. ；

Performing reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model;

Performing loss determination processing on the reconstructed image through the loss determination layer in the neural network model to obtain a reconstruction loss, and performing loss determination processing on the classification result to obtain a classification loss;

The neural network model is updated based on the reconstruction loss and the classification loss to obtain an image recognition model, where the image recognition model is configured to recognize a target image.
The image recognition model training method according to claim 1, wherein the step of updating the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model comprises:

Performing summation processing based on the reconstruction loss and the classification loss to obtain a total loss;

The neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
The image recognition model training method according to claim 2, wherein the step of performing summation processing based on the reconstruction loss and the classification loss to obtain a total loss value comprises:

Obtain pre-configured weight coefficients;

Calculate the weighted sum of the reconstruction loss and the classification loss based on the weight coefficient, and use the weighted sum as the total loss.
The image recognition model training method according to claim 2 or 3, wherein the neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain the image recognition model The steps include:

a. Update the neural network model based on the obtained total loss value and the preset back propagation algorithm to obtain a new neural network model, where the new neural network model is used to perform the sample image again Processing, get the new total loss;

b. Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.
The image recognition model training method according to any one of claims 1-4, wherein the step of performing loss determination processing on the reconstructed image to obtain reconstruction loss comprises:

For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;

Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.
The image recognition model training method according to any one of claims 1-5, wherein the step of performing loss determination processing on the classification result to obtain a classification loss comprises:

Obtaining a plurality of preset classification labels through the loss determination layer, wherein the classification labels are generated based on identification processing on the plurality of sample images;

Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.
An image recognition method, characterized in that it comprises:

Obtain a target image, and input the target image into a preset image recognition model, wherein the image recognition model is trained based on the image recognition model training method of any one of claims 1-6;

Perform recognition processing on the target image through the image recognition model to obtain a recognition result.
An image recognition model training device, which is characterized in that it comprises:

The feature classification module is configured to classify image features through a preset image classification layer to obtain a classification result, where the image classification layer belongs to a preset neural network model, and the image feature is based on the feature extraction layer of the neural network model Obtained by processing the sample image;

The feature reconstruction module is configured to perform reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model;

A loss determination module configured to perform loss determination processing on the reconstructed image through a loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss;

The model update module is configured to update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model, wherein the image recognition model is configured to recognize a target image.
The image recognition model training device according to claim 8, wherein the model update module is further configured to:

Performing summation processing based on the reconstruction loss and the classification loss to obtain a total loss;

The neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
The image recognition model training device according to claim 9, wherein the model update module is further configured to:

Obtain pre-configured weight coefficients;

Calculate the weighted sum of the reconstruction loss and the classification loss based on the weight coefficient, and use the weighted sum as the total loss.
The image recognition model training device according to claim 9 or 10, wherein the model update module is further configured to:

a. Update the neural network model based on the obtained total loss value and the preset back propagation algorithm to obtain a new neural network model, where the new neural network model is used to perform the sample image again Processing, get the new total loss;

b. Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.
The image recognition model training device according to any one of claims 8-11, wherein the loss determination module is further configured to:

For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;

Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.
The image recognition model training device according to any one of claims 8-12, wherein the loss determination module is further configured to:

Obtaining a plurality of preset classification labels through the loss determination layer, wherein the classification labels are generated based on identification processing on the plurality of sample images;

Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.
An image recognition device, characterized in that it comprises:

An image input module configured to obtain a target image and input the target image to a preset image recognition model, wherein the image recognition model is trained based on the image recognition model training device of claim 8;

The image recognition module is configured to perform recognition processing on the target image through the image recognition model to obtain a recognition result.
An electronic device, characterized in that it comprises:

Memory, configured to store computer programs;

The processor connected to the memory is configured to execute the computer program stored in the memory to implement the image recognition model training method according to any one of claims 1-6, or to implement the image recognition method according to claim 7.
A computer-readable storage medium with a computer program stored thereon, characterized in that, when the computer program is executed, it realizes the image recognition model training method according to any one of claims 1-6, or realizes the method described in claim 7 The image recognition method described.