WO2021244425A1 - Model training method and apparatus, image recognition method and apparatus, electronic device, and storage medium - Google Patents

Model training method and apparatus, image recognition method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2021244425A1
WO2021244425A1 PCT/CN2021/096763 CN2021096763W WO2021244425A1 WO 2021244425 A1 WO2021244425 A1 WO 2021244425A1 CN 2021096763 W CN2021096763 W CN 2021096763W WO 2021244425 A1 WO2021244425 A1 WO 2021244425A1
Authority
WO
WIPO (PCT)
Prior art keywords
loss
image
classification
image recognition
neural network
Prior art date
Application number
PCT/CN2021/096763
Other languages
French (fr)
Chinese (zh)
Inventor
黄颖
邱尚锋
张文伟
Original Assignee
广州虎牙科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州虎牙科技有限公司 filed Critical 广州虎牙科技有限公司
Publication of WO2021244425A1 publication Critical patent/WO2021244425A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • This application relates to the field of image recognition technology, and in particular to a model training, image recognition method and device, electronic equipment, and storage medium.
  • the inventor’s research found that in the process of training the image recognition model in related technologies, due to insufficient constraints and restrictions on the information on which the training is based, the trained image recognition model extracts There is less information, which leads to the problem of low recognition accuracy in the image recognition model.
  • the purpose of this application is to provide a model training, image recognition method and device, electronic equipment and storage medium to improve the problem of low recognition accuracy of image recognition models trained in related technologies.
  • An image recognition model training method including:
  • the image features are classified through the preset image classification layer to obtain the classification result.
  • the image classification layer belongs to the preset neural network model, and the image feature is based on the feature extraction layer of the neural network model to process the sample image. ;
  • the neural network model is updated based on the reconstruction loss and the classification loss to obtain an image recognition model, where the image recognition model is configured to recognize a target image.
  • the step of updating the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model includes:
  • the neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
  • the step of performing summing processing based on the reconstruction loss and the classification loss to obtain a total loss value includes:
  • the step of updating the neural network model based on the total loss value and a preset back propagation algorithm to obtain an image recognition model includes:
  • step a Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.
  • the step of performing loss determination processing on the reconstructed image to obtain reconstruction loss includes:
  • each reconstructed image determines the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;
  • the step of performing loss determination processing on the classification result to obtain the classification loss includes:
  • an embodiment of the present application also provides an image recognition method, including:
  • the embodiment of the present application also provides an image recognition model training device, including:
  • the feature classification module is configured to classify image features through a preset image classification layer to obtain a classification result, where the image classification layer belongs to a preset neural network model, and the image feature is based on the feature extraction layer of the neural network model Obtained by processing the sample image;
  • the feature reconstruction module is configured to perform reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model;
  • a loss determination module configured to perform loss determination processing on the reconstructed image through a loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss;
  • the model update module is configured to update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model, wherein the image recognition model is configured to recognize a target image.
  • model update module is further configured to:
  • the neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
  • model update module is further configured to:
  • model update module is further configured to:
  • step a Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.
  • the loss determination module is further configured to:
  • each reconstructed image determines the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;
  • the loss determination module is further configured to:
  • an embodiment of the present application also provides an image recognition device, including:
  • the image input module is configured to obtain a target image and input the target image to a preset image recognition model, wherein the image recognition model is trained based on the above-mentioned image recognition model training device;
  • the image recognition module is configured to perform recognition processing on the target image through the image recognition model to obtain a recognition result.
  • an embodiment of the present application also provides an electronic device, including:
  • Memory configured to store computer programs
  • the processor connected to the memory is configured to execute the computer program stored in the memory to implement the above-mentioned image recognition model training method or the above-mentioned image recognition method.
  • the embodiments of the present application also provide a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed, the foregoing image recognition model training method or the foregoing image recognition method is implemented .
  • FIG. 1 is a structural block diagram of an electronic device provided by an embodiment of the application.
  • FIG. 2 is a schematic flowchart of an image recognition model training method provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of the effect between the sample image and the corresponding reconstructed image provided by the embodiment of the application.
  • FIG. 4 is a schematic flowchart of each sub-step included in step S130 in FIG. 2.
  • FIG. 5 is a schematic diagram of the effect of pixel information of a sample image provided by an embodiment of the application.
  • FIG. 6 is a schematic diagram of the effect of reconstructed image pixel information provided by an embodiment of the application.
  • FIG. 7 is a schematic flowchart of other sub-steps included in step S130 in FIG. 2.
  • FIG. 8 is a schematic flowchart of each sub-step included in step S140 in FIG. 2.
  • FIG. 9 is a schematic flowchart of each sub-step included in step S142 in FIG. 8.
  • FIG. 10 is a schematic flowchart of an image recognition method provided by an embodiment of this application.
  • FIG. 11 is a block diagram of the functional modules included in the image recognition model training apparatus provided by an embodiment of the application.
  • FIG. 12 is a schematic block diagram of functional modules included in the image recognition device provided by an embodiment of the application.
  • Icon 10-electronic equipment; 12-memory; 14-processor; 100-image recognition model training device; 110-feature classification module; 120-feature reconstruction module; 130-loss determination module; 140-model update module; 200 -Image recognition device; 210-Image input module; 220-Image recognition module.
  • an embodiment of the present application provides an electronic device 10.
  • the electronic device 10 may include a memory 12 and a processor 14.
  • the memory 12 and the processor 14 are directly or indirectly electrically connected to realize data transmission or interaction.
  • the memory 12 may store at least one software function module that may exist in the form of software or firmware.
  • the processor 14 may be used to execute an executable computer program stored in the memory 12, such as the aforementioned software function module, so as to implement the image recognition model training method provided by the embodiment of the present application (described later). In order to obtain the image recognition model, or implement the image recognition method provided in the embodiment of the present application (as described later), to obtain the recognition result of the target image.
  • the memory 12 may be, but is not limited to, a random access memory (Random Access Memory, RAM), a read only memory (Read Only Memory, ROM), a programmable read only memory (Programmable Read-Only Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), Electrical Erasable Programmable Read-Only Memory (EEPROM), etc.
  • RAM Random Access Memory
  • ROM read only memory
  • PROM Programmable Read-Only Memory
  • EPROM Erasable Programmable Read-Only Memory
  • EEPROM Electrical Erasable Programmable Read-Only Memory
  • the processor 14 may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), a system on chip (System on Chip, SoC), etc.; or It is a Digital Signal Processor (Digital Signal Processing, DSP), Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic Devices, discrete hardware components.
  • CPU central processing unit
  • NP Network Processor
  • SoC System on Chip
  • DSP Digital Signal Processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the electronic device 10 may be a computer or server with data processing capabilities.
  • FIG. 1 is only for illustration, and the electronic device 10 may also include more or less components than that shown in FIG. 1, or have a configuration different from that shown in FIG. 1, for example, A communication unit used for information interaction with other devices, such as information interaction with other databases to obtain sample images, or information interaction with terminal devices to obtain target images. limit.
  • an embodiment of the present application also provides an image recognition model training method that can be applied to the above-mentioned electronic device 10.
  • the method steps defined in the process related to the image recognition model training method can be implemented by the electronic device 10, that is, the image recognition model training method provided in the embodiments of the present application can be executed by the electronic device.
  • the main steps S110 to S140 shown in FIG. 2 will be described in detail below.
  • Step S110 Perform classification processing on the image features through a preset image classification layer to obtain a classification result.
  • the electronic device 10 may first perform classification processing on the image features through the image classification layer of the preset neural network model to obtain the classification result.
  • the image feature may be obtained by processing a sample image based on the feature extraction layer of the neural network model.
  • Step S120 Perform reconstruction processing on the image feature through a preset image reconstruction layer to obtain a reconstructed image.
  • the electronic device 10 may also perform reconstruction processing on the image features through the image reconstruction layer of the neural network model to obtain a reconstructed image.
  • Step S130 Perform loss determination processing on the reconstructed image through the loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss.
  • the electronic device 10 can classify the classification through the loss determination layer in the neural network model. As a result, loss determination processing is performed to obtain the corresponding classification loss; on the other hand, the electronic device 10 may also perform loss determination processing on the reconstructed image through the loss determination layer to obtain the corresponding reconstruction loss.
  • Step S140 Update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model.
  • the aforementioned neural network model includes a feature extraction layer, an image classification layer, an image reconstruction layer, and a loss determination layer; wherein the feature extraction layer is connected to the image classification layer and the image reconstruction layer, respectively, the image classification layer and the image reconstruction layer They are respectively connected to the loss determination layer; that is, the input of the feature extraction layer is the sample image, and the output of the feature extraction layer is the image feature; the input of the image classification layer and the image reconstruction layer are both image features, and the output of the image classification layer is For the classification result, the output of the image reconstruction layer is the reconstructed image; the input of the loss determination layer is the classification result and the reconstructed image, and the output of the loss determination layer is the reconstruction loss and the classification loss.
  • the electronic device 10 may update the neural network model based on the reconstruction loss and the classification loss (that is, Training the neural network model), specifically, updating the model parameters of the neural network model, and the neural network model finally trained is the image recognition model, and the image recognition model is configured to recognize the target image.
  • the reconstruction loss and the classification loss that is, Training the neural network model
  • the reconstruction loss and classification loss are fully considered, and the constraints and restrictions on the information based on the training of the neural network model are increased, so that the trained image recognition model
  • more image feature information can be extracted, thereby improving the accuracy of image recognition, and better improving the problem of low recognition accuracy in related image recognition technologies.
  • the image recognition model obtained by training can be More different feature information is extracted, which makes the accuracy of the recognition result higher.
  • step S110 the specific method for classifying the image features is not limited, and can be selected according to actual application requirements.
  • the image classification layer may be a fully connected layer (FC). ), the image classification is realized by means of the classification function of the fully connected layer, that is, multiple image features can be processed through the fully connected layer to obtain the classification result.
  • FC fully connected layer
  • the output of the fully connected layer may also be connected with a normalization function, so that the normalization function When the value output by the function is used as the classification result, it can have the probability meaning that each image feature belongs to different categories.
  • the aforementioned normalization function may be a normalized exponential function (such as a softmax function), which can compress the value of any dimension in any k-dimensional vector z of the input to between (0, 1) And, the sum of the values of the k dimensions after compression is 1, so that the value of each dimension in the input vector z has a probability meaning.
  • a normalized exponential function such as a softmax function
  • the output of the fully connected layer is a k-dimensional vector, that is, there are k image features, and each image feature corresponds to a one-dimensional vector.
  • step S120 the specific method of performing reconstruction processing on the image feature is not limited, and can also be selected according to actual application requirements.
  • the feature extraction layer configured to process the sample image may be an encoder in a neural network model, and correspondingly, the image reconstruction layer may be a neural network model. decoder.
  • the electronic device 10 may perform feature extraction processing on the sample image based on the encoder to obtain corresponding image features (belonging to a feature vector), and then perform a feature extraction process on the image based on the decoder. The feature is reconstructed to obtain the corresponding reconstructed image.
  • the first line shows the facial images of 9 persons (ie 9 sample images).
  • the feature extraction process of the 9 sample images can obtain 9 image features, and then the The 9 image features are reconstructed to obtain 9 face images in the second row (ie 9 reconstructed images).
  • step S130 the specific method for determining the reconstruction loss is not limited, and can be selected according to actual application requirements.
  • the reconstructed image may be subjected to feature extraction processing again through the feature extraction layer in the neural network model to obtain new image features. Then, the new image feature is compared with the image feature obtained by performing feature extraction on the sample image, and the corresponding reconstruction loss is obtained.
  • step S130 may include step S131 and step S132, and the specific content is as follows.
  • Step S131 For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer.
  • the sample image may be multiple, and correspondingly, the reconstructed image may also be multiple.
  • the loss determination layer in the neural network model can be used to determine the pixel loss between the reconstructed image and the corresponding sample image.
  • Step S132 Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.
  • the first loss calculation process may be performed on the determined plurality of pixel losses through the loss determination layer again to obtain the total reconstruction loss.
  • a sample image may include 9 pixels, which may be pixel A1, pixel A2, pixel A3, pixel A4, pixel A5, pixel point A6, pixel point A7, pixel point A8, and pixel point A9.
  • it is a reconstructed image based on the sample image shown in Figure 5 after feature extraction processing and reconstruction processing. It can also include 9 corresponding pixel points, which can be pixel point B1 and pixel point B2. , Pixel point B3, pixel point B4, pixel point B5, pixel point B6, pixel point B7, pixel point B8, and pixel point B9.
  • the pixel difference between pixel A1 and pixel B1 (ie pixel loss), the pixel difference between pixel A2 and pixel B2, and the pixel between pixel A3 and pixel B3 can be calculated separately Difference, pixel difference between pixel A4 and pixel B4, pixel difference between pixel A5 and pixel B5, pixel difference between pixel A6 and pixel B6, pixel A7 and pixel
  • the reconstruction loss is determined based on the pixel loss of each pixel, so that when the neural network model is trained based on the reconstruction loss, it can be guaranteed that the constraint information or the supervision information is at the pixel level.
  • the constraints and restrictions on the information on which the neural network model is trained are strengthened. In this way, the image recognition model obtained by training can have a higher feature information extraction capability, thereby achieving high-precision image recognition.
  • step S130 it should be noted that the specific method for determining the classification loss is not limited, and can be selected according to actual application requirements.
  • the corresponding prediction result may include the probability value that the sample image is predicted to be each sample image in all the sample images.
  • the obtained classification result may include k probability values.
  • a probability value with a probability of 1 can be determined according to the probability value with the largest value among the k probability values, and then k-1 probability values with a probability of 0 can be determined based on the other k-1 probability values. Then, based on the k probability values included in the classification result and the determined k probability values consisting of a 1 and k-1 0s, a classification loss that occurs when a sample image is classified is calculated.
  • step S130 may include step S133 and step S134, and the specific content is as follows.
  • Step S133 Obtain multiple preset classification labels through the loss determination layer.
  • multiple preset classification labels may be obtained through the loss determination layer in the neural network model.
  • the classification label is generated based on identifying a plurality of the sample images, that is, there is a one-to-one correspondence between the classification label and the sample image, for example, sample images with different human faces have Different classification labels, 1 million sample images of different faces, corresponds to 1 million classification labels.
  • Step S134 Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.
  • the classification result and the classification label may be subjected to a second loss calculation process through the loss determination layer to obtain the classification loss.
  • the classification result may be a k-dimensional column vector.
  • the classification result can be expressed as a classification vector matrix with k rows and k columns (a column of data indicates: a sample image is predicted as each sample image in all sample images Probability value).
  • a preset loss function can be used to calculate the classification loss.
  • step S140 the specific method for updating the neural network model is not limited, and can be selected according to actual application requirements.
  • the neural network model may be trained based on the reconstruction loss and the classification loss respectively.
  • step S140 may include step S141 and step S143, and the specific content is as follows.
  • Step S141 performing a summation process based on the reconstruction loss and the classification loss to obtain a total loss value.
  • the reconstruction loss and the classification loss can be summed, that is, the sum of the reconstruction loss and the classification loss can be calculated. , In order to get the corresponding total loss.
  • Step S142 updating the neural network model based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
  • the total loss value can be used to analyze the neural network model according to the back propagation algorithm (Backpropagation algorithm, BP algorithm, which is a supervised learning algorithm).
  • Perform update processing that is, update the parameters of each network layer (such as the feature extraction layer, image classification layer, image reconstruction layer, etc.) included in the neural network model to obtain a new neural network model, that is, to obtain the required Image recognition model.
  • step S141 to calculate the total loss is not limited, and can be selected according to actual application requirements.
  • the reconstruction loss and the classification loss can be directly summed to obtain the corresponding total loss.
  • a pre-configured weight coefficient can be obtained first, and then the weighted sum of the reconstruction loss and the classification loss is calculated based on the weight coefficient to obtain the corresponding total loss value.
  • the weight coefficient may be a fixed value or a dynamically changing value.
  • the weight coefficient may be adjusted based on the determined reconstruction loss and classification loss. For example, the value of a loss is larger, The corresponding weight coefficient can be set larger.
  • step S142 to update the neural network model is not limited, and can also be selected according to actual application requirements.
  • the neural network model can be updated once based on the total loss value (that is, the number of iterations of the backpropagation algorithm is limited to 1) to ensure that the neural network model is updated s efficiency.
  • step S142 may include step S142a and step S142b, and the specific content is as follows.
  • step S142a the neural network model is updated based on the obtained total loss value and the preset back propagation algorithm to obtain a new neural network model.
  • the neural network model can be updated according to the back propagation algorithm based on the total loss value (that is, one iteration is completed based on the back propagation algorithm), In order to get a new neural network model.
  • the new neural network model may be configured to process the sample image again to obtain a new total loss value.
  • Step S142b judging whether the new total loss value is less than a preset loss value.
  • the determined Whether the new total loss value is less than the preset loss value after obtaining a new neural network model based on step S142a, and performing feature extraction processing, classification processing, reconstruction processing, and loss determination processing on the sample image based on the new neural network model, the determined Whether the new total loss value is less than the preset loss value.
  • the current neural network model can be The network model (that is, the neural network model obtained from the last update process) is used as the image recognition model.
  • step S142 based on the new total loss value, the current neural network model is updated again according to the back propagation algorithm (that is, the second iteration is completed based on the back propagation algorithm), so as to obtain a new neural network model again.
  • an embodiment of the present application also provides an image recognition method applicable to the above-mentioned electronic device 10.
  • the method steps defined in the process related to the image recognition method can be implemented by the electronic device 10, that is, the electronic device can execute the image recognition method provided in the embodiment of the present application.
  • the specific process shown in FIG. 10 will be described in detail below.
  • Step S210 Input the obtained target image into a preset image recognition model.
  • the electronic device 10 may first input the target image into a preset image recognition model.
  • the image recognition model may be obtained by training based on the aforementioned image recognition model training method.
  • Step S220 Perform recognition processing on the target image through the image recognition model to obtain a recognition result.
  • the electronic device 10 may perform recognition processing on the target image through the image recognition model to obtain a corresponding recognition result.
  • the corresponding person information can be determined by the facial features in the target image, such as determining whether it belongs to a certain person.
  • the image recognition model is trained based on the above-mentioned image recognition model training method, it has a high feature information extraction ability, so that the target image has a high recognition accuracy, so that the obtained recognition result can have Higher accuracy.
  • an embodiment of the present application also provides an image recognition model training apparatus 100, which can be configured as the above-mentioned electronic device 10.
  • the image recognition model training device 100 may include a feature classification module 110, a feature reconstruction module 120, a loss determination module 130, and a model update module 140.
  • the feature classification module 110 may be configured to classify image features through a preset image classification layer to obtain classification results, where the image classification layer belongs to a preset neural network model, and the image feature is based on the neural network model.
  • the feature extraction layer is obtained by processing the sample image.
  • the feature classification module 110 may be configured to perform step S110 shown in FIG. 2, and for related content of the feature classification module 110, reference may be made to the foregoing description of step S110.
  • the feature reconstruction module 120 may be configured to perform reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model.
  • the feature reconstruction module 120 may be configured to perform step S120 shown in FIG. 2, and the relevant content of the feature reconstruction module 120 can refer to the foregoing description of step S120.
  • the loss determination module 130 may be configured to perform loss determination processing on the reconstructed image through the loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss.
  • the loss determination module 130 may be configured to perform step S130 shown in FIG. 2, and for related content of the loss determination module 130, reference may be made to the foregoing description of step S130.
  • the model update module 140 may be configured to update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model, wherein the image recognition model is configured to recognize a target image.
  • the model update module 140 may be configured to perform step S140 shown in FIG. 2. For related content of the model update module 140, reference may be made to the foregoing description of step S140.
  • the above-mentioned image recognition model training device obtains the classification result and the reconstructed image by performing classification processing and reconstruction processing on the image features, so that the parameters of the neural network model are updated (that is, the neural network model is updated).
  • the neural network model can be trained based on the reconstruction loss and the classification loss to obtain the image recognition model.
  • the reconstruction loss and classification loss are fully considered, so that the constraints and restrictions on the information based on the training are relatively high, and the image recognition model obtained by training can be used for image recognition. More image feature information can be extracted, thereby improving the accuracy of image recognition, better improving the problem of low recognition accuracy in the existing image recognition technology, and having higher practical value.
  • model update module is further configured to:
  • the neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
  • model update module is further configured to:
  • model update module is further configured to:
  • step a Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.
  • the loss determination module is further configured to:
  • each reconstructed image determines the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;
  • the loss determination module is further configured to:
  • an embodiment of the present application also provides an image recognition apparatus 200, which may be configured as the above-mentioned electronic device 10.
  • the image recognition device 200 may include an image input module 210 and an image recognition module 220.
  • the image input module 210 may be configured to obtain a target image and input the target image into a preset image recognition model, wherein the image recognition model is trained based on the aforementioned image recognition model training device.
  • the image input module 210 may be configured to perform step S210 shown in FIG. 10, and for related content of the image input module 210, reference may be made to the foregoing description of step S210.
  • the image recognition module 220 may be configured to perform recognition processing on the target image through the image recognition model to obtain a recognition result.
  • the image recognition module 220 may be configured to perform step S220 shown in FIG. 10, and for related content of the image recognition module, reference may be made to the foregoing description of step S220.
  • the image recognition device Since the image recognition device provided by the embodiment of the present application uses the image recognition model obtained by the aforementioned model training device for image recognition, it has high image recognition accuracy.
  • a computer-readable storage medium stores a computer program that executes the above-mentioned image recognition model when the computer program is running. The individual steps of the training method.
  • a computer-readable storage medium stores a computer program, and the computer program executes the above-mentioned image recognition method when the computer program is running. The various steps.
  • the model training, image recognition method and device, electronic equipment, and storage medium perform classification processing and reconstruction processing on image features, respectively, to obtain classification results and reconstructed images, so that the neural network
  • the neural network model can be trained based on the reconstruction loss and the classification loss to obtain an image recognition model.
  • the reconstruction loss and classification loss are fully considered, so that the constraints and restrictions on the information based on the training are relatively high, and the image recognition model obtained by training can be used for image recognition.
  • More image feature information can be extracted, thereby improving the accuracy of image recognition, and better improving the problem of low recognition accuracy in related image recognition technologies. It has high practical value, especially in face recognition. At this time, because the feature information of different faces is more similar (if fewer features are extracted, the problem of recognition failure or error is very likely to occur), so that more different feature information can be extracted, so that the accuracy of the recognition result Higher, the application effect is remarkable.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of the code, and the module, program segment, or part of the code contains one or more modules for realizing the specified logical function.
  • Executable instructions may also occur in a different order from the order marked in the drawings.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the functional modules in the various embodiments of the present application may be integrated together to form an independent part, or each module may exist alone, or two or more modules may be integrated to form an independent part.
  • the function is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, an electronic device, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes.
  • ROM read-only memory
  • RAM random access memory
  • the terms "including”, “including” or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, method, article, or equipment. If there are no more restrictions, the element defined by the sentence "including a" does not exclude the existence of other identical elements in the process, method, article, or equipment that includes the element.
  • the classification result and the reconstructed image are obtained by performing classification processing and reconstruction processing on the image features, so that when the neural network model is updated (that is, the neural network model is trained), it can be Based on the reconstruction loss and classification loss, the neural network model is trained to obtain the image recognition model.
  • the reconstruction loss and classification loss are fully considered, so that the constraints and restrictions on the information based on the training are relatively high, and the image recognition model obtained by training can be used for image recognition. More image feature information can be extracted, thereby improving the accuracy of image recognition, better improving the problem of low recognition accuracy in related image recognition technologies, and having higher practical value.

Abstract

Provided are a model training method and apparatus, an image recognition method and apparatus, an electronic device, and a storage medium, which relate to the technical field of image recognition. In the present application, the model training method comprises: firstly, performing classification processing on image features by means of an image classification layer in a neural network model, so as to obtain a classification result, wherein the image features are obtained by means of processing a sample image on the basis of a feature extraction layer of the neural network model; secondly, performing reconstruction processing on the image features by means of an image reconstruction layer in the neural network model, so as to obtain a reconstructed image; then, performing loss determination processing on the reconstructed image by means of a loss determination layer in the neural network model, so as to obtain a reconstruction loss, and then performing loss determination processing on the classification result to obtain a classification loss; and finally, performing update processing on the neural network model on the basis of the reconstruction loss and the classification loss, so as to obtain an image recognition model. By means of the method, the problem of the recognition precision of a trained image recognition model in the related art not being high can be solved.

Description

模型训练、图像识别方法和装置、电子设备及存储介质Model training, image recognition method and device, electronic equipment and storage medium
相关申请的交叉引用Cross-references to related applications
本申请要求于2020年6月1日提交中国专利局的申请号为2020104869488、名称为“模型训练、图像识别方法和装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 1, 2020, with the application number 2020104869488 and titled "Model Training, Image Recognition Method and Device, Electronic Equipment and Storage Medium", all of which are approved The reference is incorporated in this application.
技术领域Technical field
本申请涉及图像识别技术领域,具体而言,涉及一种模型训练、图像识别方法和装置、电子设备及存储介质。This application relates to the field of image recognition technology, and in particular to a model training, image recognition method and device, electronic equipment, and storage medium.
背景技术Background technique
随着图像识别技术的不断发展,其应用范围也越来越广,对图像识别结果的精度要求也越来越高。With the continuous development of image recognition technology, its application range has become wider and wider, and the accuracy requirements for image recognition results have become higher and higher.
经发明人研究发现,相关技术在对图像识别模型进行训练的过程中,由于对训练所依据的信息的约束和限制的力度不足,使得训练得到的图像识别模型在进行图像识别时,提取到的信息较少,从而导致该图像识别模型存在识别精度不高的问题。The inventor’s research found that in the process of training the image recognition model in related technologies, due to insufficient constraints and restrictions on the information on which the training is based, the trained image recognition model extracts There is less information, which leads to the problem of low recognition accuracy in the image recognition model.
发明内容Summary of the invention
有鉴于此,本申请的目的在于提供一种模型训练、图像识别方法和装置、电子设备及存储介质,以改善相关技术中训练得到的图像识别模型存在识别精度不高的问题。In view of this, the purpose of this application is to provide a model training, image recognition method and device, electronic equipment and storage medium to improve the problem of low recognition accuracy of image recognition models trained in related technologies.
为实现上述目的,本申请实施例采用如下技术方案:In order to achieve the foregoing objectives, the following technical solutions are adopted in the embodiments of the present application:
一种图像识别模型训练方法,包括:An image recognition model training method, including:
通过预设的图像分类层对图像特征进行分类处理,得到分类结果,其中,该图像分类层属于预设的神经网络模型,该图像特征基于该神经网络模型的特征提取层对样本图像进行处理得到;The image features are classified through the preset image classification layer to obtain the classification result. The image classification layer belongs to the preset neural network model, and the image feature is based on the feature extraction layer of the neural network model to process the sample image. ;
通过预设的图像重构层对所述图像特征进行重构处理,得到重构图像,其中,该图像重构层属于所述神经网络模型;Performing reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model;
通过所述神经网络模型中的损失确定层对所述重构图像进行损失确定处理得到重构损失、对所述分类结果进行损失确定处理得到分类损失;Performing loss determination processing on the reconstructed image through the loss determination layer in the neural network model to obtain a reconstruction loss, and performing loss determination processing on the classification result to obtain a classification loss;
基于所述重构损失和所述分类损失对所述神经网络模型进行更新处理,得到图像识别模型,其中,该图像识别模型配置成对目标图像进行识别。The neural network model is updated based on the reconstruction loss and the classification loss to obtain an image recognition model, where the image recognition model is configured to recognize a target image.
可选地,在上述图像识别模型训练方法中,所述基于所述重构损失和所述分类损失对所述神经网络模型进行更新处理,得到图像识别模型的步骤,包括:Optionally, in the above-mentioned image recognition model training method, the step of updating the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model includes:
基于所述重构损失和所述分类损失进行求和处理,得到损失总值;Performing summation processing based on the reconstruction loss and the classification loss to obtain a total loss;
基于所述损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到图像识别模型。The neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
可选地,所述基于所述重构损失和所述分类损失进行求和处理,得到损失总值的步骤,包括:Optionally, the step of performing summing processing based on the reconstruction loss and the classification loss to obtain a total loss value includes:
获得预先配置的权重系数;Obtain pre-configured weight coefficients;
基于所述权重系数计算所述重构损失和所述分类损失的加权和值,将所述加权和值作为损失总值。Calculate the weighted sum of the reconstruction loss and the classification loss based on the weight coefficient, and use the weighted sum as the total loss.
可选地,在上述图像识别模型训练方法中,所述基于所述损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到图像识别模型的步骤,包括:Optionally, in the above-mentioned image recognition model training method, the step of updating the neural network model based on the total loss value and a preset back propagation algorithm to obtain an image recognition model includes:
a,基于得到的损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到新的神经网络模型,其中,该新的神经网络模型配置成再次对所述样本图像进行处理,得到新的损失总值;a. Update the neural network model based on the obtained total loss value and the preset backpropagation algorithm to obtain a new neural network model, wherein the new neural network model is configured to perform the operation on the sample image again Processing, get the new total loss;
b,判断所述新的损失总值是否小于预设损失值,并在该新的损失总值小于该预设损失值时,将最后一次更新处理得到的神经网络模型作为图像识别模型,在该新的损失总值不小于该预设损失值时,再次执行步骤a。b. Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.
可选地,在上述图像识别模型训练方法中,所述对所述重构图像进行损失确定处理得到重构损失的步骤,包括:Optionally, in the above-mentioned image recognition model training method, the step of performing loss determination processing on the reconstructed image to obtain reconstruction loss includes:
针对每一张重构图像,通过所述损失确定层确定该重构图像与对应的样本图像之间的像素损失,其中,该重构图像为多个,该样本图像为多个;For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;
通过所述损失确定层对确定的多个所述像素损失进行第一损失计算处理,得到重构损失。Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.
可选地,在上述图像识别模型训练方法中,所述对所述分类结果进行损失确定处理得到分类损失的步骤,包括:Optionally, in the above-mentioned image recognition model training method, the step of performing loss determination processing on the classification result to obtain the classification loss includes:
通过所述损失确定层获得多个预设的分类标签,其中,该分类标签基于对多个所述样本图像进行标识处理生成;Obtaining a plurality of preset classification labels through the loss determination layer, wherein the classification labels are generated based on identification processing on the plurality of sample images;
通过所述损失确定层对所述分类结果和所述分类标签进行第二损失计算处理,得到分类损失。Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.
在上述基础上,本申请实施例还提供了一种图像识别方法,包括:On the basis of the foregoing, an embodiment of the present application also provides an image recognition method, including:
获得目标图像,并将该目标图像输入至预设的图像识别模型,其中,该图像识别模型 基于上述的图像识别模型训练方法训练得到;Obtain a target image, and input the target image into a preset image recognition model, where the image recognition model is trained based on the above-mentioned image recognition model training method;
通过所述图像识别模型对所述目标图像进行识别处理,得到识别结果。Perform recognition processing on the target image through the image recognition model to obtain a recognition result.
本申请实施例还提供了一种图像识别模型训练装置,包括:The embodiment of the present application also provides an image recognition model training device, including:
特征分类模块,配置成通过预设的图像分类层对图像特征进行分类处理,得到分类结果,其中,该图像分类层属于预设的神经网络模型,该图像特征基于该神经网络模型的特征提取层对样本图像进行处理得到;The feature classification module is configured to classify image features through a preset image classification layer to obtain a classification result, where the image classification layer belongs to a preset neural network model, and the image feature is based on the feature extraction layer of the neural network model Obtained by processing the sample image;
特征重构模块,配置成通过预设的图像重构层对所述图像特征进行重构处理,得到重构图像,其中,该图像重构层属于所述神经网络模型;The feature reconstruction module is configured to perform reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model;
损失确定模块,配置成通过所述神经网络模型中的损失确定层对所述重构图像进行损失确定处理得到重构损失、对所述分类结果进行损失确定处理得到分类损失;A loss determination module configured to perform loss determination processing on the reconstructed image through a loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss;
模型更新模块,配置成基于所述重构损失和所述分类损失对所述神经网络模型进行更新处理,得到图像识别模型,其中,该图像识别模型配置成对目标图像进行识别。The model update module is configured to update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model, wherein the image recognition model is configured to recognize a target image.
可选地,所述模型更新模块进一步配置成:Optionally, the model update module is further configured to:
基于所述重构损失和所述分类损失进行求和处理,得到损失总值;Performing summation processing based on the reconstruction loss and the classification loss to obtain a total loss;
基于所述损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到图像识别模型。The neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
可选地,所述模型更新模块进一步配置成:Optionally, the model update module is further configured to:
获得预先配置的权重系数;Obtain pre-configured weight coefficients;
基于所述权重系数计算所述重构损失和所述分类损失的加权和值,将所述加权和值作为损失总值。Calculate the weighted sum of the reconstruction loss and the classification loss based on the weight coefficient, and use the weighted sum as the total loss.
可选地,所述模型更新模块进一步配置成:Optionally, the model update module is further configured to:
a,基于得到的损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到新的神经网络模型,其中,该新的神经网络模型用于再次对所述样本图像进行处理,得到新的损失总值;a. Update the neural network model based on the obtained total loss value and the preset back propagation algorithm to obtain a new neural network model, where the new neural network model is used to perform the sample image again Processing, get the new total loss;
b,判断所述新的损失总值是否小于预设损失值,并在该新的损失总值小于该预设损失值时,将最后一次更新处理得到的神经网络模型作为图像识别模型,在该新的损失总值不小于该预设损失值时,再次执行步骤a。b. Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.
可选地,所述损失确定模块进一步配置成:Optionally, the loss determination module is further configured to:
针对每一张重构图像,通过所述损失确定层确定该重构图像与对应的样本图像之间的像素损失,其中,该重构图像为多个,该样本图像为多个;For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;
通过所述损失确定层对确定的多个所述像素损失进行第一损失计算处理,得到重构损失。Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.
可选地,所述损失确定模块进一步配置成:Optionally, the loss determination module is further configured to:
通过所述损失确定层获得多个预设的分类标签,其中,该分类标签基于对多个所述样本图像进行标识处理生成;Obtaining a plurality of preset classification labels through the loss determination layer, wherein the classification labels are generated based on identification processing on the plurality of sample images;
通过所述损失确定层对所述分类结果和所述分类标签进行第二损失计算处理,得到分类损失。Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.
在上述基础上,本申请实施例还提供了一种图像识别装置,包括:On the basis of the foregoing, an embodiment of the present application also provides an image recognition device, including:
图像输入模块,配置成获得目标图像,并将该目标图像输入至预设的图像识别模型,其中,该图像识别模型基于上述的图像识别模型训练装置训练得到;The image input module is configured to obtain a target image and input the target image to a preset image recognition model, wherein the image recognition model is trained based on the above-mentioned image recognition model training device;
图像识别模块,配置成通过所述图像识别模型对所述目标图像进行识别处理,得到识别结果。The image recognition module is configured to perform recognition processing on the target image through the image recognition model to obtain a recognition result.
在上述基础上,本申请实施例还提供了一种电子设备,包括:On the basis of the foregoing, an embodiment of the present application also provides an electronic device, including:
存储器,配置成存储计算机程序;Memory, configured to store computer programs;
与所述存储器连接的处理器,配置成执行该存储器存储的计算机程序,以实现上述的图像识别模型训练方法,或实现上述的图像识别方法。The processor connected to the memory is configured to execute the computer program stored in the memory to implement the above-mentioned image recognition model training method or the above-mentioned image recognition method.
在上述基础上,本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被执行时,实现上述的图像识别模型训练方法,或实现上述的图像识别方法。On the basis of the foregoing, the embodiments of the present application also provide a computer-readable storage medium on which a computer program is stored. When the computer program is executed, the foregoing image recognition model training method or the foregoing image recognition method is implemented .
为使本申请的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。In order to make the above objectives, features, and advantages of the present application more comprehensible, preferred embodiments and accompanying drawings are described below in detail.
附图说明Description of the drawings
为了更清楚地说明本申请的技术方案,下面将对其中所需要使用的附图作简单地介绍,应当理解,以下附图仅示出了本申请的某些实现方式,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它相关的附图。In order to explain the technical solution of the present application more clearly, the accompanying drawings that need to be used therein will be briefly introduced below. It should be understood that the following drawings only show some implementations of the present application, and therefore should not be regarded as It is a limitation of the scope. For those of ordinary skill in the art, without creative work, other related drawings can be obtained based on these drawings.
图1为本申请实施例提供的电子设备的结构框图。FIG. 1 is a structural block diagram of an electronic device provided by an embodiment of the application.
图2为本申请实施例提供的图像识别模型训练方法的流程示意图。FIG. 2 is a schematic flowchart of an image recognition model training method provided by an embodiment of the application.
图3为本申请实施例提供的样本图像与对应的重构图像之间的效果示意图。FIG. 3 is a schematic diagram of the effect between the sample image and the corresponding reconstructed image provided by the embodiment of the application.
图4为图2中步骤S130包括的各子步骤的流程示意图。FIG. 4 is a schematic flowchart of each sub-step included in step S130 in FIG. 2.
图5为本申请实施例提供的样本图像的像素信息的效果示意图。FIG. 5 is a schematic diagram of the effect of pixel information of a sample image provided by an embodiment of the application.
图6为本申请实施例提供的重构图像的像素信息的效果示意图。FIG. 6 is a schematic diagram of the effect of reconstructed image pixel information provided by an embodiment of the application.
图7为图2中步骤S130包括的其它子步骤的流程示意图。FIG. 7 is a schematic flowchart of other sub-steps included in step S130 in FIG. 2.
图8为图2中步骤S140包括的各子步骤的流程示意图。FIG. 8 is a schematic flowchart of each sub-step included in step S140 in FIG. 2.
图9为图8中步骤S142包括的各子步骤的流程示意图。FIG. 9 is a schematic flowchart of each sub-step included in step S142 in FIG. 8.
图10为本申请实施例提供的图像识别方法的流程示意图。FIG. 10 is a schematic flowchart of an image recognition method provided by an embodiment of this application.
图11为本申请实施例提供的图像识别模型训练装置包括的功能模块的方框示意图。FIG. 11 is a block diagram of the functional modules included in the image recognition model training apparatus provided by an embodiment of the application.
图12为本申请实施例提供的图像识别装置包括的功能模块的方框示意图。FIG. 12 is a schematic block diagram of functional modules included in the image recognition device provided by an embodiment of the application.
图标:10-电子设备;12-存储器;14-处理器;100-图像识别模型训练装置;110-特征分类模块;120-特征重构模块;130-损失确定模块;140-模型更新模块;200-图像识别装置;210-图像输入模块;220-图像识别模块。Icon: 10-electronic equipment; 12-memory; 14-processor; 100-image recognition model training device; 110-feature classification module; 120-feature reconstruction module; 130-loss determination module; 140-model update module; 200 -Image recognition device; 210-Image input module; 220-Image recognition module.
具体实施方式detailed description
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例只是本申请的一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本申请实施例的组件可以以各种不同的配置来布置和设计。In order to make the purpose, technical solutions and advantages of the embodiments of this application clearer, the following will clearly and completely describe the technical solutions in the embodiments of this application with reference to the drawings in the embodiments of this application. Obviously, the described embodiments It is only a part of the embodiments of the present application, but not all the embodiments. The components of the embodiments of the present application generally described and shown in the drawings herein may be arranged and designed in various different configurations.
因此,以下对在附图中提供的本申请的实施例的详细描述并非旨在限制要求保护的本申请的范围,而是仅仅表示本申请的选定实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。Therefore, the following detailed description of the embodiments of the application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely represents selected embodiments of the application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
如图1所示,本申请实施例提供了一种电子设备10。其中,该电子设备10可以包括存储器12和处理器14。As shown in FIG. 1, an embodiment of the present application provides an electronic device 10. Wherein, the electronic device 10 may include a memory 12 and a processor 14.
详细地,所述存储器12和处理器14之间直接或间接地电性连接,以实现数据的传输或交互。例如,相互之间可通过一条或多条通讯总线或信号线实现电性连接。所述存储器12中可以存储有至少一个可以以软件或固件(firmware)的形式存在的软件功能模块。所述处理器14可以用于执行所述存储器12中存储的可执行的计算机程序,如前述的软件功能模块,从而实现本申请实施例(如后文所述)提供的图像识别模型训练方法,以得到图像识别模型,或实现本申请实施例(如后文所述)提供的图像识别方法,以得到目标图像的识别结果。In detail, the memory 12 and the processor 14 are directly or indirectly electrically connected to realize data transmission or interaction. For example, one or more communication buses or signal lines can be electrically connected to each other. The memory 12 may store at least one software function module that may exist in the form of software or firmware. The processor 14 may be used to execute an executable computer program stored in the memory 12, such as the aforementioned software function module, so as to implement the image recognition model training method provided by the embodiment of the present application (described later). In order to obtain the image recognition model, or implement the image recognition method provided in the embodiment of the present application (as described later), to obtain the recognition result of the target image.
可选地,所述存储器12可以是,但不限于,随机存取存储器(Random Access Memory,RAM),只读存储器(Read Only Memory,ROM),可编程只读存储器(Programmable Read-Only Memory,PROM),可擦除只读存储器(Erasable Programmable Read-Only Memory,EPROM),电可擦除只读存储器(Electric Erasable Programmable Read-Only Memory,EEPROM)等。Optionally, the memory 12 may be, but is not limited to, a random access memory (Random Access Memory, RAM), a read only memory (Read Only Memory, ROM), a programmable read only memory (Programmable Read-Only Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), Electrical Erasable Programmable Read-Only Memory (EEPROM), etc.
并且,所述处理器14可以是一种通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)、片上系统(System on Chip,SoC)等;还可以是数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(Application Specific  Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。In addition, the processor 14 may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), a system on chip (System on Chip, SoC), etc.; or It is a Digital Signal Processor (Digital Signal Processing, DSP), Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic Devices, discrete hardware components.
可以理解,所述电子设备10可以是一种具有数据处理能力的计算机、服务器等设备。It can be understood that the electronic device 10 may be a computer or server with data processing capabilities.
并且,图1所示的结构仅为示意,所述电子设备10还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置,例如,还可以包括用于与其它设备进行信息交互的通信单元,如与其它数据库进行信息交互,以获得样本图像,或与终端设备进行信息交互,以获得目标图像,本申请实施例对电子设备10的结构不进行限制。In addition, the structure shown in FIG. 1 is only for illustration, and the electronic device 10 may also include more or less components than that shown in FIG. 1, or have a configuration different from that shown in FIG. 1, for example, A communication unit used for information interaction with other devices, such as information interaction with other databases to obtain sample images, or information interaction with terminal devices to obtain target images. limit.
结合图2,本申请实施例还提供一种可应用于上述电子设备10的图像识别模型训练方法。其中,所述图像识别模型训练方法有关的流程所定义的方法步骤,可以由所述电子设备10实现,也即,可由电子设备执行本申请实施例提供的图像识别模型训练方法。下面将对图2所示的主要步骤S110~步骤S140,进行详细阐述。With reference to FIG. 2, an embodiment of the present application also provides an image recognition model training method that can be applied to the above-mentioned electronic device 10. Wherein, the method steps defined in the process related to the image recognition model training method can be implemented by the electronic device 10, that is, the image recognition model training method provided in the embodiments of the present application can be executed by the electronic device. The main steps S110 to S140 shown in FIG. 2 will be described in detail below.
步骤S110,通过预设的图像分类层对图像特征进行分类处理,得到分类结果。Step S110: Perform classification processing on the image features through a preset image classification layer to obtain a classification result.
在本实施例中,所述电子设备10可以先通过预设的神经网络模型的图像分类层,对图像特征进行分类处理,得到分类结果。In this embodiment, the electronic device 10 may first perform classification processing on the image features through the image classification layer of the preset neural network model to obtain the classification result.
其中,所述图像特征可以基于所述神经网络模型的特征提取层,对样本图像进行处理得到。Wherein, the image feature may be obtained by processing a sample image based on the feature extraction layer of the neural network model.
步骤S120,通过预设的图像重构层对所述图像特征进行重构处理,得到重构图像。Step S120: Perform reconstruction processing on the image feature through a preset image reconstruction layer to obtain a reconstructed image.
在本实施例中,所述电子设备10还可以通过所述神经网络模型的图像重构层,对所述图像特征进行重构处理,得到重构图像。In this embodiment, the electronic device 10 may also perform reconstruction processing on the image features through the image reconstruction layer of the neural network model to obtain a reconstructed image.
步骤S130,通过所述神经网络模型中的损失确定层对所述重构图像进行损失确定处理得到重构损失、对所述分类结果进行损失确定处理得到分类损失。Step S130: Perform loss determination processing on the reconstructed image through the loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss.
在本实施例中,在基于步骤S110和步骤S120分别得到所述分类结果和所述重构图像之后,一方面,所述电子设备10可以通过所述神经网络模型中的损失确定层对该分类结果进行损失确定处理,得到对应的分类损失;另一方面,所述电子设备10也可以通过该损失确定层对该重构图像进行损失确定处理,得到对应的重构损失。In this embodiment, after obtaining the classification result and the reconstructed image respectively based on step S110 and step S120, on the one hand, the electronic device 10 can classify the classification through the loss determination layer in the neural network model. As a result, loss determination processing is performed to obtain the corresponding classification loss; on the other hand, the electronic device 10 may also perform loss determination processing on the reconstructed image through the loss determination layer to obtain the corresponding reconstruction loss.
步骤S140,基于所述重构损失和所述分类损失对所述神经网络模型进行更新处理,得到图像识别模型。Step S140: Update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model.
也即,上述神经网络模型包含特征提取层、图像分类层、图像重构层和损失确定层;其中,特征提取层分别与图像分类层以及图像重构层相连,图像分类层以及图像重构层又分别与损失确定层相连;也即,特征提取层的输入为样本图像,特征提取层的输出为图像特征;图像分类层以及图像重构层的输入均为图像特征,图像分类层的输出为分类结果,图像重构层的输出为重构图像;损失确定层的输入为分类结果和重构图像,损失确定层的 输出为重构损失和分类损失。That is, the aforementioned neural network model includes a feature extraction layer, an image classification layer, an image reconstruction layer, and a loss determination layer; wherein the feature extraction layer is connected to the image classification layer and the image reconstruction layer, respectively, the image classification layer and the image reconstruction layer They are respectively connected to the loss determination layer; that is, the input of the feature extraction layer is the sample image, and the output of the feature extraction layer is the image feature; the input of the image classification layer and the image reconstruction layer are both image features, and the output of the image classification layer is For the classification result, the output of the image reconstruction layer is the reconstructed image; the input of the loss determination layer is the classification result and the reconstructed image, and the output of the loss determination layer is the reconstruction loss and the classification loss.
在本实施例中,在基于步骤S130得到所述重构损失和所述分类损失之后,所述电子设备10可以基于该重构损失和该分类损失,对所述神经网络模型进行更新处理(即对该神经网络模型进行训练),具体而言,是对神经网络模型的模型参数进行更新处理,最终训练得到的神经网络模型即为图像识别模型,该图像识别模型配置成对目标图像进行识别。In this embodiment, after obtaining the reconstruction loss and the classification loss based on step S130, the electronic device 10 may update the neural network model based on the reconstruction loss and the classification loss (that is, Training the neural network model), specifically, updating the model parameters of the neural network model, and the neural network model finally trained is the image recognition model, and the image recognition model is configured to recognize the target image.
基于上述的方法,由于对神经网络模型的训练,充分考虑了重构损失和分类损失,加大了对训练神经网络模型所依据的信息的约束和限制的力度,进而使得训练得到的图像识别模型在进行图像识别时,可以提取更多的图像特征信息,从而提高图像识别的精度,较好改善了相关的图像识别技术中存在的识别精度不高的问题。特别是在应用于人脸识别时,由于不同人脸的特征信息存在较多相似(若提取到的特征较少,极容易出现识别失败或误差的问题),而使得训练得到的图像识别模型可以提取到更多的不同特征信息,使得识别结果的精度更高。Based on the above method, due to the training of the neural network model, the reconstruction loss and classification loss are fully considered, and the constraints and restrictions on the information based on the training of the neural network model are increased, so that the trained image recognition model When performing image recognition, more image feature information can be extracted, thereby improving the accuracy of image recognition, and better improving the problem of low recognition accuracy in related image recognition technologies. Especially when it is applied to face recognition, because the feature information of different faces is more similar (if fewer features are extracted, the problem of recognition failure or error is very likely to occur), so that the image recognition model obtained by training can be More different feature information is extracted, which makes the accuracy of the recognition result higher.
第一方面,对于步骤S110需要说明的是,对所述图像特征进行分类处理的具体方式不受限制,可以根据实际应用需求进行选择。In the first aspect, it should be noted for step S110 that the specific method for classifying the image features is not limited, and can be selected according to actual application requirements.
例如,在一种可以替代的示例中,所述样本图像为多个,对应地,所述图像特征也可以为多个,所述图像分类层可以是一种全连接层(fully connected layers,FC),借助全连接层的分类作用实现图像分类,也就是说,可以通过该全连接层对多个所述图像特征进行处理,得到分类结果。For example, in an alternative example, there are multiple sample images. Correspondingly, there may be multiple image features. The image classification layer may be a fully connected layer (FC). ), the image classification is realized by means of the classification function of the fully connected layer, that is, multiple image features can be processed through the fully connected layer to obtain the classification result.
其中,为了使得所述电子设备10便于基于所述分类结果确定分类损失,在一种可以替代的示例中,所述全连接层的输出还可以连接有归一化函数,使得经过该归一化函数输出的值作为分类结果时,可以具有表征各图像特征属于不同类别的概率含义。Wherein, in order to make it easy for the electronic device 10 to determine the classification loss based on the classification result, in an alternative example, the output of the fully connected layer may also be connected with a normalization function, so that the normalization function When the value output by the function is used as the classification result, it can have the probability meaning that each image feature belongs to different categories.
可选地,上述的归一化函数可以是一种归一化指数函数(诸如softmax函数),能够将输入的任意k维向量z中的任一维度的数值压缩到(0,1)之间,并且,压缩后k个维度的数值之和为1,使得输入向量z中的各个维度数值具有概率含义。Optionally, the aforementioned normalization function may be a normalized exponential function (such as a softmax function), which can compress the value of any dimension in any k-dimensional vector z of the input to between (0, 1) And, the sum of the values of the k dimensions after compression is 1, so that the value of each dimension in the input vector z has a probability meaning.
也就是说,实际上所述全连接层输出的是k维的向量,即所述图像特征为k个,每一个图像特征对应有一维的向量。In other words, in fact, the output of the fully connected layer is a k-dimensional vector, that is, there are k image features, and each image feature corresponds to a one-dimensional vector.
第二方面,对于步骤S120需要说明的是,对所述图像特征进行重构处理的具体方式也不受限制,也可以根据实际应用需求进行选择。In the second aspect, it should be noted that in step S120, the specific method of performing reconstruction processing on the image feature is not limited, and can also be selected according to actual application requirements.
例如,在一种可以替代的示例中,配置成对所述样本图像进行处理的特征提取层可以为神经网络模型中的编码器,对应地,所述图像重构层可以是神经网络模型中的解码器。For example, in an alternative example, the feature extraction layer configured to process the sample image may be an encoder in a neural network model, and correspondingly, the image reconstruction layer may be a neural network model. decoder.
也就是说,所述电子设备10可以基于所述编码器对所述样本图像进行特征提取处理,得到对应的图像特征(属于一种特征向量),然后,再基于所述解码器对所述图像特征进行 重构处理,得到对应的重构图像。In other words, the electronic device 10 may perform feature extraction processing on the sample image based on the encoder to obtain corresponding image features (belonging to a feature vector), and then perform a feature extraction process on the image based on the decoder. The feature is reconstructed to obtain the corresponding reconstructed image.
如图3所示,第一行示意出9个人的脸部图像(即9张样本图像),如此,对该9张样本图像进行特征提取处理,可以得到9个图像特征,然后,再对该9个图像特征进行重构处理,可以得到第二行的9张人脸图像(即9张重构图像)。As shown in Figure 3, the first line shows the facial images of 9 persons (ie 9 sample images). In this way, the feature extraction process of the 9 sample images can obtain 9 image features, and then the The 9 image features are reconstructed to obtain 9 face images in the second row (ie 9 reconstructed images).
第三方面,对于步骤S130需要说明的是,确定所述重构损失的具体方式也不受限制,可以根据实际应用需求进行选择。In the third aspect, it should be noted for step S130 that the specific method for determining the reconstruction loss is not limited, and can be selected according to actual application requirements.
例如,在一种可以替代的示例中,可以先通过所述神经网络模型中的特征提取层对所述重构图像再次进行特征提取处理,得到新的图像特征。然后,再将该新的图像特征和前述对样本图像进行特征提取得到的图像特征进行对比计算,得到对应的重构损失。For example, in an alternative example, the reconstructed image may be subjected to feature extraction processing again through the feature extraction layer in the neural network model to obtain new image features. Then, the new image feature is compared with the image feature obtained by performing feature extraction on the sample image, and the corresponding reconstruction loss is obtained.
又例如,在另一种可以替代的示例中,为了使得基于所述重构损失对所述神经网络模型进行训练得到所述图像识别模型之后,该图像识别模型可以具有较高的信息提取能力。结合图4,步骤S130可以包括步骤S131和步骤S132,具体内容如下所述。For another example, in another alternative example, in order to enable the neural network model to be trained based on the reconstruction loss to obtain the image recognition model, the image recognition model may have higher information extraction capabilities. With reference to FIG. 4, step S130 may include step S131 and step S132, and the specific content is as follows.
步骤S131,针对每一张重构图像,通过所述损失确定层确定该重构图像与对应的样本图像之间的像素损失。Step S131: For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer.
在本实施例中,所述样本图像可以为多张,对应地,所述重构图像也可以为多张。如此,针对每一张重构图像,可以通过所述神经网络模型中的损失确定层,确定该重构图像与对应的样本图像之间的像素损失。In this embodiment, the sample image may be multiple, and correspondingly, the reconstructed image may also be multiple. In this way, for each reconstructed image, the loss determination layer in the neural network model can be used to determine the pixel loss between the reconstructed image and the corresponding sample image.
步骤S132,通过所述损失确定层对确定的多个所述像素损失进行第一损失计算处理,得到重构损失。Step S132: Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.
在本实施例中,在基于步骤S131得到多个所述像素损失之后,可以再次通过所述损失确定层对确定的多个像素损失进行第一损失计算处理,得到总的重构损失。In this embodiment, after obtaining a plurality of the pixel losses based on step S131, the first loss calculation process may be performed on the determined plurality of pixel losses through the loss determination layer again to obtain the total reconstruction loss.
详细地,在一种具体的应用示例中,如图5所示,一张样本图像可以包括9个像素点,分别可以为像素点A1、像素点A2、像素点A3、像素点A4、像素点A5、像素点A6、像素点A7、像素点A8和像素点A9。如图6所示,是基于图5所示的样本图像经过特征提取处理和重构处理后得到的重构图像,也可以包括9个对应的像素点,分别可以为像素点B1、像素点B2、像素点B3、像素点B4、像素点B5、像素点B6、像素点B7、像素点B8和像素点B9。In detail, in a specific application example, as shown in FIG. 5, a sample image may include 9 pixels, which may be pixel A1, pixel A2, pixel A3, pixel A4, pixel A5, pixel point A6, pixel point A7, pixel point A8, and pixel point A9. As shown in Figure 6, it is a reconstructed image based on the sample image shown in Figure 5 after feature extraction processing and reconstruction processing. It can also include 9 corresponding pixel points, which can be pixel point B1 and pixel point B2. , Pixel point B3, pixel point B4, pixel point B5, pixel point B6, pixel point B7, pixel point B8, and pixel point B9.
基于此,可以分别计算像素点A1与像素点B1之间的像素差值(即像素损失)、像素点A2与像素点B2之间的像素差值、像素点A3与像素点B3之间的像素差值、像素点A4与像素点B4之间的像素差值、像素点A5与像素点B5之间的像素差值、像素点A6与像素点B6之间的像素差值、像素点A7与像素点B7之间的像素差值、像素点A8与像素点B8之间的像素差值、像素点A9与像素点B9之间的像素差值。Based on this, the pixel difference between pixel A1 and pixel B1 (ie pixel loss), the pixel difference between pixel A2 and pixel B2, and the pixel between pixel A3 and pixel B3 can be calculated separately Difference, pixel difference between pixel A4 and pixel B4, pixel difference between pixel A5 and pixel B5, pixel difference between pixel A6 and pixel B6, pixel A7 and pixel The pixel difference between the point B7, the pixel difference between the pixel point A8 and the pixel point B8, the pixel difference between the pixel point A9 and the pixel point B9.
如此,可以得到9个像素差值,然后,可以基于该9个像素差值(像素损失)确定图5所示的样本图像和图6所示的重构图像之间的重构损失。In this way, 9 pixel difference values can be obtained, and then, based on the 9 pixel difference values (pixel loss), the reconstruction loss between the sample image shown in FIG. 5 and the reconstructed image shown in FIG. 6 can be determined.
也就是说,所述重构损失是基于每一个像素点的像素损失确定的,使得在基于该重构损失对所述神经网络模型进行训练时,可以保证约束信息或监督信息是像素级的,加大了对训练神经网络模型所依据的信息的约束和限制的力度,如此,可以使得训练得到的图像识别模型具有较高的特征信息提取能力,从而实现高精度的图像识别。That is to say, the reconstruction loss is determined based on the pixel loss of each pixel, so that when the neural network model is trained based on the reconstruction loss, it can be guaranteed that the constraint information or the supervision information is at the pixel level. The constraints and restrictions on the information on which the neural network model is trained are strengthened. In this way, the image recognition model obtained by training can have a higher feature information extraction capability, thereby achieving high-precision image recognition.
并且,对于步骤S130还需要说明的是,确定所述分类损失的具体方式也不受限制,可以根据实际应用需求进行选择。In addition, for step S130, it should be noted that the specific method for determining the classification loss is not limited, and can be selected according to actual application requirements.
例如,在一种可以替代的示例中,针对一个样本图像,对应的预测结果可以包括该样本图像被预测为所有的样本图像中每一张样本图像的概率值。其中,若所有的样本图像的数量为k,则针对一个样本图像,得到的分类结果可以包括k个概率值。For example, in an alternative example, for a sample image, the corresponding prediction result may include the probability value that the sample image is predicted to be each sample image in all the sample images. Wherein, if the number of all sample images is k, then for one sample image, the obtained classification result may include k probability values.
如此,可以根据k个概率值中数值最大的概率值,确定一个概率为1的概率值,再基于其它k-1个概率值,确定k-1个概率为0的概率值。然后,再基于分类结果包括的k个概率值和确定的由一个1和k-1个0组成的k个概率值,计算得到针对一个样本图像进行分类出现的一个分类损失。In this way, a probability value with a probability of 1 can be determined according to the probability value with the largest value among the k probability values, and then k-1 probability values with a probability of 0 can be determined based on the other k-1 probability values. Then, based on the k probability values included in the classification result and the determined k probability values consisting of a 1 and k-1 0s, a classification loss that occurs when a sample image is classified is calculated.
又例如,在另一种可以替代的示例中,考虑到分类结果中数值最大的概率值,可能由于提取的特征信息不足而使得该概率值并不能表征,对应的两个样本图像之间就是同一个样本图像。如此,结合图7,步骤S130可以包括步骤S133和步骤S134,具体内容如下所述。For another example, in another alternative example, considering the probability value of the largest value in the classification result, the probability value may not be characterized due to insufficient feature information extracted, and the corresponding two sample images are the same. A sample image. In this way, in conjunction with FIG. 7, step S130 may include step S133 and step S134, and the specific content is as follows.
步骤S133,通过所述损失确定层获得多个预设的分类标签。Step S133: Obtain multiple preset classification labels through the loss determination layer.
在本实施例中,在需要计算分类损失时,可以先通过所述神经网络模型中的损失确定层获得多个预设的分类标签。In this embodiment, when the classification loss needs to be calculated, multiple preset classification labels may be obtained through the loss determination layer in the neural network model.
其中,所述分类标签基于对多个所述样本图像进行标识处理生成,也就是说,所述分类标签与所述样本图像之间具有一一对应关系,例如,具有不同人脸的样本图像具有不同的分类标签,100万个不同人脸的样本图像,对应就有100万个分类标签。Wherein, the classification label is generated based on identifying a plurality of the sample images, that is, there is a one-to-one correspondence between the classification label and the sample image, for example, sample images with different human faces have Different classification labels, 1 million sample images of different faces, corresponds to 1 million classification labels.
步骤S134,通过所述损失确定层对所述分类结果和所述分类标签进行第二损失计算处理,得到分类损失。Step S134: Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.
在本实施例中,在基于步骤S133获得所述分类标签之后,可以通过所述损失确定层对所述分类结果和所述分类标签进行第二损失计算处理,得到分类损失。In this embodiment, after the classification label is obtained based on step S133, the classification result and the classification label may be subjected to a second loss calculation process through the loss determination layer to obtain the classification loss.
其中,所述分类结果可以是k维的列向量。诸如,针对k个图像特征(k个样本图像),所述分类结果可以表示为k行k列的分类向量矩阵(一列数据表示:一个样本图像被预测为所有的样本图像中每一张样本图像的概率值)。如此,根据该分类向量矩阵与所述分类标 签,可以采用预设的损失函数,计算出分类损失。Wherein, the classification result may be a k-dimensional column vector. For example, for k image features (k sample images), the classification result can be expressed as a classification vector matrix with k rows and k columns (a column of data indicates: a sample image is predicted as each sample image in all sample images Probability value). In this way, based on the classification vector matrix and the classification label, a preset loss function can be used to calculate the classification loss.
第四方面,对于步骤S140需要说明的是,对所述神经网络模型进行更新处理的具体方式不受限制,可以根据实际应用需求进行选择。In the fourth aspect, it should be noted for step S140 that the specific method for updating the neural network model is not limited, and can be selected according to actual application requirements.
例如,在一种可以替代的示例中,可以分别基于所述重构损失和所述分类损失对所述神经网络模型进行训练。For example, in an alternative example, the neural network model may be trained based on the reconstruction loss and the classification loss respectively.
又例如,在另一种可以替代的示例中,为了提高对所述神经网络模型进行训练的效率,结合图8,步骤S140可以包括步骤S141和步骤S143,具体内容如下所述。For another example, in another alternative example, in order to improve the efficiency of training the neural network model, in conjunction with FIG. 8, step S140 may include step S141 and step S143, and the specific content is as follows.
步骤S141,基于所述重构损失和所述分类损失进行求和处理,得到损失总值。Step S141, performing a summation process based on the reconstruction loss and the classification loss to obtain a total loss value.
在本实施例中,在基于步骤S130得到所述重构损失和所述分类损失之后,可以对该重构损失和该分类损失进行求和处理,即计算该重构损失和该分类损失的和,以得到对应的损失总值。In this embodiment, after the reconstruction loss and the classification loss are obtained based on step S130, the reconstruction loss and the classification loss can be summed, that is, the sum of the reconstruction loss and the classification loss can be calculated. , In order to get the corresponding total loss.
步骤S142,基于所述损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到图像识别模型。Step S142, updating the neural network model based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
在本实施例,在基于步骤S141得到所述损失总值之后,可以基于该损失总值,并按照反向传播算法(Backpropagation algorithm,BP算法,是一种监督学习算法)对所述神经网络模型进行更新处理,即对该神经网络模型包括的各个网络层(如所述特征提取层、图像分类层、图像重构层等)的参数进行更新处理,得到新的神经网络模型,即得到所需要的图像识别模型。In this embodiment, after the total loss value is obtained based on step S141, the total loss value can be used to analyze the neural network model according to the back propagation algorithm (Backpropagation algorithm, BP algorithm, which is a supervised learning algorithm). Perform update processing, that is, update the parameters of each network layer (such as the feature extraction layer, image classification layer, image reconstruction layer, etc.) included in the neural network model to obtain a new neural network model, that is, to obtain the required Image recognition model.
可选地,执行步骤S141以计算损失总值的具体方式也不受限制,可以根据实际应用需求进行选择。Optionally, the specific manner of performing step S141 to calculate the total loss is not limited, and can be selected according to actual application requirements.
例如,在一种可以替代的示例中,可以直接对所述重构损失和所述分类损失进行求和处理,得到对应的损失总值。For example, in an alternative example, the reconstruction loss and the classification loss can be directly summed to obtain the corresponding total loss.
又例如,在另一种可以替代的示例中,可以先获得预先配置的权重系数,然后,在基于该权重系数计算所述重构损失和所述分类损失的加权和值,得到对应的损失总值。For another example, in another alternative example, a pre-configured weight coefficient can be obtained first, and then the weighted sum of the reconstruction loss and the classification loss is calculated based on the weight coefficient to obtain the corresponding total loss value.
其中,所述权重系数可以是固定值,也可以是动态变化值,例如,可以基于确定的重构损失和分类损失的具体大小,对所述权重系数进行调整,如一种损失的值较大,对应的权重系数可以设置较大。Wherein, the weight coefficient may be a fixed value or a dynamically changing value. For example, the weight coefficient may be adjusted based on the determined reconstruction loss and classification loss. For example, the value of a loss is larger, The corresponding weight coefficient can be set larger.
可选地,执行步骤S142以对所述神经网络模型进行更新处理的具体方式也不受限制,也可以根据实际应用需求进行选择。Optionally, the specific manner of performing step S142 to update the neural network model is not limited, and can also be selected according to actual application requirements.
例如,在一种可以替代的示例中,可以基于所述损失总值对所述神经网络模型进行一次更新处理(即反向传播算法的迭代次数限制为1)即可,以保证神经网络模型更新的效率。For example, in an alternative example, the neural network model can be updated once based on the total loss value (that is, the number of iterations of the backpropagation algorithm is limited to 1) to ensure that the neural network model is updated s efficiency.
又例如,在另一种可以替代的示例中,为了保证得到的图像识别模型具有较高的图像识别精度,结合图9,步骤S142可以包括步骤S142a和步骤S142b,具体内容如下所述。For another example, in another alternative example, in order to ensure that the obtained image recognition model has higher image recognition accuracy, in conjunction with FIG. 9, step S142 may include step S142a and step S142b, and the specific content is as follows.
步骤S142a,基于得到的损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到新的神经网络模型。In step S142a, the neural network model is updated based on the obtained total loss value and the preset back propagation algorithm to obtain a new neural network model.
在本实施例中,基于步骤S141得到所述损失总值之后,可以基于该损失总值,按照反向传播算法对所述神经网络模型进行更新处理(即基于反向传播算法完成一次迭代),从而得到新的神经网络模型。In this embodiment, after the total loss value is obtained based on step S141, the neural network model can be updated according to the back propagation algorithm based on the total loss value (that is, one iteration is completed based on the back propagation algorithm), In order to get a new neural network model.
其中,所述新的神经网络模型可以配置成再次对所述样本图像进行处理,得到新的损失总值。Wherein, the new neural network model may be configured to process the sample image again to obtain a new total loss value.
步骤S142b,判断所述新的损失总值是否小于预设损失值。Step S142b, judging whether the new total loss value is less than a preset loss value.
在本实施例中,在基于步骤S142a得到新的神经网络模型,并基于该新的神经网络模型对所述样本图像进行特征提取处理、分类处理、重构处理和损失确定处理之后,判断得到的新的损失总值是否小于预设损失值。In this embodiment, after obtaining a new neural network model based on step S142a, and performing feature extraction processing, classification processing, reconstruction processing, and loss determination processing on the sample image based on the new neural network model, the determined Whether the new total loss value is less than the preset loss value.
如此,一方面,在所述新的损失总值小于所述预设损失值时,表明当前的神经网络模型已经具有较高的特征信息提取能力、识别精度较高,因而,可以将当前的神经网络模型(即最后一次更新处理得到的神经网络模型)作为图像识别模型。In this way, on the one hand, when the new total loss value is less than the preset loss value, it indicates that the current neural network model has high feature information extraction capabilities and high recognition accuracy. Therefore, the current neural network model can be The network model (that is, the neural network model obtained from the last update process) is used as the image recognition model.
另一方面,在所述新的损失总值不小于所述预设损失值时,表明当前的神经网络模型还不具有较高的特征信息提取能力,识别精度不高,因而,还需要再次执行步骤S142,以基于所述新的损失总值,按照反向传播算法对当前的神经网络模型进行再次更新处理(即基于反向传播算法完成二次迭代),从而再次得到新的神经网络模型。On the other hand, when the new total loss value is not less than the preset loss value, it indicates that the current neural network model does not yet have high feature information extraction capabilities, and the recognition accuracy is not high. Therefore, it needs to be executed again. In step S142, based on the new total loss value, the current neural network model is updated again according to the back propagation algorithm (that is, the second iteration is completed based on the back propagation algorithm), so as to obtain a new neural network model again.
结合图10,本申请实施例还提供一种可应用于上述电子设备10的图像识别方法。其中,所述图像识别方法有关的流程所定义的方法步骤,可以由所述电子设备10实现,也即,可由电子设备执行本申请实施例提供的图像识别方法。下面将对图10所示的具体流程,进行详细阐述。With reference to FIG. 10, an embodiment of the present application also provides an image recognition method applicable to the above-mentioned electronic device 10. The method steps defined in the process related to the image recognition method can be implemented by the electronic device 10, that is, the electronic device can execute the image recognition method provided in the embodiment of the present application. The specific process shown in FIG. 10 will be described in detail below.
步骤S210,将获得的目标图像输入至预设的图像识别模型。Step S210: Input the obtained target image into a preset image recognition model.
在本实施例中,所述电子设备10在获得所述目标图像之后,可以先将该目标图像输入至预设的图像识别模型中。In this embodiment, after obtaining the target image, the electronic device 10 may first input the target image into a preset image recognition model.
其中,所述图像识别模型可以是基于前述的图像识别模型训练方法训练得到。Wherein, the image recognition model may be obtained by training based on the aforementioned image recognition model training method.
步骤S220,通过所述图像识别模型对所述目标图像进行识别处理,得到识别结果。Step S220: Perform recognition processing on the target image through the image recognition model to obtain a recognition result.
在本实施例中,在基于步骤S210将所述目标图像输入至所述图像识别模型之后,所述电子设备10可以通过该图像识别模型对该目标图像进行识别处理,从而得到对应的识别结果。例如,可以通过目标图像中的人脸特征确定对应的人员信息,如确定是否属于某一个 人。In this embodiment, after the target image is input to the image recognition model based on step S210, the electronic device 10 may perform recognition processing on the target image through the image recognition model to obtain a corresponding recognition result. For example, the corresponding person information can be determined by the facial features in the target image, such as determining whether it belongs to a certain person.
其中,由于所述图像识别模型是基于上述的图像识别模型训练方法训练得到,因而,具有较高特征信息提取能力,使得对所述目标图像具有较高的识别精度,使得得到的识别结果可以具有较高的准确度。Wherein, because the image recognition model is trained based on the above-mentioned image recognition model training method, it has a high feature information extraction ability, so that the target image has a high recognition accuracy, so that the obtained recognition result can have Higher accuracy.
结合图11,本申请实施例还提供一种图像识别模型训练装置100,可应配置成上述的电子设备10。其中,所述图像识别模型训练装置100可以包括特征分类模块110、特征重构模块120、损失确定模块130和模型更新模块140。In conjunction with FIG. 11, an embodiment of the present application also provides an image recognition model training apparatus 100, which can be configured as the above-mentioned electronic device 10. The image recognition model training device 100 may include a feature classification module 110, a feature reconstruction module 120, a loss determination module 130, and a model update module 140.
所述特征分类模块110,可以配置成通过预设的图像分类层对图像特征进行分类处理,得到分类结果,其中,该图像分类层属于预设的神经网络模型,该图像特征基于该神经网络模型的特征提取层对样本图像进行处理得到。在本实施例中,所述特征分类模块110可配置成执行图2所示的步骤S110,关于所述特征分类模块110的相关内容可以参照前文对步骤S110的描述。The feature classification module 110 may be configured to classify image features through a preset image classification layer to obtain classification results, where the image classification layer belongs to a preset neural network model, and the image feature is based on the neural network model. The feature extraction layer is obtained by processing the sample image. In this embodiment, the feature classification module 110 may be configured to perform step S110 shown in FIG. 2, and for related content of the feature classification module 110, reference may be made to the foregoing description of step S110.
所述特征重构模块120,可以配置成通过预设的图像重构层对所述图像特征进行重构处理,得到重构图像,其中,该图像重构层属于所述神经网络模型。在本实施例中,所述特征重构模块120可配置成执行图2所示的步骤S120,关于所述特征重构模块120的相关内容可以参照前文对步骤S120的描述。The feature reconstruction module 120 may be configured to perform reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model. In this embodiment, the feature reconstruction module 120 may be configured to perform step S120 shown in FIG. 2, and the relevant content of the feature reconstruction module 120 can refer to the foregoing description of step S120.
所述损失确定模块130,可以配置成通过所述神经网络模型中的损失确定层对所述重构图像进行损失确定处理得到重构损失、对所述分类结果进行损失确定处理得到分类损失。在本实施例中,所述损失确定模块130可配置成执行图2所示的步骤S130,关于所述损失确定模块130的相关内容可以参照前文对步骤S130的描述。The loss determination module 130 may be configured to perform loss determination processing on the reconstructed image through the loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss. In this embodiment, the loss determination module 130 may be configured to perform step S130 shown in FIG. 2, and for related content of the loss determination module 130, reference may be made to the foregoing description of step S130.
所述模型更新模块140,可以配置成基于所述重构损失和所述分类损失对所述神经网络模型进行更新处理,得到图像识别模型,其中,该图像识别模型配置成对目标图像进行识别。在本实施例中,所述模型更新模块140可配置成执行图2所示的步骤S140,关于所述模型更新模块140的相关内容可以参照前文对步骤S140的描述。The model update module 140 may be configured to update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model, wherein the image recognition model is configured to recognize a target image. In this embodiment, the model update module 140 may be configured to perform step S140 shown in FIG. 2. For related content of the model update module 140, reference may be made to the foregoing description of step S140.
本申请实施例提供的上述图像识别模型训练装置,通过对图像特征分别进行分类处理和重构处理,得到分类结果和重构图像,使得在对神经网络模型进行参数更新(即对神经网络模型进行训练)时,可以基于重构损失和分类损失两种损失对神经网络模型进行训练,得到图像识别模型。如此,由于对神经网络模型的训练,充分考虑了重构损失和分类损失,使得对训练所依据的信息的约束和限制的力度较高,进而使得训练得到的图像识别模型在进行图像识别时,可以提取更多的图像特征信息,从而提高图像识别的精度,较好改善了现有的图像识别技术中存在的识别精度不高的问题,具有较高的实用价值。特别是在应配置成人脸识别时,由于不同人脸的特征信息存在较多相似(若提取到的特征较少,极容易 出现识别失败或误差的问题),而使得可以提取到更多的不同特征信息,使得识别结果的精度更高,应用效果显著。The above-mentioned image recognition model training device provided by the embodiment of the application obtains the classification result and the reconstructed image by performing classification processing and reconstruction processing on the image features, so that the parameters of the neural network model are updated (that is, the neural network model is updated). When training), the neural network model can be trained based on the reconstruction loss and the classification loss to obtain the image recognition model. In this way, due to the training of the neural network model, the reconstruction loss and classification loss are fully considered, so that the constraints and restrictions on the information based on the training are relatively high, and the image recognition model obtained by training can be used for image recognition. More image feature information can be extracted, thereby improving the accuracy of image recognition, better improving the problem of low recognition accuracy in the existing image recognition technology, and having higher practical value. Especially when adult face recognition should be configured, because the feature information of different faces is more similar (if fewer features are extracted, the problem of recognition failure or error is very likely to occur), so that more differences can be extracted The characteristic information makes the recognition result more accurate and the application effect is remarkable.
可选地,所述模型更新模块进一步配置成:Optionally, the model update module is further configured to:
基于所述重构损失和所述分类损失进行求和处理,得到损失总值;Performing summation processing based on the reconstruction loss and the classification loss to obtain a total loss;
基于所述损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到图像识别模型。The neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
可选地,所述模型更新模块进一步配置成:Optionally, the model update module is further configured to:
获得预先配置的权重系数;Obtain pre-configured weight coefficients;
基于所述权重系数计算所述重构损失和所述分类损失的加权和值,将所述加权和值作为损失总值。Calculate the weighted sum of the reconstruction loss and the classification loss based on the weight coefficient, and use the weighted sum as the total loss.
可选地,所述模型更新模块进一步配置成:Optionally, the model update module is further configured to:
a,基于得到的损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到新的神经网络模型,其中,该新的神经网络模型用于再次对所述样本图像进行处理,得到新的损失总值;a. Update the neural network model based on the obtained total loss value and the preset back propagation algorithm to obtain a new neural network model, where the new neural network model is used to perform the sample image again Processing, get the new total loss;
b,判断所述新的损失总值是否小于预设损失值,并在该新的损失总值小于该预设损失值时,将最后一次更新处理得到的神经网络模型作为图像识别模型,在该新的损失总值不小于该预设损失值时,再次执行步骤a。b. Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.
可选地,所述损失确定模块进一步配置成:Optionally, the loss determination module is further configured to:
针对每一张重构图像,通过所述损失确定层确定该重构图像与对应的样本图像之间的像素损失,其中,该重构图像为多个,该样本图像为多个;For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;
通过所述损失确定层对确定的多个所述像素损失进行第一损失计算处理,得到重构损失。Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.
可选地,所述损失确定模块进一步配置成:Optionally, the loss determination module is further configured to:
通过所述损失确定层获得多个预设的分类标签,其中,该分类标签基于对多个所述样本图像进行标识处理生成;Obtaining a plurality of preset classification labels through the loss determination layer, wherein the classification labels are generated based on identification processing on the plurality of sample images;
通过所述损失确定层对所述分类结果和所述分类标签进行第二损失计算处理,得到分类损失。Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.
结合图12,本申请实施例还提供一种图像识别装置200,可应配置成上述的电子设备10。其中,所述图像识别装置200可以包括图像输入模块210和图像识别模块220。With reference to FIG. 12, an embodiment of the present application also provides an image recognition apparatus 200, which may be configured as the above-mentioned electronic device 10. Wherein, the image recognition device 200 may include an image input module 210 and an image recognition module 220.
所述图像输入模块210,可以配置成获得目标图像,并将该目标图像输入至预设的图像识别模型,其中,该图像识别模型基于前述的图像识别模型训练装置训练得到。在本实施例中,所述图像输入模块210可配置成执行图10所示的步骤S210,关于所述图像输入 模块210的相关内容可以参照前文对步骤S210的描述。The image input module 210 may be configured to obtain a target image and input the target image into a preset image recognition model, wherein the image recognition model is trained based on the aforementioned image recognition model training device. In this embodiment, the image input module 210 may be configured to perform step S210 shown in FIG. 10, and for related content of the image input module 210, reference may be made to the foregoing description of step S210.
所述图像识别模块220,可以配置成通过所述图像识别模型对所述目标图像进行识别处理,得到识别结果。在本实施例中,所述图像识别模块220可配置成执行图10所示的步骤S220,关于所述图像识别模块的相关内容可以参照前文对步骤S220的描述。The image recognition module 220 may be configured to perform recognition processing on the target image through the image recognition model to obtain a recognition result. In this embodiment, the image recognition module 220 may be configured to perform step S220 shown in FIG. 10, and for related content of the image recognition module, reference may be made to the foregoing description of step S220.
由于本申请实施例提供的图像识别装置是采用前述模型训练装置进行训练得到的图像识别模型进行图像识别,因而具有较高的图像识别精度。Since the image recognition device provided by the embodiment of the present application uses the image recognition model obtained by the aforementioned model training device for image recognition, it has high image recognition accuracy.
在本申请实施例中,对应于上述的图像识别模型训练方法,还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,该计算机程序运行时执行上述图像识别模型训练方法的各个步骤。In the embodiment of the present application, corresponding to the above-mentioned image recognition model training method, a computer-readable storage medium is also provided. The computer-readable storage medium stores a computer program that executes the above-mentioned image recognition model when the computer program is running. The individual steps of the training method.
其中,前述计算机程序运行时执行的各步骤,在此不再一一赘述,可参考前文对所述图像识别模型训练方法的解释说明。Among them, the steps performed during the running of the aforementioned computer program will not be repeated here one by one, and reference may be made to the previous explanation of the image recognition model training method.
并且,在本申请实施例中,对应于上述的图像识别方法,也提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,该计算机程序运行时执行上述图像识别方法的各个步骤。In addition, in the embodiments of the present application, corresponding to the above-mentioned image recognition method, a computer-readable storage medium is also provided. The computer-readable storage medium stores a computer program, and the computer program executes the above-mentioned image recognition method when the computer program is running. The various steps.
其中,前述计算机程序运行时执行的各步骤,在此不再一一赘述,可参考前文对所述图像识别方法的解释说明。Among them, the steps performed during the running of the aforementioned computer program will not be repeated here one by one, and reference may be made to the previous explanation of the image recognition method.
综上所述,本申请提供的模型训练、图像识别方法和装置、电子设备及存储介质,通过对图像特征分别进行分类处理和重构处理,得到分类结果和重构图像,使得在对神经网络模型进行参数更新(即对神经网络模型进行训练)时,可以基于重构损失和分类损失两种损失对神经网络模型进行训练,得到图像识别模型。如此,由于对神经网络模型的训练,充分考虑了重构损失和分类损失,使得对训练所依据的信息的约束和限制的力度较高,进而使得训练得到的图像识别模型在进行图像识别时,可以提取更多的图像特征信息,从而提高图像识别的精度,较好改善了相关的图像识别技术中存在的识别精度不高的问题,具有较高的实用价值,特别是在应用于人脸识别时,由于不同人脸的特征信息存在较多相似(若提取到的特征较少,极容易出现识别失败或误差的问题),而使得可以提取到更多的不同特征信息,使得识别结果的精度更高,应用效果显著。In summary, the model training, image recognition method and device, electronic equipment, and storage medium provided by this application perform classification processing and reconstruction processing on image features, respectively, to obtain classification results and reconstructed images, so that the neural network When the model is updated (that is, the neural network model is trained), the neural network model can be trained based on the reconstruction loss and the classification loss to obtain an image recognition model. In this way, due to the training of the neural network model, the reconstruction loss and classification loss are fully considered, so that the constraints and restrictions on the information based on the training are relatively high, and the image recognition model obtained by training can be used for image recognition. More image feature information can be extracted, thereby improving the accuracy of image recognition, and better improving the problem of low recognition accuracy in related image recognition technologies. It has high practical value, especially in face recognition. At this time, because the feature information of different faces is more similar (if fewer features are extracted, the problem of recognition failure or error is very likely to occur), so that more different feature information can be extracted, so that the accuracy of the recognition result Higher, the application effect is remarkable.
在本申请实施例所提供的几个实施例中,应该理解到,所揭露的装置和方法,也可以通过其它的方式实现。以上所描述的装置和方法实施例仅仅是示意性的,例如,附图中的流程图和框图显示了根据本申请的多个实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现方式中,方框中所标注的功 能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。In the several embodiments provided in the embodiments of the present application, it should be understood that the disclosed device and method may also be implemented in other ways. The device and method embodiments described above are merely illustrative. For example, the flowcharts and block diagrams in the accompanying drawings show possible implementation architectures of devices, methods, and computer program products according to multiple embodiments of the present application. Function and operation. In this regard, each block in the flowchart or block diagram may represent a module, program segment, or part of the code, and the module, program segment, or part of the code contains one or more modules for realizing the specified logical function. Executable instructions. It should also be noted that, in some alternative implementations, the functions marked in the blocks may also occur in a different order from the order marked in the drawings. For example, two consecutive blocks can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.
另外,在本申请各个实施例中的各功能模块可以集成在一起形成一个独立的部分,也可以是各个模块单独存在,也可以两个或两个以上模块集成形成一个独立的部分。In addition, the functional modules in the various embodiments of the present application may be integrated together to form an independent part, or each module may exist alone, or two or more modules may be integrated to form an independent part.
所述功能如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,电子设备,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。If the function is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, an electronic device, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. . It should be noted that in this article, the terms "including", "including" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, method, article, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or equipment that includes the element.
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The foregoing descriptions are only preferred embodiments of the application, and are not intended to limit the application. For those skilled in the art, the application can have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the protection scope of this application.
工业实用性Industrial applicability
本申请提出的技术方案中,通过对图像特征分别进行分类处理和重构处理,得到分类结果和重构图像,使得在对神经网络模型进行参数更新(即对神经网络模型进行训练)时,可以基于重构损失和分类损失两种损失对神经网络模型进行训练,得到图像识别模型。如此,由于对神经网络模型的训练,充分考虑了重构损失和分类损失,使得对训练所依据的信息的约束和限制的力度较高,进而使得训练得到的图像识别模型在进行图像识别时,可以提取更多的图像特征信息,从而提高图像识别的精度,较好改善了相关的图像识别技术中存在的识别精度不高的问题,具有较高的实用价值。In the technical solution proposed in this application, the classification result and the reconstructed image are obtained by performing classification processing and reconstruction processing on the image features, so that when the neural network model is updated (that is, the neural network model is trained), it can be Based on the reconstruction loss and classification loss, the neural network model is trained to obtain the image recognition model. In this way, due to the training of the neural network model, the reconstruction loss and classification loss are fully considered, so that the constraints and restrictions on the information based on the training are relatively high, and the image recognition model obtained by training can be used for image recognition. More image feature information can be extracted, thereby improving the accuracy of image recognition, better improving the problem of low recognition accuracy in related image recognition technologies, and having higher practical value.

Claims (16)

  1. 一种图像识别模型训练方法,其特征在于,包括:An image recognition model training method, which is characterized in that it includes:
    通过预设的图像分类层对图像特征进行分类处理,得到分类结果,其中,该图像分类层属于预设的神经网络模型,该图像特征基于该神经网络模型的特征提取层对样本图像进行处理得到;The image features are classified through the preset image classification layer to obtain the classification result. The image classification layer belongs to the preset neural network model, and the image feature is based on the feature extraction layer of the neural network model to process the sample image. ;
    通过预设的图像重构层对所述图像特征进行重构处理,得到重构图像,其中,该图像重构层属于所述神经网络模型;Performing reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model;
    通过所述神经网络模型中的损失确定层对所述重构图像进行损失确定处理得到重构损失、对所述分类结果进行损失确定处理得到分类损失;Performing loss determination processing on the reconstructed image through the loss determination layer in the neural network model to obtain a reconstruction loss, and performing loss determination processing on the classification result to obtain a classification loss;
    基于所述重构损失和所述分类损失对所述神经网络模型进行更新处理,得到图像识别模型,其中,该图像识别模型配置成对目标图像进行识别。The neural network model is updated based on the reconstruction loss and the classification loss to obtain an image recognition model, where the image recognition model is configured to recognize a target image.
  2. 根据权利要求1所述的图像识别模型训练方法,其特征在于,所述基于所述重构损失和所述分类损失对所述神经网络模型进行更新处理,得到图像识别模型的步骤,包括:The image recognition model training method according to claim 1, wherein the step of updating the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model comprises:
    基于所述重构损失和所述分类损失进行求和处理,得到损失总值;Performing summation processing based on the reconstruction loss and the classification loss to obtain a total loss;
    基于所述损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到图像识别模型。The neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
  3. 根据权利要求2所述的图像识别模型训练方法,其特征在于,所述基于所述重构损失和所述分类损失进行求和处理,得到损失总值的步骤,包括:The image recognition model training method according to claim 2, wherein the step of performing summation processing based on the reconstruction loss and the classification loss to obtain a total loss value comprises:
    获得预先配置的权重系数;Obtain pre-configured weight coefficients;
    基于所述权重系数计算所述重构损失和所述分类损失的加权和值,将所述加权和值作为损失总值。Calculate the weighted sum of the reconstruction loss and the classification loss based on the weight coefficient, and use the weighted sum as the total loss.
  4. 根据权利要求2或3所述的图像识别模型训练方法,其特征在于,所述基于所述损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到图像识别模型的步骤,包括:The image recognition model training method according to claim 2 or 3, wherein the neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain the image recognition model The steps include:
    a,基于得到的损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到新的神经网络模型,其中,该新的神经网络模型用于再次对所述样本图像进行处理,得到新的损失总值;a. Update the neural network model based on the obtained total loss value and the preset back propagation algorithm to obtain a new neural network model, where the new neural network model is used to perform the sample image again Processing, get the new total loss;
    b,判断所述新的损失总值是否小于预设损失值,并在该新的损失总值小于该预设损失值时,将最后一次更新处理得到的神经网络模型作为图像识别模型,在该新的损失总值不小于该预设损失值时,再次执行步骤a。b. Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.
  5. 根据权利要求1-4任意一项所述的图像识别模型训练方法,其特征在于,所述对所述 重构图像进行损失确定处理得到重构损失的步骤,包括:The image recognition model training method according to any one of claims 1-4, wherein the step of performing loss determination processing on the reconstructed image to obtain reconstruction loss comprises:
    针对每一张重构图像,通过所述损失确定层确定该重构图像与对应的样本图像之间的像素损失,其中,该重构图像为多个,该样本图像为多个;For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;
    通过所述损失确定层对确定的多个所述像素损失进行第一损失计算处理,得到重构损失。Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.
  6. 根据权利要求1-5任意一项所述的图像识别模型训练方法,其特征在于,所述对所述分类结果进行损失确定处理得到分类损失的步骤,包括:The image recognition model training method according to any one of claims 1-5, wherein the step of performing loss determination processing on the classification result to obtain a classification loss comprises:
    通过所述损失确定层获得多个预设的分类标签,其中,该分类标签基于对多个所述样本图像进行标识处理生成;Obtaining a plurality of preset classification labels through the loss determination layer, wherein the classification labels are generated based on identification processing on the plurality of sample images;
    通过所述损失确定层对所述分类结果和所述分类标签进行第二损失计算处理,得到分类损失。Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.
  7. 一种图像识别方法,其特征在于,包括:An image recognition method, characterized in that it comprises:
    获得目标图像,并将该目标图像输入至预设的图像识别模型,其中,该图像识别模型基于权利要求1-6任意一项所述的图像识别模型训练方法训练得到;Obtain a target image, and input the target image into a preset image recognition model, wherein the image recognition model is trained based on the image recognition model training method of any one of claims 1-6;
    通过所述图像识别模型对所述目标图像进行识别处理,得到识别结果。Perform recognition processing on the target image through the image recognition model to obtain a recognition result.
  8. 一种图像识别模型训练装置,其特征在于,包括:An image recognition model training device, which is characterized in that it comprises:
    特征分类模块,配置成通过预设的图像分类层对图像特征进行分类处理,得到分类结果,其中,该图像分类层属于预设的神经网络模型,该图像特征基于该神经网络模型的特征提取层对样本图像进行处理得到;The feature classification module is configured to classify image features through a preset image classification layer to obtain a classification result, where the image classification layer belongs to a preset neural network model, and the image feature is based on the feature extraction layer of the neural network model Obtained by processing the sample image;
    特征重构模块,配置成通过预设的图像重构层对所述图像特征进行重构处理,得到重构图像,其中,该图像重构层属于所述神经网络模型;The feature reconstruction module is configured to perform reconstruction processing on the image features through a preset image reconstruction layer to obtain a reconstructed image, where the image reconstruction layer belongs to the neural network model;
    损失确定模块,配置成通过所述神经网络模型中的损失确定层对所述重构图像进行损失确定处理得到重构损失、对所述分类结果进行损失确定处理得到分类损失;A loss determination module configured to perform loss determination processing on the reconstructed image through a loss determination layer in the neural network model to obtain a reconstruction loss, and perform loss determination processing on the classification result to obtain a classification loss;
    模型更新模块,配置成基于所述重构损失和所述分类损失对所述神经网络模型进行更新处理,得到图像识别模型,其中,该图像识别模型配置成对目标图像进行识别。The model update module is configured to update the neural network model based on the reconstruction loss and the classification loss to obtain an image recognition model, wherein the image recognition model is configured to recognize a target image.
  9. 根据权利要求8所述的图像识别模型训练装置,其特征在于,所述模型更新模块进一步配置成:The image recognition model training device according to claim 8, wherein the model update module is further configured to:
    基于所述重构损失和所述分类损失进行求和处理,得到损失总值;Performing summation processing based on the reconstruction loss and the classification loss to obtain a total loss;
    基于所述损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到图像识别模型。The neural network model is updated based on the total loss value and a preset back propagation algorithm to obtain an image recognition model.
  10. 根据权利要求9所述的图像识别模型训练装置,其特征在于,所述模型更新模块进一步配置成:The image recognition model training device according to claim 9, wherein the model update module is further configured to:
    获得预先配置的权重系数;Obtain pre-configured weight coefficients;
    基于所述权重系数计算所述重构损失和所述分类损失的加权和值,将所述加权和值作为损失总值。Calculate the weighted sum of the reconstruction loss and the classification loss based on the weight coefficient, and use the weighted sum as the total loss.
  11. 根据权利要求9或10所述的图像识别模型训练装置,其特征在于,所述模型更新模块进一步配置成:The image recognition model training device according to claim 9 or 10, wherein the model update module is further configured to:
    a,基于得到的损失总值和预设的反向传播算法对所述神经网络模型进行更新处理,得到新的神经网络模型,其中,该新的神经网络模型用于再次对所述样本图像进行处理,得到新的损失总值;a. Update the neural network model based on the obtained total loss value and the preset back propagation algorithm to obtain a new neural network model, where the new neural network model is used to perform the sample image again Processing, get the new total loss;
    b,判断所述新的损失总值是否小于预设损失值,并在该新的损失总值小于该预设损失值时,将最后一次更新处理得到的神经网络模型作为图像识别模型,在该新的损失总值不小于该预设损失值时,再次执行步骤a。b. Determine whether the new total loss value is less than the preset loss value, and when the new total loss value is less than the preset loss value, use the neural network model obtained from the last update process as the image recognition model. When the new total loss value is not less than the preset loss value, step a is executed again.
  12. 根据权利要求8-11任意一项所述的图像识别模型训练装置,其特征在于,所述损失确定模块进一步配置成:The image recognition model training device according to any one of claims 8-11, wherein the loss determination module is further configured to:
    针对每一张重构图像,通过所述损失确定层确定该重构图像与对应的样本图像之间的像素损失,其中,该重构图像为多个,该样本图像为多个;For each reconstructed image, determine the pixel loss between the reconstructed image and the corresponding sample image through the loss determination layer, where there are multiple reconstructed images and multiple sample images;
    通过所述损失确定层对确定的多个所述像素损失进行第一损失计算处理,得到重构损失。Perform a first loss calculation process on the determined multiple pixel losses through the loss determination layer to obtain a reconstruction loss.
  13. 根据权利要求8-12任意一项所述的图像识别模型训练装置,其特征在于,所述损失确定模块进一步配置成:The image recognition model training device according to any one of claims 8-12, wherein the loss determination module is further configured to:
    通过所述损失确定层获得多个预设的分类标签,其中,该分类标签基于对多个所述样本图像进行标识处理生成;Obtaining a plurality of preset classification labels through the loss determination layer, wherein the classification labels are generated based on identification processing on the plurality of sample images;
    通过所述损失确定层对所述分类结果和所述分类标签进行第二损失计算处理,得到分类损失。Perform a second loss calculation process on the classification result and the classification label through the loss determination layer to obtain a classification loss.
  14. 一种图像识别装置,其特征在于,包括:An image recognition device, characterized in that it comprises:
    图像输入模块,配置成获得目标图像,并将该目标图像输入至预设的图像识别模型,其中,该图像识别模型基于权利要求8所述的图像识别模型训练装置训练得到;An image input module configured to obtain a target image and input the target image to a preset image recognition model, wherein the image recognition model is trained based on the image recognition model training device of claim 8;
    图像识别模块,配置成通过所述图像识别模型对所述目标图像进行识别处理,得到识别结果。The image recognition module is configured to perform recognition processing on the target image through the image recognition model to obtain a recognition result.
  15. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    存储器,配置成存储计算机程序;Memory, configured to store computer programs;
    与所述存储器连接的处理器,配置成执行该存储器存储的计算机程序,以实现权利要求1-6任意一项所述的图像识别模型训练方法,或实现权利要求7所述的图像识别方法。The processor connected to the memory is configured to execute the computer program stored in the memory to implement the image recognition model training method according to any one of claims 1-6, or to implement the image recognition method according to claim 7.
  16. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被执行时,实现权利要求1-6任意一项所述的图像识别模型训练方法,或实现权利要求7所述的图像识别方法。A computer-readable storage medium with a computer program stored thereon, characterized in that, when the computer program is executed, it realizes the image recognition model training method according to any one of claims 1-6, or realizes the method described in claim 7 The image recognition method described.
PCT/CN2021/096763 2020-06-01 2021-05-28 Model training method and apparatus, image recognition method and apparatus, electronic device, and storage medium WO2021244425A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010486948.8A CN111639607A (en) 2020-06-01 2020-06-01 Model training method, image recognition method, model training device, image recognition device, electronic equipment and storage medium
CN202010486948.8 2020-06-01

Publications (1)

Publication Number Publication Date
WO2021244425A1 true WO2021244425A1 (en) 2021-12-09

Family

ID=72329555

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/096763 WO2021244425A1 (en) 2020-06-01 2021-05-28 Model training method and apparatus, image recognition method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN111639607A (en)
WO (1) WO2021244425A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115587629A (en) * 2022-12-07 2023-01-10 中国科学院上海高等研究院 Covariance expansion coefficient estimation method, model training method and storage medium terminal

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639607A (en) * 2020-06-01 2020-09-08 广州虎牙科技有限公司 Model training method, image recognition method, model training device, image recognition device, electronic equipment and storage medium
CN112668637B (en) * 2020-12-25 2023-05-23 苏州科达科技股份有限公司 Training method, recognition method and device of network model and electronic equipment
CN112651445A (en) * 2020-12-29 2021-04-13 广州中医药大学(广州中医药研究院) Biological information identification method and device based on deep network multi-modal information fusion

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107180248A (en) * 2017-06-12 2017-09-19 桂林电子科技大学 Strengthen the hyperspectral image classification method of network based on associated losses
CN110490878A (en) * 2019-07-29 2019-11-22 上海商汤智能科技有限公司 Image processing method and device, electronic equipment and storage medium
US10529318B2 (en) * 2015-07-31 2020-01-07 International Business Machines Corporation Implementing a classification model for recognition processing
CN111639607A (en) * 2020-06-01 2020-09-08 广州虎牙科技有限公司 Model training method, image recognition method, model training device, image recognition device, electronic equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10453444B2 (en) * 2017-07-27 2019-10-22 Microsoft Technology Licensing, Llc Intent and slot detection for digital assistants
CN108304829B (en) * 2018-03-08 2020-03-06 北京旷视科技有限公司 Face recognition method, device and system
CN109508669B (en) * 2018-11-09 2021-07-23 厦门大学 Facial expression recognition method based on generative confrontation network
CN110070030B (en) * 2019-04-18 2021-10-15 北京迈格威科技有限公司 Image recognition and neural network model training method, device and system
CN111160095B (en) * 2019-11-26 2023-04-25 华东师范大学 Unbiased face feature extraction and classification method and system based on depth self-encoder network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10529318B2 (en) * 2015-07-31 2020-01-07 International Business Machines Corporation Implementing a classification model for recognition processing
CN107180248A (en) * 2017-06-12 2017-09-19 桂林电子科技大学 Strengthen the hyperspectral image classification method of network based on associated losses
CN110490878A (en) * 2019-07-29 2019-11-22 上海商汤智能科技有限公司 Image processing method and device, electronic equipment and storage medium
CN111639607A (en) * 2020-06-01 2020-09-08 广州虎牙科技有限公司 Model training method, image recognition method, model training device, image recognition device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ROBERT THOMAS; THOME NICOLAS; CORD MATTHIEU: "HybridNet: Classification and Reconstruction Cooperation for Semi-supervised Learning", vol. 11211 Chap.10, no. 558, 6 October 2018 (2018-10-06), Berlin, Heidelberg , pages 158 - 175, XP047488230, ISBN: 3540745491, DOI: 10.1007/978-3-030-01234-2_10 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115587629A (en) * 2022-12-07 2023-01-10 中国科学院上海高等研究院 Covariance expansion coefficient estimation method, model training method and storage medium terminal
CN115587629B (en) * 2022-12-07 2023-04-07 中国科学院上海高等研究院 Covariance expansion coefficient estimation method, model training method and storage medium terminal

Also Published As

Publication number Publication date
CN111639607A (en) 2020-09-08

Similar Documents

Publication Publication Date Title
WO2021244425A1 (en) Model training method and apparatus, image recognition method and apparatus, electronic device, and storage medium
US20210209410A1 (en) Method and apparatus for classification of wafer defect patterns as well as storage medium and electronic device
CN112732911B (en) Semantic recognition-based speaking recommendation method, device, equipment and storage medium
CN109685087B (en) Information processing method and device and information detection method
CN110765860A (en) Tumble determination method, tumble determination device, computer apparatus, and storage medium
WO2021244521A1 (en) Object classification model training method and apparatus, electronic device, and storage medium
CN111259985A (en) Classification model training method and device based on business safety and storage medium
CN111105017A (en) Neural network quantization method and device and electronic equipment
CN111340213B (en) Neural network training method, electronic device, and storage medium
CN116167010A (en) Rapid identification method for abnormal events of power system with intelligent transfer learning capability
CN108496174B (en) Method and system for face recognition
CN114445121A (en) Advertisement click rate prediction model construction and advertisement click rate prediction method
CN113192627A (en) Patient and disease bipartite graph-based readmission prediction method and system
US20230021551A1 (en) Using training images and scaled training images to train an image segmentation model
US20220197949A1 (en) Method and device for predicting next event to occur
CN116012841A (en) Open set image scene matching method and device based on deep learning
CN112633394B (en) Intelligent user label determination method, terminal equipment and storage medium
CN115456039A (en) Click rate estimation model training method, click rate estimation method and electronic equipment
CN114117037A (en) Intention recognition method, device, equipment and storage medium
CN116501993B (en) House source data recommendation method and device
CN116911955B (en) Training method and device for target recommendation model
CN115114345B (en) Feature representation extraction method, device, equipment, storage medium and program product
CN116708313B (en) Flow detection method, flow detection device, storage medium and electronic equipment
CN113591983B (en) Image recognition method and device
CN116976486A (en) Complaint request processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21816687

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21816687

Country of ref document: EP

Kind code of ref document: A1