CN111985554A

CN111985554A - Model training method, bracelet identification method and corresponding device

Info

Publication number: CN111985554A
Application number: CN202010834303.9A
Authority: CN
Inventors: 柯政远; 李锴莹
Original assignee: Innovation Qizhi Xi'an Technology Co ltd
Current assignee: Innovation Qizhi Xi'an Technology Co ltd
Priority date: 2020-08-18
Filing date: 2020-08-18
Publication date: 2020-11-24

Abstract

The application provides a model training method, a bracelet identification method and a corresponding device, wherein the model training method comprises the following steps: inputting N training samples of a current batch into an image classification model to obtain a feature vector corresponding to each training sample; calculating the loss of the current batch based on the N eigenvectors and the corresponding class labels, wherein the loss of the current batch comprises classification loss and auxiliary loss; updating network parameters of the image classification model based on the loss of the current lot. In the embodiment, the image classification model is trained by combining auxiliary losses, and the auxiliary losses can represent the differences between the feature vectors with the same class labels and the differences between the feature vectors with different class labels, so that the sensitivity of the image classification model to the differences between jade bracelets can be improved, and the classification performance of the image classification model is effectively improved.

Description

Model training method, bracelet identification method and corresponding device

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a model training method, a bracelet identification method and a corresponding device.

Background

In recent years, the jewelry e-commerce platform is continuously developed, and the data of resident merchants and commodities are continuously increased. For ornaments such as emerald bracelets, merchants upload commodity images, and the platform is responsible for carrying out detailed evaluation on the emerald bracelets according to indexes such as purity, evenness, lightness, seeds, color tone and purity so as to determine the types and prices of the emerald bracelets. The number of commodities is large, and the manual evaluation efficiency is limited; meanwhile, in the evaluation indexes of the jade bracelets, classification evaluation is often performed depending on slight differences, such as the overall color of the jade, the types of colors with slight surface textures, the distribution of textures, the colors of stripes different from the overall color, and the like.

Disclosure of Invention

An object of the embodiments of the present application is to provide a model training method, a bracelet identification method and a corresponding apparatus, so as to improve the above technical problems.

In a first aspect, the present application provides a model training method, including: inputting N training samples of a current batch into an image classification model to obtain a feature vector corresponding to each training sample; each training sample comprises a target bracelet image and a corresponding class label; calculating the loss of the current batch based on the N feature vectors and the corresponding class labels thereof, wherein the loss of the current batch comprises classification loss and auxiliary loss, the classification loss represents the difference between the classification prediction result and the real classification result, and the auxiliary loss represents the difference between the feature vectors with the same class label and the difference between the feature vectors with different class labels; updating network parameters of the image classification model based on the loss of the current lot.

In the classification of jade bracelets, the texture among all the classes or the color difference of speckles different from the whole is small, identifiable characteristics are small, the inter-class similarity is high, if only classification loss is utilized, the degree of optimizing an image classification model is limited, the scheme is combined with auxiliary loss to train the image classification model, and the auxiliary loss can represent the difference among feature vectors with the same class labels and the difference among feature vectors with different class labels, so that the sensitivity of the image classification model to the inter-class difference of the jade bracelets can be improved, and the classification performance of the image classification model is effectively improved.

In an optional embodiment, the auxiliary loss is a Triplet loss, and the calculating the loss of the current lot based on the N feature vectors and the corresponding class labels includes: converting the N feature vectors into N probability distributions by using a softmax classifier, and calculating the classification loss of the current batch based on the N probability distributions and the corresponding N class labels; each probability distribution represents a classification prediction result of a corresponding training sample; calculating the Triplet loss of the current batch based on the N eigenvectors and the class labels corresponding to the eigenvectors; and carrying out weighted summation on the classification loss and the Triplet loss to obtain the loss of the current batch.

In an optional embodiment, the N training samples of the current batch correspond to K classes, and the calculating the Triplet loss of the current batch based on the N feature vectors and the class labels corresponding to the N feature vectors includes: constructing a plurality of triples [ V ] according to the feature vectors of the N training samples_A,V_P,V_N]Wherein V in each of the triplets_AFor any one of N feature vectors, V_PIs a V_ADividing V in feature vector of training sample corresponding to category_AAny one of (1), V_NIs equal to V_AAny one of feature vectors of training samples of different classes; calculating the Triplet loss of each constructed Triplet to obtain a plurality of Triplet losses; and averaging the multiple triplets losses to obtain the triplets loss of the current batch.

During the training phase, classify loss and auxiliary loss Triplet _ loss_mThe images jointly participate in optimization of the image classification model, after training and learning, positive samples of the same type are closer to the anchor samples, negative samples of different types are far away from the anchor samples, and therefore the types of similar bracelet images can be accurately screened.

In an alternative embodiment, before inputting the current batch of N training samples into the image classification model, the method further comprises: acquiring N original bracelet images; and carrying out background removal processing on each original bracelet image, and intercepting the bracelet target in the processed image to obtain N target bracelet images.

For the existing classification algorithm, the background in the image is easy to cause interference, and the neural network is easy to learn some information of the background during training, so the scheme proposes that the background of the bracelet image is removed firstly, then the image classification model is trained based on the bracelet image with the removed background, and the image input into the image classification model is ensured to only contain the information of the bracelet target.

In an optional implementation manner, the performing a background removal process on each original bracelet image includes: recognizing each original bracelet image by utilizing a semantic segmentation model to obtain N masks; and superposing each original bracelet image and the corresponding mask to obtain N processed images.

The semantic segmentation model can accurately distinguish a bracelet target and a background after training, generate a corresponding mask, and remove the background in an original bracelet image after the mask is overlapped with the original bracelet image. The scheme can effectively deal with various complex backgrounds.

In an optional implementation manner, the performing a background removal process on each original bracelet image includes: acquiring a preset background image; and subtracting the background image from each original bracelet image to obtain N processed images.

In the scheme, by uniformly requiring the bracelet images uploaded by merchants, for example, the bracelet images uploaded by the merchants are required to be shot under a specific background, so that the background can be quickly removed by only subtracting the original bracelet images from the background images. The scheme is simple and rapid, and has small calculation amount.

In a second aspect, the present application provides a bracelet identification method, including: acquiring a bracelet image to be identified; carrying out background removal processing on the bracelet image to be identified, and intercepting a bracelet target in the processed image to obtain a target bracelet image; inputting the target bracelet image into an image classification model obtained by training through the method of the first aspect or any optional embodiment of the first aspect to obtain a corresponding feature vector; and carrying out image classification based on the feature vectors to obtain the category of the bracelet image to be identified.

In an optional implementation manner, the performing the background removal processing on the bracelet image to be identified includes: utilizing a semantic segmentation model to identify the bracelet image to be identified to obtain a corresponding mask; and superposing the bracelet image to be identified and the mask to obtain a processed image.

In a third aspect, the present application provides a model training apparatus, comprising: the first feature extraction module is used for inputting the N training samples of the current batch into the image classification model to obtain a feature vector corresponding to each training sample; each training sample comprises a target bracelet image and a corresponding class label; a loss calculation module, configured to calculate a loss of a current batch based on the N feature vectors and the class labels corresponding to the N feature vectors, where the loss of the current batch includes a classification loss and an auxiliary loss, the classification loss represents a difference between a classification prediction result and a real classification result, and the auxiliary loss represents a difference between feature vectors having the same class label and a difference between feature vectors having different class labels; and the model updating module is used for updating the network parameters of the image classification model based on the loss of the current batch.

In a fourth aspect, the present application provides a bracelet identification device comprising: the image acquisition module is used for acquiring an image of the bracelet to be identified; the image preprocessing module is used for carrying out background removal processing on the bracelet image to be identified and intercepting a bracelet target in the processed image to obtain a target bracelet image; a second feature extraction module, configured to input the target bracelet image into an image classification model trained by the method according to the first aspect or any optional implementation manner of the first aspect, so as to obtain a corresponding feature vector; and the bracelet identification module is used for carrying out image classification based on the feature vector to obtain the category of the bracelet image to be identified.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

FIG. 1 is a flow chart of a model training method provided in an embodiment of the present application;

FIG. 2 is a flow chart of one embodiment of step 120;

FIG. 3 is a flowchart of a model training method according to an embodiment of the present disclosure;

fig. 4 is a flowchart of a bracelet identification method provided in an embodiment of the present application;

FIG. 5 is a schematic diagram of a model training apparatus according to an embodiment of the present disclosure;

FIG. 6 is a schematic view of a bracelet identification device provided by an embodiment of the application;

fig. 7 is a schematic view of an electronic device provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

The embodiment of the application provides a model training method, which improves the sensitivity of an image classification model to bracelet class differences and effectively improves the classification performance of the image classification model through reasonable design of loss. Fig. 1 shows a flow chart of the model training method, which, as shown in fig. 1, comprises the following steps:

step 110: inputting N training samples of a current batch into an image classification model to obtain a feature vector corresponding to each training sample; each training sample includes an image of a target bracelet and a corresponding class label.

Before training of a current batch, N training samples of the current batch are obtained, wherein each training sample comprises a target bracelet image and a corresponding class label, the N training samples are divided into K classes according to the class labels, and each class comprises a plurality of training samples. In one embodiment, the number of training samples in the current batch is P × K, that is, each class contains the same number of training samples, and each class includes P training samples, but the number of training samples in each class may also be different, for example: class 1 includes P1 training samples and class 2 includes P2 training samples.

After N training samples of the current batch are obtained, the N training samples are input into an image classification model, the features of each target bracelet image are extracted, and N feature vectors are obtained. It is understood that the image classification model employs a neural network for feature extraction of the target bracelet image.

Step 120: and calculating the loss of the current batch based on the N feature vectors and the corresponding class labels, wherein the loss of the current batch comprises classification loss and auxiliary loss.

Wherein the classification loss represents the difference between the classification prediction result and the real classification result. For example, the loss function of the classification loss may be any one of a cross-entropy loss function, a 0-1 loss function, an absolute value loss function, a square loss function, and the like. The auxiliary loss characterizes the dissimilarity between feature vectors with the same class label and the dissimilarity between feature vectors with different class labels. In a specific embodiment, the auxiliary Loss may be Contrast Loss, triple Loss, or the like.

Step 130: network parameters of the image classification model are updated based on the loss of the current batch.

After obtaining the loss value of the current batch, updating network parameters of a neural network in the image classification model by using a back propagation algorithm based on the loss value.

In the step 120, as shown in fig. 2, a specific implementation manner includes the following steps:

step 121: and converting the N feature vectors into N probability distributions by using a softmax classifier, and calculating the classification loss of the current batch based on the N probability distributions and the corresponding N class labels.

Inputting the N characteristic vectors output by the image classification model into a next-layer softmax classifier, converting the N characteristic vectors into N probability distributions by using the softmax classifier, and representing the classification prediction result of the corresponding training sample by each probability distribution. Each probability distribution comprises a plurality of probability values, and each probability value represents the probability that the training sample belongs to the corresponding category. The classification loss of the current lot may be calculated based on the N probability distributions and the corresponding N class labels, and the classification loss may be calculated by selecting some existing loss functions, such as the several listed above.

Step 122: and calculating the triple loss of the current batch based on the N eigenvectors and the corresponding class labels.

In addition to the classification loss, the auxiliary loss of the current batch needs to be calculated. This embodiment will be described by taking an example in which the assist loss is a Triplet loss. The process of calculating Triplet loss from the N eigenvectors output by the image classification model is as follows.

For the sake of understanding, assume that there are K classes in the N training samples of the current batch, and each class includes P training samples. Firstly, a plurality of triples [ V ] are constructed according to the feature vectors of N training samples_A,V_P,V_N]Wherein V in each triplet_AFor any one of N feature vectors, V_PIs a V_ADividing V in feature vector of training sample corresponding to category_AAny one of (1), V_NIs equal to V_AAny of the feature vectors of the training samples of the different classes. Specifically, in the process of constructing each triplet, a feature vector V of the training sample of the ith category is arbitrarily taken first_AThen, taking the feature vector V of any remaining training sample under the category_PThen, taking the feature vector V of any training sample of different category from the category_NConstruct a triad [ V ]_A,V_P,V_N]. The value range of i is [1, K ]]。

Repeating the above operationTo construct a plurality of triplets. When constructing a plurality of triplets, different V can be arbitrarily combined in the above manner_A、V_P、V_NUntil all of the various possible combinations are taken over, it will be understood that V_AA total of P x K selections, at V_AAfter determination, V_POne total of (P-1) choices, V_NA total of (P x K-P) selections are made, and thus, a total of P x K (P-1) x (P x K-P) triplets can be constructed at most. It should be noted that P × K (P-1) × (P × K-P) only indicates the maximum number of triples that can be constructed when each of the K classes includes P training samples, and this embodiment does not illustrate the number of triples when each of the K classes includes different numbers of training samples.

It will be appreciated that the number of triples constructed in the previous step may be P x K (P-1) x (P x K-P), or there may be no need to take combinations between the individual feature vectors, i.e. the number of constructed triples may be less than P x K (P-1) x (P x K-P).

Constructing a plurality of triplets [ V ] according to feature vectors of N training samples_A,V_P,V_N]And then calculating the Triplet loss of each constructed Triplet to obtain a plurality of Triplet losses. The equation for calculating Triplet loss is:

Triplet_loss_A＝max(d(V_A,V_P)-d(V_A,V_N)+margin,0)；

where max represents the maximum value operation, d represents the calculated Euclidean distance, and d (V)_A,V_P) Representing a calculated feature vector V_AAnd V_POf between, d (V)_A,V_N) Representing a calculated feature vector V_AAnd V_NThe euclidean distance between them, margin, represents a minimum separation between the two distances, which can be taken from empirical values. Triple _ loss_ARepresenting the Triplet loss calculated for each Triplet.

After the Triplet loss of each Triplet is calculated, averaging the multiple Triplet losses to obtain the Triplet loss of the current batch.

According to the foregoing example, if constructedThe number of triplets in (b) is P × K (P-1) × (P × K-P), then P × K (P-1) × (P × K-P) triplets _ loss are calculated_AFor P X K (P-1) X (P X K-P) triplets _ loss_AAnd averaging to obtain the triple loss of the current batch for optimizing the image classification model, and recording the triple loss as triple _ loss_m。

After calculating the triple loss of the current batch in step 122, continuing to step 123: and carrying out weighted summation on the classification loss and the Triplet loss to obtain the loss of the current batch.

Loss function loss of image classification model in training process_clsComprises the following steps:

loss_cls＝loss+·Triplet_loss_m

wherein, the loss is the classification loss of the current batch, triple _ loss_mAnd gamma is a weight coefficient for the auxiliary loss of the current batch, and can be taken according to an empirical value.

In the present embodiment, the loss function loss is utilized_clsPerforming training iteration on the image classification model, wherein the training purpose is to reduce loss_clsThe smaller the drop during the training iteration, the better the auxiliary loss Triplet loss_mAfter learning through training, positive samples of the same class are made closer to the anchor samples, while negative samples of different classes are made further away from the anchor samples. Wherein, V_AThe corresponding training sample is the anchor sample, V_PThe corresponding training sample is a positive sample, V_NThe corresponding training samples are negative samples.

Auxiliary loss Triplet loss_mThe value of margin in (1) should be appropriate, if the value of margin is set to be smaller, then triple _ loss_mThe image classification model obtained by training can not well distinguish similar bracelet images because the image classification model is easy to approach 0; if the value of margin is set to be large, triple _ loss_mIt is likely that a large value will eventually be maintained, and it is difficult to approach 0.

In the embodiment, the image classification model is trained by combining triple loss, so that the sensitivity of the image classification model to differences between emerald bracelet types can be effectively improved, the classification performance of the model is improved, and the similar bracelet images can be accurately classified.

In an actual application scene, the backgrounds of jade bracelet images uploaded by merchants are complex and various, the bracelet placing angles are different, the backgrounds in the images are very easy to interfere with the existing classification algorithm, and a neural network is easy to learn some information of the backgrounds during training.

It is understood that step 110 is preceded by the step of obtaining N training samples of the current batch, wherein each training sample includes a target bracelet image, and the target bracelet image may be an image subjected to background removal processing.

Specifically, as shown in fig. 3, before step 110, the training method further includes the following steps:

step 210: and acquiring N original bracelet images.

Step 220: and carrying out background removal processing on each original bracelet image.

Step 230: and intercepting the bracelet target in the processed image to obtain N target bracelet images.

The background removal process in step 220 includes, but is not limited to, the following two implementation manners:

the first method is as follows: after N original bracelet images are obtained, each original bracelet image is identified by using a trained semantic segmentation model, and N masks are obtained, wherein each mask value in each mask represents information of a corresponding pixel point on the original bracelet image, for example, the mask value is 0, the corresponding pixel point on the original bracelet image is a background, the mask value is 1, and the corresponding pixel point on the original bracelet image is a bracelet target. And superposing each original bracelet image with a corresponding mask, removing the background in the original bracelet image after superposition to obtain N processed images, wherein the size of the processed images is the same as that of the original bracelet image, but the background area is removed, and only the bracelet target is reserved.

Before the semantic segmentation model is applied, a plurality of original bracelet images and corresponding masks labeled manually are input into the semantic segmentation model for training, so that the trained semantic segmentation model is obtained. The first mode can effectively deal with various complex backgrounds.

The second method comprises the following steps: there is a uniform requirement for the bracelet images uploaded by merchants, for example, the bracelet images uploaded by the merchants are required to be shot in a specific background, such as placing an emerald bracelet on a pure white desktop, and obtaining the bracelet images with the pure white background through shooting. Of course, the specific background may be a solid color, or may be other specific backgrounds.

In the training stage, a preset background image and N original bracelet images shot by the background in the preset background image are obtained, and each original bracelet image and the background image are subjected to subtraction operation to remove the background in the original bracelet images and obtain N processed images. Specifically, the pixel values of corresponding pixel points on each original bracelet image and the background image are subtracted, so that it can be understood that the pixel value of each pixel point in the background area in the original bracelet image should be the same as or slightly different from the pixel value of the corresponding pixel point in the background image, and therefore, after the pixel values are subtracted, the pixel value of the background area should be 0 or close to 0. After the subtraction operation is performed, all pixel values within a first preset range in the pixel values obtained after the subtraction are set to be 0, and then the background in the original bracelet image is removed.

After N processed images with background removed are obtained, each processed image is further processed, a rectangular area which tightly surrounds a bracelet target is intercepted, and N target bracelet images used for training an image classification model are obtained.

In the training stage, according to the class indexes of bracelets, class labels are labeled on N target bracelet images with backgrounds removed to obtain N training samples, then the N training samples are sent to an image classification model for training, in order to strengthen the capability of the image classification model for identifying class differences among classes, feature vectors obtained by the image classification model are input to a softmax classifier of the next layer for classification so as to calculate classification loss, corresponding Triplet loss needs to be calculated and used as an auxiliary loss function, and the image classification model is jointly optimized according to the classification loss and the Triplet loss in the training stage.

The embodiment of the application also provides a bracelet identification method, and the bracelet identification method is used for classifying bracelet types in bracelet images by adopting the image classification model obtained by training through the training method provided by the previous embodiment. The identification method may be executed by an electronic device, as shown in fig. 4, and includes the following steps:

step 310: and acquiring an image of the bracelet to be identified.

Step 320: and carrying out background removal processing on the bracelet image to be identified, and intercepting a bracelet target in the processed image to obtain a target bracelet image.

Step 330: and inputting the target bracelet image into an image classification model to obtain a corresponding feature vector.

Step 340: and carrying out image classification based on the feature vector to obtain the category of the bracelet image to be identified.

In step 310, the image classification model is a model obtained by training through the training method provided in the previous embodiment, and the bracelet images with similar categories are better resolved by jointly participating in optimization through the classification loss and the auxiliary loss.

In the step 320, the process of performing background removal processing on the bracelet image to be recognized includes, but is not limited to, the following two implementation manners:

the first method is as follows: identifying the bracelet image to be identified by using the trained semantic segmentation model to obtain a corresponding mask; and superposing the bracelet image to be identified and the corresponding mask to obtain a processed image. Since the semantic segmentation model has been trained in the previous training phase, the trained semantic segmentation model can be directly obtained here and used in step 320.

The second method comprises the following steps: there is a uniform requirement for the bracelet images uploaded by merchants, for example, the bracelet images uploaded by the merchants are required to be shot in a specific background, such as placing an emerald bracelet on a pure white desktop, and obtaining the bracelet images with the pure white background through shooting. Of course, the specific background may be a solid color, or may be other specific backgrounds. Therefore, the bracelet image to be identified in the step 310 is shot with the background in the preset background image.

In the prediction stage, a preset background image is obtained, and the bracelet image to be identified and the background image are subjected to subtraction operation to remove the background in the bracelet image to be identified, so that a processed image is obtained. Specifically, pixel values of corresponding pixel points on the bracelet image to be recognized and the background image are subtracted, and after the pixel values are subtracted, pixel values in a first preset range in the subtracted pixel values are all set to be 0, so that the background in the bracelet image to be recognized is removed.

After the processed image is obtained, the processed image is further processed, and a rectangular area which closely surrounds the bracelet target is intercepted, so that a target bracelet image which is input into an image classification model is obtained.

In the step 330-.

In summary, since the backgrounds of jade bracelet images uploaded by merchants are complex and various, the bracelet placing angles are different, and the backgrounds in the images are very likely to cause interference in the existing classification algorithm, the training method and the recognition method of the embodiment first utilize a semantic segmentation model or other means to perform background removal and bracelet target interception, and ensure that the images input into the image classification model only contain target features. Meanwhile, in the classification of jade bracelets, the texture among the classes or the color difference of speckles different from the whole is small, identifiable features are small, the inter-class similarity is high, if only common classification loss is utilized, the degree of optimizing an image classification model is limited, the embodiment trains the image classification model by combining auxiliary loss, and the auxiliary loss can represent the difference among feature vectors with the same class labels and the difference among feature vectors with different class labels, so that the sensitivity of the image classification model to the inter-class difference of the jade bracelets can be improved, and the classification performance of the image classification model is effectively improved.

Based on the same inventive concept, an embodiment of the present application provides a model training apparatus, as shown in fig. 5, the apparatus includes: a first feature extraction module 410, a loss calculation module 420, and a model update module 430.

The first feature extraction module 410 is configured to input N training samples of a current batch to the image classification model, so as to obtain a feature vector corresponding to each training sample; each training sample includes an image of a target bracelet and a corresponding class label.

The loss calculating module 420 is configured to calculate a loss of the current batch based on the N feature vectors and the class labels corresponding to the N feature vectors, where the loss of the current batch includes a classification loss and an auxiliary loss, the classification loss represents a difference between the classification prediction result and the real classification result, and the auxiliary loss represents a difference between feature vectors having the same class label and a difference between feature vectors having different class labels.

Wherein the model updating module 430 is configured to update the network parameters of the image classification model based on the loss of the current lot.

The model training device provided in the embodiment of the present application has the same implementation principle and technical effect as the model training method in the foregoing method embodiment, and for brief description, no mention is made in the embodiment of the device, and reference may be made to the corresponding contents in the method embodiment.

Further, an embodiment of the present application provides a bracelet identification device, as shown in fig. 6, the device includes: an image acquisition module 510, an image preprocessing module 520, a second feature extraction module 530 and a bracelet identification module 540.

The image obtaining module 510 is configured to obtain an image of the bracelet to be identified.

The image preprocessing module 520 is configured to perform background removal processing on the bracelet image to be identified, and intercept a bracelet target in the processed image to obtain a target bracelet image.

The second feature extraction module 530 is configured to input the target bracelet image into an image classification model, so as to obtain a corresponding feature vector. The image classification model is a model obtained by training by using the model training method provided by the foregoing embodiment.

The bracelet identification module 540 is configured to perform image classification based on the feature vectors, and obtain a category of the bracelet image to be identified.

The implementation principle and the technical effects of the bracelet identification device provided by the embodiment of the application are consistent with those of the bracelet identification method in the foregoing method embodiment, and for the sake of brief description, no mention is made in the section of the embodiment of the device, and reference may be made to the corresponding contents in the method embodiment.

Referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device 600 according to an embodiment of the present disclosure. The application provides an electronic device 600 comprising: the processor 601 and the memory 602, the processor 601 and the memory 602 being interconnected and communicating with each other via a communication bus 603 and/or other form of connection mechanism (not shown), the memory 602 storing a computer program executable by the processor 601, the processor 601 executing the computer program when the electronic device is running to perform the model training method or the bracelet identification method provided by the above embodiments.

The embodiment of the application provides a storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program executes the model training method or the bracelet identification method provided by the embodiment. The storage medium may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as a Static Random Access Memory (SRAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), an Erasable Programmable Read-Only Memory (EPROM), a Programmable Read-Only Memory (PROM), a Read-Only Memory (ROM), a magnetic Memory, a flash Memory, a magnetic disk, or an optical disk.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the unit is only a logical division, and other divisions may be realized in practice. Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method of model training, comprising:

inputting N training samples of a current batch into an image classification model to obtain a feature vector corresponding to each training sample; each training sample comprises a target bracelet image and a corresponding class label;

calculating the loss of the current batch based on the N feature vectors and the corresponding class labels thereof, wherein the loss of the current batch comprises classification loss and auxiliary loss, the classification loss represents the difference between the classification prediction result and the real classification result, and the auxiliary loss represents the difference between the feature vectors with the same class label and the difference between the feature vectors with different class labels;

updating network parameters of the image classification model based on the loss of the current lot.

2. The method of claim 1, wherein the auxiliary loss is a Triplet loss, and wherein calculating the loss of the current lot based on the N eigenvectors and their corresponding class labels comprises:

converting the N feature vectors into N probability distributions by using a softmax classifier, and calculating the classification loss of the current batch based on the N probability distributions and the corresponding N class labels; each probability distribution represents a classification prediction result of a corresponding training sample;

calculating the Triplet loss of the current batch based on the N eigenvectors and the class labels corresponding to the eigenvectors;

and carrying out weighted summation on the classification loss and the Triplet loss to obtain the loss of the current batch.

3. The method of claim 2, wherein the N training samples of the current batch correspond to K classes, and wherein the calculating the Triplet loss of the current batch based on the N feature vectors and their corresponding class labels comprises:

constructing a plurality of triples [ V ] according to the feature vectors of the N training samples_A,V_P,V_N]Wherein V in each of the triplets_AFor any one of N feature vectors, V_PIs a V_ADividing V in feature vector of training sample corresponding to category_AAny one of (1), V_NIs equal to V_AAny one of feature vectors of training samples of different classes;

calculating the Triplet loss of each constructed Triplet to obtain a plurality of Triplet losses;

and averaging the multiple triplets losses to obtain the triplets loss of the current batch.

4. The method of any of claims 1-3, wherein prior to inputting the current batch of N training samples into the image classification model, the method further comprises:

acquiring N original bracelet images;

and carrying out background removal processing on each original bracelet image, and intercepting the bracelet target in the processed image to obtain N target bracelet images.

5. The method according to claim 4, wherein the performing of the background removal process on each original bracelet image comprises:

recognizing each original bracelet image by utilizing a semantic segmentation model to obtain N masks;

and superposing each original bracelet image and the corresponding mask to obtain N processed images.

6. The method according to claim 4, wherein the performing of the background removal process on each original bracelet image comprises:

acquiring a preset background image;

and subtracting the background image from each original bracelet image to obtain N processed images.

7. A bracelet identification method, comprising:

acquiring a bracelet image to be identified;

carrying out background removal processing on the bracelet image to be identified, and intercepting a bracelet target in the processed image to obtain a target bracelet image;

inputting the target bracelet image into an image classification model obtained by training according to the method of any one of claims 1-6 to obtain a corresponding feature vector;

and carrying out image classification based on the feature vectors to obtain the category of the bracelet image to be identified.

8. The method according to claim 7, wherein the background removing of the bracelet image to be identified comprises:

utilizing a semantic segmentation model to identify the bracelet image to be identified to obtain a corresponding mask;

and superposing the bracelet image to be identified and the mask to obtain a processed image.

9. A model training apparatus, comprising:

the first feature extraction module is used for inputting the N training samples of the current batch into the image classification model to obtain a feature vector corresponding to each training sample; each training sample comprises a target bracelet image and a corresponding class label;

a loss calculation module, configured to calculate a loss of a current batch based on the N feature vectors and the class labels corresponding to the N feature vectors, where the loss of the current batch includes a classification loss and an auxiliary loss, the classification loss represents a difference between a classification prediction result and a real classification result, and the auxiliary loss represents a difference between feature vectors having the same class label and a difference between feature vectors having different class labels;

and the model updating module is used for updating the network parameters of the image classification model based on the loss of the current batch.

10. A bracelet identification device, comprising:

the image acquisition module is used for acquiring an image of the bracelet to be identified;

the image preprocessing module is used for carrying out background removal processing on the bracelet image to be identified and intercepting a bracelet target in the processed image to obtain a target bracelet image;

a second feature extraction module, configured to input the target bracelet image into an image classification model trained by the method according to any one of claims 1 to 6, to obtain a corresponding feature vector;

and the bracelet identification module is used for carrying out image classification based on the feature vector to obtain the category of the bracelet image to be identified.