CN113780243B - Training method, device, equipment and storage medium for pedestrian image recognition model - Google Patents

Training method, device, equipment and storage medium for pedestrian image recognition model Download PDF

Info

Publication number
CN113780243B
CN113780243B CN202111167837.1A CN202111167837A CN113780243B CN 113780243 B CN113780243 B CN 113780243B CN 202111167837 A CN202111167837 A CN 202111167837A CN 113780243 B CN113780243 B CN 113780243B
Authority
CN
China
Prior art keywords
pedestrian image
image recognition
feature vector
pedestrian
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111167837.1A
Other languages
Chinese (zh)
Other versions
CN113780243A (en
Inventor
司世景
王健宗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202111167837.1A priority Critical patent/CN113780243B/en
Publication of CN113780243A publication Critical patent/CN113780243A/en
Application granted granted Critical
Publication of CN113780243B publication Critical patent/CN113780243B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a training method of a pedestrian image recognition model, which comprises the following steps: obtaining an unlabeled first pedestrian image, carrying out data enhancement on the first pedestrian image to obtain a data enhancement image, respectively inputting the data enhancement image into a first pedestrian image recognition network and a second pedestrian image recognition network for analysis to extract a first anti-shielding high-level semantic feature vector and a second anti-shielding high-level semantic feature vector, and finally updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the first anti-shielding high-level semantic feature vector, the second anti-shielding high-level semantic feature vector and a preset loss function so as to realize training of a pedestrian image recognition model. Therefore, the pedestrian image recognition model can extract anti-shielding high-level semantic features in the data enhancement image, so that the trained pedestrian image recognition model can more accurately recognize pedestrians in the pedestrian shielding image.

Description

Training method, device, equipment and storage medium for pedestrian image recognition model
Technical Field
The present application relates to the field of artificial intelligence, and in particular, to a training method and apparatus for a pedestrian image recognition model, a computer device, and a storage medium.
Background
The pedestrian re-recognition technique (Person-identification) is a technique for retrieving whether or not a pedestrian is present in a specified image or video using computer vision. In practical application of the pedestrian re-recognition technology, since the actual image acquisition scene is usually complex and changeable, pedestrians in the image are easily blocked by certain obstacles (such as luggage, counters, crowded public places, automobiles, trees and the like), so that the image becomes a pedestrian blocking image. Most of the current pedestrian re-recognition technologies focus on searching and matching on the whole image of the pedestrian, but neglecting searching and matching on the pedestrian shielding image (i.e. the pedestrian in the image is shielded by other objects) results in that the current pedestrian re-recognition technologies cannot accurately recognize the pedestrian in the pedestrian shielding image. As can be seen, the recognition accuracy of the current pedestrian re-recognition technology still has room for further improvement.
Disclosure of Invention
The application aims to solve the technical problem that the recognition accuracy of the existing pedestrian re-recognition technology is lower.
In order to solve the technical problem, a first aspect of the present application discloses a training method for a pedestrian image recognition model, which comprises the following steps:
acquiring a first pedestrian image which is not provided with a corresponding labeling label;
adding shielding noise into the first pedestrian image based on a preset data enhancement method to obtain a data enhancement image;
inputting the data enhanced image to a first pedestrian image recognition network for analysis so as to extract a first anti-shielding high-level semantic feature vector in the data enhanced image;
inputting the data enhanced image to a second pedestrian image recognition network for analysis so as to extract a second anti-shielding high-level semantic feature vector in the data enhanced image;
updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the first anti-shielding high-level semantic feature vector, the second anti-shielding high-level semantic feature vector and a preset loss function so as to train a pedestrian image recognition model;
the pedestrian image recognition model comprises a first pedestrian image recognition network and a second pedestrian image recognition network, and network parameters are shared between the first pedestrian image recognition network and the second pedestrian image recognition network.
The second aspect of the application discloses a training device for a pedestrian image recognition model, which comprises:
the acquisition module is used for acquiring a first pedestrian image which is not provided with a corresponding labeling label;
the data enhancement module is used for adding shielding noise to the first pedestrian image based on a preset data enhancement method so as to obtain a data enhancement image;
the analysis module is used for inputting the data enhanced image into a first pedestrian image recognition network for analysis so as to extract a first anti-shielding high-level semantic feature vector in the data enhanced image;
the analysis module is further used for inputting the data enhanced image into a second pedestrian image recognition network for analysis so as to extract a second anti-shielding high-level semantic feature vector in the data enhanced image;
the updating module is used for updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the first anti-shielding high-level semantic feature vector, the second anti-shielding high-level semantic feature vector and a preset loss function so as to train a pedestrian image recognition model;
the pedestrian image recognition model comprises a first pedestrian image recognition network and a second pedestrian image recognition network, and network parameters are shared between the first pedestrian image recognition network and the second pedestrian image recognition network.
A third aspect of the application discloses a computer device comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform some or all of the steps in the training method for the pedestrian image recognition model disclosed in the first aspect of the present application.
A fourth aspect of the present application discloses a computer storage medium storing computer instructions which, when invoked, are used to perform part or all of the steps of the training method of the pedestrian image identification model disclosed in the first aspect of the present application.
According to the embodiment of the application, the first pedestrian image recognition network and the second pedestrian image recognition network which share network parameters are arranged in the pedestrian image recognition model, the model structure is more favorable for recognizing pedestrians in pedestrian shielding images, then the data enhancement images used for model training are obtained in a mode of adding shielding noise in the first pedestrian image, the mode of adding shielding noise is also favorable for learning the pedestrian in pedestrian shielding images with pertinence by the pedestrian image recognition model, then anti-shielding high-layer semantic features are extracted from the data enhancement images by using the two pedestrian image recognition networks, and finally training of the pedestrian image recognition model is completed by using a preset loss function and the anti-shielding high-layer semantic features, so that the pedestrian image recognition model has stronger capability of pertinently recognizing pedestrians in the pedestrian shielding images, the trained pedestrian image recognition model can more accurately recognize pedestrians in the pedestrian shielding images, and the recognition accuracy of pedestrian re-recognition technology is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a training method of a pedestrian image recognition model according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a pedestrian image recognition model according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a training device for a pedestrian image recognition model according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a computer device according to an embodiment of the present application;
fig. 5 is a schematic diagram of a computer storage medium according to an embodiment of the present application.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms first, second and the like in the description and in the claims and in the above-described figures are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or article that comprises a list of steps or elements is not limited to only those listed but may optionally include other steps or elements not listed or inherent to such process, method, article, or article.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The application discloses a training method, a device, computer equipment and a storage medium of a pedestrian image recognition model, wherein a first pedestrian image recognition network and a second pedestrian image recognition network which share network parameters are arranged in the pedestrian image recognition model, the model structure is more favorable for recognizing pedestrians in a pedestrian shielding image, then a data enhancement image used for model training is obtained by adding shielding noise in the first pedestrian image, the mode of adding shielding noise is also favorable for learning the pedestrian image recognition model to have the capability of recognizing pedestrians in the pedestrian shielding image in a targeted manner, then anti-shielding high-level semantic features are extracted from the data enhancement image by using two pedestrian image recognition networks, and finally training of the pedestrian image recognition model is completed by using a preset loss function and the anti-shielding high-level semantic features, so that the pedestrian image recognition model has stronger capability of recognizing pedestrians in the pedestrian shielding image in a targeted manner, the pedestrian image recognition model after training can recognize pedestrians in the pedestrian shielding image more accurately, and the recognition accuracy of a re-recognition technology is improved. The following will describe in detail.
Example 1
Referring to fig. 1, fig. 1 is a flowchart of a training method of a pedestrian image recognition model according to an embodiment of the present application. As shown in fig. 1, the training method of the pedestrian image recognition model may include the following operations:
101. and acquiring a first pedestrian image which is not provided with the corresponding labeling label.
In step 101, a first pedestrian image may be obtained from an unlabeled pedestrian image dataset (e.g., a pedestrian re-identification dataset such as CUHK01, CUHK02, CUHK03, dukeMTMC-reID, dukeMTMC ReID, etc.) that typically includes a large number of unlabeled pedestrian images.
102. And adding shielding noise into the first pedestrian image based on a preset data enhancement method to obtain a data enhancement image.
In the step 102, the predetermined data enhancing method may be a random erase data enhancing method Random Erasing Data Augmentation (REA). In short, an area is randomly selected in the pedestrian image, and an occlusion noise mask is punched. The mask can be black blocks, gray blocks or random noise. The method has been demonstrated to improve the performance of the model and robustness against occlusion in multiple CNN architectures and different fields. Specifically, the basic flow of the random erase data augmentation method may be as follows:
(1) Inputting the width W, the height H and the area S of the pedestrian image I, setting the area occupation ratio range Se epsilon (Sl, sh) of the erasure area, and the length-width ratio range rl epsilon (r 1, r 2) of the erasure area;
(2) Randomly taking points (xe, ye) in the pedestrian image I, randomly generating an erasure area ratio Se in the range of (Sl, sh), randomly generating an erasure area length-width ratio rl in the range of (r 1, r 2), and further calculating the width We and the height He of the shielding noise mask according to Se and rl;
(3) Judging whether the shielding noise mask exceeds the boundary of the pedestrian image I, if so, returning to the second step, and if not, entering the next step;
(4) Assigning a random value or an average value to pixels in the shielding noise mask;
(5) And returning a new pedestrian image.
103. And inputting the data enhanced image into a first pedestrian image recognition network for analysis so as to extract a first anti-shielding high-level semantic feature vector in the data enhanced image.
In step 103, the first pedestrian image recognition network and the second pedestrian image recognition network that share the network parameters are set in the pedestrian image recognition model, and the network design makes the first pedestrian image recognition network still extract robust high-level features (i.e., the first anti-occlusion high-level semantic feature vector) that are not interfered by the occlusion noise in the data enhancement image under the influence of the occlusion noise.
104. And inputting the data enhanced image to a second pedestrian image recognition network for analysis so as to extract a second anti-occlusion high-level semantic feature vector in the data enhanced image.
In step 104, the network design in the pedestrian image recognition model can also enable the second pedestrian image recognition network to still extract the robust high-level features (i.e., the second anti-occlusion high-level semantic feature vector) of the data enhancement image, which are not interfered by the occlusion noise, under the influence of the occlusion noise.
105. Updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the first anti-shielding high-level semantic feature vector, the second anti-shielding high-level semantic feature vector and a preset loss function so as to train a pedestrian image recognition model;
the pedestrian image recognition model comprises a first pedestrian image recognition network and a second pedestrian image recognition network, and network parameters are shared between the first pedestrian image recognition network and the second pedestrian image recognition network.
In step 105, after the first anti-occlusion high-level semantic feature vector and the second anti-occlusion high-level semantic feature vector are extracted, the pedestrian image recognition model may be trained using the first anti-occlusion high-level semantic feature vector, the second anti-occlusion high-level semantic feature vector, and a preset loss function. Specifically, the preset loss function is used for restraining the two pedestrian image recognition networks to learn common information between the two data enhancement images, and network parameters in the two pedestrian image recognition networks are continuously updated through inverse gradient propagation in the training process, so that the setting of the network parameters continuously tends to be reasonable, and the capability of the two pedestrian image recognition networks to extract robust high-level features (namely anti-occlusion high-level semantic features) which are not interfered by occlusion noise in the data enhancement images is continuously improved.
Therefore, implementing the training method of the pedestrian image recognition model described in fig. 1, through setting the first pedestrian image recognition network and the second pedestrian image recognition network which share network parameters in the pedestrian image recognition model, the model structure is more favorable for recognizing pedestrians in pedestrian shielding images, then, a data enhancement image used for model training is obtained by adding shielding noise in the first pedestrian image, the mode of adding shielding noise is also favorable for learning the pedestrian image recognition model to have the capability of recognizing pedestrians in pedestrian shielding images in a targeted manner, then, two pedestrian image recognition networks are used for extracting anti-shielding high-layer semantic features from the data enhancement image, and finally, training of the pedestrian image recognition model is completed by using a preset loss function and the anti-shielding high-layer semantic features, so that the pedestrian image recognition model has stronger capability of recognizing pedestrians in pedestrian shielding images in a targeted manner, the pedestrian image recognition model after training can recognize pedestrians in pedestrian shielding images more accurately, and the recognition accuracy of the recognition technology is improved.
In an optional embodiment, the updating the network parameters of the first pedestrian image identification network and the second pedestrian image identification network based on the first anti-occlusion high-level semantic feature vector, the second anti-occlusion high-level semantic feature vector and a preset loss function to implement training of the pedestrian image identification model includes:
inputting the first anti-shielding high-level semantic feature vector into a preset multi-layer perceptron for analysis to obtain a multi-layer perceptual feature vector;
and updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the multi-layer perception feature vector, the second anti-occlusion high-layer semantic feature vector and a preset loss function so as to train the pedestrian image recognition model.
In this alternative embodiment, it has been found in practice that such a symmetrical model structure of the first pedestrian image recognition network and the second pedestrian image recognition network, in which the shared network parameters are set in the pedestrian image recognition model, is prone to a situation in which the outputs of the two pedestrian image recognition networks are highly approximated, resulting in collapse solutions. In order to reduce the occurrence of collapse and solution, a multi-layer perceptron can be added after the first pedestrian image recognition network to modify the model structure into an asymmetric structure, so that the occurrence of collapse and solution caused by the fact that network parameters tend to be the same in the model training process can be reduced, and the stability and adaptability of the pedestrian image recognition model are further enhanced.
In an alternative embodiment, the data-enhanced image comprises a first data-enhanced sub-image and a second data-enhanced sub-image;
and, the loss function is:
wherein p is 1 For the feature vector, p, corresponding to the first data enhancer image in the multi-layer perceptual feature vector 2 Z is the feature vector corresponding to the second data enhancer image in the multi-layer perceptual feature vector 1 Z is the feature vector corresponding to the first data enhancer image in the second anti-occlusion high-level semantic feature vector 2 And L is a loss value, and D (a, b) is a negative cosine similarity value between the feature vector a and the feature vector b for the feature vector corresponding to the second data enhancer image in the second anti-occlusion high-level semantic feature vector.
In this alternative embodiment, after adding the multi-layer perceptron after the first pedestrian image recognition network to modify the model structure to an asymmetric structure, the loss function may also be adaptively modified to the loss function described above to accommodate the new asymmetric model structure.
In an optional embodiment, the updating the network parameters of the first pedestrian image identification network and the second pedestrian image identification network based on the first anti-occlusion high-level semantic feature vector, the second anti-occlusion high-level semantic feature vector and a preset loss function to implement training of the pedestrian image identification model includes:
inputting the first anti-shielding high-level semantic feature vector into a preset multi-layer perceptron for analysis to obtain a multi-layer perceptual feature vector;
converting the second anti-occlusion high-level semantic feature vector into a gradient stop feature vector based on a preset gradient stop operator;
and updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the multi-layer perception feature vector, the gradient stopping feature vector and a preset loss function so as to train the pedestrian image recognition model.
In this alternative embodiment, as shown in fig. 2, in order to further increase the asymmetry of the model structure, a gradient operator may be added after the first pedestrian image recognition network, and meanwhile, a multi-layer perceptron may be added after the second pedestrian image recognition network, so that the occurrence of the situation that the network parameters tend to be the same and cause collapse and solution in the model training process can be further reduced, and the stability and adaptability of the pedestrian image recognition model are further enhanced.
In an alternative embodiment, the data-enhanced image comprises a first data-enhanced sub-image and a second data-enhanced sub-image;
and, the loss function is:
wherein p is 1 For the feature vector, p, corresponding to the first data enhancer image in the multi-layer perceptual feature vector 2 Z is the feature vector corresponding to the second data enhancer image in the multi-layer perceptual feature vector 1 Z is the feature vector corresponding to the first data enhancer image in the second anti-occlusion high-level semantic feature vector 2 For the feature vector corresponding to the second data enhancer image in the second anti-occlusion high-level semantic feature vector, a stop (z 1 ) For the feature vector corresponding to the first data enhancer image in the gradient stop feature vector, a stopgard (z 2 ) And stopping the feature vector corresponding to the second data enhancer image in the feature vector for the gradient, wherein L is a loss value, and D (a, b) is a negative cosine similarity value between the feature vector a and the feature vector b.
In this alternative embodiment, the multi-layer perceptron is added after the first pedestrian image recognition network and the gradient operator is added after the second pedestrian image recognition network, the penalty function may also be adaptively modified to the penalty function described above to accommodate the new asymmetric model structure.
In an optional embodiment, after the updating of the network parameters of the first pedestrian image identification network and the second pedestrian image identification network based on the first anti-occlusion high-level semantic feature vector, the second anti-occlusion high-level semantic feature vector, and a preset loss function to implement training of the pedestrian image identification model, the method further includes:
acquiring a second pedestrian image preset with a corresponding labeling label;
updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the second pedestrian image and a preset triplet loss function so as to realize adjustment of the pedestrian image recognition model;
wherein the triplet loss function is:
L triplet =max(d(a,p)-d(a,n)+margin,0)
wherein a is a sample corresponding to the second pedestrian image in a preset labeled training dataset, p is a sample randomly selected from the labeled training dataset and belonging to the same class as the second pedestrian image, n is a sample randomly selected from the labeled training dataset and belonging to a different class from the second pedestrian image, margin is a preset boundary micro-constant, L triplet D (a, p) is the Euclidean distance between a and p, and d (a, n) is the Euclidean distance between a and n.
In this alternative embodiment, the second tagged pedestrian image may be obtained from a pre-set tagged training dataset (e.g., MSMT17 dataset). A sample Anchor is randomly selected from the training data set with the tag, and then a sample Positive belonging to the same class with the Anchor and a sample Negative belonging to different classes with the Anchor are randomly selected. For a set triplet (active, negative), the triplet loss function tries to learn a feature space such that in the feature space, the reference sample (active) is closer to the Positive sample (Positive) and the reference sample (active) is farther from the Negative sample (Negative). And updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network again based on the second pedestrian image and a preset triplet loss function, namely training the first pedestrian image recognition network and the second pedestrian image recognition network by using the second pedestrian image and the preset triplet loss function. The recognition accuracy of the pedestrian re-recognition technology can be further improved by adjusting the pedestrian image recognition model through the triplet loss function.
It can be seen that, after the training of the pedestrian image recognition model is completed, the second line of the labeled pedestrian image and the triplet loss function are continuously used for adjusting the pedestrian image recognition model, so that the recognition accuracy of the pedestrian re-recognition technology can be further improved.
In an optional embodiment, after the updating of the network parameters of the first pedestrian image identification network and the second pedestrian image identification network based on the second pedestrian image and a preset triplet loss function to implement the adjustment of the pedestrian image identification model, the method further includes:
updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the second pedestrian image and a preset cross entropy loss function so as to realize adjustment of the pedestrian image recognition model;
wherein the cross entropy loss function is:
wherein y is i Labeling labels corresponding to the ith sample in the labeled training dataset, p i For the predicted probability value corresponding to the ith sample in the labeled training dataset, N is the number of samples in the labeled training dataset, L CE Is a cross entropy loss value.
In this alternative embodiment, the process of updating the network parameters of the first pedestrian image identification network and the second pedestrian image identification network again based on the second pedestrian image and the preset cross entropy loss function is a process of training the first pedestrian image identification network and the second pedestrian image identification network using the second pedestrian image and the preset cross entropy loss function.
It can be seen that, in implementing this alternative embodiment, after the second line of person images with labels and the triplet loss function are used to adjust the pedestrian image recognition model, the second pedestrian image and the cross entropy loss function are used to adjust the pedestrian image recognition model again, so that the recognition accuracy of the pedestrian re-recognition technology can be further improved.
Optionally, it is also possible to: and uploading training information of the pedestrian image recognition model of the training method of the pedestrian image recognition model to a blockchain.
Specifically, the training information of the pedestrian image recognition model is obtained by running the training method of the pedestrian image recognition model, and is used for recording the training condition of the pedestrian image recognition model, such as the acquired first pedestrian image, the data enhancement image, the extracted first anti-occlusion high-level semantic feature vector, the extracted second anti-occlusion high-level semantic feature vector, the trained pedestrian image recognition model and the like. The training information of the pedestrian image recognition model is uploaded to the blockchain, so that the safety and the fairness and transparency to users can be ensured. The user can download the training information of the pedestrian image recognition model from the blockchain so as to verify whether the training information of the pedestrian image recognition model of the training method of the pedestrian image recognition model is tampered. The blockchain referred to in this example is a novel mode of application for computer technology such as distributed data storage, point-to-point transmission, consensus mechanisms, encryption algorithms, and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Example two
Referring to fig. 3, fig. 3 is a schematic structural diagram of a training device for a pedestrian image recognition model according to an embodiment of the application. As shown in fig. 3, the training device of the pedestrian image recognition model may include:
an acquiring module 201, configured to acquire a first pedestrian image not provided with a corresponding labeling tag;
the data enhancement module 202 is configured to add shielding noise to the first pedestrian image based on a preset data enhancement method, so as to obtain a data enhancement image;
the analysis module 203 is configured to input the data enhanced image to a first pedestrian image recognition network for analysis, so as to extract a first anti-occlusion high-level semantic feature vector in the data enhanced image;
the analysis module 203 is further configured to input the data enhanced image to a second pedestrian image recognition network for analysis, so as to extract a second anti-occlusion high-level semantic feature vector in the data enhanced image;
the updating module 204 is configured to update network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the first anti-occlusion high-level semantic feature vector, the second anti-occlusion high-level semantic feature vector, and a preset loss function, so as to train a pedestrian image recognition model;
the pedestrian image recognition model comprises a first pedestrian image recognition network and a second pedestrian image recognition network, and network parameters are shared between the first pedestrian image recognition network and the second pedestrian image recognition network.
For the specific description of the training device of the pedestrian image recognition model, reference may be made to the specific description of the training method of the pedestrian image recognition model, and for avoiding repetition, the description will not be repeated here.
Example III
Referring to fig. 4, fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the application. As shown in fig. 4, the computer device may include:
a memory 301 storing executable program code;
a processor 302 connected to the memory 301;
the processor 302 invokes the executable program code stored in the memory 301 to perform the steps in the training method of the pedestrian image recognition model disclosed in the first embodiment of the present application.
Example IV
Referring to fig. 5, an embodiment of the present application discloses a computer storage medium 401, where the computer storage medium 401 stores computer instructions for executing steps in the training method of the pedestrian image recognition model disclosed in the embodiment of the present application when the computer instructions are called.
The apparatus embodiments described above are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present application without undue burden.
From the above detailed description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product that may be stored in a computer-readable storage medium including Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM) or other optical disc Memory, magnetic disc Memory, tape Memory, or any other medium that can be used for computer-readable carrying or storing data.
Finally, it should be noted that: the training method, device, computer equipment and storage medium of the pedestrian image recognition model disclosed by the embodiment of the application are disclosed as the preferred embodiment of the application, and are only used for illustrating the technical scheme of the application, but are not limited to the technical scheme; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (8)

1. A method for training a pedestrian image recognition model, the method comprising:
acquiring a first pedestrian image which is not provided with a corresponding labeling label;
adding shielding noise into the first pedestrian image based on a preset data enhancement method to obtain a data enhancement image;
inputting the data enhanced image to a first pedestrian image recognition network for analysis so as to extract a first anti-shielding high-level semantic feature vector in the data enhanced image;
inputting the data enhanced image to a second pedestrian image recognition network for analysis so as to extract a second anti-shielding high-level semantic feature vector in the data enhanced image;
updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the first anti-shielding high-level semantic feature vector, the second anti-shielding high-level semantic feature vector and a preset loss function so as to train a pedestrian image recognition model, wherein in the training process, the network parameters in the two pedestrian image recognition networks are continuously updated through inverse gradient propagation;
the pedestrian image recognition model comprises a first pedestrian image recognition network and a second pedestrian image recognition network, and network parameters are shared between the first pedestrian image recognition network and the second pedestrian image recognition network;
acquiring a second pedestrian image preset with a corresponding labeling label;
updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the second pedestrian image and a preset triplet loss function so as to realize adjustment of the pedestrian image recognition model;
wherein the triplet loss function is:
L triplet =max(d(a,p)-d(a,n)+margin,0)
wherein a is a sample corresponding to the second pedestrian image in a preset labeled training dataset, p is a sample randomly selected from the labeled training dataset and belonging to the same class as the second pedestrian image, and n is a sample selected from the labeled training datasetRandomly selecting a sample in the labeled training dataset, which is different from the second pedestrian image, wherein margin is a preset boundary micro constant, L triplet D (a, p) is the Euclidean distance between a and p, and d (a, n) is the Euclidean distance between a and n;
updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the second pedestrian image and a preset cross entropy loss function so as to realize adjustment of the pedestrian image recognition model;
wherein the cross entropy loss function is:
wherein y is i Labeling labels corresponding to the ith sample in the labeled training dataset, p i For the predicted probability value corresponding to the ith sample in the labeled training dataset, N is the number of samples in the labeled training dataset, L CE Is a cross entropy loss value.
2. The training method of the pedestrian image recognition model according to claim 1, wherein updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the first anti-occlusion high-level semantic feature vector, the second anti-occlusion high-level semantic feature vector, and a preset loss function to achieve training of the pedestrian image recognition model includes:
inputting the first anti-shielding high-level semantic feature vector into a preset multi-layer perceptron for analysis to obtain a multi-layer perceptual feature vector;
and updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the multi-layer perception feature vector, the second anti-occlusion high-layer semantic feature vector and a preset loss function so as to train the pedestrian image recognition model.
3. The method of training a pedestrian image recognition model of claim 2 wherein the data-enhanced image comprises a first data-enhanced sub-image and a second data-enhanced sub-image;
and, the loss function is:
wherein p is 1 For the feature vector, p, corresponding to the first data enhancer image in the multi-layer perceptual feature vector 2 Z is the feature vector corresponding to the second data enhancer image in the multi-layer perceptual feature vector 1 Z is the feature vector corresponding to the first data enhancer image in the second anti-occlusion high-level semantic feature vector 2 And L is a loss value, and D (a, b) is a negative cosine similarity value between the feature vector a and the feature vector b for the feature vector corresponding to the second data enhancer image in the second anti-occlusion high-level semantic feature vector.
4. The training method of the pedestrian image recognition model according to claim 1, wherein updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the first anti-occlusion high-level semantic feature vector, the second anti-occlusion high-level semantic feature vector, and a preset loss function to achieve training of the pedestrian image recognition model includes:
inputting the first anti-shielding high-level semantic feature vector into a preset multi-layer perceptron for analysis to obtain a multi-layer perceptual feature vector;
converting the second anti-occlusion high-level semantic feature vector into a gradient stop feature vector based on a preset gradient stop operator;
and updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the multi-layer perception feature vector, the gradient stopping feature vector and a preset loss function so as to train the pedestrian image recognition model.
5. The method of training a pedestrian image recognition model of claim 4 wherein the data-enhanced image comprises a first data-enhanced sub-image and a second data-enhanced sub-image;
and, the loss function is:
wherein p is 1 For the feature vector, p, corresponding to the first data enhancer image in the multi-layer perceptual feature vector 2 Z is the feature vector corresponding to the second data enhancer image in the multi-layer perceptual feature vector 1 Z is the feature vector corresponding to the first data enhancer image in the second anti-occlusion high-level semantic feature vector 2 For the feature vector corresponding to the second data enhancer image in the second anti-occlusion high-level semantic feature vector, a stop (z 1 ) For the feature vector corresponding to the first data enhancer image in the gradient stop feature vector, a stopgard (z 2 ) And stopping the feature vector corresponding to the second data enhancer image in the feature vector for the gradient, wherein L is a loss value, and D (a, b) is a negative cosine similarity value between the feature vector a and the feature vector b.
6. A training device for a pedestrian image recognition model, the device comprising:
the acquisition module is used for acquiring a first pedestrian image which is not provided with a corresponding labeling label;
the data enhancement module is used for adding shielding noise to the first pedestrian image based on a preset data enhancement method so as to obtain a data enhancement image;
the analysis module is used for inputting the data enhanced image into a first pedestrian image recognition network for analysis so as to extract a first anti-shielding high-level semantic feature vector in the data enhanced image;
the analysis module is further used for inputting the data enhanced image into a second pedestrian image recognition network for analysis so as to extract a second anti-shielding high-level semantic feature vector in the data enhanced image;
the updating module is used for updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the first anti-shielding high-level semantic feature vector, the second anti-shielding high-level semantic feature vector and a preset loss function so as to train a pedestrian image recognition model, and in the training process, the network parameters in the two pedestrian image recognition networks are continuously updated through inverse gradient propagation;
the pedestrian image recognition model comprises a first pedestrian image recognition network and a second pedestrian image recognition network, and network parameters are shared between the first pedestrian image recognition network and the second pedestrian image recognition network;
the updating module is also used for acquiring a second pedestrian image preset with a corresponding labeling label;
updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the second pedestrian image and a preset triplet loss function so as to realize adjustment of the pedestrian image recognition model;
wherein the triplet loss function is:
L triplet =max(d(a,p)-d(a,n)+margin,0)
wherein a is a pre-preparationThe set sample corresponding to the second pedestrian image in the labeled training dataset is p is a sample randomly selected from the labeled training dataset and belonging to the same class as the second pedestrian image, n is a sample randomly selected from the labeled training dataset and belonging to a different class from the second pedestrian image, margin is a preset boundary micro constant, L triplet D (a, p) is the Euclidean distance between a and p, and d (a, n) is the Euclidean distance between a and n;
updating network parameters of the first pedestrian image recognition network and the second pedestrian image recognition network based on the second pedestrian image and a preset cross entropy loss function so as to realize adjustment of the pedestrian image recognition model;
wherein the cross entropy loss function is:
wherein y is i Labeling labels corresponding to the ith sample in the labeled training dataset, p i For the predicted probability value corresponding to the ith sample in the labeled training dataset, N is the number of samples in the labeled training dataset, L CE Is a cross entropy loss value.
7. A computer device, the computer device comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the training method of the pedestrian image recognition model of any one of claims 1-5.
8. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the training method of the pedestrian image recognition model according to any one of claims 1-5.
CN202111167837.1A 2021-09-29 2021-09-29 Training method, device, equipment and storage medium for pedestrian image recognition model Active CN113780243B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111167837.1A CN113780243B (en) 2021-09-29 2021-09-29 Training method, device, equipment and storage medium for pedestrian image recognition model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111167837.1A CN113780243B (en) 2021-09-29 2021-09-29 Training method, device, equipment and storage medium for pedestrian image recognition model

Publications (2)

Publication Number Publication Date
CN113780243A CN113780243A (en) 2021-12-10
CN113780243B true CN113780243B (en) 2023-10-17

Family

ID=78854672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111167837.1A Active CN113780243B (en) 2021-09-29 2021-09-29 Training method, device, equipment and storage medium for pedestrian image recognition model

Country Status (1)

Country Link
CN (1) CN113780243B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114445681A (en) * 2022-01-28 2022-05-06 上海商汤智能科技有限公司 Model training and image recognition method and device, equipment and storage medium
CN114724183B (en) * 2022-04-08 2024-05-24 平安科技(深圳)有限公司 Human body key point detection method, system, electronic equipment and readable storage medium
CN116091907B (en) * 2023-04-12 2023-08-15 四川大学 Image tampering positioning model and method based on non-mutually exclusive ternary comparison learning
CN116385813B (en) * 2023-06-07 2023-08-29 南京隼眼电子科技有限公司 ISAR image space target classification method, device and storage medium based on unsupervised contrast learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259850A (en) * 2020-01-23 2020-06-09 同济大学 Pedestrian re-identification method integrating random batch mask and multi-scale representation learning
CN111931637A (en) * 2020-08-07 2020-11-13 华南理工大学 Cross-modal pedestrian re-identification method and system based on double-current convolutional neural network
CN111967429A (en) * 2020-08-28 2020-11-20 清华大学 Pedestrian re-recognition model training method and device based on active learning
CN112801008A (en) * 2021-02-05 2021-05-14 电子科技大学中山学院 Pedestrian re-identification method and device, electronic equipment and readable storage medium
CN113095263A (en) * 2021-04-21 2021-07-09 中国矿业大学 Method and device for training heavy identification model of pedestrian under shielding and method and device for heavy identification of pedestrian under shielding
CN113111814A (en) * 2021-04-20 2021-07-13 合肥学院 Regularization constraint-based semi-supervised pedestrian re-identification method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259850A (en) * 2020-01-23 2020-06-09 同济大学 Pedestrian re-identification method integrating random batch mask and multi-scale representation learning
CN111931637A (en) * 2020-08-07 2020-11-13 华南理工大学 Cross-modal pedestrian re-identification method and system based on double-current convolutional neural network
CN111967429A (en) * 2020-08-28 2020-11-20 清华大学 Pedestrian re-recognition model training method and device based on active learning
CN112801008A (en) * 2021-02-05 2021-05-14 电子科技大学中山学院 Pedestrian re-identification method and device, electronic equipment and readable storage medium
CN113111814A (en) * 2021-04-20 2021-07-13 合肥学院 Regularization constraint-based semi-supervised pedestrian re-identification method and device
CN113095263A (en) * 2021-04-21 2021-07-09 中国矿业大学 Method and device for training heavy identification model of pedestrian under shielding and method and device for heavy identification of pedestrian under shielding

Also Published As

Publication number Publication date
CN113780243A (en) 2021-12-10

Similar Documents

Publication Publication Date Title
CN113780243B (en) Training method, device, equipment and storage medium for pedestrian image recognition model
CN112734775B (en) Image labeling, image semantic segmentation and model training methods and devices
CN109829396B (en) Face recognition motion blur processing method, device, equipment and storage medium
CN112446342B (en) Key frame recognition model training method, recognition method and device
CN111783749A (en) Face detection method and device, electronic equipment and storage medium
CN114898342B (en) Method for detecting call receiving and making of non-motor vehicle driver in driving
KR102592551B1 (en) Object recognition processing apparatus and method for ar device
US11275970B2 (en) Systems and methods for distributed data analytics
US20240087368A1 (en) Companion animal life management system and method therefor
CN111401196A (en) Method, computer device and computer readable storage medium for self-adaptive face clustering in limited space
CN111626251A (en) Video classification method, video classification device and electronic equipment
CN112182269B (en) Training of image classification model, image classification method, device, equipment and medium
Shah et al. Efficient portable camera based text to speech converter for blind person
US20240087352A1 (en) System for identifying companion animal and method therefor
CN113762326A (en) Data identification method, device and equipment and readable storage medium
CN112613508A (en) Object identification method, device and equipment
CN115620304A (en) Training method of text recognition model, text recognition method and related device
CN112257628A (en) Method, device and equipment for identifying identities of outdoor competition athletes
EP4332910A1 (en) Behavior detection method, electronic device, and computer readable storage medium
CN116704433A (en) Self-supervision group behavior recognition method based on context-aware relationship predictive coding
CN112989869B (en) Optimization method, device, equipment and storage medium of face quality detection model
CN114120287A (en) Data processing method, data processing device, computer equipment and storage medium
CN114385846A (en) Image classification method, electronic device, storage medium and program product
CN116612466B (en) Content identification method, device, equipment and medium based on artificial intelligence
CN117095244B (en) Infrared target identification method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant