WO2020186914A1 - 行人再识别方法、装置及存储介质 - Google Patents
行人再识别方法、装置及存储介质 Download PDFInfo
- Publication number
- WO2020186914A1 WO2020186914A1 PCT/CN2020/071499 CN2020071499W WO2020186914A1 WO 2020186914 A1 WO2020186914 A1 WO 2020186914A1 CN 2020071499 W CN2020071499 W CN 2020071499W WO 2020186914 A1 WO2020186914 A1 WO 2020186914A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- target
- training
- pedestrian
- network
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 238000012549 training Methods 0.000 claims abstract description 178
- 238000013528 artificial neural network Methods 0.000 claims abstract description 61
- 238000000605 extraction Methods 0.000 claims abstract description 23
- 230000004438 eyesight Effects 0.000 claims abstract description 12
- 230000009466 transformation Effects 0.000 claims description 59
- 230000006870 function Effects 0.000 claims description 38
- 238000012937 correction Methods 0.000 claims description 35
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 31
- 238000003062 neural network model Methods 0.000 claims description 30
- 238000004590 computer program Methods 0.000 claims description 26
- 238000006243 chemical reaction Methods 0.000 claims description 20
- 238000012545 processing Methods 0.000 claims description 13
- 230000015654 memory Effects 0.000 claims description 7
- 230000036544 posture Effects 0.000 description 28
- 238000010586 diagram Methods 0.000 description 13
- 238000011176 pooling Methods 0.000 description 13
- 238000013135 deep learning Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 6
- 238000007781 pre-processing Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000004913 activation Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Definitions
- the present disclosure relates to the field of pedestrian re-identification, but is not limited to the field of pedestrian re-identification, and in particular to a pedestrian re-identification method, device and storage medium.
- Pedestrian re-identification means that in the case of multiple cameras, given a person’s identity ID, it matches the characteristics of the person under multiple cameras to accurately find the person’s location Identity ID.
- Pedestrian re-identification methods based on non-deep learning manual feature design is cumbersome, and the algorithm accuracy is not high; compared with non-deep learning methods, pedestrian re-identification methods based on deep learning have improved algorithm accuracy and faster running speed.
- Pedestrian re-recognition for specific scenarios has better results, but for complex scenarios (such as complex pedestrian flow places, train stations, unmanned stores on JD, etc.), the accuracy of the algorithm is limited. Cameras) and pedestrians' different clothes (different seasons, different clothes styles) and other cross-dataset misidentification of pedestrians.
- the current deep learning method still lacks the ability of cross-domain model generalization, that is, the network model trained in the feature scene is not well applied to the new scene, including the same person wearing different clothes in the same scene Clothes, or wearing the same clothes in different scenes, in complex scenes, the pedestrian re-identification model still misses the recognition and the problem of misunderstanding remains to be solved.
- the embodiments of the present disclosure provide a pedestrian re-identification method, device and storage medium with strong generalization ability and accurate identification.
- embodiments of the present disclosure provide a pedestrian re-identification method, the method including:
- the training samples of the neural network include The target domain image obtained after the source domain image in the scene is converted to the target view domain scene and the identity information of the object contained in the target domain image.
- the method before performing feature extraction and matching on the image to be recognized based on the trained neural network for pedestrian re-recognition, the method further includes:
- the second training sample is input to the neural network model for iterative training until the loss function of the neural network model satisfies the convergence condition, and the trained neural network for pedestrian recognition is obtained.
- the input of the second training sample into the neural network model for iterative training until the loss function of the neural network model meets the convergence condition further includes:
- the generative confrontation network includes a generative network and a recognition network, and the first training sample is input into the trained generative confrontation network to perform style conversion to obtain the target domain image in the target view scene.
- the network and the recognition network are individually and alternately iteratively trained until the set loss function meets the convergence condition, and the trained generative confrontation network is obtained.
- the method before performing feature extraction and matching on the image to be recognized based on the trained neural network for pedestrian re-recognition, the method further includes:
- the performing posture correction of the object to be recognized in the image to be recognized includes:
- the embodiments of the present disclosure also provide a pedestrian re-identification device, including an acquisition module and a processing module, wherein:
- the acquisition module is configured to acquire an image to be recognized in a scene of a target field of view, where the image to be recognized includes an object to be recognized;
- the processing module is configured to perform feature extraction and matching on the to-be-recognized image based on the trained neural network for pedestrian re-recognition, to obtain a recognition result corresponding to the object to be recognized; wherein, the neural network
- the training samples include target domain images obtained by converting source domain images in other view domain scenes to the target vision domain scenes and the types of objects contained in the target domain images.
- it further includes a training module configured to obtain a first training sample, where the first training sample includes a source domain image of a target object in a scene of another field of view; and the first training Input the sample into the trained generation confrontation network to perform style conversion to obtain the target domain image in the target field of view scene; form a second training sample according to the target domain image labeled with the identity information of the contained target object; The second training sample is input to the neural network model for iterative training until the loss function of the neural network model satisfies the convergence condition, and the trained neural network for pedestrian re-recognition is obtained.
- a training module configured to obtain a first training sample, where the first training sample includes a source domain image of a target object in a scene of another field of view; and the first training Input the sample into the trained generation confrontation network to perform style conversion to obtain the target domain image in the target field of view scene; form a second training sample according to the target domain image labeled with the identity information of the contained target object; The second training sample is input
- the training module is further configured to obtain an original target domain image including the target object in the target field of view scene, and label the original target domain with the identity information of the contained target object.
- the image is used as part of the second training sample.
- the generative confrontation network includes a generative network and a recognition network
- the training module includes:
- a generating network training unit is configured to obtain source domain images in other view domain scenes; input the source domain images into the generating network for training to obtain corresponding output images; wherein, the source domain images correspond to the corresponding output images
- the output images correspond to different scene styles
- the recognition network training unit is configured to obtain a target domain image in a target field of view scene and a scene label corresponding to the target domain image; and combine the output image, the target domain image, and the scene label corresponding to the target domain image Input the recognition network for training, and determine the scene recognition result of the output image and the target domain image;
- the convergence unit is configured to obtain the trained generative confrontation network by performing separate alternating iterative training on the generation network and the recognition network until the set loss function meets the convergence condition.
- it further includes a posture correction module, which is further configured to perform posture correction on the object to be recognized in the image to be recognized.
- the posture correction module includes:
- the spatial transformation network training unit is configured to obtain the target domain image training set in the target view domain scene; generate an affine transformed image sample set from the target domain image training set based on the affine transformation parameters and compare the generated image samples with the original Image sample pairing to obtain paired samples; training a spatial transformation network based on the paired samples and the affine transformation parameters until the corresponding loss function satisfies the convergence condition to obtain a trained spatial transformation network for pedestrian pose alignment;
- the correction unit is configured to perform posture correction on the object to be recognized in the image to be recognized based on the trained spatial transformation network.
- an embodiment of the present disclosure also provides a pedestrian re-identification device, including: a processor and a memory for storing a computer program that can run on the processor;
- the processor when used to run the computer program, it implements the pedestrian re-identification method as described in any embodiment of the present disclosure.
- an embodiment of the present disclosure further provides a computer storage medium in which a computer program is stored, wherein the computer program is executed by a processor to realize the pedestrian operation as described in any embodiment of the present disclosure. Identify the method again.
- the method obtains the image to be recognized in the scene of the target field of view, and the image to be recognized includes the object to be recognized; and the image to be recognized is performed based on the trained neural network for pedestrian recognition.
- Feature extraction and matching, to obtain the recognition result corresponding to the object to be recognized; here, the training samples of the neural network used for pedestrian re-recognition include the conversion of source domain images in other view domain scenes to the target view domain scene
- the obtained target domain image and the identity information of the object contained in the target domain image are determined by the trained neural network to determine the recognition result of the object to be recognized in the image to be recognized.
- the pedestrian data set expansion problem in the model generalization ability of the model enables the re-identified neural network to have a more robust feature learning ability for different scenarios, which can be better applied to new application scenarios, and the recognition accuracy is improved , Effectively reducing the problem of missing and misunderstanding.
- FIG. 1 is a schematic diagram of a pedestrian re-identification scene in an unmanned store provided by an embodiment of the disclosure
- FIG. 2 is a schematic flowchart of a pedestrian re-identification method provided by an embodiment of the present disclosure
- FIG. 3 is a schematic flowchart of a pedestrian re-identification method according to another embodiment of the present disclosure.
- FIG. 4 is a schematic flowchart of a pedestrian re-identification method according to another embodiment of the present disclosure.
- FIG. 5a is an effect diagram before image conversion between a source domain and a target domain provided by an embodiment of the disclosure
- 5b is an effect diagram after image conversion between the source domain and the target domain provided by an embodiment of the disclosure.
- FIG. 6 is a schematic flowchart of a pedestrian re-identification method provided by another embodiment of the present disclosure.
- FIG. 7 is a schematic flowchart of a pedestrian re-identification method provided by another embodiment of the present disclosure.
- FIG. 8 is a schematic flowchart of a pedestrian re-identification method provided by another embodiment of the present disclosure.
- FIG. 9a is an effect diagram of an image to be recognized before posture correction according to an embodiment of the disclosure.
- FIG. 9b is an effect diagram after posture correction of an image to be recognized provided by an embodiment of the present disclosure.
- FIG. 10 is a schematic flowchart of a pedestrian re-identification method according to another embodiment of the present disclosure.
- FIG. 11 is a schematic structural diagram of a pedestrian re-identification device provided by an embodiment of the present disclosure.
- FIG. 12 is a schematic structural diagram of a pedestrian re-identification device provided by another embodiment of the present disclosure.
- the terms "including”, “including” or any other variations thereof are intended to cover non-exclusive inclusion, so that a method or device including a series of elements not only includes what is clearly stated Elements, but also include other elements not explicitly listed, or elements inherent to the implementation of the method or device. Without more restrictions, the element defined by the sentence “including a" does not exclude the presence of other related elements in the method or device that includes the element (such as steps in the method or units in the device). For example, the unit may be part of a circuit, part of a processor, part of a program or software, etc.).
- the pedestrian re-identification method provided by the embodiments of the present disclosure includes a series of steps, but the pedestrian re-identification method provided by the embodiments of the present disclosure is not limited to the recorded steps.
- the embodiments of the present disclosure provide A pedestrian re-identification device includes a series of modules, but the device provided in the embodiments of the present disclosure is not limited to include the explicitly recorded modules, and may also include modules that need to be set for obtaining relevant information or processing based on information.
- FIG. 1 is a schematic diagram of a pedestrian re-recognition scene in an unmanned store provided by an embodiment of the present disclosure.
- the unmanned store includes multiple cameras and processing equipment connected to the cameras, for example, camera A1, camera A2, and camera A3. , Each camera is connected to processing device B respectively. Each camera is set in a different corner of the unmanned store, and the different corners may have different light intensities, shooting angles, etc.
- the camera Whenever a shopper enters the unmanned store for the first time, the camera will collect the image data of the shopper, and the processing device will assign an identity to the shopper, so that each shopper entering the unmanned store has a unique identity , Here, it can be obtained by obtaining the face image of the shopper and determining the identity of the shopper through the ID.
- the camera set at the entrance of the unmanned store is A1.
- the processing device When the shopper X enters the unmanned store, the processing device will collect the image data of the shopper X and define or obtain an ID correspondingly.
- the processor can quickly and accurately identify the shopper X in the A2 camera by applying the pedestrian re-identification method of the present disclosure, and realize the re-shopping of the shopper. Identification to meet the needs of automatic tracking, shopper information collection and automatic settlement in the unmanned store shopping process. For example, a shopper makes a purchase after entering a store, and uses multiple cameras to determine the user ID to ensure that different users have different IDs for tracking. When the shopper walks out of an unmanned store, automatic checkout is completed according to the user ID.
- an embodiment of the present disclosure provides a pedestrian re-identification method.
- FIG. 2 is a schematic flowchart of a pedestrian re-identification method provided by an embodiment of the present disclosure. The method includes:
- Step 11 Obtain an image to be recognized in the target field of view scene, where the image to be recognized includes the object to be recognized;
- the object to be recognized is a target object that needs to be recognized.
- the object to be identified refers to a person, which has features such as face, posture, and clothing, and can be a shopper within a camera scene range in an unmanned vending store with multiple cameras, for example, shopper A , Shopper B and shopper C;
- the target field of view scene corresponds to an image acquisition device, such as a camera, where the field of view scene is usually related to the installation position of the camera, and different cameras correspond to specific light intensity and shooting Angle, shooting range, etc.
- the image to be recognized may be a frame sequence intercepted from a video captured by a camera and an image obtained after image data fusion processing is performed on multiple frames of images in the frame sequence; or it may be directly captured by different shooting equipment Contains photos of objects to be identified.
- Step 12 Perform feature extraction and matching on the image to be recognized based on the trained neural network for pedestrian re-recognition to obtain a recognition result corresponding to the object to be recognized; wherein the training samples of the neural network include The target domain image obtained after the source domain image in another view domain scene is converted to the target domain scene and the identity information of the object contained in the target domain image.
- the pedestrian re-recognition neural network may be a neural network model pre-trained based on a known image data set, for example, a BP neural network model, a convolutional neural network model, or a modification of the aforementioned neural network model.
- certain preprocessing can be performed on the image to be recognized and the image used for training, and the preprocessed image data for training is input into the neural network for training to obtain the neural network model.
- the following will take a convolutional neural network as an example:
- FIG. 3 is a schematic flow chart of a pedestrian re-identification method provided by another embodiment of the present disclosure.
- the neural network for pedestrian re-identification is based on the trained neural network.
- Feature extraction and matching of the image to be recognized can include:
- Step 21 Preprocessing the image sample set used for training the pedestrian re-recognition neural network, wherein the image sample set contains multiple pictures of the object to be identified and corresponding identity information.
- the preprocessing may be to normalize the image samples in the sample set into images with a size of 100*100*20, and perform grayscale processing on the images in the image sample set.
- Step 22 Input the preprocessed samples into the initial convolutional neural network for iterative training until convergence.
- the initial convolutional neural network may sequentially include the following levels: convolutional layer C1, pooling layer S1, convolutional layer C2, pooling layer S2, convolutional layer C3 layer, pooling layer S3 layer.
- Convolutional layer C1 layer select 32 different convolution kernels with a size of 3*3*3, a step size of 1, and padding with a margin of 0. After the input image passes through the convolutional layer, the convolutional layer C1 The layer output image size is 100*100*20, and the total amount of variables is 32*100*100*20;
- Pooling layer S1 This layer uses the maximum pooling method, the pooling size is 2*2*2, the step size is 2, the margin is 0 for filling, and the output image size of the pooling layer S1 is 50*50* 10. The total amount of variables is 32*50*50*10;
- Convolutional layer C2 layer This layer selects 64 different convolution kernels, the size, step length, and margin filling methods are the same as those of the C1 layer.
- the output image size of the convolutional layer C2 layer is 50*50*10, variable The total amount is 64*50*50*10;
- Pooling layer S2 layer The basic settings of this layer are exactly the same as the S1 layer. After the S2 layer, the output image size of the pooling layer S2 layer is 25*25*5, and the total amount of variables is 64*25*25*5;
- Convolutional layer C3 The number of convolution kernels in this layer is set to 128.
- the size, step length, and margin filling methods of the convolution kernel are the same as those of the C1 and C2 layers.
- the resulting feature map size is 25*25 *5, the total amount of variables is 128*25*25*5;
- Pooling layer S3 layer The basic settings of this layer are exactly the same in the S1 and S2 layers. After the S3 layer, the output image size of the pooling layer S3 layer is 13*13*3, and the total variable is 128*13*13*3 .
- an activation layer is provided after each of the above-mentioned convolutional layers, and the activation layer includes a ReLU activation function for adding nonlinear features before performing the operation of the pooling layer.
- the activation layer includes a ReLU activation function for adding nonlinear features before performing the operation of the pooling layer.
- the feature vector output by the fully connected layer is passed into the Softmax layer to iteratively train the network model for the purpose of minimizing the cross-entropy loss function.
- the probability is assigned to the data category to obtain the classification result and realize the classification. match.
- Step 23 Preprocess the image to be recognized, and input the preprocessed image to be recognized into the trained pedestrian recognition neural network to obtain a recognition result corresponding to the object to be recognized.
- the training samples of the neural network include the target domain image obtained after converting the source domain image in the other view domain scene to the target domain scene, including converting the source domain image in the other vision domain scene into the target domain image.
- the target area image of scene factors such as illumination and sharpness in the field of view scene can increase the amount of training samples and reduce the workload of manual annotation.
- the above conversion can be performed based on the overall framework of cross-domain adaptive data enhancement of the generative countermeasure network, which can be used for the sample data enhancement in the training phase of the pedestrian re-recognition network and the data preprocessing in the test phase, see Figure 4.
- A is the source domain image
- B is the target domain image.
- the image in the source domain B scene can be generated by generating a confrontation network to generate the target domain B scene
- the B domain image can be generated by the confrontation network to obtain the A domain image
- the pedestrian re-identification process includes step 31: through training the generation confrontation network to obtain the AB domain and BA domain image conversion model, the AB domain and BA
- the domain image conversion is used as the pre-processing before pedestrian recognition, so that the obtained source domain image tends to the scene style of the target domain image.
- FIGS. 5a and 5b are a comparison diagram of the effect before and after the image conversion between the source domain and the target domain provided by an embodiment of the present disclosure.
- the method obtains the image to be recognized in the scene of the target field of view, and the image to be recognized includes the object to be recognized; the neural network for recognizing pedestrians after training is performed on the image to be recognized. Feature extraction and matching, to obtain the recognition result corresponding to the object to be recognized; here, the training samples of the neural network used for pedestrian re-recognition include the conversion of source domain images in other view domain scenes to the target view domain scene
- the obtained target domain image and the identity information of the object contained in the target domain image are determined by the trained neural network to determine the recognition result of the object to be recognized in the image to be recognized.
- FIG. 6 is a schematic flowchart of a pedestrian re-recognition method provided by another embodiment of the present disclosure.
- the step 12 is based on the trained neural network for pedestrian re-identification. Before feature extraction and matching of the image to be recognized, it also includes:
- Step 41 Obtain a first training sample, where the first training sample includes a source domain image of the target object in a scene of another view domain;
- each camera may correspond to a field of view scene.
- each camera may correspond to a field of view scene.
- it corresponds to three field of view scenes of A, B, and C.
- C view is the target view
- a and B view are other views
- the images collected in the C view scene are the target image
- the images captured in the A and B view scenes are the source Domain image
- B is the target vision
- a and C are other visions
- the images collected in the B vision scene are the target and the image
- the A and C vision scenes are collected
- the image is the source domain image.
- the images in the scenes of other view zones correspond to the source zone images.
- Step 42 Input the first training sample into the trained Generative Adversarial Network to perform style conversion to obtain the target domain image in the target view scene;
- the style may mean that the pictures collected in different viewing areas have different light intensities, postures, viewing angles, and the like.
- Step 43 Form a second training sample according to the target domain image labeled with the identity information of the contained target object;
- the second training sample is converted from the first training sample, and the sample picture in the first training sample carries the identity information label, and the sample picture in the first training sample also corresponds to
- the identity information can be used to mark the converted sample picture with the identity information.
- Step 44 Input the second training sample into the neural network model for iterative training until the loss function of the neural network model meets the convergence condition, and obtain the trained neural network for pedestrian re-recognition.
- the samples for training the neural network model for iterative training may include not only the second training samples, but also samples obtained in the target domain scenario.
- the step 44 inputting the second training sample into the neural network model for iterative training, until the loss function of the neural network model satisfies the convergence condition, further includes:
- the original target image here may be an image containing the target object collected after the target object was recognized when the target object entered the target field of view in the previous period. Mark the original target image and its identity information of the target object in the target field of view scene as part of the second training sample, which can increase the number of samples and enhance the samples, so that the trained pedestrian re-recognition network has better generalization Ability to improve recognition accuracy and obtain good recognition results.
- step 42 the first training sample is input into the trained generation confrontation network for style conversion to obtain the Before the target field image in the target field of view scene, include:
- Step 51 Obtain source domain images in other view domain scenes
- Step 52 Input the source domain image into the generation network for training to obtain a corresponding output image; wherein, the source domain image and the corresponding output image respectively correspond to different scene styles;
- Step 53 Obtain a target area image in the target field of view scene and a scene tag corresponding to the target area image;
- Step 54 Input the output image, the target domain image, and the scene label corresponding to the target domain image into the recognition network for training, determine the scene recognition result of the output image and the target domain image, and compare The generation network and the recognition network are individually and alternately iteratively trained until the set loss function meets the convergence condition, and the trained generation confrontation network is obtained.
- the Generative Adversarial Net includes a generative model (generative model) and a discriminative model (discriminative model).
- the generative model may also be called a generator (Generator) or a generative network, which may be expressed as G;
- the discriminant model may also be called a discriminator or discriminant network, which may be expressed as D.
- G can receive a random noise vector z, and generate data (such as an image) from this noise, denoted as G(z).
- the noise corresponds to the value of the source image collected in the source domain.
- Feature vector. D can receive G(z) or a real image to determine the probability that the received image is a real image.
- the output of D can be represented as D(x), D(x)
- the value of can be in the range of 0 to 1
- the real image is an image of the target field collected in a scene of the target field of view.
- Both G and D can be trained at the same time.
- the goal of G is to generate as close to the real image as possible to try to deceive D, and the goal of D is to distinguish the images generated by G as much as possible.
- G and D are a dynamic game process.
- D minimize the discrimination error
- G maximize the discrimination error. Both goals can be achieved through the back propagation method.
- the generative confrontation network can convert source images in other view domain scenes into target domain images that conform to the target view domain scene.
- Pr and Pg respectively represent the distribution of real images and the distribution of G generated images, where the real image is the image collected in the target field of view scene, and the generated image is the source domain image input into the generated image
- the network is trained to obtain the corresponding output image, then the objective function of D can be expressed as:
- the step 12, before performing feature extraction and matching on the image to be recognized based on the trained neural network for pedestrian re-recognition further includes: Perform posture correction for the object to be recognized.
- FIG. 8 is a schematic flowchart of a pedestrian re-recognition method according to another embodiment of the present disclosure.
- the posture correction of the object to be recognized in the image to be recognized includes:
- Step 61 Obtain a target domain image training set in the target view domain scene
- Step 62 Generate an affine transformed image sample set from the target domain image training set based on the affine transformation parameters and pair the generated image samples with the original image samples to obtain paired samples;
- the affine transformation image sample set is generated through the target domain image training set, including translation, rotation, scaling, tilt and so on.
- Step 63 Train a spatial transformation network based on the paired samples and the affine transformation parameters until the corresponding loss function satisfies the convergence condition to obtain a trained spatial transformation network for pedestrian pose alignment;
- the spatial transformation network specifically includes a positioning network, a grid generator, and a pixel sampler, where the positioning network includes a convolution layer, a pooling layer, a nonlinear activation unit layer, a fully connected layer, and a regression layer.
- the spatial transformation network is trained based on the paired samples and the affine transformation parameters, wherein the supervised training of the affine transformation parameters is performed by calculating the average mean square error of the network regression value and the true label value and performing reverse gradient
- the supervised training of the paired samples is realized by calculating the mean square error of the average pixel value of the target domain image training set sample and the transformed sample and performing reverse gradient propagation;
- the loss function is composed of two parts, namely Pairing sample loss and transformation parameter loss, the mathematical formula is as follows:
- I in and I out respectively represent the input transformed image and the transformed image output by the network calculation
- ⁇ evl and ⁇ gt respectively represent the affine transformation parameters and the true transformation parameters of the depth space transformation network regression
- MSE represents the average mean square error
- Step 64 Perform posture correction on the object to be recognized in the image to be recognized based on the trained spatial transformation network.
- a spatial transformation network is used to perform posture correction on the object to be recognized in the image to be recognized , It can avoid the uncertainty of the pedestrian attitude caused by the uncertainty of the recognition, which leads to the missed detection and misidentification of the pedestrian re-identification model.
- FIGS. 9a and 9b are a comparative effect diagram before and after the posture correction of the image to be recognized according to an embodiment of the present disclosure.
- FIG. 10 is a schematic flowchart of a pedestrian re-identification method provided by another embodiment of the present disclosure.
- the pedestrian re-identification method includes the following steps:
- Step S1 Obtain source domain images in other field of view scenes; input the source domain images into the generative confrontation network for training, and obtain a trained generative confrontation network;
- the source domain image is input into the generative confrontation network, and the corresponding output image is obtained through the generative network.
- the source domain image and the corresponding output image respectively correspond to different scene styles; and the target domain in the target view domain scene is obtained Image and scene label corresponding to the target domain image; input the output image, the target domain image, and the scene label corresponding to the target domain image into a recognition network for training, and determine the output image and the target domain image According to the scene recognition result of, obtain the trained generative confrontation network by performing separate alternating iterative training on the generation network and the recognition network until the set loss function meets the convergence condition;
- Step S2 Obtain a first training sample, and input the first training sample into the trained Generative Adversarial Network to perform style conversion to obtain a target domain image in the target view scene.
- the neural network model is trained to obtain a trained neural network for pedestrian recognition;
- the first training sample includes the source domain image of the target object in another view area scene;
- the second training sample is formed according to the target domain image labeled with the identity information of the contained target object;
- the training samples are input to the initial neural network model for iterative training until the loss function of the neural network model satisfies the convergence condition, and the trained neural network for pedestrian re-recognition is obtained;
- Step S3 Obtain a target field image training set in the target field of view scene; generate an affine transformed image sample set from the target field image training set based on the affine transformation parameters and pair the generated image samples with the original image samples to obtain the pairing Sample; training a spatial transformation network based on the paired samples and the affine transformation parameters until the corresponding loss function meets the convergence condition, and a trained spatial transformation network for pedestrian pose alignment is obtained;
- Step S4 Obtain an image to be recognized in the scene of the target field of view, where the image to be recognized includes the object to be recognized;
- Step S5 performing posture correction on the object to be recognized in the image to be recognized based on the trained spatial transformation network
- Step S6 Perform feature extraction and matching on the image to be recognized based on the trained neural network for pedestrian re-recognition; wherein the training samples of the neural network include converting source domain images in other view domain scenes to all The target area image obtained after the target field of view scene and the identity information of the object contained in the target area image;
- Step S7 Obtain the recognition result corresponding to the object to be recognized, and determine the ID of the object to be recognized.
- the embodiments of the present disclosure abandon the use of traditional methods and non-deep learning methods to perform different manual feature + feature matching step-by-step combination of pedestrian re-recognition strategies for images in different scenarios, and use deep learning neural networks to complete pedestrian feature extraction and feature matching end-to-end Learning, with more robust feature learning capabilities for different scenarios.
- the pedestrian re-recognition method based on deep learning has improved algorithm accuracy and faster running speed.
- the training samples of the neural network used for pedestrian re-recognition in the present disclosure include the target domain image obtained after converting source domain images in other view domain scenes to the target vision domain scene and the The identity information of the object contained in the target domain image is determined by the trained neural network to determine the recognition result of the object to be recognized in the image to be recognized.
- the pedestrian re-identification neural network cross-domain model generalization ability is solved The problem of data set expansion enables the re-identified neural network to have more robust feature learning capabilities for different scenarios, which can be better applied to new application scenarios, improve the recognition accuracy, and effectively reduce missed recognition, The problem of misunderstanding.
- the present disclosure will also use the spatial transformation neural network to perform posture correction on the image to be recognized, which can prevent pedestrians from posture changes. Certainty causes uncertainty in recognition, leading to missed and misidentified pedestrian re-identification models.
- an embodiment of the present disclosure provides a pedestrian re-identification device.
- FIG. 11 is a schematic structural diagram of a pedestrian re-identification device provided by an embodiment of the present disclosure.
- the pedestrian re-identification device includes an acquisition module 71 and Processing module 72, in which,
- the obtaining module 71 is configured to obtain an image to be recognized in a scene of the target field of view, where the image to be recognized includes an object to be recognized;
- the processing module 72 is configured to perform feature extraction and matching on the image to be recognized based on the trained neural network for pedestrian re-recognition, to obtain a recognition result corresponding to the object to be recognized; wherein, the neural network
- the training samples of the network include target domain images obtained after converting source domain images in other view domain scenes to the target domain scenes and the types of objects included in the target domain images.
- the training module is configured to obtain a first training sample, the first training sample includes the source domain image of the target object in another field of view scene; the first training sample is input into the training The latter generation confrontation network performs style conversion to obtain the target domain image in the target field of view scene; forms a second training sample according to the target domain image labeled with the identity information of the contained target object;
- the training samples are input to the neural network model for iterative training until the loss function of the neural network model satisfies the convergence condition, and the trained neural network for pedestrian recognition is obtained.
- the training module 73 is further configured to obtain an original target domain image including the target object in the target field of view scene, and use the original target domain image marked with the identity information of the contained target object as the original target domain image. Said part of the second training sample.
- the generative confrontation network includes a generative network and a recognition network
- the training module includes:
- the generating network training unit 74 is configured to obtain source domain images in other view domain scenes; the processing module 72 is also configured to input the source domain images into the generating network for training to obtain corresponding output images; wherein, The source domain image and the corresponding output image respectively correspond to different scene styles;
- the recognition network training unit 75 is configured to obtain a target domain image in a target field of view scene and a scene label corresponding to the target domain image; and combine the output image, the target domain image, and the scene corresponding to the target domain image
- the tag is input to the recognition network for training, and the scene recognition result of the output image and the target domain image is determined;
- the convergence unit 76 is configured to obtain the trained generative confrontation network by performing separate alternating iterative training on the generation network and the recognition network until the set loss function meets the convergence condition.
- a posture correction module 77 is further included, and the posture correction module 77 is also configured to perform posture correction on the object to be recognized in the image to be recognized.
- the posture correction module 77 includes:
- the spatial transformation network training unit 78 is configured to obtain a target domain image training set in a target field of view scene; generate an affine transformed image sample set from the target domain image training set based on affine transformation parameters and compare the generated image samples with Pair the original image samples to obtain paired samples; train a spatial transformation network based on the paired samples and the affine transformation parameters until the corresponding loss function meets the convergence condition, and obtain a trained spatial transformation network for pedestrian pose alignment;
- the correction unit 79 is configured to perform posture correction on the object to be recognized in the image to be recognized based on the trained spatial transformation network.
- an embodiment of the present disclosure provides a pedestrian re-identification device.
- FIG. 12 is a schematic structural diagram of a pedestrian re-identification device provided by another embodiment of the present disclosure.
- the pedestrian re-identification device includes: a processor 82 and a memory 81 for storing computer programs that can run on the processor 82;
- the training samples of the neural network include The target domain image obtained after the source domain image in the scene is converted to the target view domain scene and the identity information of the object contained in the target domain image.
- processor 82 executes the computer program, it is also used to implement:
- the second training sample is input to the neural network model for iterative training until the loss function of the neural network model satisfies the convergence condition, and the trained neural network for pedestrian recognition is obtained.
- the processor 82 executes the computer program, it is also used to achieve: obtain the original target domain image including the target object in the target field of view scene, and mark all the images that carry the included target object's identity information.
- the original target domain image is used as a part of the second training sample.
- processor 82 executes the computer program, it is also used to implement:
- the network and the recognition network are individually and alternately iteratively trained until the set loss function meets the convergence condition, and the trained generative confrontation network is obtained.
- the processor 82 executes the computer program, it is also used to implement: posture correction of the object to be recognized in the image to be recognized.
- the processor 82 executes the computer program, it is also used to achieve: obtain a target field image training set in a target field of view scene;
- embodiments of the present disclosure provide a computer storage medium, for example, including a memory storing a computer program.
- the computer program can be executed by a processor in the above-mentioned apparatus to complete the steps described in the foregoing method.
- the computer storage medium can be FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM, etc.; it can also be a variety of devices including one or any combination of the above memories, such as mobile phones , Computers, tablet devices, personal digital assistants, etc.
- a computer program is stored in the computer storage medium.
- the processor 82 executes the computer program and includes the following steps:
- the training samples of the neural network include The target domain image obtained after the source domain image in the scene is converted to the target view domain scene and the identity information of the object contained in the target domain image.
- processor 82 executes the computer program, it is also used to implement:
- the second training sample is input to the neural network model for iterative training until the loss function of the neural network model satisfies the convergence condition, and the trained neural network for pedestrian recognition is obtained.
- the processor 82 executes the computer program, it is also used to achieve: obtain the original target domain image including the target object in the target field of view scene, and mark all the images that carry the included target object's identity information.
- the original target domain image is used as a part of the second training sample.
- processor 82 executes the computer program, it is also used to implement:
- the network and the recognition network are individually and alternately iteratively trained until the set loss function meets the convergence condition, and the trained generative confrontation network is obtained.
- the processor 82 executes the computer program, it is also used to implement: posture correction of the object to be recognized in the image to be recognized.
- the processor 82 executes the computer program, it is also used to achieve: obtain a target field image training set in a target field of view scene;
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
一种行人再识别方法,所述方法包括:获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象(11);基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;其中,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的类别(12)。
Description
本申请基于申请号为201910213845.1、申请日为2019年03月20日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
本公开涉及行人再识别领域但不限于行人再识领域,尤其涉及一种行人再识别方法、装置及存储介质。
行人再识别问题已成为计算机视觉领域的研究热点,行人再识别就是在多个摄像头的情况下,给定一个人的身份ID,与多个摄像头下的人物特征匹配,从而精确找到这个人所在的身份ID。
基于非深度学习的行人再识别方法,手工特征设计繁琐,且算法精度不高;相比于非深度学习的方法,基于深度学习的行人再识别方法,算法精度得到提升,运行速度较快,其针对特定场景(人流可控)进行行人再识别效果较好,但针对复杂场景(如复杂人流场所、火车站、京东无人店等)算法精度受限,例如在行人再识别中不同场景(跨摄像头)及行人不同衣着(季节不同、衣服风格差异)等跨数据集性行人误识等问题。
由此可见,目前基于深度学习方法仍缺乏跨域的模型泛化能力,即基于特征场景下训练好的网络模型未能较好地应用到新的场景中,包括同一人在同一场景下穿不同衣服,或穿一样的衣服在不同场景,在复杂场景下,行人再识别模型仍漏识、误识问题有待解决。
发明内容
本公开实施例提供了一种泛化能力强、识别准确的行人再识别方法、装置及存储介质。
本公开实施例的技术方案是这样实现的:
第一方面,本公开实施例提供一种行人再识别方法,所述方法包括:
获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象;
基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;其中,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的身份信息。
在一些实施例中,在所述基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配之前,还包括:
获取第一训练样本,所述第一训练样本包括目标对象在其它视域场景下的源域图像;
将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像;
根据携带所包含的目标对象的身份信息标注的所述目标域图像形成第二训练样本;
将所述第二训练样本输入所述神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件,得到所述训练后的用于行人再识别的神经网络。
在一些实施例中,所述将所述第二训练样本输入神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件之前,还包 括:
获取包括目标对象在所述目标视域场景下的原始目标域图像,将携带有所包含的目标对象的身份信息标注的所述原始目标域图像作为所述第二训练样本的一部分。
在一些实施例中,所述生成对抗网络包括生成网络和识别网络,所述将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像之前,包括:
获取其他视域场景下的源域图像;
将所述源域图像输入所述生成网络进行训练,获得对应的输出图像;其中,所述源域图像与所述对应的输出图像分别对应不同的场景风格;
获取目标视域场景下的目标域图像及所述目标域图像对应的场景标签;
将所述输出图像、所述目标域图像及所述目标域图像对应的场景标签输入所述识别网络进行训练,确定所述输出图像与所述目标域图像的场景识别结果,通过对所述生成网络和所述识别网络进行单独交替迭代训练直至设置的损失函数满足收敛条件,获得所述训练后的生成对抗网络。
在一些实施例中,在基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配之前,还包括:
对所述待识别图像中的待识别对象进行姿态校正。
在一些实施例中,所述对所述待识别图像中的待识别对象进行姿态校正,包括:
获取目标视域场景下的目标域图像训练集;
基于仿射变换参数将所述目标域图像训练集生成仿射变换图像样本集并将生成的图像样本与原图像样本配对,获得配对样本;
基于所述配对样本和所述仿射变换参数训练空间变换网络,直至对应的损失函数满足收敛条件,得到训练后的用于行人姿态对齐的空间变换网络;
基于训练后的所述空间变换网络对所述待识别图像中的待识别对象进行姿态校正。
第二方面,本公开实施例还提供一种行人再识别装置,包括获取模块和处理模块,其中,
所述获取模块,被配置为获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象;
所述处理模块,被配置为基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;其中,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的类别。
在一些实施例中,还包括训练模块,所述训练模块被配置为获取第一训练样本,所述第一训练样本包括目标对象在其它视域场景下的源域图像;将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像;根据携带所包含的目标对象的身份信息标注的所述目标域图像形成第二训练样本;将所述第二训练样本输入所述神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件,得到所述训练后的用于行人再识别的神经网络。
在一些实施例中,所述训练模块还被配置为获取包括目标对象在所述目标视域场景下的原始目标域图像,将携带有所包含的目标对象的身份信息标注的所述原始目标域图像作为所述第二训练样本的一部分。
在一些实施例中,所述生成对抗网络包括生成网络和识别网络,所 述训练模块包括:
生成网络训练单元,被配置为获取其他视域场景下的源域图像;将所述源域图像输入所述生成网络进行训练,获得对应的输出图像;其中,所述源域图像与所述对应的输出图像分别对应不同的场景风格;
识别网络训练单元,被配置为获取目标视域场景下的目标域图像及所述目标域图像对应的场景标签;将所述输出图像、所述目标域图像及所述目标域图像对应的场景标签输入所述识别网络进行训练,确定所述输出图像与所述目标域图像的场景识别结果;
收敛单元,被配置为通过对所述生成网络和所述识别网络进行单独交替迭代训练直至设置的损失函数满足收敛条件,获得所述训练后的生成对抗网络。
在一些实施例中,还包括姿态校正模块,所述姿态校正模块还被配置为对所述待识别图像中的待识别对象进行姿态校正。
在一些实施例中,所述姿态校正模块包括:
空间变换网络训练单元,被配置为获取目标视域场景下的目标域图像训练集;基于仿射变换参数将所述目标域图像训练集生成仿射变换图像样本集并将生成的图像样本与原图像样本配对,获得配对样本;基于所述配对样本和所述仿射变换参数训练空间变换网络,直至对应的损失函数满足收敛条件,得到训练后的用于行人姿态对齐的空间变换网络;
校正单元,被配置为基于训练后的所述空间变换网络对所述待识别图像中的待识别对象进行姿态校正。
第三方面,本公开实施例还提供一种行人再识别装置,包括:处理器和用于存储能够在处理器上运行的计算机程序的存储器;
其中,所述处理器用于运行所述计算机程序时,实现如本公开任一实施例所述的行人再识别方法。
第四方面,本公开实施例还提供一种计算机存储介质,所述计算机存储介质中存储有计算机程序,其中,所述计算机程序被处理器执行时实现如本公开任一实施例所述的行人再识别方法。
本公开实施例中,所述方法通过获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象;基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;这里,用于行人再识别的神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的身份信息,通过训练后的神经网络确定待识别图像中待识别对象的识别结果,如此,解决了行人再识别的神经网络跨域的模型泛化能力中的行人数据集扩充问题,使得所述再识别的神经网络具有针对不同的场景更鲁棒的特征学习能力,能够较好地应用到新的应用场景,提升了识别准确率,有效地减少了漏识、误识的问题。
图1为本公开一实施例提供的无人店行人再识别场景示意图;
图2为本公开一实施例提供的一种行人再识别方法的流程示意图;
图3为本公开又一实施例提供的一种行人再识别方法的流程示意图;
图4为本公开又一实施例提供的一种行人再识别方法的流程示意图;
图5a为本公开一实施例提供的源域和目标域之间图像转换之前的效果图;
图5b为本公开一实施例提供的源域和目标域之间图像转换之后的效果图;
图6为本公开又一实施例提供的一种行人再识别方法的流程示意图;
图7为本公开又一实施例提供的一种行人再识别方法的流程示意图;
图8为本公开又一实施例提供的一种行人再识别方法的流程示意图;
图9a为本公开一实施例提供的待识别图像姿态校正前的效果图;
图9b为本公开一实施例提供的待识别图像姿态校正后的效果图;
图10为本公开又一实施例提供的一种行人再识别方法的流程示意图;
图11为本公开一实施例提供的一种行人再识别装置的结构示意图;
图12为本公开又一实施例提供的一种行人再识别装置的结构示意图。
以下结合附图及实施例,对本公开进行进一步详细说明。应当理解,此处所提供的实施例仅仅用以解释本公开,并不用于限定本公开。另外,以下所提供的实施例是用于实施本公开的部分实施例,而非提供实施本公开的全部实施例,在不冲突的情况下,本公开实施例记载的技术方案可以任意组合的方式实施。
需要说明的是,在本公开实施例中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的方法或者装置不仅包括所明确记载的要素,而且还包括没有明确列出的其他要素,或者是还包括为实施方法或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的方法或者装置中还存在另外的相关要素(例如方法中的步骤或者装置中的单元,例如的单元可以是部分电路、部分处理器、部分程序或软件等等)。
例如,本公开实施例提供的一种行人再识别方法包含了一系列的步骤,但是本公开实施例提供的一种行人再识别方法不限于所记载的步骤,同样地,本公开实施例提供的一种行人再识别装置包括了一系列模块,但是本公开实施例提供的装置不限于包括所明确记载的模块,还可以包括为获取相关信息、或基于信息进行处理时所需要设置的模块。
为了能够更加便于对本公开实施例提供的行人再识别方法的实现流程的理解,以下以无人店行人再识别场景为例对本公开的应用场景进行示例性说明:
请参见图1,为本公开一实施例提供的无人店行人再识别场景示意图,无人店店内包括多个摄像头和与所述摄像头连接的处理设备,例如,摄像头A1、摄像头A2和摄像头A3,各个摄像头分别与处理设备B连接。每个摄像头设置在无人店不同的角落,所述不同的角落可能具有不同的光线强度、拍摄视角等。每当购物者首次进入无人店,摄像头会采集购物者的图像数据,所述处理设备会给所述购物者分配一个身份,这样,每个进入无人店的购物者都对应有唯一一个身份,这里,可以是通过获取购物者的人脸图像并通过ID确定购物者身份。例如,设置在无人店门口的摄像头为A1,在购物者X进入无人店时,所述处理设备会采集购物者X的图像数据并对应定义或获取一个ID。在所述购物者X从摄像头A1进入下一个A2摄像头的拍摄范围时,处理器通过应用本公开的行人再识别方法就能够快速准确地识别出A2摄像头中的购物者X,实现购物者的再识别,以满足无人店购物流程中的自动跟踪、购物者的信息收集和自动结算等需要。例如,购物者进店后购物,通过多个摄像头确定用户ID,以保证不同用户有不同的ID以实现跟踪,在购物者走出无人店时,根据用户ID完成自动结账。
以下将针对本公开实施例进行详细说明:
第一方面,本公开实施例提供一种行人再识别方法,请参见图2,为本公开一实施例提供的一种行人再识别方法的流程示意图,所述方法包括:
步骤11,获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象;
这里,所述待识别对象为需要识别的目标对象。本公开实施例中,待 识别对象是指人,具有脸、姿态、装束等特征,可以是具有多个摄像头的无人售货店中某个摄像头场景范围内的购物者,例如,购物者A、购物者B和购物者C;所述目标视域场景对应一个图像采集设备,例如摄像头,这里的视域场景通常跟所述摄像头的安装位置有关,不同的摄像头分别对应特定的光照强度、拍摄角度、拍摄范围等。所述待识别图像可以是从摄像头拍摄到的视频中截取的一段帧序列并对所述帧序列中的多帧图像经过图像数据融合处理之后得到的图像;也可以是指不同拍摄设备直接拍摄得到的包含有待识别对象的照片。
步骤12,基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;其中,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的身份信息。
这里,所述行人再识别神经网络可以是基于已知的图像数据集预训练得到的神经网络模型,例如,BP神经网络模型、卷积神经网络模型或者是经过上述神经网络模型的变形。
这里,在进行特征提取和匹配前,可以对进行待识别的图像与用于训练的图像进行一定的预处理,将经过预处理后的用于训练的图像数据输入神经网络中进行训练获得神经网络模型。以下将以卷积神经网络为例进行说明:
作为一种实施方式,请参见图3,为本公开又一实施例提供的一种行人再识别方法的流程示意图,所述步骤12,基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,可以包括:
步骤21,对用于训练行人再识别神经网络的图像样本集进行预处理,其中,所述图像样本集中包含有所述待识别对象的多张图片和对应身份信 息。所述预处理可以是将所述样本集中的图像样本归一化为100*100*20大小的图像,并对图像样本集中的图像进行灰度化处理。
步骤22,将经过预处理后的样本输入到初始的卷积神经网络中进行迭代训练,直至收敛。
在一可选的实施例中,所述初始的卷积神经网络可以依次包括以下层级:卷积层C1层、池化层S1层、卷积层C2层、池化层S2层、卷积层C3层、池化层S3层。
卷积层C1层:选取32个不同的卷积核,尺寸大小为3*3*3,步长为1,并以边距为0进行填充,输入图像经过卷积层后,卷积层C1层输出图像大小为100*100*20,变量总量为32*100*100*20;
池化层S1层:这一层采用最大池化方法,池化尺寸为2*2*2,步长为2,边距为0进行填充,池化层S1层输出图像大小为50*50*10,变量总量为32*50*50*10;
卷积层C2层:这一层选取64个不同的卷积核,尺寸大小、步长、边距填充方式与C1层相同,卷积层C2层的输出图像大小为50*50*10,变量总量为64*50*50*10;
池化层S2层:这一层的基本设置与S1层完全相同,经过S2层,池化层S2层的输出图像大小为25*25*5,变量总量为64*25*25*5;
卷积层C3层:这一层卷积核的个数设置为128个,卷积核尺寸大小、步长、边距填充方式同C1、C2层相同,所得到的特征图大小为25*25*5,变量总量为128*25*25*5;
池化层S3层:这一层的基本设置于S1、S2层完全相同,经过S3层,池化层S3层的输出图像大小为13*13*3,总变量为128*13*13*3。
这里,在上述每一个卷积层之后都设置有一个激活层,所述激活层包括ReLU激活函数,用以添加非线性特征之后再进行池化层的操作。在经 过上述层之后,再经过一个具有1024个神经元的全连接层来获得更为高级的特征,在这一层,通过在损失函数里添加稀疏化规则项来提高模型针对于特定问题的泛化能力。最后将全连接层输出的特征向量传入到Softmax层中,将使交叉熵损失函数最小化为目的来迭代训练网络模型,经过Softmax操作来为数据的类别分配概率,从而得到分类结果,实现分类匹配。
步骤23、将待识别图像进行预处理,将预处理后的待识别图像输入到训练好的行人再识别神经网络,得到与所述待识别对象对应的识别结果。
这里,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像包括将其它视域场景下的源域图像通过转换成为符合目标视域场景下的光照、清晰度等场景因素的目标域图像以增加训练样本量并减少人工标注的工作量。例如,可以基于生成对抗网络的跨域自适应数据增强的整体框架进行上述转换,所述生成对抗网络可以用作行人再识别网络训练阶段样本数据增强及测试阶段数据预处理,请参见图4,为本公开又一实施例提供的一种行人再识别方法的流程示意图,A为源域图像,B为目标域图像,这里,源域B场景下的图像可以通过生成对抗网络生成目标域B场景下的图像,反之,B域图像可以通过生成对抗网络获得A域图像,即,行人再识别过程包括步骤31:通过训练生成对抗网络获得A-B域和B-A域图像转换的模型,将A-B域和B-A域图像转换作为行人再识别前的预处理,使得所获得的源域图像更趋向于目标域图像的场景风格。该网络的优势在于生成的B域图像可以作为B域行人再识别网络的训练样本,这样训练好的B域行人再识别模型具有更好的场景泛化性,不仅能很好的解决B域样本数据增强,还能解决当前B域场景的行人再识别。请参见图5a和5b,为本公开一实施例提供的源域和目标域之间图像转换前后效果对比图。
本公开实施例中,所述方法通过获取目标视域场景下的待识别图像, 所述待识别图像包括待识别对象;基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;这里,用于行人再识别的神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的身份信息,通过训练后的神经网络确定待识别图像中待识别对象的识别结果,如此,解决了行人再识别的神经网络跨域的模型泛化能力中的行人数据集扩充问题,使得所述再识别的神经网络具有针对不同的场景更鲁棒的特征学习能力,能够较好地应用到新的应用场景,提升了识别准确率,有效地减少了漏识、误识的问题。请参见图6,为本公开又一实施例提供的一种行人再识别方法的流程示意图,作为一种实施例,所述步骤12,基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配之前,还包括:
步骤41,获取第一训练样本,所述第一训练样本包括目标对象在其它视域场景下的源域图像;
这里,在应用场景中,可以是每个摄像头对应一个视域场景,例如,在某个应用场景中,包括A、B、C三个摄像头,则对应A、B、C三个视域场景。当C视域为目标视域时,A、B视域则为其他视域,在C视域场景下采集到的图像为目标域图像,在A、B视域场景下采集到的图像为源域图像;当B视域为目标视域时,A、C视域则为其他视域,在B视域场景下采集到的图像为目标与图像,在A、C视域场景下采集到的图像为源域图像。这里,其他视域场景下的图像对应源域图像。
步骤42、将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像;
这里,所述风格可以是指不同视域场景下采集到的图片具有不同的光线强度、姿态、视角等。
步骤43、根据携带所包含的目标对象的身份信息标注的所述目标域图像形成第二训练样本;
这里,所述第二训练样本由所述第一训练样本转换而来,所述第一训练样本中的样本图片携带有身份信息标注,则第一训练样本中的样本图片在转换后同样对应有身份信息,可以用所述身份信息对转换后的样本图片进行标注。
步骤44、将所述第二训练样本输入神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件,得到所述训练后的用于行人再识别的神经网络。
这里,作为一种实施方式,训练所述神经网络模型进行迭代训练的样本可以不仅仅包括第二训练样本,还可以包括目标域场景下获得的样本。
所述步骤44,将所述第二训练样本输入所述神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件之前,还包括:
获取包括目标对象在所述目标视域场景下的原始目标域图像,将携带有所包含的目标对象的身份信息标注的所述原始目标域图像作为所述第二训练样本的一部分。
这里,这里原始目标图像可以是在上一时段目标对象进入目标视域时识别目标对象后采集到的包含有所述目标对象的图像。将目标对象在所述目标视域场景下的原始目标图像及其身份信息标注作为第二训练样本的一部分,可以增加样本数量,增强样本,使得训练后的行人再识别网络具有更好的泛化能力,提高识别准确率,获得良好的识别效果。
请参见图7,为本公开又一实施例提供的一种行人再识别方法的流程示意图,所述步骤42,将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像之前,包括:
步骤51:获取其他视域场景下的源域图像;
步骤52:将所述源域图像输入所述生成网络进行训练,获得对应的输出图像;其中,所述源域图像与所述对应的输出图像分别对应不同的场景风格;
步骤53:获取目标视域场景下的目标域图像及所述目标域图像对应的场景标签;
步骤54:将所述输出图像、所述目标域图像及所述目标域图像对应的场景标签输入所述识别网络进行训练,确定所述输出图像与所述目标域图像的场景识别结果,通过对所述生成网络和所述识别网络进行单独交替迭代训练直至设置的损失函数满足收敛条件,获得所述训练后的生成对抗网络。
这里,所述生成对抗网络(Generative Adversarial Net,GAN)包括生成模型(generative model)和判别模型(discriminative model)。所述生成模型也可以称为生成器(Generator)或生成网络,可以表示为G;所述判别模型也可以称为判别器(Discriminator)或判别网络,可以表示为D。一般地,G可以接收一个随机的噪声向量z,通过这个噪声生成数据(如图像),记为G(z),在本公开实施例中,所述噪声对应源域采集到的源域图像的特征向量。D可以接收G(z)或接收真实图像,以判断所接收的图像为真实图像的概率,假设D所接收的图像表示为x,则D的输出可以表示为D(x),D(x)的值可以位于0~1区间,D(x)=1表示x为真实图像,D(x)=0.5表示x有50%的概率为真实图像,D(x)=0表示不可能为真实图像,本公开实施例中,所述真实图像为目标视域场景下采集到的目标域图像。G和D这两者可以同时训练,G的目标是尽量生成接近真实的图像试图欺骗D,而D的目标是尽量将G生成的图像区分开来。可见G和D是一个动态的博弈过程,在训练D时,最小化判别误差;在训练G时,最大化判别误差。两个目的均可以通过反向传播方法实现。经过交替优化训练,两种模型G和D都能得到提升,直 到到达一个“G生成的图像与真实图像无法区分”的点,也就是使得D(G(z))=0.5,此时,所述生成对抗网络能够将其他视域场景下的源图像转换成符合目标视域场景的目标域图像。
这里,若以Pr和Pg分别表示真实图像的分布和G的生成图像的分布,其中,所述真实图像为目标视域场景下采集到的图像,所述生成图像为源域图像输入所述生成网络进行训练,获得的对应的输出图像,那么可以将D的目标函数表示为:
结合G的目标,可以将整个的优化目标函数表示为:
分别对D和G进行交替迭代:固定G优化D,一段时间后固定D优化G,直到预设的损失函数满足收敛条件。这样,经过交替迭代训练,两种模型G和D都能得到提升,直到达到一个“G生成的图像与真实图像无法区分”的点,即使得D(G(z))=0.5。
在一个可选的实施例中,所述步骤12,在基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配之前,还包括:对所述待识别图像中的待识别对象进行姿态校正。请参见图8,为本公开又一实施例提供的一种行人再识别方法的流程示意图,作为一种实施例,所述对所述待识别图像中的待识别对象进行姿态校正,包括:
步骤61,获取目标视域场景下的目标域图像训练集;
步骤62,基于仿射变换参数将所述目标域图像训练集生成仿射变换图像样本集并将生成的图像样本与原图像样本配对,获得配对样本;
这里,通过目标域图像训练集生成仿射变换图像样本集,包括平移、旋转、缩放、倾斜等情况。
步骤63,基于所述配对样本和所述仿射变换参数训练空间变换网络, 直至对应的损失函数满足收敛条件,得到训练后的用于行人姿态对齐的空间变换网络;
这里,所述空间变换网络具体包括定位网络、网格生成器和像素采样器,其中,定位网络包括卷积层、池化层、非线性激活单元层、全连接层及回归层。
这里,基于所述配对样本和所述仿射变换参数对所述空间变换网络进行训练,其中,仿射变换参数的监督训练通过计算网络回归值和真实标签值的平均均方差并进行反向梯度传播而实现;所述配对样本的监督训练通过计算目标域图像训练集样本与变换后的样本的平均像素值均方误差并进行反向梯度传播而实现;所述损失函数由两部分组成,即配对样本损失和变换参数损失,数学公式表示如下:
Loss=MSE(I
in,I
out)+MSE(θ
ev1,θ
gt)
其中I
in和I
out分别表示输入的变换图像和网络计算输出的变换图像,θevl和θgt分别表示深度空间变换网络回归的仿射变换参数和真实的变换参数,MSE表示平均均方误差;在该损失函数的作用下,通过反向梯度传播优化模型的参数值,使得模型达到较理想的状态。
步骤64,基于训练后的所述空间变换网络对所述待识别图像中的待识别对象进行姿态校正。
本公开实施例中,在基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配之前,采用空间变换网络对所述待识别图像中的待识别对象进行姿态校正,能够避免行人姿态不确定性造成识别时的不确定性,导致行人再识别模型的漏检、误识问题。
请参见图9a和9b,为本公开一实施例提供的待识别图像姿态校正前后对比效果图。
为了能够更加便于对本公开实施例提供的行人再识别方法的实现流程进一步理解,以下分别通过一个可选的具体实施例对本申请所提供的行人 再识别方法进行进一步说明:
请参见图10,为本公开又一实施例提供的一种行人再识别方法的流程示意图,所述行人再识别方法包括如下步骤:
步骤S1,获取其他视域场景下的源域图像;将所述源域图像输入所述生成对抗网络进行训练,得到训练后的生成对抗网络;
其中,将源域图像输入所述生成对抗网络,通过生成网络得到对应的输出图像,所述源域图像与所述对应的输出图像分别对应不同的场景风格;获取目标视域场景下的目标域图像及所述目标域图像对应的场景标签;将所述输出图像、所述目标域图像及所述目标域图像对应的场景标签输入识别网络进行训练,确定所述输出图像与所述目标域图像的场景识别结果,通过对所述生成网络和所述识别网络进行单独交替迭代训练直至设置的损失函数满足收敛条件,获得所述训练后的生成对抗网络;
步骤S2,获取第一训练样本,将所述第一训练样本输入所述训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像,基于所述目标域图像对初始的神经网络模型进行训练,得到训练后的用于行人再识别的神经网络;
其中,所述第一训练样本包括目标对象在其它视域场景下的源域图像;根据携带所包含的目标对象的身份信息标注的所述目标域图像形成第二训练样本;将所述第二训练样本输入初始的神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件,得到所述训练后的用于行人再识别的神经网络;
步骤S3,获取目标视域场景下的目标域图像训练集;基于仿射变换参数将所述目标域图像训练集生成仿射变换图像样本集并将生成的图像样本与原图像样本配对,获得配对样本;基于所述配对样本和所述仿射变换参数训练空间变换网络,直至对应的损失函数满足收敛条件,得到训练后的 用于行人姿态对齐的空间变换网络;
步骤S4,获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象;
步骤S5,基于训练后的所述空间变换网络对所述待识别图像中的待识别对象进行姿态校正;
步骤S6,基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配;其中,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的身份信息;
步骤S7,得到与所述待识别对象对应的识别结果,确定所述待识别对象的ID。
本公开实施例放弃采用传统方法非深度学习方法针对不同场景下的图像进行不同的手工特征+特征匹配分步组合的行人再识别策略,采用深度学习神经网络完成行人特征提取及特征匹配端对端学习,具有针对不同场景更鲁棒的特征学习能力,相比于非深度学习的方法,基于深度学习的行人再识别方法,算法精度得到提升,运行速度较快,其针对特定场景(人流可控)进行行人再识别效果较好;本公开用于行人再识别的神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的身份信息,通过训练后的神经网络确定待识别图像中待识别对象的识别结果,如此,解决了行人再识别的神经网络跨域的模型泛化能力中的行人数据集扩充问题,使得所述再识别的神经网络具有针对不同的场景更鲁棒的特征学习能力,能够较好地应用到新的应用场景,提升了识别准确率,有效地减少了漏识、误识的问题。本公开在基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配之前,还会利用空间变换神经网络对所述待识别 图像进行姿态校正,能够避免行人姿态不确定性造成识别时的不确定性,导致行人再识别模型的漏检、误识问题。
第二方面,本公开实施例提供一种行人再识别装置,请参见图11,为本公开一实施例提供的一种行人再识别装置结构示意图,所述行人再识别装置,包括获取模块71和处理模块72,其中,
所述获取模块71,被配置为获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象;
所述处理模块72,被配置为基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;其中,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的类别。
其中,还包括训练模块73,所述训练模块被配置为获取第一训练样本,所述第一训练样本包括目标对象在其它视域场景下的源域图像;将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像;根据携带所包含的目标对象的身份信息标注的所述目标域图像形成第二训练样本;将所述第二训练样本输入所述神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件,得到所述训练后的用于行人再识别的神经网络。
其中,所述训练模块73还被配置为获取包括目标对象在所述目标视域场景下的原始目标域图像,将携带有所包含的目标对象的身份信息标注的所述原始目标域图像作为所述第二训练样本的一部分。
其中,所述生成对抗网络包括生成网络和识别网络,所述训练模块包括:
生成网络训练单元74,被配置为获取其他视域场景下的源域图像;所 述处理模块72还用于将所述源域图像输入所述生成网络进行训练,获得对应的输出图像;其中,所述源域图像与所述对应的输出图像分别对应不同的场景风格;
识别网络训练单元75,被配置为获取目标视域场景下的目标域图像及所述目标域图像对应的场景标签;将所述输出图像、所述目标域图像及所述目标域图像对应的场景标签输入所述识别网络进行训练,确定所述输出图像与所述目标域图像的场景识别结果;
收敛单元76,被配置为通过对所述生成网络和所述识别网络进行单独交替迭代训练直至设置的损失函数满足收敛条件,获得所述训练后的生成对抗网络。
其中,还包括姿态校正模块77,所述姿态校正模块77还被配置为对所述待识别图像中的待识别对象进行姿态校正。
其中,所述姿态校正模块77包括:
空间变换网络训练单元78,被配置为获取目标视域场景下的目标域图像训练集;基于仿射变换参数将所述目标域图像训练集生成仿射变换图像样本集并将生成的图像样本与原图像样本配对,获得配对样本;基于所述配对样本和所述仿射变换参数训练空间变换网络,直至对应的损失函数满足收敛条件,得到训练后的用于行人姿态对齐的空间变换网络;
校正单元79,被配置为基于训练后的所述空间变换网络对所述待识别图像中的待识别对象进行姿态校正。
第三方面,本公开实施例提供一种行人再识别装置,请参见图12,为本公开又一实施例提供的一种行人再识别装置的结构示意图,所述行人再识别装置包括:处理器82和用于存储能够在处理器82上运行的计算机程序的存储器81;
其中,所述处理器82用于运行所述计算机程序时,所述处理器82执 行所述计算机程序时包括实现如下步骤:
获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象;
基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;其中,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的身份信息。
这里,所述处理器82执行所述计算机程序时还用于实现:
获取第一训练样本,所述第一训练样本包括目标对象在其它视域场景下的源域图像;
将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像;
根据携带所包含的目标对象的身份信息标注的所述目标域图像形成第二训练样本;
将所述第二训练样本输入所述神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件,得到所述训练后的用于行人再识别的神经网络。
这里,所述处理器82执行所述计算机程序时还用于实现:获取包括目标对象在所述目标视域场景下的原始目标域图像,将携带有所包含的目标对象的身份信息标注的所述原始目标域图像作为所述第二训练样本的一部分。
这里,所述处理器82执行所述计算机程序时还用于实现:
获取其他视域场景下的源域图像;
将所述源域图像输入所述生成网络进行训练,获得对应的输出图像;其中,所述源域图像与所述对应的输出图像分别对应不同的场景风格;
获取目标视域场景下的目标域图像及所述目标域图像对应的场景标 签;
将所述输出图像、所述目标域图像及所述目标域图像对应的场景标签输入所述识别网络进行训练,确定所述输出图像与所述目标域图像的场景识别结果,通过对所述生成网络和所述识别网络进行单独交替迭代训练直至设置的损失函数满足收敛条件,获得所述训练后的生成对抗网络。
这里,所述处理器82执行所述计算机程序时还用于实现:对所述待识别图像中的待识别对象进行姿态校正。
这里,所述处理器82执行所述计算机程序时还用于实现:获取目标视域场景下的目标域图像训练集;
基于仿射变换参数将所述目标域图像训练集生成仿射变换图像样本集并将生成的图像样本与原图像样本配对,获得配对样本;
基于所述配对样本和所述仿射变换参数训练空间变换网络,直至对应的损失函数满足收敛条件,得到训练后的用于行人姿态对齐的空间变换网络;
基于训练后的所述空间变换网络对所述待识别图像中的待识别对象进行姿态校正。
第四方面,本公开实施例提供一种计算机存储介质,例如包括存储有计算机程序的存储器,上述计算机程序可由上述装置中的处理器执行,以完成前述方法所述步骤。计算机存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、Flash Memory、磁表面存储器、光盘、或CD-ROM等存储器;也可以是包括上述存储器之一或任意组合的各种设备,如移动电话、计算机、平板设备、个人数字助理等。所述计算机存储介质中存储有计算机程序,其中,所述处理器用于运行所述计算机程序时,所述处理器82执行所述计算机程序时包括实现如下步骤:
获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象;
基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;其中,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的身份信息。
这里,所述处理器82执行所述计算机程序时还用于实现:
获取第一训练样本,所述第一训练样本包括目标对象在其它视域场景下的源域图像;
将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像;
根据携带所包含的目标对象的身份信息标注的所述目标域图像形成第二训练样本;
将所述第二训练样本输入所述神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件,得到所述训练后的用于行人再识别的神经网络。
这里,所述处理器82执行所述计算机程序时还用于实现:获取包括目标对象在所述目标视域场景下的原始目标域图像,将携带有所包含的目标对象的身份信息标注的所述原始目标域图像作为所述第二训练样本的一部分。
这里,所述处理器82执行所述计算机程序时还用于实现:
获取其他视域场景下的源域图像;
将所述源域图像输入所述生成网络进行训练,获得对应的输出图像;其中,所述源域图像与所述对应的输出图像分别对应不同的场景风格;
获取目标视域场景下的目标域图像及所述目标域图像对应的场景标签;
将所述输出图像、所述目标域图像及所述目标域图像对应的场景标签 输入所述识别网络进行训练,确定所述输出图像与所述目标域图像的场景识别结果,通过对所述生成网络和所述识别网络进行单独交替迭代训练直至设置的损失函数满足收敛条件,获得所述训练后的生成对抗网络。
这里,所述处理器82执行所述计算机程序时还用于实现:对所述待识别图像中的待识别对象进行姿态校正。
这里,所述处理器82执行所述计算机程序时还用于实现:获取目标视域场景下的目标域图像训练集;
基于仿射变换参数将所述目标域图像训练集生成仿射变换图像样本集并将生成的图像样本与原图像样本配对,获得配对样本;
基于所述配对样本和所述仿射变换参数训练空间变换网络,直至对应的损失函数满足收敛条件,得到训练后的用于行人姿态对齐的空间变换网络;
基于训练后的所述空间变换网络对所述待识别图像中的待识别对象进行姿态校正。
以上所述,仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和范围之内所作的任何修改、等同替换和改进等,均包含在本发明的保护范围之内。
Claims (14)
- 一种行人再识别方法,所述方法包括:获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象;基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;其中,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的身份信息。
- 根据权利要求1所述的行人再识别方法,其中,在所述基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配之前,还包括:获取第一训练样本,所述第一训练样本包括目标对象在其它视域场景下的源域图像;将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像;根据携带所包含的目标对象的身份信息标注的所述目标域图像形成第二训练样本;将所述第二训练样本输入神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件,得到所述训练后的用于行人再识别的神经网络。
- 根据权利要求2所述的行人再识别方法,其中,所述将所述第二训练样本输入所述神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件之前,还包括:获取包括目标对象在所述目标视域场景下的原始目标域图像,将携带有所包含的目标对象的身份信息标注的所述原始目标域图像作为所述第二训练样本的一部分。
- 根据权利要求2所述的行人再识别方法,其中,所述生成对抗网络包括生成网络和识别网络,所述将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像之前,包括:获取其他视域场景下的源域图像;将所述源域图像输入所述生成网络进行训练,获得对应的输出图像;其中,所述源域图像与所述对应的输出图像分别对应不同的场景风格;获取目标视域场景下的目标域图像及所述目标域图像对应的场景标签;将所述输出图像、所述目标域图像及所述目标域图像对应的场景标签输入所述识别网络进行训练,确定所述输出图像与所述目标域图像的场景识别结果,通过对所述生成网络和所述识别网络进行单独交替迭代训练直至设置的损失函数满足收敛条件,获得所述训练后的生成对抗网络。
- 根据权利要求1所述的行人再识别方法,其中,在基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配之前,还包括:对所述待识别图像中的待识别对象进行姿态校正。
- 根据权利要求5所述的行人再识别方法,其中,所述对所述待识别图像中的待识别对象进行姿态校正,包括:获取目标视域场景下的目标域图像训练集;基于仿射变换参数将所述目标域图像训练集生成仿射变换图像样本集并将生成的图像样本与原图像样本配对,获得配对样本;基于所述配对样本和所述仿射变换参数训练空间变换网络,直至对应的损失函数满足收敛条件,得到训练后的用于行人姿态对齐的空间变换网络;基于训练后的所述空间变换网络对所述待识别图像中的待识别对象进 行姿态校正。
- 一种行人再识别装置,包括获取模块和处理模块,其中,所述获取模块,被配置为获取目标视域场景下的待识别图像,所述待识别图像包括待识别对象;所述处理模块,被配置为基于训练后的用于行人再识别的神经网络对所述待识别图像进行特征提取和匹配,得到与所述待识别对象对应的识别结果;其中,所述神经网络的训练样本包括将其它视域场景下的源域图像转换至所述目标视域场景后得到的目标域图像及所述目标域图像中所包含的对象的类别。
- 根据权利要求7所述的行人再识别装置,其中,还包括训练模块,所述训练模块被配置为获取第一训练样本,所述第一训练样本包括目标对象在其它视域场景下的源域图像;将所述第一训练样本输入训练后的生成对抗网络进行风格转换,得到所述目标视域场景下的目标域图像;根据携带所包含的目标对象的身份信息标注的所述目标域图像形成第二训练样本;将所述第二训练样本输入所述神经网络模型进行迭代训练,直至所述神经网络模型的损失函数满足收敛条件,得到所述训练后的用于行人再识别的神经网络。
- 根据权利要求8所述的行人再识别装置,其中,所述训练模块还被配置为获取包括目标对象在所述目标视域场景下的原始目标域图像,将携带有所包含的目标对象的身份信息标注的所述原始目标域图像作为所述第二训练样本的一部分。
- 根据权利要求8所述的行人再识别装置,其中,所述生成对抗网络包括生成网络和识别网络,所述训练模块包括:生成网络训练单元,被配置为获取其他视域场景下的源域图像;将所述源域图像输入所述生成网络进行训练,获得对应的输出图像;其中,所 述源域图像与所述对应的输出图像分别对应不同的场景风格;识别网络训练单元,被配置为获取目标视域场景下的目标域图像及所述目标域图像对应的场景标签;将所述输出图像、所述目标域图像及所述目标域图像对应的场景标签输入所述识别网络进行训练,确定所述输出图像与所述目标域图像的场景识别结果;收敛单元,被配置为通过对所述生成网络和所述识别网络进行单独交替迭代训练直至设置的损失函数满足收敛条件,获得所述训练后的生成对抗网络。
- 根据权利要求7所述的行人再识别装置,其中,还包括姿态校正模块,所述姿态校正模块被配置为对所述待识别图像中的待识别对象进行姿态校正。
- 根据权利要求11所述的行人再识别装置,其中,所述姿态校正模块包括:空间变换网络训练单元,被配置为获取目标视域场景下的目标域图像训练集;基于仿射变换参数将所述目标域图像训练集生成仿射变换图像样本集并将生成的图像样本与原图像样本配对,获得配对样本;基于所述配对样本和所述仿射变换参数训练空间变换网络,直至对应的损失函数满足收敛条件,得到训练后的用于行人姿态对齐的空间变换网络;校正单元,被配置为基于训练后的所述空间变换网络对所述待识别图像中的待识别对象进行姿态校正。
- 一种行人再识别装置,包括:处理器和用于存储能够在处理器上运行的计算机程序的存储器;其中,所述处理器用于运行所述计算机程序时,实现权利要求1至6任一项所述的行人再识别方法。
- 一种计算机存储介质,所述计算机存储介质中存储有计算机程序, 其中,所述计算机程序被处理器执行时实现权利要求1至6中任一项所述行人再识别方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910213845.1A CN111723611A (zh) | 2019-03-20 | 2019-03-20 | 行人再识别方法、装置及存储介质 |
CN201910213845.1 | 2019-03-20 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020186914A1 true WO2020186914A1 (zh) | 2020-09-24 |
Family
ID=72519630
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/071499 WO2020186914A1 (zh) | 2019-03-20 | 2020-01-10 | 行人再识别方法、装置及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111723611A (zh) |
WO (1) | WO2020186914A1 (zh) |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112347957A (zh) * | 2020-11-12 | 2021-02-09 | 广联达科技股份有限公司 | 行人重识别方法、装置、计算机设备及存储介质 |
CN112530003A (zh) * | 2020-12-11 | 2021-03-19 | 北京奇艺世纪科技有限公司 | 一种三维人手重建方法、装置及电子设备 |
CN112633071A (zh) * | 2020-11-30 | 2021-04-09 | 之江实验室 | 基于数据风格解耦内容迁移的行人重识别数据域适应方法 |
CN112686176A (zh) * | 2020-12-30 | 2021-04-20 | 深圳云天励飞技术股份有限公司 | 目标重识别方法、模型训练方法、装置、设备及存储介质 |
CN112749758A (zh) * | 2021-01-21 | 2021-05-04 | 北京百度网讯科技有限公司 | 图像处理方法、神经网络的训练方法、装置、设备和介质 |
CN112766353A (zh) * | 2021-01-13 | 2021-05-07 | 南京信息工程大学 | 一种加强局部注意的双分支车辆再识别方法 |
CN112861811A (zh) * | 2021-03-24 | 2021-05-28 | 北京百度网讯科技有限公司 | 目标识别方法、装置、设备、存储介质及雷达 |
CN112946567A (zh) * | 2021-01-26 | 2021-06-11 | 西安电子科技大学 | 基于域对抗神经网络的运动目标指纹室内定位方法 |
CN113111706A (zh) * | 2021-03-04 | 2021-07-13 | 西北工业大学 | 一种面向方位角连续缺失的sar目标特征解缠与识别方法 |
CN113221770A (zh) * | 2021-05-18 | 2021-08-06 | 青岛根尖智能科技有限公司 | 基于多特征混合学习的跨域行人重识别方法及系统 |
CN113221642A (zh) * | 2021-04-02 | 2021-08-06 | 哈尔滨鹏博普华科技发展有限责任公司 | 一种违章抓拍图像ai识别系统 |
CN113642499A (zh) * | 2021-08-23 | 2021-11-12 | 中国人民解放军火箭军工程大学 | 基于计算机视觉的人体行为识别方法 |
CN113657397A (zh) * | 2021-08-17 | 2021-11-16 | 北京百度网讯科技有限公司 | 循环生成网络模型的训练方法、建立字库的方法和装置 |
CN113657254A (zh) * | 2021-08-16 | 2021-11-16 | 浙江大学 | 一种基于可靠价值样本和新身份样本挖掘的行人重识别域适应方法 |
CN113723429A (zh) * | 2021-08-30 | 2021-11-30 | 广州极飞科技股份有限公司 | 基于模型优化迭代的区域边界识别方法和装置 |
CN113780330A (zh) * | 2021-04-13 | 2021-12-10 | 北京沃东天骏信息技术有限公司 | 图像校正方法及装置、计算机存储介质、电子设备 |
CN113792576A (zh) * | 2021-07-27 | 2021-12-14 | 北京邮电大学 | 基于有监督域适应的人体行为识别方法、电子设备 |
CN113837256A (zh) * | 2021-09-15 | 2021-12-24 | 深圳市商汤科技有限公司 | 对象识别方法、网络的训练方法及装置、设备及介质 |
CN113849647A (zh) * | 2021-09-28 | 2021-12-28 | 平安科技(深圳)有限公司 | 对话身份识别方法、装置、设备及存储介质 |
CN113869193A (zh) * | 2021-09-26 | 2021-12-31 | 平安科技(深圳)有限公司 | 行人再识别模型的训练方法、行人再识别方法及系统 |
CN114399505A (zh) * | 2022-03-25 | 2022-04-26 | 江苏智云天工科技有限公司 | 工业检测中的检测方法、检测装置 |
CN114529946A (zh) * | 2022-02-23 | 2022-05-24 | 厦门市美亚柏科信息股份有限公司 | 基于自监督学习的行人重识别方法、装置、设备及存储介质 |
CN114612925A (zh) * | 2020-12-04 | 2022-06-10 | 浙江宇视科技有限公司 | 行人识别方法、装置、电子设备和存储介质 |
CN114861731A (zh) * | 2022-05-26 | 2022-08-05 | 中国科学技术大学 | 一种可跨场景通用的肌电模式识别方法 |
CN115147871A (zh) * | 2022-07-19 | 2022-10-04 | 北京龙智数科科技服务有限公司 | 遮挡环境下行人再识别方法 |
CN115205903A (zh) * | 2022-07-27 | 2022-10-18 | 华中农业大学 | 一种基于身份迁移生成对抗网络的行人重识别方法 |
CN117649677A (zh) * | 2022-08-20 | 2024-03-05 | 南京视察者智能科技有限公司 | 一种基于3d生成对抗网络的域适应行人重识别方法 |
CN117690009A (zh) * | 2023-12-13 | 2024-03-12 | 中国水产科学研究院渔业机械仪器研究所 | 一种适用于水下柔性可移动目标的小样本数据扩增方法 |
WO2024152267A1 (zh) * | 2023-01-18 | 2024-07-25 | 康佳集团股份有限公司 | 一种基于多度量的行人重识别方法、装置及终端 |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112507941B (zh) * | 2020-12-17 | 2024-05-10 | 中国矿业大学 | 面向矿井ai视频分析的跨视域行人重识别方法及装置 |
CN113177920B (zh) * | 2021-04-29 | 2022-08-09 | 宁波智能装备研究院有限公司 | 一种模式生物追踪系统的目标重识别方法及系统 |
KR102594694B1 (ko) * | 2021-07-05 | 2023-10-26 | 서울과학기술대학교 산학협력단 | 다수의 카메라에서 촬영된 영상에서 동일인물 인식 방법 및 이를 수행하기 위한 기록 매체 및 장치 |
CN115620090A (zh) * | 2022-11-07 | 2023-01-17 | 中电科新型智慧城市研究院有限公司 | 模型训练方法、低照度目标重识别方法和装置、终端设备 |
CN116206332B (zh) * | 2023-01-31 | 2023-08-08 | 北京数美时代科技有限公司 | 一种基于姿态估计的行人重识别方法、系统和存储介质 |
CN116188919B (zh) * | 2023-04-25 | 2023-07-14 | 之江实验室 | 一种测试方法、装置、可读存储介质及电子设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108564066A (zh) * | 2018-04-28 | 2018-09-21 | 国信优易数据有限公司 | 一种人物识别模型训练方法以及人物识别方法 |
CN109002761A (zh) * | 2018-06-13 | 2018-12-14 | 中山大学新华学院 | 一种基于深度卷积神经网络的行人重识别监控系统 |
CN109117823A (zh) * | 2018-08-31 | 2019-01-01 | 常州大学 | 一种基于多层神经网络的跨场景行人重识别的方法 |
CN109271895A (zh) * | 2018-08-31 | 2019-01-25 | 西安电子科技大学 | 基于多尺度特征学习和特征分割的行人重识别方法 |
CN109299707A (zh) * | 2018-10-30 | 2019-02-01 | 天津师范大学 | 一种基于模糊深度聚类的无监督行人再识别方法 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10503981B2 (en) * | 2017-06-27 | 2019-12-10 | Canon Kabushiki Kaisha | Method and apparatus for determining similarity of objects in images |
CN107273872B (zh) * | 2017-07-13 | 2020-05-05 | 北京大学深圳研究生院 | 用于图像或视频中行人重识别的深度判别网络模型方法 |
CN108256439A (zh) * | 2017-12-26 | 2018-07-06 | 北京大学 | 一种基于循环生成式对抗网络的行人图像生成方法及系统 |
CN107977656A (zh) * | 2017-12-26 | 2018-05-01 | 北京大学 | 一种行人重识别方法及系统 |
-
2019
- 2019-03-20 CN CN201910213845.1A patent/CN111723611A/zh active Pending
-
2020
- 2020-01-10 WO PCT/CN2020/071499 patent/WO2020186914A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108564066A (zh) * | 2018-04-28 | 2018-09-21 | 国信优易数据有限公司 | 一种人物识别模型训练方法以及人物识别方法 |
CN109002761A (zh) * | 2018-06-13 | 2018-12-14 | 中山大学新华学院 | 一种基于深度卷积神经网络的行人重识别监控系统 |
CN109117823A (zh) * | 2018-08-31 | 2019-01-01 | 常州大学 | 一种基于多层神经网络的跨场景行人重识别的方法 |
CN109271895A (zh) * | 2018-08-31 | 2019-01-25 | 西安电子科技大学 | 基于多尺度特征学习和特征分割的行人重识别方法 |
CN109299707A (zh) * | 2018-10-30 | 2019-02-01 | 天津师范大学 | 一种基于模糊深度聚类的无监督行人再识别方法 |
Cited By (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112347957A (zh) * | 2020-11-12 | 2021-02-09 | 广联达科技股份有限公司 | 行人重识别方法、装置、计算机设备及存储介质 |
CN112633071A (zh) * | 2020-11-30 | 2021-04-09 | 之江实验室 | 基于数据风格解耦内容迁移的行人重识别数据域适应方法 |
CN114612925A (zh) * | 2020-12-04 | 2022-06-10 | 浙江宇视科技有限公司 | 行人识别方法、装置、电子设备和存储介质 |
CN112530003A (zh) * | 2020-12-11 | 2021-03-19 | 北京奇艺世纪科技有限公司 | 一种三维人手重建方法、装置及电子设备 |
CN112530003B (zh) * | 2020-12-11 | 2023-10-27 | 北京奇艺世纪科技有限公司 | 一种三维人手重建方法、装置及电子设备 |
CN112686176A (zh) * | 2020-12-30 | 2021-04-20 | 深圳云天励飞技术股份有限公司 | 目标重识别方法、模型训练方法、装置、设备及存储介质 |
CN112686176B (zh) * | 2020-12-30 | 2024-05-07 | 深圳云天励飞技术股份有限公司 | 目标重识别方法、模型训练方法、装置、设备及存储介质 |
CN112766353A (zh) * | 2021-01-13 | 2021-05-07 | 南京信息工程大学 | 一种加强局部注意的双分支车辆再识别方法 |
CN112766353B (zh) * | 2021-01-13 | 2023-07-21 | 南京信息工程大学 | 一种加强局部注意的双分支车辆再识别方法 |
CN112749758A (zh) * | 2021-01-21 | 2021-05-04 | 北京百度网讯科技有限公司 | 图像处理方法、神经网络的训练方法、装置、设备和介质 |
CN112749758B (zh) * | 2021-01-21 | 2023-08-11 | 北京百度网讯科技有限公司 | 图像处理方法、神经网络的训练方法、装置、设备和介质 |
CN112946567A (zh) * | 2021-01-26 | 2021-06-11 | 西安电子科技大学 | 基于域对抗神经网络的运动目标指纹室内定位方法 |
CN112946567B (zh) * | 2021-01-26 | 2023-05-02 | 西安电子科技大学 | 基于域对抗神经网络的运动目标指纹室内定位方法 |
CN113111706B (zh) * | 2021-03-04 | 2024-02-02 | 西北工业大学 | 一种面向方位角连续缺失的sar目标特征解缠与识别方法 |
CN113111706A (zh) * | 2021-03-04 | 2021-07-13 | 西北工业大学 | 一种面向方位角连续缺失的sar目标特征解缠与识别方法 |
CN112861811B (zh) * | 2021-03-24 | 2023-08-01 | 北京百度网讯科技有限公司 | 目标识别方法、装置、设备、存储介质及雷达 |
CN112861811A (zh) * | 2021-03-24 | 2021-05-28 | 北京百度网讯科技有限公司 | 目标识别方法、装置、设备、存储介质及雷达 |
CN113221642A (zh) * | 2021-04-02 | 2021-08-06 | 哈尔滨鹏博普华科技发展有限责任公司 | 一种违章抓拍图像ai识别系统 |
CN113221642B (zh) * | 2021-04-02 | 2024-04-05 | 哈尔滨鹏博普华科技发展有限责任公司 | 一种违章抓拍图像ai识别系统 |
CN113780330A (zh) * | 2021-04-13 | 2021-12-10 | 北京沃东天骏信息技术有限公司 | 图像校正方法及装置、计算机存储介质、电子设备 |
CN113221770A (zh) * | 2021-05-18 | 2021-08-06 | 青岛根尖智能科技有限公司 | 基于多特征混合学习的跨域行人重识别方法及系统 |
CN113221770B (zh) * | 2021-05-18 | 2024-06-04 | 青岛根尖智能科技有限公司 | 基于多特征混合学习的跨域行人重识别方法及系统 |
CN113792576A (zh) * | 2021-07-27 | 2021-12-14 | 北京邮电大学 | 基于有监督域适应的人体行为识别方法、电子设备 |
CN113657254B (zh) * | 2021-08-16 | 2023-09-19 | 浙江大学 | 一种基于可靠价值样本和新身份样本挖掘的行人重识别域适应方法 |
CN113657254A (zh) * | 2021-08-16 | 2021-11-16 | 浙江大学 | 一种基于可靠价值样本和新身份样本挖掘的行人重识别域适应方法 |
CN113657397B (zh) * | 2021-08-17 | 2023-07-11 | 北京百度网讯科技有限公司 | 循环生成网络模型的训练方法、建立字库的方法和装置 |
CN113657397A (zh) * | 2021-08-17 | 2021-11-16 | 北京百度网讯科技有限公司 | 循环生成网络模型的训练方法、建立字库的方法和装置 |
CN113642499B (zh) * | 2021-08-23 | 2024-05-24 | 中国人民解放军火箭军工程大学 | 基于计算机视觉的人体行为识别方法 |
CN113642499A (zh) * | 2021-08-23 | 2021-11-12 | 中国人民解放军火箭军工程大学 | 基于计算机视觉的人体行为识别方法 |
CN113723429A (zh) * | 2021-08-30 | 2021-11-30 | 广州极飞科技股份有限公司 | 基于模型优化迭代的区域边界识别方法和装置 |
CN113837256A (zh) * | 2021-09-15 | 2021-12-24 | 深圳市商汤科技有限公司 | 对象识别方法、网络的训练方法及装置、设备及介质 |
CN113869193A (zh) * | 2021-09-26 | 2021-12-31 | 平安科技(深圳)有限公司 | 行人再识别模型的训练方法、行人再识别方法及系统 |
CN113849647B (zh) * | 2021-09-28 | 2024-05-31 | 平安科技(深圳)有限公司 | 对话身份识别方法、装置、设备及存储介质 |
CN113849647A (zh) * | 2021-09-28 | 2021-12-28 | 平安科技(深圳)有限公司 | 对话身份识别方法、装置、设备及存储介质 |
CN114529946A (zh) * | 2022-02-23 | 2022-05-24 | 厦门市美亚柏科信息股份有限公司 | 基于自监督学习的行人重识别方法、装置、设备及存储介质 |
CN114399505A (zh) * | 2022-03-25 | 2022-04-26 | 江苏智云天工科技有限公司 | 工业检测中的检测方法、检测装置 |
CN114861731B (zh) * | 2022-05-26 | 2024-04-02 | 中国科学技术大学 | 一种可跨场景通用的肌电模式识别方法 |
CN114861731A (zh) * | 2022-05-26 | 2022-08-05 | 中国科学技术大学 | 一种可跨场景通用的肌电模式识别方法 |
CN115147871A (zh) * | 2022-07-19 | 2022-10-04 | 北京龙智数科科技服务有限公司 | 遮挡环境下行人再识别方法 |
CN115147871B (zh) * | 2022-07-19 | 2024-06-11 | 北京龙智数科科技服务有限公司 | 遮挡环境下行人再识别方法 |
CN115205903A (zh) * | 2022-07-27 | 2022-10-18 | 华中农业大学 | 一种基于身份迁移生成对抗网络的行人重识别方法 |
CN117649677A (zh) * | 2022-08-20 | 2024-03-05 | 南京视察者智能科技有限公司 | 一种基于3d生成对抗网络的域适应行人重识别方法 |
WO2024152267A1 (zh) * | 2023-01-18 | 2024-07-25 | 康佳集团股份有限公司 | 一种基于多度量的行人重识别方法、装置及终端 |
CN117690009A (zh) * | 2023-12-13 | 2024-03-12 | 中国水产科学研究院渔业机械仪器研究所 | 一种适用于水下柔性可移动目标的小样本数据扩增方法 |
Also Published As
Publication number | Publication date |
---|---|
CN111723611A (zh) | 2020-09-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020186914A1 (zh) | 行人再识别方法、装置及存储介质 | |
Song et al. | Discriminative representation combinations for accurate face spoofing detection | |
Tang et al. | Cross-camera knowledge transfer for multiview people counting | |
US8805018B2 (en) | Method of detecting facial attributes | |
Pang et al. | Classifying discriminative features for blur detection | |
CN105740780B (zh) | 人脸活体检测的方法和装置 | |
JP6428266B2 (ja) | 色補正装置、色補正方法および色補正用プログラム | |
Pang et al. | Incremental learning with saliency map for moving object detection | |
Obinata et al. | Temporal extension module for skeleton-based action recognition | |
CN112906545B (zh) | 一种针对多人场景的实时动作识别方法及系统 | |
US9330312B2 (en) | Multispectral detection of personal attributes for video surveillance | |
TW200910223A (en) | Image processing apparatus and image processing method | |
Fang et al. | Surveillance face anti-spoofing | |
CN107862240A (zh) | 一种多摄像头协同的人脸追踪方法 | |
Ban et al. | Tiny and blurred face alignment for long distance face recognition | |
US20170147874A1 (en) | Methods and systems for generating a three dimensional representation of a human body shape | |
CN109063776A (zh) | 图像再识别网络训练方法、装置和图像再识别方法及装置 | |
US20110182497A1 (en) | Cascade structure for classifying objects in an image | |
CN111582027A (zh) | 身份认证方法、装置、计算机设备和存储介质 | |
Allaert et al. | Optical flow techniques for facial expression analysis: Performance evaluation and improvements | |
Nalty et al. | A brief survey on person recognition at a distance | |
Képešiová et al. | An effective face detection algorithm | |
de Oliveira et al. | A fast eye localization and verification method to improve face matching in surveillance videos | |
CN112818808B (zh) | 一种结合两个向量嵌入空间的高精度步态识别方法 | |
Reddy et al. | Facial Recognition Enhancement Using Deep Learning Techniques |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20774413 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 04.02.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20774413 Country of ref document: EP Kind code of ref document: A1 |