CN111126250A - Pedestrian re-identification method and device based on PTGAN - Google Patents

Pedestrian re-identification method and device based on PTGAN Download PDF

Info

Publication number
CN111126250A
CN111126250A CN201911327963.1A CN201911327963A CN111126250A CN 111126250 A CN111126250 A CN 111126250A CN 201911327963 A CN201911327963 A CN 201911327963A CN 111126250 A CN111126250 A CN 111126250A
Authority
CN
China
Prior art keywords
image
pedestrian
ptgan
camera
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911327963.1A
Other languages
Chinese (zh)
Inventor
张斯尧
王思远
谢喜林
张�诚
黄晋
文戎
田磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha Qianshitong Intelligent Technology Co ltd
Original Assignee
Changsha Qianshitong Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha Qianshitong Intelligent Technology Co ltd filed Critical Changsha Qianshitong Intelligent Technology Co ltd
Priority to CN201911327963.1A priority Critical patent/CN111126250A/en
Publication of CN111126250A publication Critical patent/CN111126250A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Abstract

The invention discloses a pedestrian re-identification method and device based on PTGAN, wherein the method comprises the following steps: acquiring a first image which is acquired by a first camera and contains a target object; inputting the first image into a trained PTGAN model, and realizing the migration of a background difference area on the premise of realizing the unchanged foreground of the pedestrian to obtain a second image with the same style as the image shot by the second camera; extracting the pedestrian features of the second image; and calculating the similarity between the pedestrian feature vector extracted from the second identification image and the pedestrian image feature vector shot by the second camera according to the cosine distance, and acquiring the pedestrian image with the highest similarity with the target object according to the similarity. The invention solves the problems of high cross-camera retrieval difficulty and low re-identification accuracy of the pedestrian re-identification method in the prior art.

Description

Pedestrian re-identification method and device based on PTGAN
Technical Field
The invention relates to the technical field of computer vision and smart cities, in particular to a pedestrian re-identification method and device based on PTGAN, terminal equipment and a computer readable medium.
Background
With the continuous development of artificial intelligence, computer vision and hardware technology, video image processing technology has been widely applied to intelligent city systems.
Pedestrian Re-identification (Person Re-identification) is also called pedestrian Re-identification, abbreviated Re-ID. The method is a technology for judging whether a specific pedestrian exists in an image or a video sequence by utilizing a computer vision technology. Is widely considered as a sub-problem for image retrieval. Given a monitored pedestrian image, the pedestrian image is retrieved across the device. Due to the difference between different camera devices and the characteristic of rigidity and flexibility of pedestrians, the appearance is easily affected by wearing, size, shielding, posture, visual angle and the like, so that the pedestrian re-identification becomes a hot topic which has research value and is very challenging in the field of computer vision.
Currently, although the detection capability of pedestrian re-identification has been significantly improved, many challenging problems have not been completely solved in practical situations: for example, the search across cameras is often difficult, and due to the difference in fields and different cameras having different styles, such as background, lighting conditions, camera parameters, etc., it is difficult to search the pedestrian pictures captured by the camera a in the camera B, and the re-recognition accuracy is low.
Disclosure of Invention
In view of the above, the present invention provides a pedestrian re-identification method, apparatus, terminal device and computer readable medium based on PTGAN, which can improve the accuracy of pedestrian re-identification under different cameras, and solve the problems of difficult cross-camera search and low re-identification accuracy of the pedestrian re-identification method in the prior art.
The first aspect of the embodiment of the invention provides a pedestrian re-identification method based on PTGAN, which comprises the following steps:
acquiring a first image which is acquired by a first camera and contains a target object;
inputting the first image into a trained PTGAN model, and realizing the migration of a background difference region on the premise of keeping the foreground of the pedestrian unchanged to obtain a second image with the same style as the image shot by the second camera;
extracting pedestrian features of the second image;
and calculating the similarity between the pedestrian feature vector extracted from the second identification image and the pedestrian image feature vector shot by the second camera according to the cosine distance, and acquiring the pedestrian image with the highest similarity with the target object according to the similarity.
Further, before inputting the first image into the trained PTGAN model, the method further comprises:
constructing a network model based on PTGAN;
taking the video image acquired by the first camera and the video image acquired by the second camera as parameter values of target parameters for training a network model based on the PTGAN in a training set, and converting the video image acquired by the first camera into an image with the same style as the video image acquired by the second camera through training and iterative feedback;
wherein the loss function expression of the PTGAN-based network model is as follows:
Figure BDA0002328866810000021
in the formula LStyleRepresenting a loss of generated style or regional differences, LIDRepresenting a loss of identity of the generated image. Lambda [ alpha ]1Is to balance LStyleAnd LIDThe weight of (c).
Further, after constructing the PTGAN-based network model, the method further comprises:
performing foreground segmentation on the first video image sequence by using PSPNet to obtain a mask layer area, wherein the identity loss L isIDThe expression of (a) is shown as:
Figure BDA0002328866810000022
wherein G (a) is a pedestrian image transferred in the image a,
Figure BDA0002328866810000023
is the pedestrian image that is shifted in the image b,
Figure BDA0002328866810000024
for the data distribution of the video image captured by the first camera,
Figure BDA0002328866810000025
for the data distribution of the video image acquired by the second camera, m (a) and m (b) are two segmented mask regions.
Further, the extracting the pedestrian feature of the second image includes:
extracting appearance characteristics of the second image based on the trained AlexNet model;
facial features of the second image are extracted based on the trained VGG-16 model.
A second aspect of an embodiment of the present invention provides a pedestrian re-identification apparatus based on PTGAN, including:
the acquisition module is used for acquiring a first image which is acquired by the first camera and contains a target object;
the PTGAN module is used for inputting the first image into a trained PTGAN model, and realizing the migration of a background difference region on the premise of keeping the foreground of a pedestrian unchanged to obtain a second image with the same style as the image shot by the second camera;
the feature extraction module is used for extracting the pedestrian features of the second image;
and the identification module is used for calculating the similarity between the pedestrian feature vector extracted from the second identification image and the pedestrian image feature vector shot by the second camera according to the cosine distance, and acquiring the pedestrian image with the highest similarity with the target object according to the similarity.
Further, the apparatus further comprises:
the PTGAN construction module is used for constructing a network model based on the PTGAN;
the PTGAN training module is used for taking the video image acquired by the first camera and the video image acquired by the second camera as parameter values of target parameters of a network model based on the PTGAN trained by a training set, and converting the video image acquired by the first camera into an image with the same style as the video image acquired by the second camera through training and iterative feedback;
wherein the loss function expression of the PTGAN-based network model is as follows:
Figure BDA0002328866810000031
in the formula LStyleRepresenting a loss of generated style or regional differences, LIDRepresenting a loss of identity of the generated image. Lambda [ alpha ]1Is to balance LStyleAnd LIDThe weight of (c).
Further, the apparatus further comprises:
a foreground segmentation module for performing foreground segmentation on the first video image sequence using PSPNet to obtain a mask layer region, the identity loss LIDThe expression of (a) is shown as:
Figure BDA0002328866810000032
wherein G (a) is a pedestrian image transferred in the image a,
Figure BDA0002328866810000033
is the pedestrian image that is shifted in the image b,
Figure BDA0002328866810000034
for the data distribution of the video image captured by the first camera,
Figure BDA0002328866810000035
for the data distribution of the video image acquired by the second camera, m (a) and m (b) are two segmented mask regions.
Further, the feature extraction module comprises:
the appearance characteristic module is used for extracting appearance characteristics of the second image based on the trained AlexNet model;
and the facial feature module is used for extracting the facial features of the second image based on the trained VGG-16 model.
A third aspect of the embodiments of the present invention provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the PTGAN-based pedestrian re-identification method when executing the computer program.
A fourth aspect of the embodiments of the present invention provides a computer-readable medium, which stores a computer program that, when being processed and executed, implements the steps of the above-mentioned PTGAN-based pedestrian re-identification method.
In the embodiment of the invention, the first image acquired by the first camera is input into the trained PTGAN model, so that the migration of the background difference area is realized on the premise that the foreground of the pedestrian is not changed, and the second image with the same style as the image shot by the second camera can be obtained, thereby improving the accuracy of pedestrian re-identification under different cameras, and solving the problem that the image shot in one camera is difficult to search in the other camera due to the field difference or the different styles of the cameras in the prior art.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart of a pedestrian re-identification method based on PTGAN according to an embodiment of the present invention;
FIG. 2 is a comparison graph of real-time conversion effects of different pedestrian re-identification methods provided by the embodiment of the invention;
fig. 3 is a schematic structural diagram of a pedestrian re-identification apparatus based on PTGAN according to an embodiment of the present invention;
FIG. 4 is a detailed structure diagram of a feature extraction module provided in an embodiment of the present invention;
fig. 5 is a schematic diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
Referring to fig. 1, fig. 1 is a flowchart illustrating a pedestrian re-identification method based on PTGAN according to an embodiment of the present invention. As shown in fig. 1, the PTGAN-based pedestrian re-identification method of the present embodiment includes the following steps:
step S102, acquiring a first image which is acquired by a first camera and contains a target object;
step S104, inputting the first image into a trained PTGAN model, and realizing the migration of a background difference area on the premise of unchanging the foreground of a pedestrian to obtain a second image with the same style as the image shot by the second camera;
further, before inputting the first image into the trained PTGAN model, the method further comprises:
constructing a network model based on PTGAN;
taking the video image acquired by the first camera and the video image acquired by the second camera as parameter values of target parameters for training a network model based on the PTGAN in a training set, and converting the video image acquired by the first camera into an image with the same style as the video image acquired by the second camera through training and iterative feedback;
ptgan (person Transfer gan) is a generative countermeasure network aimed at Re-identifying Re-ID problems. In the invention, the biggest characteristic of the PTGAN is to realize the migration of the difference of the background area on the premise of ensuring the unchanged foreground of the pedestrian as much as possible. First, the loss function of the PTGAN network consists of two parts:
Figure BDA0002328866810000041
wherein L isStyleRepresenting the loss of the generated style, or domain difference loss, is whether the generated image resembles a new dataset style. L isIDThe loss of identity representing the generated image is to verify that the generated image is the same person as the original image. λ there1Is a weight that balances the two losses. These two losses are defined as follows:
firstly, the Loss function (Loss) of the PTGAN is divided into two parts; the first part is LStyleThe concrete formula is as follows:
Figure BDA0002328866810000051
wherein the content of the first and second substances,
Figure BDA0002328866810000058
represents a loss of resistance in the standard,
Figure BDA0002328866810000052
representing a loss of periodic consistency, A, B is a two frame GAN processed image, let G be the image a to B style mapping function,
Figure BDA0002328866810000053
for the style mapping function of B to a, λ 2 is the weight of segmentation loss and identity loss.
The above parts are all normal losses of PTGAN in order to ensure that the difference area (domain) of the generated picture and the desired data set is the same.
Secondly, in order to ensure that the foreground is not changed in the process of image migration, a foreground segmentation is firstly carried out on the video image by using the PSPNet to obtain a mask (mask layer) area. Generally, conventional generation of countermeasure networks such as CycleGAN is not used for Re-ID tasks, and therefore there is no need to ensure that the identity information of the foreground object is unchanged, with the result that the foreground may be of poor quality such as blurred, and worse, the appearance of pedestrians may change. To solve this problem, the present invention proposes LIDLoss, foreground extracted by PSPNet, this foreground is a mask, and the final loss of identity information is:
Figure BDA0002328866810000054
wherein, M (a) and M (b) are two divided foreground mask layers, and the identity information Loss function (Loss) can restrain the foreground of the pedestrian to keep unchanged as much as possible in the migration process.
Wherein G (a) is a pedestrian image transferred in the image a,
Figure BDA0002328866810000055
is the pedestrian image that is shifted in the image b,
Figure BDA0002328866810000056
is a distribution of the data of a,
Figure BDA0002328866810000057
is the data distribution of B, M (a) and M (B) areTwo divided mask areas.
Fig. 2 shows a comparison graph of real-time conversion effects of different pedestrian re-identification methods, wherein the first row of pictures is pictures to be converted, and the fourth row shows the result of PTGAN conversion, and it can be seen that the image quality generated by PTGAN is higher than that of the third row of pictures using Cycle-GAN conversion results. For example, the appearance of the person remains the same and the style is effectively transferred to another camera. Shadows, road markings and backgrounds are automatically generated, similar to the effect captured by another camera. Meanwhile, PTGAN can handle the noise segmentation result generated by PSPNet well. The algorithm provided by the invention can intuitively ensure the identity information of the pedestrian compared with the traditional annular generation countermeasure network (cycleGAN).
And step S106, extracting the pedestrian feature of the second image.
Appearance-based attributes are first extracted from human detection, which capture the traits and characteristics of an individual in the form of appearance. Common to the image representations is the Convolutional Neural Network (CNN). The present invention uses an AlexNet model pre-trained on ImageNet as an extractor of appearance characteristics. This is done by removing the top output layer and using the activation of the last fully connected layer as a feature (length 4096). The AlexNet architecture includes five convolutional layers, three fully connected layers, and three largest pool layers immediately following the first, second, and fifth convolutional layers. The first convolution layer has 96 filters of size 11 x 11, the second layer 256 filters of size 5 x 5, the third, fourth and fifth layers are connected to each other without any interference pool and have 384/384 and 256 filters of size 3 x 3, respectively. Fully connected layer L learning nonlinear function
Figure BDA0002328866810000061
Wherein
Figure BDA0002328866810000062
W and b are input data XiHas respective weights and offsets, and f is a corrective linear unit that activates the hidden layer.
Further, facial features are extracted, and biometric identification of human faces is an established biometric identification technology for identity identification and verification. The face morphology can be used for re-recognition because it is essentially a non-contact biometric and can be extracted remotely. The invention extracts facial features from the facial bounding box using a VGG-16 model pre-trained on ImageNet. This is done by removing the top output layer and using the activation of the last fully connected layer as a facial feature (length 4096). VGG-16 is a convolutional neural network, the structure of which is composed of 13 convolutional layers and 3 fully-connected layers, and the filter size is 3 x 3. The pool will be applied between convolution layers with a 2 x 2 pixel window, with a step of 2. The average subtraction of the training set is used as a pre-processing step.
And step S108, calculating the similarity between the pedestrian feature vector extracted from the second identification image and the pedestrian image feature vector shot by the second camera according to the cosine distance, and acquiring the pedestrian image with the highest similarity with the target object according to the similarity.
And calculating the similarity by adopting the cosine distance, wherein the cosine similarity uses the cosine value of an included angle between two vectors in a vector space as the measure of the difference between the two individuals. Cosine similarity emphasizes the difference of two vectors in direction rather than distance or length, compared to distance measurement. The formula is as follows:
Figure BDA0002328866810000063
wherein, X represents the pedestrian feature vector extracted from the second identification image, and Y represents the pedestrian image feature vector shot by the second camera.
In the embodiment of the invention, the first image acquired by the first camera is input into the trained PTGAN model, so that the migration of the background difference area is realized on the premise that the foreground of the pedestrian is not changed, and the second image with the same style as the image shot by the second camera can be obtained, thereby improving the accuracy of pedestrian re-identification under different cameras, and solving the problem that the image shot in one camera is difficult to search in the other camera due to the field difference or the different styles of the cameras in the prior art.
Referring to fig. 3, fig. 3 is a block diagram illustrating a pedestrian re-identification apparatus based on PTGAN according to an embodiment of the present invention. As shown in fig. 3, the PTGAN-based pedestrian re-identification 20 of the present embodiment includes an acquisition module 202, a PTGAN module 204, a feature extraction module 206, and an identification module 208. The obtaining module 202, the PTGAN module 204, the feature extracting module 206 and the identifying module 208 are respectively configured to perform the specific methods in S102, S104, S106 and S108 in fig. 1, and details can be referred to in the related introduction of fig. 1 and are only briefly described here:
the acquisition module 202 is configured to acquire a first image which includes a target object and is acquired by a first camera;
the PTGAN module 204 is configured to input the first image into a trained PTGAN model, and obtain a second image with the same style as an image shot by the second camera by implementing migration of a background difference region on the premise that a foreground of a pedestrian is not changed;
a feature extraction module 206, configured to extract pedestrian features of the second image;
the identification module 208 is configured to calculate a similarity between the pedestrian feature extracted from the second identification image and a pedestrian image feature vector captured by the second camera according to the cosine distance, and obtain a pedestrian image with the highest similarity to the target object according to the similarity.
Further, the PTGAN-based pedestrian re-recognition apparatus further includes:
the PTGAN construction module is used for constructing a network model based on the PTGAN;
the PTGAN training module is used for taking the video image acquired by the first camera and the video image acquired by the second camera as parameter values of target parameters of a network model based on the PTGAN for training by a training set, and converting the video image acquired by the first camera into an image with the same style as the video image acquired by the second camera through training and iterative feedback;
wherein, the loss function expression of the network model based on the PTGAN is shown as follows:
Figure BDA0002328866810000071
in the formula LStyleRepresenting a loss of generated style or regional differences, LIDRepresenting a loss of identity of the generated image. Lambda [ alpha ]1Is to balance LStyleAnd LIDThe weight of (c).
Further, the PTGAN-based pedestrian re-recognition apparatus further includes:
a foreground segmentation module for performing foreground segmentation on the first video image sequence by using PSPNet to obtain a mask layer region with an identity loss LIDThe expression of (a) is shown as:
Figure BDA0002328866810000072
wherein G (a) is a pedestrian image transferred in the image a,
Figure BDA0002328866810000073
is the pedestrian image that is shifted in the image b,
Figure BDA0002328866810000074
for the data distribution of the video image captured by the first camera,
Figure BDA0002328866810000075
for the data distribution of the video image acquired by the second camera, m (a) and m (b) are two segmented mask regions.
Further, as can be seen in fig. 4, the feature extraction module 206 includes:
an appearance feature module 2061, configured to extract an appearance feature of the second image based on the trained AlexNet model;
a facial feature module 2062, configured to extract facial features of the second image based on the trained VGG-16 model.
In the embodiment of the invention, the first image acquired by the first camera is input into the trained PTGAN model through the PTGAN module 204, the migration of the background difference region is realized on the premise that the foreground of the pedestrian is not changed, and the second image with the same style as the image shot by the second camera can be obtained, so that the accuracy of pedestrian re-identification under different cameras is improved, and the problem that the image shot in one camera is difficult to search in the other camera due to the field difference or the different styles of the cameras in the prior art is solved.
Fig. 5 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 5, the terminal device 10 of this embodiment includes: a processor 100, a memory 101 and a computer program 102 stored in the memory 101 and executable on the processor 100, for example a program for pedestrian re-identification based on PTGAN. The processor 100, when executing the computer program 102, implements the steps in the above-described method embodiments, e.g., the steps of S102, S104, S106, S108 shown in fig. 1. Alternatively, the processor 100, when executing the computer program 102, implements the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the obtaining module 202, the PTGAN module 204, the feature extracting module 206 and the identifying module 208 shown in fig. 5.
Illustratively, the computer program 102 may be partitioned into one or more modules/units that are stored in the memory 101 and executed by the processor 100 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 102 in the terminal device 10. For example, the computer program 102 may be partitioned into an acquisition module 202, a PTGAN module 204, a feature extraction module 206, and a recognition module 208. (modules in the virtual device), the specific functions of each module are as follows:
the acquisition module 202 is configured to acquire a first image which includes a target object and is acquired by a first camera;
the PTGAN module 204 is configured to input the first image into a trained PTGAN model, and obtain a second image with the same style as an image shot by the second camera by implementing migration of a background difference region on the premise that a foreground of a pedestrian is not changed;
a feature extraction module 206, configured to extract pedestrian features of the second image;
the identification module 208 is configured to calculate a similarity between the pedestrian feature extracted from the second identification image and a pedestrian image feature vector captured by the second camera according to the cosine distance, and obtain a pedestrian image with the highest similarity to the target object according to the similarity.
The terminal device 10 may be a computing device such as a desktop computer, a notebook, a palm computer, and a cloud server. Terminal device 10 may include, but is not limited to, a processor 100, a memory 101. Those skilled in the art will appreciate that fig. 5 is merely an example of a terminal device 10 and does not constitute a limitation of terminal device 10 and may include more or fewer components than shown, or some components may be combined, or different components, e.g., the terminal device may also include input-output devices, network access devices, buses, etc.
The Processor 100 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 101 may be an internal storage unit of the terminal device 10, such as a hard disk or a memory of the terminal device 10. The memory 101 may also be an external storage device of the terminal device 10, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 10. Further, the memory 101 may also include both an internal storage unit of the terminal device 10 and an external storage device. The memory 101 is used for storing the computer program and other programs and data required by the terminal device 10. The memory 101 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (10)

1. A pedestrian re-identification method based on PTGAN is characterized by comprising the following steps:
acquiring a first image which is acquired by a first camera and contains a target object;
inputting the first image into a trained PTGAN model, and realizing the migration of a background difference region on the premise of keeping the foreground of the pedestrian unchanged to obtain a second image with the same style as the image shot by the second camera;
extracting pedestrian features of the second image;
and calculating the similarity between the pedestrian feature vector extracted from the second identification image and the pedestrian image feature vector shot by the second camera according to the cosine distance, and acquiring the pedestrian image with the highest similarity with the target object according to the similarity.
2. The PTGAN-based pedestrian re-recognition method of claim 1, wherein prior to inputting the first image into a trained PTGAN model, the method further comprises:
constructing a network model based on PTGAN;
taking the video image acquired by the first camera and the video image acquired by the second camera as parameter values of target parameters for training a network model based on the PTGAN in a training set, and converting the video image acquired by the first camera into an image with the same style as the video image acquired by the second camera through training and iterative feedback;
wherein the loss function expression of the PTGAN-based network model is as follows:
Figure FDA0002328866800000011
in the formula LStyleRepresenting a loss of generated style or regional differences, LIDRepresenting a loss of identity of the generated image. Lambda [ alpha ]1Is to balance LStyleAnd LIDThe weight of (c).
3. The PTGAN-based pedestrian re-identification method according to claim 2, wherein after constructing the PTGAN-based network model, the method further comprises:
performing foreground segmentation on the first video image sequence by using PSPNet to obtain a mask layer area, wherein the identity loss L isIDThe expression of (a) is shown as:
Figure FDA0002328866800000012
wherein G (a) is a pedestrian image transferred in the image a,
Figure FDA0002328866800000013
is the pedestrian image that is shifted in the image b,
Figure FDA0002328866800000014
for the data distribution of the video image captured by the first camera,
Figure FDA0002328866800000015
for the data distribution of the video image acquired by the second camera, m (a) and m (b) are two segmented mask regions.
4. The PTGAN-based pedestrian re-recognition method according to claim 3, wherein the extracting the pedestrian feature of the second image comprises:
extracting appearance characteristics of the second image based on the trained AlexNet model;
facial features of the second image are extracted based on the trained VGG-16 model.
5. A pedestrian re-identification device based on PTGAN is characterized by comprising:
the acquisition module is used for acquiring a first image which is acquired by the first camera and contains a target object;
the PTGAN module is used for inputting the first image into a trained PTGAN model, and realizing the migration of a background difference region on the premise of keeping the foreground of a pedestrian unchanged to obtain a second image with the same style as the image shot by the second camera;
the feature extraction module is used for extracting the pedestrian features of the second image;
and the identification module is used for calculating the similarity between the pedestrian feature vector extracted from the second identification image and the pedestrian image feature vector shot by the second camera according to the cosine distance, and acquiring the pedestrian image with the highest similarity with the target object according to the similarity.
6. The PTGAN-based pedestrian re-identification device according to claim 5, further comprising:
the PTGAN construction module is used for constructing a network model based on the PTGAN;
the PTGAN training module is used for taking the video image acquired by the first camera and the video image acquired by the second camera as parameter values of target parameters of a network model based on the PTGAN trained by a training set, and converting the video image acquired by the first camera into an image with the same style as the video image acquired by the second camera through training and iterative feedback;
wherein the loss function expression of the PTGAN-based network model is as follows:
Figure FDA0002328866800000021
in the formula LStyleRepresenting a loss of generated style or regional differences, LIDRepresenting a loss of identity of the generated image. Lambda [ alpha ]1Is to balance LStyleAnd LIDThe weight of (c).
7. The PTGAN-based pedestrian re-identification device according to claim 6, further comprising:
a foreground segmentation module for performing foreground segmentation on the first video image sequence using PSPNet to obtain a mask layer region, the identity loss LIDThe expression of (a) is shown as:
Figure FDA0002328866800000022
wherein G (a) is a pedestrian image transferred in the image a,
Figure FDA0002328866800000023
is the pedestrian image that is shifted in the image b,
Figure FDA0002328866800000024
for the data distribution of the video image captured by the first camera,
Figure FDA0002328866800000025
for the data distribution of the video image acquired by the second camera, m (a) and m (b) are two segmented mask regions.
8. The PTGAN-based pedestrian re-recognition device according to claim 3, wherein the feature extraction module comprises:
the appearance characteristic module is used for extracting appearance characteristics of the second image based on the trained AlexNet model;
and the facial feature module is used for extracting the facial features of the second image based on the trained VGG-16 model.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-4 when executing the computer program.
10. A computer-readable medium, in which a computer program is stored which, when being processed and executed, carries out the steps of the method according to any one of claims 1 to 4.
CN201911327963.1A 2019-12-20 2019-12-20 Pedestrian re-identification method and device based on PTGAN Pending CN111126250A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911327963.1A CN111126250A (en) 2019-12-20 2019-12-20 Pedestrian re-identification method and device based on PTGAN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911327963.1A CN111126250A (en) 2019-12-20 2019-12-20 Pedestrian re-identification method and device based on PTGAN

Publications (1)

Publication Number Publication Date
CN111126250A true CN111126250A (en) 2020-05-08

Family

ID=70500742

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911327963.1A Pending CN111126250A (en) 2019-12-20 2019-12-20 Pedestrian re-identification method and device based on PTGAN

Country Status (1)

Country Link
CN (1) CN111126250A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112016402A (en) * 2020-08-04 2020-12-01 杰创智能科技股份有限公司 Unsupervised learning-based pedestrian re-identification field self-adaption method and device
CN113221807A (en) * 2021-05-26 2021-08-06 新疆爱华盈通信息技术有限公司 Pedestrian re-identification method and system with multiple cameras
CN114218423A (en) * 2022-02-21 2022-03-22 广东联邦家私集团有限公司 5G-based non-labeling solid wood board identity digitalization method, device and system
CN112016402B (en) * 2020-08-04 2024-05-17 杰创智能科技股份有限公司 Self-adaptive method and device for pedestrian re-recognition field based on unsupervised learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256439A (en) * 2017-12-26 2018-07-06 北京大学 A kind of pedestrian image generation method and system based on cycle production confrontation network
CN109886251A (en) * 2019-03-11 2019-06-14 南京邮电大学 A kind of recognition methods again of pedestrian end to end guiding confrontation study based on posture
CN110110755A (en) * 2019-04-04 2019-08-09 长沙千视通智能科技有限公司 Based on the pedestrian of PTGAN Regional disparity and multiple branches weight recognition detection algorithm and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256439A (en) * 2017-12-26 2018-07-06 北京大学 A kind of pedestrian image generation method and system based on cycle production confrontation network
CN109886251A (en) * 2019-03-11 2019-06-14 南京邮电大学 A kind of recognition methods again of pedestrian end to end guiding confrontation study based on posture
CN110110755A (en) * 2019-04-04 2019-08-09 长沙千视通智能科技有限公司 Based on the pedestrian of PTGAN Regional disparity and multiple branches weight recognition detection algorithm and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112016402A (en) * 2020-08-04 2020-12-01 杰创智能科技股份有限公司 Unsupervised learning-based pedestrian re-identification field self-adaption method and device
CN112016402B (en) * 2020-08-04 2024-05-17 杰创智能科技股份有限公司 Self-adaptive method and device for pedestrian re-recognition field based on unsupervised learning
CN113221807A (en) * 2021-05-26 2021-08-06 新疆爱华盈通信息技术有限公司 Pedestrian re-identification method and system with multiple cameras
CN114218423A (en) * 2022-02-21 2022-03-22 广东联邦家私集团有限公司 5G-based non-labeling solid wood board identity digitalization method, device and system

Similar Documents

Publication Publication Date Title
Zhao et al. Multi-focus image fusion with a natural enhancement via a joint multi-level deeply supervised convolutional neural network
US11830230B2 (en) Living body detection method based on facial recognition, and electronic device and storage medium
CN109543548A (en) A kind of face identification method, device and storage medium
CN110334762B (en) Feature matching method based on quad tree combined with ORB and SIFT
CN112530019B (en) Three-dimensional human body reconstruction method and device, computer equipment and storage medium
CN111754396B (en) Face image processing method, device, computer equipment and storage medium
CN111079764A (en) Low-illumination license plate image recognition method and device based on deep learning
CN111291612A (en) Pedestrian re-identification method and device based on multi-person multi-camera tracking
CN110825900A (en) Training method of feature reconstruction layer, reconstruction method of image features and related device
CN112651380A (en) Face recognition method, face recognition device, terminal equipment and storage medium
CN113011253B (en) Facial expression recognition method, device, equipment and storage medium based on ResNeXt network
CN112528866A (en) Cross-modal face recognition method, device, equipment and storage medium
CN110222718A (en) The method and device of image procossing
CN111080670A (en) Image extraction method, device, equipment and storage medium
CN111833360B (en) Image processing method, device, equipment and computer readable storage medium
CN112507897A (en) Cross-modal face recognition method, device, equipment and storage medium
CN112101195A (en) Crowd density estimation method and device, computer equipment and storage medium
CN111209873A (en) High-precision face key point positioning method and system based on deep learning
CN111814682A (en) Face living body detection method and device
CN111126250A (en) Pedestrian re-identification method and device based on PTGAN
Liu et al. Iris recognition in visible spectrum based on multi-layer analogous convolution and collaborative representation
CN111626212B (en) Method and device for identifying object in picture, storage medium and electronic device
CN111353325A (en) Key point detection model training method and device
CN111104911A (en) Pedestrian re-identification method and device based on big data training
KR20180092453A (en) Face recognition method Using convolutional neural network and stereo image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination