CN112270241B - Pedestrian re-identification method and device, electronic equipment and computer readable storage medium - Google Patents

Pedestrian re-identification method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN112270241B
CN112270241B CN202011139919.0A CN202011139919A CN112270241B CN 112270241 B CN112270241 B CN 112270241B CN 202011139919 A CN202011139919 A CN 202011139919A CN 112270241 B CN112270241 B CN 112270241B
Authority
CN
China
Prior art keywords
pedestrian
probability
visual
calculating
image data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011139919.0A
Other languages
Chinese (zh)
Other versions
CN112270241A (en
Inventor
邓练兵
文少杰
陈小满
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Dahengqin Technology Development Co Ltd
Original Assignee
Zhuhai Dahengqin Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Dahengqin Technology Development Co Ltd filed Critical Zhuhai Dahengqin Technology Development Co Ltd
Priority to CN202011139919.0A priority Critical patent/CN112270241B/en
Publication of CN112270241A publication Critical patent/CN112270241A/en
Application granted granted Critical
Publication of CN112270241B publication Critical patent/CN112270241B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a pedestrian re-identification method, a pedestrian re-identification device, electronic equipment and a computer-readable storage medium, wherein the method comprises the following steps: acquiring pedestrian image data; extracting pedestrian features in each image in the pedestrian image data, and calculating to obtain the visual probability of the pedestrian according to the pedestrian features; when the visual probability does not exceed a preset threshold value, acquiring camera identification and frame number information in the image data of the pedestrian, and calculating the space-time probability of the pedestrian according to the camera identification and the frame number information; calculating the position information of the pedestrians in the images shot by different cameras at the same time in the actual environment; configuring the influence parameters of the visual probability and the space-time probability according to the position information; and calculating to obtain a joint probability by utilizing the visual probability and the influence parameters thereof as well as the space-time probability and the influence parameters thereof, and obtaining a pedestrian re-identification result. The accuracy of the joint probability under different conditions is improved by adjusting the weight of the influence parameters on the visual probability and the space-time probability under different conditions.

Description

Pedestrian re-identification method and device, electronic equipment and computer readable storage medium
Technical Field
The invention relates to the field of artificial intelligence, in particular to a pedestrian re-identification method, a pedestrian re-identification device, electronic equipment and a computer-readable storage medium.
Background
The attention degree of the society to public safety is gradually increased nowadays, the scale of a video monitoring system is increasingly huge, and intelligent video monitoring is widely applied to public places such as schools, malls, parks, transportation and the like so as to assist city safety management.
However, due to the low resolution and shooting angle of the camera, the problem of mutual occlusion between pedestrians exists widely, and a frontal face picture with very high quality and resolution cannot be obtained generally. In an actual scene, the variability of pedestrians, various interferences and uncertain factors in the actual scene are increased, so that the detection and recognition difficulty is higher, and meanwhile, the pedestrian detection precision is lower due to different appearances, visual angles and postures. Under the condition that the face recognition is invalid, the pedestrian is recognized by using the structural information or the high-level semantic information of the human body, and the tracking, matching and identity identification of the target human body or the target crowd can be realized by different cameras in a cross-time and space mode, so that the visual limitation of the fixed camera is effectively made up.
In the prior art, in the process of re-identifying pedestrians, factors influencing final identification results under different conditions are different, and the final results obtained by adopting a mode of fusing visual probability and space-time probability in the prior art are not suitable for all scenes, so that the accuracy of the identification results under partial scenes is low.
Disclosure of Invention
Therefore, the technical problem to be solved by the present invention is to overcome the defect of low accuracy of the recognition result in the partial scene by adopting the fusion of the visual probability and the space-time probability in the prior art, so as to provide a pedestrian re-recognition method, which comprises the following steps:
acquiring pedestrian image data;
extracting pedestrian features in each image in the pedestrian image data, and calculating to obtain the visual probability of the pedestrian according to the pedestrian features, wherein the visual probability represents the similarity between pedestrians on different images;
judging whether the visual probability exceeds a preset threshold value or not;
when the visual probability does not exceed the preset threshold, acquiring camera identification and frame number information in the pedestrian image data, and calculating the space-time probability of the pedestrian according to the camera identification and frame number information;
calculating the position information of the pedestrians in the images shot by different cameras at the same time in the actual environment;
configuring an influence parameter of the visual probability and the spatiotemporal probability according to the position information, wherein the influence parameter is used for reflecting the weight of the visual probability or the spatiotemporal probability, and the influence parameter is a numerical value which is greater than or equal to 0 and less than or equal to 1;
and calculating to obtain a joint probability by utilizing the visual probability and the influence parameters thereof as well as the space-time probability and the influence parameters thereof, and obtaining a pedestrian re-identification result.
Optionally, when the visual probability exceeds the preset threshold, it is determined that the pedestrians on the different image are the same person.
Optionally, the joint probability is calculated by the following formula:
Figure BDA0002737901850000021
where φ represents a hyper-parameter that balances the visual probability and the spatio-temporal probability, e represents the base of the natural logarithm, ρVRepresenting the visual probability, pSTAnd the value of gamma is 5, x represents the influence parameter of the visual probability, and y represents the influence parameter of the space-time probability.
Optionally, the configuring the influence parameters of the visual probability and the spatiotemporal probability according to the position information includes:
judging whether the pedestrians in different images are at the same position or not based on the position information;
when the pedestrians are located at the same position in different images, the values of x and y are both set to 0.
Alternatively, the values of x and y are both greater than 0 and less than 1 when the pedestrian is in different positions in different images, wherein the greater the positional deviation of the pedestrian in different images, the greater the value of y.
Optionally, calculating the position information of the pedestrian in the images captured by different cameras at the same time in the actual environment includes:
acquiring camera parameters and a camera position of a camera for shooting the pedestrian image data;
calculating the position information of the pedestrian in the image using the camera parameters and the camera position.
Optionally, the extracting pedestrian features in each image of the pedestrian image data, and calculating a visual probability of a pedestrian according to the pedestrian features includes:
the pedestrian image data is used as input of a neural network model obtained by utilizing pre-training, and the neural network model is utilized to extract pedestrian features in each image in the pedestrian image data to obtain a pedestrian feature matrix;
the acquiring of the camera identification and the frame number information in the pedestrian image data and the calculating of the spatiotemporal probability of the pedestrian according to the camera identification and the frame number information comprise:
and taking the pedestrian image data, the camera identification and the frame number information as the input of a space-time probability model, and calculating by using the space-time probability model to obtain the space-time probability of the pedestrian.
The present invention also provides a pedestrian re-identification apparatus, including:
the first acquisition module is used for acquiring pedestrian image data;
the extraction module is used for extracting the pedestrian features in each image in the pedestrian image data, and calculating the visual probability of the pedestrian according to the pedestrian features, wherein the visual probability represents the similarity between pedestrians on different images;
the judging module is used for judging whether the visual probability exceeds a preset threshold value;
the second acquisition module is used for acquiring the camera identification and the frame number information in the pedestrian image data when the visual probability does not exceed the preset threshold value, and calculating the space-time probability of the pedestrian according to the camera identification and the frame number information;
the first calculation module is used for calculating the position information of the pedestrians in the images shot by different cameras at the same time in the actual environment;
a configuration module, configured to configure an influence parameter of the visual probability and the spatiotemporal probability according to the location information, where the influence parameter is used to reflect a weight of the visual probability or the spatiotemporal probability, and the influence parameter is a numerical value greater than or equal to 0 and less than or equal to 1;
and the second calculation module is used for calculating a joint probability by utilizing the visual probability and the influence parameters thereof as well as the space-time probability and the influence parameters thereof to obtain a pedestrian re-identification result.
The invention also provides an electronic device, which comprises a memory and a processor, wherein the memory and the processor are mutually connected in a communication manner, the memory stores computer instructions, and the processor executes the computer instructions so as to execute the pedestrian re-identification method.
The invention also provides a computer-readable storage medium, which stores computer instructions for causing a computer to execute the pedestrian re-identification method.
The technical scheme of the invention has the following advantages:
1. according to the pedestrian re-identification method provided by the invention, the pedestrian features are extracted from the acquired pedestrian image data, the visual probability of the pedestrian is calculated according to the extracted pedestrian features, and whether the pedestrians on two different images are the same person or not can be judged according to the visual probability; when the visual probability does not exceed a preset threshold value, calculating the space-time probability of the pedestrian according to the camera identification and the frame number information in the image data of the pedestrian; meanwhile, the position information of the pedestrians in the images shot by different cameras in the same time in the actual environment is calculated, the influence parameters of the visual probability and the space-time probability are configured by judging the calculated position information of the pedestrians in two different pictures, and the final joint probability is obtained by adjusting the influence parameters of the visual probability and the space-time probability, so that the accuracy of the final recognition result is improved.
2. According to the pedestrian re-identification method provided by the invention, when the visual probability exceeds the preset threshold value, subsequent calculation of the space-time probability and the joint probability is not carried out, and the pedestrians on different images are directly determined to be the same person.
3. According to the pedestrian re-identification method provided by the invention, when the visual probability does not exceed the preset threshold value, subsequent calculation processing is carried out, but when the position information of the pedestrians in the images shot by different cameras at the same time in the actual environment is consistent, the pedestrians in the different images can be directly judged to be the same person.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a pedestrian re-identification method according to an embodiment of the present invention;
fig. 2 is a functional block diagram of a pedestrian re-identification apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be noted that, unless otherwise explicitly specified or limited, the terms "connected" and "connected" are to be interpreted broadly, e.g., as being fixed or detachable or integrally connected; can be mechanically or electrically connected; the two elements may be directly connected or indirectly connected through an intermediate medium, or may be communicated with each other inside the two elements, or may be wirelessly connected or wired connected. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1
While the present embodiment provides a pedestrian re-identification method, fig. 1 is a flow chart illustrating pedestrian re-identification by performing pedestrian feature extraction on pedestrian image data according to some embodiments of the present invention. Although the processes described below include operations that occur in a particular order, it should be appreciated that the processes may include more or fewer operations, which may be performed sequentially or in parallel (e.g., using parallel processors or a multi-threaded environment).
The pedestrian re-identification method provided by the invention comprises the following steps:
and S101, acquiring pedestrian image data.
In the implementation step, the pedestrian image data may be obtained from a camera, and the like at a specific position where the pedestrian to be identified is located, where the camera and the camera may be a monitoring camera or a camera for road traffic in the position, a camera or a camera installed in a nearby mall, a supermarket, and the like, a vehicle mounted on an automobile, and the like.
In some embodiments, the pedestrian image data may also be obtained from an electronic device with a photographing or video recording function, such as a mobile phone, a camera, a video camera, etc., of a passerby passing through the location, as long as the electronic device with the photographing or video recording function captures the relevant image or video. The obtained pedestrian image data is internally provided with blank information such as the position, time and the like of a video or a photo when being shot, the position when being shot can be represented by longitude and latitude, a self-determined direction of a system and the like, the time when being shot is the time in the attribute of the video and the photo, and most of the electronic equipment can automatically record the information such as the shot time, the shot position and the like when being shot or recorded.
S102, extracting pedestrian features in each image in the pedestrian image data, and calculating the visual probability of the pedestrian according to the pedestrian features, wherein the visual probability represents the similarity between pedestrians on different images.
In the implementation step, the pedestrian image data is used as an input of a neural network model obtained by pre-training, the residual attention mechanism module in the neural network model is used for extracting the pedestrian features in each image in the pedestrian image data, the pedestrian features comprise features such as color, texture and shape of the image, and the obtained pedestrian features are converted into a pedestrian feature matrix f with dimensions of H × W × C, wherein H, W and C respectively represent the length, width and number of channels of a feature map.
Sampling the pedestrian feature matrix f with dimension H multiplied by W multiplied by C into the pedestrian feature matrix f with dimension H multiplied by W multiplied by C by utilizing a pedestrian feature sampling layer in the residual attention mechanism network
Figure BDA0002737901850000081
Local feature matrix f1、f2、f3、f4、f5、f6And then calculating the obtained local feature matrix f by a Global Average Particle (GAP)1、f2、f3、f4、f5、f6Local feature vector V of1、V2、V3、V4、V5、V6Finally, the local feature vector V is connected through a local feature connection layer (Concat)1、V2、V3、V4、V5、V6Concatenated into a feature vector V.
For example, to obtain a pedestrian x in two different photographstest1And xtest2The corresponding characteristic vector V is obtained by utilizing a pre-trained neural network modeltest1And Vtest2The visual probability is calculated using the cosine distance:
Figure BDA0002737901850000082
where | | · | | represents l of the feature vector2A paradigm.
The pre-trained neural network model can be obtained by training through the following steps:
step a, carrying out pedestrian feature extraction on an input pedestrian M by using a ResNet-50 model obtained by pretraining an ImageNet data set so as to obtain a pedestrian feature matrix, wherein the pedestrian feature matrix is represented as f; wherein the ImageNet dataset is a common dataset, and in some embodiments, other datasets may be selected, such as an MNIST dataset, an MS-COCO dataset, and the like; the ImageNet data set and the ResNet-50 model are prior art and are not described in detail herein.
And b, taking the pedestrian feature matrix f with dimension H multiplied by W multiplied by C obtained in the step a as the input of the residual attention mechanism network, and taking the corresponding identity information N as the target output, wherein H, W and C respectively represent the length, width and channel number of the feature map.
Sampling a pedestrian feature matrix f with dimension H multiplied by W multiplied by C into dimension H multiplied by W multiplied by C by utilizing a pedestrian feature sampling layer
Figure BDA0002737901850000091
Local feature matrix f1、f2、f3、f4、f5、f6Then, a local feature matrix f is calculated through a Global Average potential Pooling (GAP)1、f2、f3、f4、f5、f6Local feature vector V of1、V2、V3、V4、V5、V6
C, the obtained local feature vector V is connected through a local feature connecting layer (Concat)1、V2、V3、V4、V5、V6Connecting the feature vectors into a feature vector V, calculating the cross entropy loss between the feature vector V of the pedestrian M and the identity N of the pedestrian, reversely transmitting by using a random gradient descent method to optimize the parameters of the residual error attention mechanism network until the upper limit of the training times is reached, and finally outputting the obtained trained neural network model, namely the optimized residual error attention mechanism network.
In the above steps, a residual attention mechanism network needs to be constructed, and the residual attention mechanism network includes a residual attention mechanism module, a pedestrian feature sampling layer, a global pooling layer and a local feature connection layer, which are connected in sequence.
S103, judging whether the visual probability exceeds a preset threshold value.
In the implementation steps, the system can automatically judge the visual probability value between the pedestrians in the two different photos, namely judge the similarity between the pedestrians in the two different photos. When the visual probability between the pedestrians in the two different pictures exceeds a certain value set by the system, the system can directly judge that the pedestrians in the two different pictures are the same person.
For example, the preset threshold is set to 95% in the system in advance, and the visual probabilities ρ of the pedestrian test1 and the pedestrian test2 in the two different images are obtained by the step S102VIf the vision probability ρ is calculated in step S102VIf the number of the pedestrians is 95% or more than 95%, the system can automatically judge that the pedestrian test1 and the pedestrian test2 in the two different photos are the same person, and remind the pedestrians in the two different photos in an image, sound or character mode, so that the whole process of the pedestrian re-identification is finished. In some embodiments, the preset threshold may be 90%, 96%, 97%, or 99%, and the higher the preset threshold is, the more accurate the pedestrian re-recognition result obtained directly through the visual probability will be.
In some embodiments, when the obtained visual probability has a slight deviation from the preset threshold, for example, the visual probability obtained in step S102 is smaller than the preset threshold, it can be determined that the pedestrians in the two images are different persons; the pedestrian can be judged again, and whether the pedestrian and the pedestrian are different is determined by combining other factors, for example, a time-space factor (that is, a time for shooting an image and a spatial position of a camera, or a parameter such as a spatial position of the pedestrian) can be combined for judgment, and when the time-space factor of the pedestrian and the spatial-space factor of the pedestrian are relatively close, and the visual probability is smaller than a preset threshold value but the difference between the two is within a certain range, the pedestrian and the pedestrian can be considered as the same pedestrian. If the pedestrian test1 and the pedestrian test2 in the two different images obtained in step 103 are the same person, the subsequent steps are not performed, so that the calculation amount and time of the computer can be saved, and the result can be obtained more quickly.
If the obtained value of the visual probability does not exceed the preset threshold value, or the difference between the obtained value of the visual probability and the preset threshold value is not within the specified range, the pedestrian test1 and the pedestrian test2 in the two different images are not considered to be the same person, and in order to further obtain the result of pedestrian re-identification, the subsequent steps are required.
S104, when the visual probability does not exceed the preset threshold, acquiring camera identification and frame number information in the image data of the pedestrian, and calculating the space-time probability of the pedestrian according to the camera identification and frame number information;
in the implementation steps, the camera identification and the frame number information are used as the input of a space-time probability model, and the space-time probability of the pedestrian is calculated by using the space-time probability model. In the pre-training process, the space-time probability model is established according to camera identification and frame number information in the ImageNet data set sample pedestrian label, wherein the camera identification comprises an ID number (where the camera is located) of the camera, and the frame number information is the time when the camera or the camera shoots the pedestrian. In some embodiments, the spatio-temporal probability model may be incorporated into a neural network model such that the visual probability and the spatio-temporal probability are calculated using only one neural network model.
Modeling the spatio-temporal probabilities according to spatio-temporal information carried in the captured image, the spatio-temporal probabilities of which are rhoST1(ptest1=ptest2|k,ctest1,ctest2) Is shown at k, ctest1,ctest2The probability that the pedestrian test1 and the pedestrian test2 are the same person under the condition can be expressed as:
Figure BDA0002737901850000111
wherein p istest1、ptest2Respectively represent the identity information corresponding to the pedestrian test1 and the pedestrian test2 in the image, ctest1,ctest2ID numbers respectively indicating the respective cameras that captured the pedestrian test1 and pedestrian test2 images; k is used to represent the kth time period; l is used to denote the 1 st time period;
Figure BDA0002737901850000112
indicating slave camera ctest1To ctest2And the number of pedestrians whose time difference falls in the kth time period;
Figure BDA0002737901850000113
indicating slave camera ctest1To ctest2And the time difference falls within the number of pedestrians of the 1 st time period. In some embodiments, a time period may be 1 frame, 10 frames, 25 frames, 50 frames, 100 frames, etc. In this embodiment, 1 frame is a time period, and the kth time period and the 1 st time period are both the 1 st time period.
Since there is more jitter in the probability estimation model, in order to reduce the interference caused by jitter, a gaussian distribution function is used for smoothing, and the process is expressed as follows:
Figure BDA0002737901850000114
Figure BDA0002737901850000115
wherein z ═ ΣkρST(ptest1=ptest2|k,Ctest1,ctest2) For the normalization factor, K (.) is a gaussian distribution function, μ is a control distribution shift parameter, generally 0, λ is a control distribution scaling parameter, a value of 50 is proposed, and e is the base of the natural logarithm.
And S105, calculating the position information of the pedestrians in the images shot by different cameras at the same time in the actual environment.
In the implementation step, first, camera parameters and a camera position of a camera for capturing image data of a pedestrian need to be acquired, the camera parameters include a focal length, a pixel size, a base length, and the like, the camera position includes information on a position where the camera is actually located, information on an angle between the camera and a horizontal plane or a vertical plane, information on a height of the camera from the ground, and the like, and the information on the position where the camera is actually located may be identified by using longitude and latitude coordinate values or coordinate values determined by a system.
After camera parameters and position information of a camera are acquired, position information of a pedestrian in an actual environment in an image is calculated, and when the actual position of the camera is represented by longitude and latitude coordinate values, the position information of the pedestrian in the actual environment is also represented by the longitude and latitude coordinate values; when the position where the camera is actually located is represented by system-specified coordinate values, the position information where the pedestrian is located in the actual environment is also represented by the system-specified coordinate values. The position information of the pedestrian in the actual environment at the same time can be calculated by adopting images shot by a monocular camera or a binocular camera.
S106, configuring influence parameters of the visual probability and the space-time probability according to the position information, wherein the influence parameters are used for reflecting the weight of the visual probability or the space-time probability, and the influence parameters are numerical values which are greater than or equal to 0 and less than or equal to 1.
In the above implementation step, the position information of the pedestrian in the actual environment in the different images is obtained through step S105, the position information of the pedestrian in the actual environment in the two different photographs is compared, and whether the pedestrian is in the same position in the two different images is determined based on the position information, so that the influence parameter x of the visual probability and the influence parameter y of the space-time probability are configured correspondingly.
And S107, calculating to obtain a joint probability by using the visual probability and the influence parameters thereof and the space-time probability and the influence parameters thereof, and obtaining a pedestrian re-identification result.
In the above implementation step, the visual probability ρ obtained in step S102VAnd the space-time probability ρ obtained in step S104STTo calculate the final joint spatio-temporal probability piointAnd obtaining a pedestrian re-identification result. Because the space-time probability and the visual probability magnitude possibly have difference, the space-time probability and the visual probability magnitude need to be balanced through a sigmoid activation function, and the final joint probability rhojointIt can be expressed as bayesian joint probability:
Figure BDA0002737901850000131
wherein phi represents a balance viewThe hyperparameters of the sensation probability and the space-time probability are phi suggested to be 50-70, e represents the base of the natural logarithm, rhoVIs the visual probability, ρ, obtained in step S102STAnd (5) taking the value of gamma, wherein x represents the influence parameter of the visual probability, and y represents the influence parameter of the space-time probability, and the space-time probability is obtained in the step S104.
In the joint probability, the space-time probability and the visual probability are independent from each other, and the visual probability is restrained through the space-time probability, so that the identification precision of pedestrian re-identification is higher.
When the pedestrians in the two different images are at the same position, the influence parameter x of the visual probability and the influence parameter y of the space-time probability are both 0, and at the moment, the pedestrians in the two different images can be judged to be the same person. Two different photographing or camera devices photograph the same place at different positions at the same time, and the positions of pedestrians in the actual environment are unique, for example, the positions of pedestrians in two different images are north latitude N31 ° 18 '3.84 "and east longitude E120 ° 34' 52.11", and then it can be determined that the pedestrians in the two different images are the same person. In some embodiments, the positions of the pedestrians in the two different images can be represented by a system-customized coordinate system.
When the pedestrians are in different positions in the two different images, but the difference between the positions of the pedestrians in the two different images is within the error allowable range, the influence parameter x of the visual probability and the influence parameter y of the space-time probability are both 0, or the influence parameter x of the visual probability and the influence parameter y of the space-time probability may be a number close to 0, for example, 0.001, 0.002, and the error allowable range may be set to be a value of 1%, 2%, 3%, or the like. For example, the longitude and latitude positions of the pedestrian test1 are N31 ° 18 '3.84 ″ of north latitude, and E120 ° 34' 52.11 of east longitude, and the longitude and latitude positions of the pedestrian test2 are N31 ° 18 '3.84 ″ of north latitude, and E120 ° 34' 52.12 of east longitude, and at this time, the difference between the positions of the pedestrian test1 and the pedestrian test2 is within the error allowable range, and it can also be considered that the pedestrian test1 and the pedestrian test2 are the same person. In some embodiments, the positions of the pedestrians in the two different images can be represented by a system-customized coordinate system.
When the pedestrians in the two different images are at different positions or the difference between the positions of the pedestrians in the two different images is not within the error allowable range, the influence parameter x of the visual probability and the influence parameter y of the space-time probability need to be reconfigured, the values of the influence parameter x of the visual probability and the influence parameter y of the space-time probability are both greater than 0 and less than 1, wherein the larger the position deviation of the pedestrians in the different images is, the larger the value of y is.
Obtaining a final joint probability through the configuration of an influence parameter x of the visual probability and an influence parameter y of the space-time probability, and considering that pedestrians in two different images are the same person when the calculated joint probability is greater than or equal to a preset probability value; and when the calculated joint probability is smaller than the preset probability value, the pedestrians in the two different images are considered as two different people. The preset probability value can be 95%, 96% or 98% and the like.
Example 2
The present embodiment provides a pedestrian re-recognition apparatus, as shown in fig. 2, including:
the first obtaining module 21 is configured to obtain pedestrian image data captured by a camera, and the source of the pedestrian image data may refer to the related description of step S101 corresponding to the foregoing method embodiment, which is not described herein again.
The extracting module 22 is configured to extract pedestrian features in each image in the pedestrian image data from the first obtaining module 21, where the pedestrian features include features such as color, texture, and shape of the image, and calculate a visual probability of a pedestrian according to the pedestrian features, where the visual probability represents a similarity between pedestrians on different images. The process of the extracting module 22 can refer to the related description of step S102 corresponding to the above method embodiment, and is not described herein again.
And the judging module 23 is configured to judge whether the visual probability exceeds a preset threshold. The threshold may be a constant or a section, when the visual probability exceeds the maximum number of the section, the system may directly determine that the pedestrians in the two different images are the same pedestrian, and when the visual probability is less than the minimum number of the section, the system may also directly determine that the pedestrians in the two different images are not the same pedestrian, thereby ending the whole recognition process. When the visual probability is in the interval, the system needs to further judge the pedestrians on the two different images, and determine whether the two are different pedestrians or not by combining other factors. For other processes of the determining module 23, reference may be made to the related description of the method embodiment corresponding to step S103, which is not described herein again.
And the second obtaining module 24 is configured to obtain the camera identifier and the frame number information in the pedestrian image data when the visual probability does not exceed the preset threshold, and calculate the spatiotemporal probability of the pedestrian according to the camera identifier and the frame number information as the input of the spatiotemporal probability model. The process of the second obtaining module 24 can refer to the related description of step S104 corresponding to the above method embodiment, and is not described herein again.
The first calculating module 25 is used for calculating the position information of the pedestrians in the images shot by different cameras at the same time in the actual environment; the process of the first calculating module 25 can refer to the related description of the above method embodiment corresponding to step S105, and is not repeated herein.
A configuration module 26, configured to configure an influence parameter of the visual probability and the spatio-temporal probability according to the location information, where the influence parameter is used to reflect a weight of the visual probability or the spatio-temporal probability, and the influence parameter is a value greater than or equal to 0 and less than or equal to 1; by reconfiguring the influence parameters of the visual probability and the space-time probability, the final recognition result can be more accurate. The process of configuring the module 26 may refer to the related description of step S106 corresponding to the above method embodiment, and is not described herein again.
And the second calculating module 27 is configured to calculate a joint probability by using the visual probability and the influence parameter thereof, and the spatio-temporal probability and the influence parameter thereof, so as to obtain a final pedestrian re-identification result. The process of the second calculating module 27 can refer to the related description of step S106 corresponding to the above method embodiment, and is not described herein again.
In the above-mentioned pedestrian re-identification device, when the judgment module 23 judges that the visual probability exceeds the preset threshold, it can be directly judged that the pedestrians on the two different images are the same pedestrian, so that the whole pedestrian re-identification process is finished, and the calculation amount and the calculation time of the system are reduced. When the judgment module 23 judges that the visual probability does not exceed the preset threshold, the influence parameters of the visual probability and the space-time probability are configured by combining the position relations of the pedestrians on different images in the actual environment, so that the final joint probability is obtained, and the identification accuracy is improved.
Example 3
The present embodiment provides a computer device, as shown in fig. 3, the device includes a processor 31 and a memory 32, wherein the processor 31 and the memory 32 may be connected by a bus or other means, and fig. 2 takes the example of connection by a bus as an example.
The Processor 31 may be a Central Processing Unit (CPU), and the Processor 31 may also be other general-purpose processors, Digital Signal Processors (DSP), Graphics Processing Units (GPU), embedded Neural Network Processors (NPU), or other dedicated deep learning coprocessors, Application Specific Integrated Circuits (ASIC), Field Programmable Gate Arrays (FPGA), or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or the like, or a combination thereof.
The memory 32, which is a non-transitory computer readable storage medium, can be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the pedestrian re-identification method in embodiment 1 of the present invention. The processor 31 executes various functional applications and data processing of the processor 31 by running non-transitory software programs, instructions and modules stored in the memory 32, that is, implements the pedestrian re-identification method in the above-described method embodiment.
The memory 32 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor 31, and the like. Further, the memory 32 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 32 may optionally include memory located remotely from the processor 31, and these remote memories may be connected to the processor 31 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The one or more program instructions and/or modules are stored in the memory 32 and, when executed by the processor 31, perform the pedestrian re-identification method in the embodiment shown in fig. 1.
The embodiment of the invention also provides a non-transitory computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions can execute the pedestrian re-identification method in any method embodiment. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.

Claims (8)

1. A pedestrian re-identification method is characterized by comprising the following steps:
acquiring pedestrian image data;
extracting pedestrian features in each image in the pedestrian image data, and calculating to obtain the visual probability of the pedestrian according to the pedestrian features, wherein the visual probability represents the similarity between pedestrians on different images;
judging whether the visual probability exceeds a preset threshold value or not;
when the visual probability exceeds the preset threshold value, determining that the pedestrians on the different images are the same person; when the visual probability does not exceed the preset threshold, acquiring camera identification and frame number information in the pedestrian image data, and calculating the space-time probability of the pedestrian according to the camera identification and frame number information;
calculating the position information of the pedestrians in the images shot by different cameras at the same time in the actual environment;
configuring an influence parameter of the visual probability and the spatiotemporal probability according to the position information, wherein the influence parameter is used for reflecting the weight of the visual probability or the spatiotemporal probability, and the influence parameter is a numerical value which is greater than or equal to 0 and less than or equal to 1;
calculating to obtain a joint probability by using the visual probability and the influence parameters thereof and the space-time probability and the influence parameters thereof to obtain a pedestrian re-identification result; wherein the joint probability is calculated by the following formula:
Figure FDA0003301509730000011
where φ represents a hyper-parameter that balances the visual probability and the spatio-temporal probability, e represents the base of the natural logarithm, ρVRepresenting the visual probability, pSTRepresenting the spatiotemporal probability, gamma being 5, x representing the visionThe influence parameter of the probability, y denotes the influence parameter of the spatio-temporal probability.
2. The pedestrian re-identification method according to claim 1, wherein the configuring of the influence parameters of the visual probability and the spatiotemporal probability according to the position information includes:
judging whether the pedestrians in different images are at the same position or not based on the position information;
when the pedestrians are located at the same position in different images, the values of x and y are both set to 0.
3. The pedestrian re-identification method according to claim 2,
the values of x and y are both greater than 0 and less than 1 when the pedestrian is in different positions in different images, wherein the greater the positional deviation of the pedestrian in different images, the greater the value of y.
4. The pedestrian re-identification method according to any one of claims 1 to 3, wherein calculating the position information of the pedestrian in the images shot by different cameras at the same time in the actual environment comprises:
acquiring camera parameters and a camera position of a camera for shooting the pedestrian image data;
calculating the position information of the pedestrian in the image using the camera parameters and the camera position.
5. The pedestrian re-identification method according to any one of claims 1 to 3, wherein the extracting of the pedestrian feature in each image of the pedestrian image data and the calculating of the visual probability of the pedestrian according to the pedestrian feature comprise:
the pedestrian image data is used as input of a neural network model obtained by utilizing pre-training, and the neural network model is utilized to extract pedestrian features in each image in the pedestrian image data to obtain a pedestrian feature matrix;
the acquiring of the camera identification and the frame number information in the pedestrian image data and the calculating of the spatiotemporal probability of the pedestrian according to the camera identification and the frame number information comprise:
and taking the camera identification and the frame number information as the input of a space-time probability model, and calculating by using the space-time probability model to obtain the space-time probability of the pedestrian.
6. A pedestrian re-recognition apparatus, comprising:
the first acquisition module is used for acquiring pedestrian image data;
the extraction module is used for extracting the pedestrian features in each image in the pedestrian image data, and calculating the visual probability of the pedestrian according to the pedestrian features, wherein the visual probability represents the similarity between pedestrians on different images;
the judging module is used for judging whether the visual probability exceeds a preset threshold value;
the second obtaining module is used for determining that the pedestrians on the different images are the same person when the visual probability exceeds the preset threshold; when the visual probability does not exceed the preset threshold, acquiring camera identification and frame number information in the pedestrian image data, and calculating the space-time probability of the pedestrian according to the camera identification and frame number information;
the first calculation module is used for calculating the position information of the pedestrians in the images shot by different cameras at the same time in the actual environment;
a configuration module, configured to configure an influence parameter of the visual probability and the spatiotemporal probability according to the location information, where the influence parameter is used to reflect a weight of the visual probability or the spatiotemporal probability, and the influence parameter is a numerical value greater than or equal to 0 and less than or equal to 1;
the second calculation module is used for calculating a joint probability by utilizing the visual probability and the influence parameters thereof as well as the space-time probability and the influence parameters thereof to obtain a pedestrian re-identification result; wherein the content of the first and second substances,
the joint probability is calculated by the following formula:
Figure FDA0003301509730000031
where φ represents a hyper-parameter that balances the visual probability and the spatio-temporal probability, e represents the base of the natural logarithm, ρVRepresenting the visual probability, pSTAnd the value of gamma is 5, x represents the influence parameter of the visual probability, and y represents the influence parameter of the space-time probability.
7. An electronic device, characterized in that: the pedestrian re-identification method comprises a memory and a processor, wherein the memory and the processor are connected with each other in a communication mode, computer instructions are stored in the memory, and the processor executes the computer instructions so as to execute the pedestrian re-identification method according to any one of claims 1 to 5.
8. A computer-readable storage medium characterized by: the computer-readable storage medium stores computer instructions for causing a computer to execute the pedestrian re-identification method according to any one of claims 1 to 5.
CN202011139919.0A 2020-10-22 2020-10-22 Pedestrian re-identification method and device, electronic equipment and computer readable storage medium Active CN112270241B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011139919.0A CN112270241B (en) 2020-10-22 2020-10-22 Pedestrian re-identification method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011139919.0A CN112270241B (en) 2020-10-22 2020-10-22 Pedestrian re-identification method and device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN112270241A CN112270241A (en) 2021-01-26
CN112270241B true CN112270241B (en) 2021-12-10

Family

ID=74341790

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011139919.0A Active CN112270241B (en) 2020-10-22 2020-10-22 Pedestrian re-identification method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN112270241B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116704448B (en) * 2023-08-09 2023-10-24 山东字节信息科技有限公司 Pedestrian recognition method and recognition system with multiple cameras

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930768A (en) * 2016-04-11 2016-09-07 武汉大学 Spatial-temporal constraint-based target re-identification method
CN110414441A (en) * 2019-07-31 2019-11-05 浙江大学 A kind of pedestrian's whereabouts analysis method and system
CN111160297A (en) * 2019-12-31 2020-05-15 武汉大学 Pedestrian re-identification method and device based on residual attention mechanism space-time combined model
CN111178284A (en) * 2019-12-31 2020-05-19 珠海大横琴科技发展有限公司 Pedestrian re-identification method and system based on spatio-temporal union model of map data

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9396412B2 (en) * 2012-06-21 2016-07-19 Siemens Aktiengesellschaft Machine-learnt person re-identification
CN105389562B (en) * 2015-11-13 2018-08-21 武汉大学 A kind of double optimization method of the monitor video pedestrian weight recognition result of space-time restriction

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930768A (en) * 2016-04-11 2016-09-07 武汉大学 Spatial-temporal constraint-based target re-identification method
CN110414441A (en) * 2019-07-31 2019-11-05 浙江大学 A kind of pedestrian's whereabouts analysis method and system
CN111160297A (en) * 2019-12-31 2020-05-15 武汉大学 Pedestrian re-identification method and device based on residual attention mechanism space-time combined model
CN111178284A (en) * 2019-12-31 2020-05-19 珠海大横琴科技发展有限公司 Pedestrian re-identification method and system based on spatio-temporal union model of map data

Also Published As

Publication number Publication date
CN112270241A (en) 2021-01-26

Similar Documents

Publication Publication Date Title
US11468697B2 (en) Pedestrian re-identification method based on spatio-temporal joint model of residual attention mechanism and device thereof
US11144786B2 (en) Information processing apparatus, method for controlling information processing apparatus, and storage medium
JP7190842B2 (en) Information processing device, control method and program for information processing device
CN110674688B (en) Face recognition model acquisition method, system and medium for video monitoring scene
WO2019230339A1 (en) Object identification device, system for moving body, object identification method, training method of object identification model, and training device for object identification model
TWI766201B (en) Methods and devices for biological testing and storage medium thereof
CN112052831B (en) Method, device and computer storage medium for face detection
JP6998554B2 (en) Image generator and image generation method
JP7276607B2 (en) Methods and systems for predicting crowd dynamics
CN107766864B (en) Method and device for extracting features and method and device for object recognition
CN112101195B (en) Crowd density estimation method, crowd density estimation device, computer equipment and storage medium
US11657592B2 (en) Systems and methods for object recognition
CN111325782A (en) Unsupervised monocular view depth estimation method based on multi-scale unification
WO2023279799A1 (en) Object identification method and apparatus, and electronic system
CN110443228B (en) Pedestrian matching method and device, electronic equipment and storage medium
CN112270241B (en) Pedestrian re-identification method and device, electronic equipment and computer readable storage medium
US11605220B2 (en) Systems and methods for video surveillance
CN114972182A (en) Object detection method and device
CN116051736A (en) Three-dimensional reconstruction method, device, edge equipment and storage medium
CN113743313A (en) Pedestrian identification method and device and electronic equipment
Genovese et al. Driver attention assistance by pedestrian/cyclist distance estimation from a single RGB image: A CNN-based semantic segmentation approach
US20230394628A1 (en) Method and apparatus for reconstructing face image by using video identity clarification network
CN115631296A (en) 3D target detection method, computer program product and electronic equipment
JP2024069041A (en) IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND COMPUTER PROGRAM
CN116957999A (en) Depth map optimization method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant