CN115082854A - Pedestrian searching method oriented to security monitoring video - Google Patents

Pedestrian searching method oriented to security monitoring video Download PDF

Info

Publication number
CN115082854A
CN115082854A CN202210682446.1A CN202210682446A CN115082854A CN 115082854 A CN115082854 A CN 115082854A CN 202210682446 A CN202210682446 A CN 202210682446A CN 115082854 A CN115082854 A CN 115082854A
Authority
CN
China
Prior art keywords
pedestrian
space
frame
time
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210682446.1A
Other languages
Chinese (zh)
Inventor
冯德瀛
魏衍侠
肖海荣
张来刚
王政森
杨杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Liaocheng University
Original Assignee
Liaocheng University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liaocheng University filed Critical Liaocheng University
Priority to CN202210682446.1A priority Critical patent/CN115082854A/en
Publication of CN115082854A publication Critical patent/CN115082854A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7837Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
    • G06F16/784Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content the detected or recognised objects being people
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items

Abstract

The invention relates to a pedestrian searching method facing to a security monitoring video, which comprises the following steps: detecting each pedestrian frame by frame on a monitoring video by utilizing a pre-trained regional convolution neural network and generating corresponding spatial features; organizing the pedestrian space features extracted frame by frame in the monitoring video by using the hidden state output by the gate control circulation unit, adding an average pooling layer at the output end of the gate control circulation unit, reducing the dimension of the hidden state vector, and generating the corresponding pedestrian space-time features; and thirdly, indexing all pedestrian space-time characteristics through local sensitive hashing, and determining a final pedestrian search result by calculating the similarity of the pedestrian space-time characteristics to be searched and the pedestrian space-time characteristics in the monitoring video.

Description

Pedestrian searching method oriented to security monitoring video
Technical Field
The invention relates to the technical field of computer vision, in particular to a pedestrian searching method facing to security monitoring videos.
Background
With the continuous progress of the construction process of the smart city, more and more monitoring cameras are distributed in the streets and alleys of the city, and play an important role in the aspects of missing the old and children, searching and positioning criminal suspects and the like. However, related information can be found out by unequal surveillance videos, and because the number of surveillance cameras is large, the recording time is long, and the video data volume is in a trend of geometric increase, a great amount of time and manpower and material resources are often consumed for finding a specific pedestrian target in a large number of surveillance videos. Meanwhile, the monitoring scenes are different due to different installation positions of the monitoring cameras, and particularly in large public places such as shopping malls, stations and exhibition centers, the pedestrian flow is dense, so that the monitoring scenes are more complex, and the pedestrian search is more challenging. Therefore, how to find the relevant pedestrians in the massive security monitoring videos more quickly and accurately becomes one of the hot points concerned in the technical field of computer vision.
The pedestrian search method queries a specific pedestrian target in an unknown image or video data set, thereby finding the same pedestrian image in the data set. Most of the existing pedestrian search methods are usually trained on a ChuK-SYSU or PRW image data set, and the pedestrian search is carried out by using a trained neural network model. Compared with the CHUK-SYSU and PRW image data sets, the real security monitoring video not only contains the spatial characteristics of pedestrians, but also relates to the time correlation of the pedestrians. Therefore, the pedestrian search method trained on the CHUK-SYSU and PRW image data sets needs to be further improved in robustness and reliability when applied to security surveillance videos because the time correlation of pedestrians is not considered.
Through the literature search of the prior art, the patent CN 112241682a provides an end-to-end pedestrian search method based on blocking and multi-layer information fusion. The method uses a convolutional neural network to provide preliminary features, and uses a candidate region extraction network to extract the region where the pedestrian is located, so as to obtain a high-level feature map. By blocking the high-level features and fusing the high-level features with the middle-level features, the accuracy of pedestrian search is improved. Although the method adopts the whole image shot by the monitoring camera as input data, the time correlation of pedestrians in the front frame image and the rear frame image is not considered, and the method has certain limitation when being applied to security monitoring videos.
Further retrieval shows that patent CN 109165540a provides a pedestrian search method based on a priori candidate frame selection strategy. The method comprises the steps of constructing a pedestrian candidate frame vector according to the length and the width of all pedestrian frames in a training set, further obtaining a priori candidate frame through a K-means + + clustering algorithm, identifying the identity of a pedestrian, and finally determining the position of the pedestrian in a monitoring image through a trained pedestrian searching network. The method also only processes the pedestrian characteristics in a single image and does not relate to the time correlation among pedestrians in a monitoring video multi-frame image.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a pedestrian searching method facing to security monitoring videos. According to the characteristics that the security monitoring video has space invariance and time continuity, the time-space characteristics of the pedestrians are generated in the monitoring video, the time correlation of the pedestrians in the front frame and the rear frame of the video is fully utilized, and the pedestrian searching accuracy is improved. By organizing all pedestrian space-time characteristics in the monitoring video, the real-time performance of pedestrian searching is guaranteed.
The invention is realized by the following technical scheme, and the invention specifically comprises the following steps:
a pedestrian searching method facing to security monitoring videos comprises the following steps:
step one, detecting each pedestrian frame by frame for a monitoring video by utilizing a pre-trained regional convolution neural network and generating corresponding spatial features.
And secondly, organizing the pedestrian space features extracted frame by frame in the monitoring video by using the hidden state output by the gate control circulation unit, adding an average pooling layer at the output end of the gate control circulation unit, reducing the dimension of the hidden state vector, and generating the corresponding pedestrian space-time features.
And thirdly, indexing all pedestrian space-time characteristics through local sensitive hashing, and determining a final pedestrian search result by calculating the similarity of the pedestrian space-time characteristics to be searched and the pedestrian space-time characteristics in the monitoring video.
Further, step one is performed according to the following steps:
1) for surveillance video V ═ V { (V) 1 ,v 2 ,…,v N Contains N frame images, wherein the ith frame image is represented as v i
2) Processing the surveillance video V frame by frame through a pre-trained regional convolution neural network, and processing the image V in the ith frame i The jth spatial feature s of the pedestrian extracted from the previous image i,j
3) After the N frames of images in the monitoring video V are processed, all spatial features of the pedestrians are expressed as S ═ S i,j },1≤i≤n j J is more than or equal to 1 and less than or equal to M, wherein n j Representing the number of frame images containing the jth pedestrian, and M representing the total number of pedestrians appearing in the monitoring video;
further, the second step is executed according to the following steps:
1) the ith frame image v i The jth spatial feature s of the pedestrian extracted from the previous image i,j As input vectors, input to the gated loop unit;
2) spatial features s of pedestrians in gated cyclic units i,j Updating candidate hidden state vector c by tanh activation function i,j And is represented as: c. C i,j =tanh(W n s i,j +U n (r i,j ⊙h i-1,j )+b n ) Wherein h is i-1,j Representing the hidden state vector r corresponding to the jth pedestrian in the ith-1 frame image i,j Is h i-1,j Corresponding weight, W n 、U n And b n Network parameters of gated cycle cells;
3) in gated-cycle units, according to h i-1,j And c i,j Generating a hidden state vector h corresponding to the jth pedestrian in the ith frame image i,j And is represented as: h is i,j =z i,j h i-1,j +(1-z i,j )c i,j ,z i,j Is a combination h i-1,j And c i,j The weight of (c);
4) after the spatial features corresponding to the jth pedestrian are completely processed in the gate control cycle unit, a hidden vector sequence h corresponding to the jth pedestrian is obtained j ={h i,j },1≤i≤n j And all pedestrians in the video are represented as a hidden vector sequence H ═ H j },1≤j≤M;
5) Adding an average pooling layer at the output of the gated cyclic unit for the sequence h j Reducing dimensions and generating the space-time characteristic p of the jth pedestrian j And is represented by
Figure BDA0003698860790000031
All pedestrian spatiotemporal features are denoted as P ═ { P ═ P j },1≤j≤M。
Further, step three is performed according to the following steps:
1) mapping all pedestrian space-time characteristics P in the monitored video to a Hamming vector space, and regarding the jth pedestrian space-time characteristics P j Mapping to a b-bit hash code;
2) after the pedestrian space-time feature q to be searched is mapped to a Hamming vector space, calculating the similarity of the pedestrian space-time feature to be searched and the pedestrian space-time feature in the video;
3) after calculating all similarities between the pedestrian space-time features q to be searched and all the pedestrian space-time features P, sequencing the similarities, and obtaining the pedestrian images corresponding to the T space-time features in the front sequence, namely the final search result.
The invention has the beneficial effects that: the method utilizes the regional convolutional neural network to extract the spatial characteristics of the pedestrians from frame to frame from the security monitoring video, thereby avoiding the interference of a complex monitoring background. The time correlation of the pedestrian space characteristics of the front frame and the rear frame is organized through the gate control circulation unit, the space-time characteristics of the pedestrians are generated at the output end by adding the average pooling layer, and the identification degree of the same pedestrian and the identification degrees of different pedestrians are enhanced. A large number of pedestrian space-time characteristics are organized through local sensitive hashing, the calculated amount in the searching process is reduced, and the real-time performance of pedestrian searching is guaranteed. Compared with the prior art, the method can be applied to the actual security monitoring video, and the searching accuracy is improved on the basis of ensuring the searching real-time performance.
Drawings
FIG. 1 is a flow chart of the present invention.
Fig. 2 is a comparison of the accuracy of the inventive method and a search method using only spatial features of pedestrians.
Detailed Description
The invention is further described with reference to the accompanying drawings and the specific embodiments. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and these equivalents also fall within the scope of the present application.
Example 1
The invention is realized by the following technical scheme, and the method comprises the following specific steps:
firstly, a security monitoring video is composed of continuous frame images, and a pre-trained regional convolution neural network is utilized to detect pedestrians on the monitoring video frame by frame and generate corresponding spatial features. The specific method comprises the following steps:
1) for surveillance video V ═ V { (V) 1 ,v 2 ,…,v N Contains N frame images, wherein the ith frame image is denoted as v i . For the pre-trained regional convolutional neural network, it can be regarded as a nonlinear equation f RCNN (·);
2) Passing the surveillance video V through the regional convolutional neural network f frame by frame RCNN (v) processing in the ith frame image v i The jth spatial feature s of the pedestrian extracted from the previous image i,j Can be represented as s i,j =f RCNN (v i );
3) After N frames of images in the surveillance video V are processed, all spatial features of the pedestrian may be expressed as S ═ S i,j },1≤i≤n j J is more than or equal to 1 and less than or equal to M, wherein n j Representing the number of frame images containing the jth pedestrian, and M representing the total number of pedestrians appearing in the video.
Organizing the pedestrian space features extracted from the front and rear frames of the video through a gate control circulation unit, and adding an average pooling layer at the output end to generate the pedestrian space-time features;
the pedestrian spatial features extracted from the front and rear frames of the video are organized through the gate control circulation unit, and the average pooling layer is added at the output end to generate the pedestrian spatial-temporal features, wherein the pedestrian spatial features are as follows: because the gating circulation unit can effectively process the data sequence, the spatial features of the pedestrians extracted from the video can be organized by utilizing the hidden state output by the gating circulation unit, so that the time correlation information of the pedestrians in the front frame and the rear frame of the video is transmitted. Because the generated hidden state vector has higher dimensionality, the calculated amount is increased in the pedestrian searching process, so that an average pooling layer is added at the output end of the gating circulating unit, the hidden state vector is subjected to dimension reduction, and the corresponding pedestrian space-time characteristics are generated at the same time, so that the calculated amount in the subsequent searching process can be reduced, and the time correlation information of the pedestrian is included.
The step of organizing the pedestrian spatial features extracted from different frames of the video through the gating cycle unit and generating the pedestrian spatial-temporal features through the average pooling layer comprises the following steps:
1) the ith frame image v i The jth spatial feature s of the pedestrian extracted from the previous image i,j As an input vector, inputting to a gated loop unit;
2) spatial features s of pedestrians in gated cyclic units i,j Updating the candidate by the tanh activation functionSelect hidden state vector c i,j And is represented as: c. C i,j =tanh(W n s i,j +U n (r i,j ⊙h i-1,j )+b n ) Wherein h is i-1,j Representing the hidden state vector r corresponding to the jth pedestrian in the ith-1 frame image i,j Is h i-1,j Corresponding weight, W n 、U n And b n Network parameters that are gated cycle units;
3) in gated-cycle units, according to h i-1,j And c i,j Generating a hidden state vector h corresponding to the jth pedestrian in the ith frame image i,j And is represented as: h is i,j =z i,j h i-1,j +(1-z i,j )c i,j ,z i,j Is a combination h i-1,j And c i,j The weight of (c). h is i,j Not only the candidate hidden state of the jth pedestrian in the ith frame image is considered, but also the hidden state of the jth pedestrian in the ith-1 frame image is included, so that time correlation is established between the ith-1 frame image and the ith frame image for the jth pedestrian;
4) after the spatial features corresponding to the jth pedestrian are completely processed in the gate control cycle unit, the hidden vector sequence h corresponding to the jth pedestrian can be obtained j ={h i,j },1≤i≤n j Therefore, the time correlation of the pedestrians in different frames of the video is described, and then all the pedestrians in the video can be represented as a hidden vector sequence H ═ { H ═ in the video j },1≤j≤M;
5) Due to the sequence h j Is higher, so that an average pooling layer is added at the output of the gated-round unit, for the sequence h j Reducing dimensions and simultaneously generating the space-time characteristic p of the jth pedestrian j And is represented by
Figure BDA0003698860790000051
For the jth pedestrian, the average pooling layer will be a high dimensional sequence h j Conversion into a single vector p j All pedestrian spatiotemporal features may be expressed as P ═ P at the same time j },1≤j≤M。
And thirdly, indexing all pedestrian space-time characteristics through local sensitive hashing, and determining a pedestrian search result according to the similarity.
The method for indexing all pedestrian space-time characteristics through the locality sensitive hash and determining the pedestrian search result according to the similarity comprises the following steps: because the number of pedestrians contained in the monitoring video is large, each pedestrian corresponds to one pedestrian space-time characteristic, and therefore a large number of pedestrian space-time characteristics can be generated. The method has the advantages that the similarity is directly calculated for the pedestrian space-time characteristics, and the real-time performance of pedestrian searching is difficult to guarantee, so that all the pedestrian space-time characteristics are indexed by adopting local sensitive Hash, and the final pedestrian searching result is given by calculating the similarity of the pedestrian space-time characteristics to be searched and the pedestrian space-time characteristics in the monitoring video.
The step of indexing all pedestrian space-time characteristics through locality sensitive hashing and determining a pedestrian search result according to the similarity comprises the following steps:
1) mapping all pedestrian space-time characteristics P in the monitored video to a Hamming vector space, and regarding the jth pedestrian space-time characteristics P j Can be mapped to a b-bit hash code and is denoted as H: p j →{0,1} b
2) After the pedestrian space-time feature q to be searched is mapped to the Hamming vector space, the similarity between the pedestrian space-time feature to be searched and the pedestrian space-time feature in the video can be calculated as sim (q, p) j )=P[H(q)=H(p j )]=Jaccard(q,p j );
3) After calculating all similarities between the pedestrian spatiotemporal features q to be searched and all the pedestrian spatiotemporal features P, sequencing the similarities, and obtaining pedestrian images corresponding to T spatiotemporal features at the front of the sequencing, namely the final search result.
Example 2
The embodiment adopts a pedestrian searching method facing to security monitoring videos, and the specific implementation steps are as follows:
1. and extracting the spatial features of the pedestrians from the monitoring video frame by adopting a regional convolution neural network.
Detection of pedestrian boundaries in regional convolutional neural networks using the conv1 layer through the conv4_3 layer in the ResNet-50 modelBlocks, conv4_4 through conv5_3 layers, perform pedestrian identification. After global average pooling and feature mapping, for the ith frame image v i The jth pedestrian in (b) can generate a 256-dimensional spatial feature s i,j Further, for all pedestrians in the surveillance video, a set of 256 × n may be generated j A spatial signature sequence S in x M dimensions.
2. And organizing the pedestrian spatial features extracted from the frames before and after the video through a gating circulation unit, and adding an average pooling layer at the output end to generate the pedestrian spatial-temporal features.
Because the number of frames of the security monitoring video is large, in order to ensure the efficiency of pedestrian searching, only one gating cycle unit is adopted to organize the pedestrian space characteristics extracted from the front frame image and the rear frame image. The ith frame image v i The extracted spatial feature s of the jth pedestrian i,j As an input vector, after being processed by a gating cycle unit, a 256-dimensional hidden state vector h can be generated i,j Further, the entire spatial signature encompassing the jth pedestrian may be represented as a set of 256 × n j Hidden vector sequence h of dimensions j . For all pedestrians in the video, it can be expressed as 256 × n j Hidden vector sequence H of x M dimensions.
Due to the hidden vector sequence h j And H is higher in dimension, so the average pooling layer is used for the sequence H j Dimension reduction is performed so as to reduce 256 × n j Hidden vector sequence h of dimensions j Conversion into a 256-dimensional pedestrian spatiotemporal feature p j Further, the M pedestrians appearing in the video may be represented as M independent pedestrian spatiotemporal features P ═ { P ═ P j }(1≤j≤M)。
3. And indexing all pedestrian space-time characteristics through local sensitive hashing, and determining a pedestrian search result according to the similarity.
In order to ensure the real-time performance of pedestrian search, local sensitive Hash is adopted to index M pedestrian space-time characteristics, the M pedestrian space-time characteristics are mapped to a Hamming vector space, and the jth pedestrian space-time characteristic p is j Can be mapped to a 128-bit hash code and represented as
Figure BDA0003698860790000061
And then calculating the similarity of the pedestrian space-time feature to be searched and the pedestrian space-time feature in the video according to the Jaccard coefficient in the Hamming vector space. After all the similarity degrees are sequenced, the pedestrian images corresponding to the 5 space-time features which are ranked at the top are selected, and the pedestrian images are the final search results.
The simulation experiment of the method of the invention is as follows:
in the experiment, videos shot by 9 monitoring cameras are selected, and 15000 video clips including 897 pedestrian targets are selected in total, so that a pedestrian search database is created. In this database, 11546 video clips were selected as the training data set, and the remaining 3454 video clips were selected as the test data set. Meanwhile, the performance of the pedestrian search method is tested by selecting the Average accuracy (MAP). In the case of changing the number of hash code, the method is compared with a search method using only spatial features of pedestrians, and the experimental result is shown in fig. 2. It can be seen that as the hash code increases from 8 bits to 128 bits, the average accuracy of the mean values of the method and the search method using only the pedestrian spatial features increases, but the accuracy of the method is higher than that of the method using only the pedestrian spatial features. The pedestrian space-time characteristics used by the method not only comprise the pedestrian space characteristics of the single-frame image, but also organize the time correlation of the pedestrian space characteristics of the front frame and the rear frame, thereby enhancing the discrimination among different pedestrians and being beneficial to improving the search accuracy.

Claims (4)

1. A pedestrian searching method facing to security monitoring videos comprises the following steps:
detecting each pedestrian frame by frame on a monitoring video by utilizing a pre-trained regional convolution neural network and generating corresponding spatial features;
organizing the pedestrian space features extracted frame by frame in the monitoring video by using the hidden state output by the gate control circulation unit, adding an average pooling layer at the output end of the gate control circulation unit, reducing the dimension of the hidden state vector, and generating the corresponding pedestrian space-time features;
and thirdly, indexing all pedestrian space-time characteristics through local sensitive hashing, and determining a final pedestrian search result by calculating the similarity of the pedestrian space-time characteristics to be searched and the pedestrian space-time characteristics in the monitoring video.
2. The pedestrian searching method according to claim 1, wherein the step one is performed according to the steps of:
1) for surveillance video V ═ V { (V) 1 ,v 2 ,…,v N Contains N frame images, wherein the ith frame image is represented as v i
2) Processing the surveillance video V frame by frame through a pre-trained regional convolution neural network, and processing the image V in the ith frame i The jth spatial feature s of the pedestrian extracted from the previous image i,j
3) After the N frames of images in the monitoring video V are processed, all spatial features of the pedestrians are expressed as S ═ S i,j },1≤i≤n j J is more than or equal to 1 and less than or equal to M, wherein n j Indicating the number of frame images containing the jth pedestrian, and M indicating the total number of pedestrians appearing in the surveillance video.
3. The pedestrian searching method according to claim 1, wherein the step two is performed according to the steps of:
1) the ith frame image v i The jth spatial feature s of the pedestrian extracted from the previous image i,j As an input vector, inputting to a gated loop unit;
2) spatial features s of pedestrians in gated cyclic units i,j Updating candidate hidden state vector c through tanh activation function i,j And is represented as: c. C i,j =tanh(W n s i,j +U n (r i,j ⊙h i-1,j )+b n ) Wherein h is i-1,j Representing the hidden state vector r corresponding to the jth pedestrian in the ith-1 frame image i,j Is h i-1,j Corresponding weight, W n 、U n And b n Network parameters of gated cycle cells;
3) in gated-cycle units, according to h i-1,j And c i,j Generating a hidden state vector h corresponding to the jth pedestrian in the ith frame image i,j And is represented as: h is i,j =z i,j h i-1,j +(1-z i,j )c i,j ,z i,j Is a combination h i-1,j And c i,j The weight of (c);
4) after the spatial features corresponding to the jth pedestrian are completely processed in the gate control cycle unit, a hidden vector sequence h corresponding to the jth pedestrian is obtained j ={h i,j },1≤i≤n j And all pedestrians in the video are represented as a hidden vector sequence H ═ H j },1≤j≤M;
5) Adding an average pooling layer at the output of the gated cyclic unit for the sequence h j Reducing dimensions and generating the space-time characteristic p of the jth pedestrian j And is represented by
Figure FDA0003698860780000021
All pedestrian spatiotemporal features are denoted as P ═ { P ═ P j },1≤j≤M。
4. The pedestrian search method according to claim 1, wherein step three is performed in accordance with the steps of:
1) mapping all pedestrian space-time characteristics P in the monitored video to a Hamming vector space, and regarding the jth pedestrian space-time characteristics P j Mapping the hash code into a b-bit hash code;
2) after the pedestrian space-time feature q to be searched is mapped to a Hamming vector space, calculating the similarity of the pedestrian space-time feature to be searched and the pedestrian space-time feature in the video;
3) after calculating all similarities between the pedestrian space-time features q to be searched and all the pedestrian space-time features P, sequencing the similarities, and obtaining the pedestrian images corresponding to the T space-time features in the front sequence, namely the final search result.
CN202210682446.1A 2022-06-16 2022-06-16 Pedestrian searching method oriented to security monitoring video Pending CN115082854A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210682446.1A CN115082854A (en) 2022-06-16 2022-06-16 Pedestrian searching method oriented to security monitoring video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210682446.1A CN115082854A (en) 2022-06-16 2022-06-16 Pedestrian searching method oriented to security monitoring video

Publications (1)

Publication Number Publication Date
CN115082854A true CN115082854A (en) 2022-09-20

Family

ID=83253432

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210682446.1A Pending CN115082854A (en) 2022-06-16 2022-06-16 Pedestrian searching method oriented to security monitoring video

Country Status (1)

Country Link
CN (1) CN115082854A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115689297A (en) * 2022-12-29 2023-02-03 杭州天阙科技有限公司 Child abduction risk early warning method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115689297A (en) * 2022-12-29 2023-02-03 杭州天阙科技有限公司 Child abduction risk early warning method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
Sun et al. TBE-Net: A three-branch embedding network with part-aware ability and feature complementary learning for vehicle re-identification
Sarfraz et al. A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking
CN111259786B (en) Pedestrian re-identification method based on synchronous enhancement of appearance and motion information of video
CN110717411A (en) Pedestrian re-identification method based on deep layer feature fusion
CN111709311B (en) Pedestrian re-identification method based on multi-scale convolution feature fusion
Soomro et al. Action localization in videos through context walk
CN111126360A (en) Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model
CN110120064B (en) Depth-related target tracking algorithm based on mutual reinforcement and multi-attention mechanism learning
Liu et al. Ca3net: Contextual-attentional attribute-appearance network for person re-identification
Yan et al. Learning multi-attention context graph for group-based re-identification
Lee et al. Place recognition using straight lines for vision-based SLAM
CN111242064A (en) Pedestrian re-identification method and system based on camera style migration and single marking
CN110765841A (en) Group pedestrian re-identification system and terminal based on mixed attention mechanism
CN112801068A (en) Video multi-target tracking and segmenting system and method
Fu et al. Learning latent features with local channel drop network for vehicle re-identification
Lee et al. Online multiple object tracking using rule distillated siamese random forest
Ruan et al. Correlation discrepancy insight network for video re-identification
Liu et al. Dual context-aware refinement network for person search
Tao et al. An adaptive frame selection network with enhanced dilated convolution for video smoke recognition
Liu et al. Dual-branch self-attention network for pedestrian attribute recognition
CN115082854A (en) Pedestrian searching method oriented to security monitoring video
Li et al. Object re-identification based on deep learning
Gao et al. Occluded person re-identification based on feature fusion and sparse reconstruction
Li et al. Trajectory association for person re-identification
Zhao et al. Cooperative use of recurrent neural network and siamese region proposal network for robust visual tracking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination