CN110659576A - Pedestrian searching method and device based on joint judgment and generation learning - Google Patents

Pedestrian searching method and device based on joint judgment and generation learning Download PDF

Info

Publication number
CN110659576A
CN110659576A CN201910783692.4A CN201910783692A CN110659576A CN 110659576 A CN110659576 A CN 110659576A CN 201910783692 A CN201910783692 A CN 201910783692A CN 110659576 A CN110659576 A CN 110659576A
Authority
CN
China
Prior art keywords
pedestrian
video
identified
image
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910783692.4A
Other languages
Chinese (zh)
Inventor
张斯尧
谢喜林
王思远
黄晋
蒋杰
张�诚
文戎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Jiu Ling Software Engineering Co Ltd
Original Assignee
Shenzhen Jiu Ling Software Engineering Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jiu Ling Software Engineering Co Ltd filed Critical Shenzhen Jiu Ling Software Engineering Co Ltd
Priority to CN201910783692.4A priority Critical patent/CN110659576A/en
Publication of CN110659576A publication Critical patent/CN110659576A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The embodiment of the invention provides a pedestrian searching method and device based on joint judgment and generation learning, wherein the method comprises the steps of obtaining a pedestrian re-identification system model; segmenting the pedestrian video to be identified based on the key frame of the pedestrian video to be identified, and extracting a pedestrian feature vector from the segmented pedestrian video to be identified through the pedestrian re-identification system model; and calculating the similarity between the feature vector of the target pedestrian and the pedestrian feature vector in the pedestrian video to be recognized, and obtaining a retrieval image of the target pedestrian from the pedestrian video to be recognized according to the similarity. By the embodiment of the invention, the efficiency, accuracy and real-time performance of pedestrian video search can be improved.

Description

Pedestrian searching method and device based on joint judgment and generation learning
Technical Field
The invention belongs to the technical field of computer vision and intelligent traffic, and particularly relates to a pedestrian searching method, a pedestrian searching device, terminal equipment and a computer readable medium based on joint judgment and generation learning.
Background
With the continuous development of artificial intelligence, computer vision and hardware technology, video image processing technology attracts more and more attention of related researchers. However, the current method for dynamically identifying and monitoring pedestrians by directly using videos is not mature, because on one hand, each frame image in the videos is in dynamic refreshing, and the shooting speed of the camera is faster and faster; on the other hand, the existing video pedestrian searching method is low in efficiency and low in accuracy. Therefore, it is difficult to accurately search for a target pedestrian in real time for a large amount of video.
Disclosure of Invention
In view of this, embodiments of the present invention provide a pedestrian search method, apparatus, terminal device and computer readable medium based on joint judgment and generation learning, which can improve efficiency, accuracy and real-time performance of video pedestrian search.
A first aspect of an embodiment of the present invention provides a pedestrian search method based on joint judgment and generative learning, including:
acquiring a pedestrian re-identification system model;
segmenting the pedestrian video to be identified based on the key frame of the pedestrian video to be identified, and extracting a pedestrian feature vector from the segmented pedestrian video to be identified through the pedestrian re-identification system model;
and calculating the similarity between the feature vector of the target pedestrian and the pedestrian feature vector in the pedestrian video to be recognized, and obtaining a retrieval image of the target pedestrian from the pedestrian video to be recognized according to the similarity.
A second aspect of an embodiment of the present invention provides a pedestrian search apparatus based on joint determination and generation learning, including:
the acquisition module is used for acquiring a pedestrian re-identification system model;
the extraction module is used for carrying out segmentation processing on the pedestrian video to be recognized based on the key frame of the pedestrian video to be recognized and extracting a pedestrian feature vector from the segmented pedestrian video to be recognized through the pedestrian re-recognition system model;
and the retrieval module is used for calculating the similarity between the feature vector of the target pedestrian and the pedestrian feature vector in the pedestrian video to be identified and obtaining the retrieval image of the target pedestrian from the pedestrian video to be identified according to the similarity.
A third aspect of the embodiments of the present invention provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the above pedestrian search method based on joint judgment and generation learning when executing the computer program.
A sixth aspect of the embodiments of the present invention provides a computer-readable medium storing a computer program which, when being processed and executed, realizes the above-mentioned steps of the pedestrian search method based on joint determination and generation learning.
In the pedestrian searching method based on joint judgment and generation learning provided by the embodiment of the invention, a pedestrian re-identification system model can be obtained, the pedestrian video to be identified is segmented based on the key frame of the pedestrian video to be identified, the pedestrian feature vector is extracted from the segmented pedestrian video to be identified through the pedestrian re-identification system model, the similarity between the feature vector of a target pedestrian and the pedestrian feature vector in the pedestrian video to be identified is calculated, and the retrieval image of the target pedestrian is obtained from the pedestrian video to be identified according to the similarity, so that the efficiency and the accuracy of video pedestrian searching can be improved, and the requirement of real-time searching can be met.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a flow chart of a pedestrian search method based on joint judgment and generative learning according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of reconstruction of a given pedestrian image in a pedestrian search method based on joint judgment and generation learning according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of cross-synthesis of given pedestrian images with different identities in a pedestrian search method based on joint judgment and generation learning according to an embodiment of the present invention;
fig. 4 is a schematic diagram illustrating Map and Reduce processing performed on a pedestrian video in a pedestrian search method based on joint judgment and generation learning according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a pedestrian searching apparatus based on joint judgment and generation learning according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a detailed structure of the acquisition module in FIG. 5;
FIG. 7 is a schematic diagram of a refined structure of the extraction module in FIG. 5;
FIG. 8 is a schematic diagram of a detailed structure of the retrieval module in FIG. 5;
fig. 9 is a schematic diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
Referring to fig. 1, fig. 1 is a pedestrian searching method based on joint judgment and generation learning according to an embodiment of the present invention. As shown in fig. 1, the pedestrian search method based on joint determination and generation learning of the present embodiment includes the steps of:
s101: and acquiring a pedestrian re-identification system model.
In the embodiment of the invention, a learning convolutional neural network module can be constructed and generated based on a given pedestrian image; the generation learning convolution neural network module comprises a self-generation network and a cross generation network, wherein the self-generation network is used for reconstructing a given pedestrian image to generate a reconstructed image, and the cross generation network is used for synthesizing pedestrian images with different identity characteristics to generate a synthesized image; then, a joint discrimination module is constructed according to the generated learning convolution neural network module, wherein the joint discrimination module is used for learning the preliminary characteristics of the reconstructed image and the synthesized image and mining the fine-grained characteristics of the reconstructed image and the synthesized image; finally, determining a loss function of the whole network of the generated learning convolutional neural network module and the combined judgment module based on the loss function of the generated learning convolutional neural network module and the loss function of the combined judgment module, and determining a pedestrian re-identification system model according to the loss function of the whole network; the pedestrian re-identification system model is used for re-identifying pedestrians on the basis of pedestrian images.
Specifically, a Spark (named Apache Spark, a parallel computing architecture based on HDFS from Apache corporation) can be built based on an MLbase (which is a part of Spark ecosphere and is specially responsible for machine learning) machine learning libraryA big data platform. Over the past decade, an extensible distributed programming framework has emerged to manage large data. The first programming model is MapReduce and its open source implementation Apache Hadoop (distributed system infrastructure developed by the Apache foundation, Hadoop for short). In recent years, a new distributed framework apache spark has emerged, which is a platform for fast and versatile large-scale data processing. The Spark platform, based on memory computation, is naturally adapted to big data processing and analysis. Spark has the advantages of Hadoop MapReduce (the full name of MapReduce); but different from MapReduce, the intermediate output result of Job can be saved in the memory, and HDFS (Hadoop distributed file system) is not required to be read and written, so Spark can be better suitable for MapReduce algorithms requiring iteration, such as data mining and machine learning. Since the video data is stored in the HDFS file system, Spark accesses a data source in a TCP socket (TCP socket) based manner, and performs intelligent video analysis using a MapReduce distributed computing model. In the component structure of Spark, MLlib is the implementation library of Spark to commonly used machine learning algorithms. MLlib currently supports four common machine learning problems: binary classification, regression, clustering and collaborative filtering, and also comprises a bottom gradient descent optimization basic algorithm. The machine learning algorithm comprises two parts of training and predicting, wherein a model is trained, and then an unknown sample is predicted. MLbase is automatically optimized for distributed execution, and algorithm selection is achieved according to MLbase best practices and cost-based models. The embodiment of the invention uses MLbase as a tool to process information feature detection and training processing of vehicles, human faces, pedestrians, left-over articles and the like in videos. After a Spark big data platform based on an MLbase machine learning library can be built according to the corresponding components, video images are accessed to the platform, and subsequent algorithm operation is carried out. In the subsequent operation of building a pedestrian re-identification system model, firstly, a learning convolutional neural network module is built and generated based on a pedestrian image obtained from a Spark big data platform, and usually, a pedestrian image can be defined firstly
Figure BDA0002177347200000041
The identity tag is
Figure BDA0002177347200000042
Where N is the number of images, yi∈[1,K]And K is the category or the amount of identity information in the data set. Given two realistic pedestrian images x in the training setiAnd xjThe learning convolutional neural network generation module provided by the embodiment of the invention can generate a new pedestrian image through removing the surface layer structure information code reconstruction of the given image. The learning convolutional neural network module is generated by a surface layer decoder Ea:xi→aiA structure decoder Es∶xj→sjAnd a surface and structure encoder
Figure BDA0002177347200000043
And (4) forming. In order to make the generated image more controllable and to better match the data distribution of the real data set, the algorithm provided by the embodiment of the invention strengthens two aspects contained in the generation of the learning convolutional neural network module: 1. self-generating a network; 2. and (4) generating a network in a crossed mode.
With respect to the self-generated network, any one pedestrian image x is giveniThe generative learning convolutional neural network module first learns how to reconstruct the image from the image itself. The general method is as shown in FIG. 2, giving two pedestrian images xiAnd xtWherein, the identity characteristics of the two pedestrian images are the same (namely, the same person), so the structure information codes of the two pedestrian images are the same (namely, y)iIs equal to yt). For pedestrian image xiBased on its surface information code aiAnd a structural information code siReconfigurable generated image
Figure BDA0002177347200000044
For pedestrian image xtUsing its surface layer information code atAnd combined with the pedestrian image xiStructural information code s ofiReconfigurable generated image
Figure BDA0002177347200000045
This type of simple self-reconstruction task provides an important regularization for the entire generative learning. The image reconstruction loss function for reconstructing a pedestrian image in the present invention is:
Figure BDA0002177347200000046
wherein, G (a)i,si) Representation based on pedestrian image xiReconstructed image aiRepresenting a pedestrian image xiSurface layer information code in siRepresenting a pedestrian image xiThe structural information code in (1), E is a desired operator (the same applies below). Meanwhile, in order to strengthen the distinction of the surface layer information codes of different images, the embodiment of the invention constructs the identity loss function in the self-generating network to distinguish different identity characteristics of different images, and the formula is as follows:
Figure BDA0002177347200000047
wherein, p (y)i|xi) Is based on the image x in the surface layer information code of the imageiMembership to dataset yiIs predicted.
Regarding the cross identity generation network, unlike the self generation network, the cross identity generation network mainly performs generation learning for different identity characteristics of images. More specifically, the cross generation network is used for synthesizing images of pedestrians with different identity characteristics to generate a synthesized image. The present invention uses a latent code reconstruction technique based on a surface information code and a structure information code to control the generation of an image. Overall given two pedestrian images x, as shown in FIG. 3iAnd xjThe two pedestrian images have different identity characteristics (i.e. are not the same person), so the two pedestrian images have different structure information codes (i.e. y)iIs not equal to yj) Further, since the clothes, trousers, shoes, and the like of pedestrians in the two pedestrian images are different, the two pedestrian imagesThe skin information code of the image is also different. Based on image xiSurface layer information code a ofiAnd image xjStructural information code s ofjReconfigurable generated image
Figure BDA0002177347200000051
Based on image xjSurface layer information code a ofjAnd image xiStructural information code s ofiReconfigurable generated image
Figure BDA0002177347200000052
More specifically, given two realistic pedestrian images x of different identitiesiAnd xj,xiIs not equal to yjThen the generated image is
Figure BDA0002177347200000053
Generating an image requires maintaining an image xiSurface layer information code a iniAnd maintaining a pedestrian image xjStructural information code s in (1)jThe information of (1). We can then reconstruct the two subcode information after generating the image. The specific formula is as follows:
Figure BDA0002177347200000054
Figure BDA0002177347200000055
wherein the content of the first and second substances,
Figure BDA0002177347200000056
a loss function is reconstructed for the surface layer information codes used to reconstruct the surface layer information codes in the pedestrian image,
Figure BDA0002177347200000057
a loss function is reconstructed for the structure information code used to reconstruct the structure information code in the pedestrian image.
Similarly, the loss function for distinguishing different identity features of different images in a cross-identity generation network is:
Figure BDA0002177347200000058
wherein the content of the first and second substances,
Figure BDA0002177347200000059
is a synthesized image
Figure BDA00021773472000000510
Middle identity label yiFor pedestrian image xiThe overlapping probability of the true significance of. In addition, the embodiment of the present invention provides a loss-fighting function for generating image matching for real data distribution as follows:
Ladv=E[logD(xi)]+log(1-D(G(aj,sj))) (5)
based on the loss function and the picture processing method, a learning convolutional neural network module can be constructed and generated by combining the existing residual error network 50(Res-net50) model.
Further, regarding the construction of the joint discrimination module, the joint discrimination module can be mainly divided into two parts: main feature learning and fine-grained feature mining. Through a priori experience, the generated images on the line (for example, the reconstructed image and the synthetic image generated by the generation learning convolutional neural network module) can be better utilized based on the two parts. Because these two parts typically focus on different aspects of the generated image, the algorithm provided by embodiments of the present invention can branch off two lightweight head weights at the top of the appearance encoder for both large parts of feature learning. Regarding the learning of the main features (which may also be referred to as preliminary features), the image generated in S101 may be considered to be similar to a training sample in an existing model. However, the change in the pedestrian images across categories and across ID combinations allows embodiments of the present invention to employ a teacher-student style supervision method of dynamic soft tags. The teaching model in the teacher-student type supervision method is only an original training set of baseline CNN (convolutional neural network) training and loss recognition, and like the prior art, the teaching model is not repeated herein. In order to train the discriminant model with the main features, the algorithm provided in the embodiment of the present invention minimizes the Kullback-Leibler divergence (referred to as KL divergence) between the probability distribution p predicted by the joint discriminant module and the probability distribution q predicted by the teaching model, and uses the following loss function:
Figure BDA0002177347200000061
where K is the amount of identity information, where,
Figure BDA0002177347200000062
is a synthesized image predicted by the joint discrimination module
Figure BDA0002177347200000063
For pedestrian image xiThe probability of overlap of the true significance of,
Figure BDA0002177347200000064
is a composite image predicted by the teaching model
Figure BDA0002177347200000065
For pedestrian image xiThe overlapping probability of the true significance of. In other words, the loss function may be used to learn preliminary features of the reconstructed image and the composite image. Compared with other feature labels, the dynamic feature label used by the method is more suitable for the combined discrimination model, and the reliability of the discrimination model with the main features can be enhanced and improved.
Regarding mining of fine-grained features, unlike the direct effect of main feature learning on the acquisition of generated data, mining of fine-grained features is mainly reflected in the transfer of interest points on images of general pedestrians. The fine-grained feature mining is mainly to train the pedestrian images in the training library or the images generated in S101, and the joint discrimination module is forced to learn fine-grained ID-related attributes (such as hair, hat, bag, body type, etc.) which are not related to clothes. In the discrimination model of the fine-grained features of the part, the algorithm provided by the embodiment of the invention treats an image generated by combining one structural information code and different apparent information codes as the same type of real image providing the structural code. In order to realize the above functions, a feature discrimination model of a fine-grained feature in the joint discrimination module is obtained by training, and a loss function strengthened by the method on the specific classification is as follows:
Figure BDA0002177347200000066
such a loss function may impose additional identity supervision on the identification discrimination module in a multitasking manner. Compared with the existing method for mining the fine-grained feature samples, the algorithm does not need to explicitly search the hard training samples with the fine granularity, and the identification module of the algorithm is focused on the fine identity feature attributes of the pedestrian images through the fine-grained feature mining of the method.
It can be generally considered that a high quality composite image can be considered "portal" in nature (as opposed to "outliers") because the images generated by the generative learning model preserve and recombine visual content from the real data. Through the two characteristic learning tasks, the combined judgment module of the method can enable the integrally built model network to specifically use the generated data according to the operation on the appearance information code and the structure information code.
The method of the invention does not use single supervision like all previous methods, but processes the generated image from two different angles through the learning of primary features (also called main features) and the mining of fine-grained features, wherein the former focuses on the costume external information with invariable structural information, and the latter focuses on the structural clue with invariable apparent information.
Further, regarding the determination of the model of the pedestrian re-identification system, the embodiment of the present invention trains the surface layer decoder, the structure decoder, the surface layer and structure encoder, and the joint discriminator (corresponding joint discrimination module) together to optimize the overall objective, that is, based on the image reconstruction loss function, the loss function for distinguishing different identity features of different images in the self-generated network, the surface layer information code reconstruction loss function, the structure information code reconstruction loss function, the loss function for distinguishing different identity features of different images in the cross-generation network, and the countervailing loss function, the overall network loss function of the learning convolutional neural network module and the joint discrimination module can be constructed:
Figure BDA0002177347200000071
wherein the content of the first and second substances,
Figure BDA0002177347200000072
and
Figure BDA0002177347200000074
respectively, image reconstruction loss functions for reconstructing different pedestrian images,
Figure BDA0002177347200000075
is a potential information code reconstruction loss function in cross-identity generation learning. Lambda [ alpha ]img、λid、λprimAnd λfineIs the weight of the importance of each part controlling the relevant loss term, and in the actual image-to-image conversion process, a large weight lambda is generally usedimgThe image reconstruction loss is calculated as 5. Since the cross-ID generated image is of low quality at the beginning, the loss function is identified
Figure BDA0002177347200000076
May make the training unstable, so a smaller weight λ needs to be setid0.5. Meanwhile, before the generation quality is stable, the method does not involve the identification of the characteristic learning loss function LprimAnd Lfine. After the integral model function is determined, the generated learning convolution spirit can be subjected to the integral network loss functionTraining the whole network of the network module and the combined discrimination module to obtain and output a pedestrian re-identification system model. The pedestrian re-identification system model can be used for re-identifying pedestrians or retrieving large data amount based on pedestrian images by using a postcursor.
S102: and carrying out segmentation processing on the pedestrian video to be recognized based on the key frame of the pedestrian video to be recognized, and extracting pedestrian feature vectors from the segmented pedestrian video to be recognized through the pedestrian re-recognition system model.
In the embodiment of the present invention, a real-time pedestrian video or video file may be transmitted to a Spark big data platform, video image data is segmented into video segments by a Map (mapping) method, then extraction processing of pedestrian features may be performed, and automatic aggregation and data storage are performed on pedestrian images and pedestrian information (including pedestrian feature vectors and the like) obtained after processing an extraction result by a Reduce method, as specifically shown in fig. 4.
The video segmentation processing is carried out based on the video key frames, so that the video can be better parallelized. The segmentation process is roughly divided into the following three steps: 1. distinguishing I frame data and P frame data according to different frame data in the pedestrian video to be identified, and taking out key frame information of the pedestrian video to be identified; 2. based on the key frame information, taking a key frame of the pedestrian video to be recognized as a segmentation point of the pedestrian video to be recognized, wherein the key frame comprises the pedestrian to be recognized and a preset marker (such as a building, a shop sign, a guideboard, a license plate and the like); 3. according to a moving object detection algorithm, the time point when the moving object appears or disappears is used as the starting and ending time point of the segmented video of the pedestrian video to be identified, and the file position of the segmented video can be obtained according to the appearance or disappearance of the moving object. The moving object detection algorithm is the same as the prior art, and therefore, the details are not repeated herein. It can be seen that the basis of video segmentation is mainly 3 constraint terms as follows: the time point of appearance or disappearance of a moving target in the video is used as the starting and stopping time point of the segmented video; the video segmentation points are required to be video key frames, and the complete video image can be obtained only by the video file segmented according to the video key frames; the video clip cannot be less than 30 seconds long and cannot exceed 6 minutes long. Finally, the segmented video image may be output. In addition, considering the requirement of target detection in practical application, the embodiment of the present invention may set the size of the search area as: the aspect ratio of the picture is variable, and the overall picture size is not changed. This not only helps to increase the processing requirements of the video image, but also greatly reduces the amount of computation. For the original input picture, the regional candidate network (RPN network) will get about twenty thousand search boxes. In practical application, some search boxes beyond the picture boundary can be eliminated; meanwhile, for the search frames overlapped and covered by the same target, a Non-Maximum Suppression (NMS) method can be adopted for processing, so as to achieve the purpose of removing the overlapped search frames. The strategy can remarkably improve the searching efficiency of the candidate target frame. And finally, extracting the pedestrian feature vector from the segmented pedestrian video to be recognized through the pedestrian re-recognition system model obtained in the step S101. It is noted that the more training images are input in S101, the more accurate the model is and the wider the coverage is. The pedestrian target training and a large amount of on-site system adjustment and testing are carried out through a huge amount of pedestrian image learning samples, various characteristics such as appearance outlines, relative positions, colors and textures of various parts such as clothes, faces, upper half bodies and lower half bodies can be collected and described, a large amount of auxiliary classification information is formed, and a comprehensive credibility score can be finally obtained together with the results such as the ages and the sexes of pedestrians.
S103: and calculating the similarity between the feature vector of the target pedestrian and the pedestrian feature vector in the pedestrian video to be recognized, and obtaining a retrieval image of the target pedestrian from the pedestrian video to be recognized according to the similarity.
In the embodiment of the invention, the similarity between the feature vector of the target pedestrian and the pedestrian feature vector in the pedestrian video to be identified can be calculated according to the following formula
Figure BDA0002177347200000082
Wherein the content of the first and second substances,is the feature vector of the target pedestrian,
Figure BDA0002177347200000084
for the pedestrian feature vector in the pedestrian video to be identified, | x | isIs | y |
Figure BDA0002177347200000086
Norm of theta is
Figure BDA0002177347200000087
And
Figure BDA0002177347200000088
the smaller the calculated value of the included angle is, the higher the similarity is. And then obtaining a retrieval image of the target pedestrian from the pedestrian video to be identified according to the similarity. The feature vector of the pedestrian in the retrieval image of the target pedestrian is the same as or similar to the feature vector of the target pedestrian. Specifically, the pedestrian images and the pedestrian information which are the same as or most similar to the target pedestrian can be obtained according to the similarity ranking. Assuming that a pedestrian a (target pedestrian) is searched, by the method, retrieval images of the pedestrian a can be obtained from pedestrian videos (pedestrian videos to be identified) shot by a large number of cameras, and the retrieval images also contain the pedestrian a, so that the specific track of the pedestrian a can be determined through the retrieval images and pedestrian information in the images.
In the pedestrian searching method based on joint judgment and generation learning provided in fig. 1, a pedestrian re-identification system model can be obtained, the pedestrian video to be identified is segmented based on a key frame of the pedestrian video to be identified, a pedestrian feature vector is extracted from the segmented pedestrian video to be identified through the pedestrian re-identification system model, the similarity between the feature vector of a target pedestrian and the pedestrian feature vector in the pedestrian video to be identified is calculated, and a retrieval image of the target pedestrian is obtained from the pedestrian video to be identified according to the similarity, so that the efficiency and the accuracy of pedestrian searching of massive videos can be improved, and the requirement of real-time searching can be met. The embodiment of the invention is based on the structural composition of a Spark big data platform and the intelligent analysis of big data, sequentially realizes the intelligent segmentation algorithm of the video key frame based on deep learning and the pedestrian searching technology based on the video, and has the advantages of high system reliability, good identification degree, good robustness, simple step calculation, high efficiency maintenance and real-time performance meeting the requirements.
Referring to fig. 5, fig. 5 is a block diagram illustrating a pedestrian searching apparatus based on joint judgment and generation learning according to an embodiment of the present invention. As shown in fig. 5, the pedestrian search apparatus 50 based on joint determination and generation learning of the present embodiment includes an acquisition module 501, an extraction module 502, and a retrieval module 503. The obtaining module 501, the extracting module 502 and the retrieving module 503 are respectively configured to perform the specific methods in S101, S102 and S103 in fig. 1, and details can be referred to in the related introduction of fig. 1 and are only briefly described here:
the obtaining module 501 is configured to obtain a pedestrian re-identification system model.
The extraction module 502 is configured to perform segmentation processing on the to-be-identified pedestrian video based on the keyframe of the to-be-identified pedestrian video, and extract a pedestrian feature vector from the segmented to-be-identified pedestrian video through the pedestrian re-identification system model.
The retrieval module 503 is configured to calculate a similarity between a feature vector of a target pedestrian and the pedestrian feature vector in the to-be-identified pedestrian video, and obtain a retrieval image of the target pedestrian from the to-be-identified pedestrian video according to the similarity.
Further, as can be seen in fig. 6, the obtaining module 501 may specifically include a learning network building unit 5011, a discrimination building unit 5012, and a re-recognition model determining unit 5013:
a learning network construction unit 5011 for constructing and generating a learning convolutional neural network module based on a given pedestrian image; the learning convolutional neural network generation module comprises a self-generation network and a cross generation network; the self-generation network is used for reconstructing a given pedestrian image to generate a reconstructed image, and the cross generation network is used for synthesizing pedestrian images with different identity characteristics to generate a synthesized image.
The judgment building unit 5012 is used for building a joint judgment module based on the generated learning convolutional neural network module; the joint discrimination module is used for learning the preliminary features of the reconstructed image and the synthesized image and mining the fine-grained features of the reconstructed image and the synthesized image.
A re-recognition model determining unit 5013, configured to determine a loss function of an overall network of the generated learning convolutional neural network module and the joint discrimination module based on the loss function of the generated learning convolutional neural network module and the loss function of the joint discrimination module, and determine a pedestrian re-recognition system model according to the overall network loss function; the pedestrian re-identification system model is used for re-identifying pedestrians on the basis of pedestrian images.
Further, referring to fig. 7, the extraction module 502 may specifically include a distinguishing unit 5021, a first dividing unit 5022, a second dividing unit 5023, and an extraction unit 5024:
the distinguishing unit 5021 is configured to distinguish I-frame data and P-frame data according to differences of different frame data in the pedestrian video to be identified, and extract key frame information of the pedestrian video to be identified.
A first dividing unit 5022, configured to use the keyframe of the to-be-identified pedestrian video as a segmentation point of the to-be-identified pedestrian video based on the keyframe information; the keyframes include pedestrians to be identified and preset markers (e.g., buildings, store signs, road boards, license plates, etc.).
And the second dividing unit 5023 is configured to use a time point when a moving object appears or disappears as a start-stop time point of the segmented video of the pedestrian video to be identified according to a moving object detection algorithm.
And the extraction unit 5024 is used for extracting the pedestrian feature vector from the segmented pedestrian video to be identified through the pedestrian re-identification system model.
Further, referring to fig. 8, the retrieving module 503 may specifically include a calculating unit 5031 and a searching unit 5032:
a calculating unit 5031, configured to calculate a similarity between a feature vector of a target pedestrian and the pedestrian feature vector in the to-be-identified pedestrian video
Figure BDA0002177347200000102
Wherein the content of the first and second substances,
Figure BDA0002177347200000103
is the feature vector of the target pedestrian,
Figure BDA0002177347200000104
for the pedestrian feature vector in the pedestrian video to be identified, | x | is
Figure BDA0002177347200000105
Is | y |Norm of theta is
Figure BDA0002177347200000107
And
Figure BDA0002177347200000108
the included angle therebetween.
A searching unit 5032, configured to obtain, according to the similarity, a retrieval image of the target pedestrian from the pedestrian video to be recognized.
The pedestrian searching device based on joint judgment and generation learning provided in fig. 5 can acquire a pedestrian re-identification system model, perform segmentation processing on a pedestrian video to be identified based on a keyframe of the pedestrian video to be identified, extract a pedestrian feature vector from the pedestrian video to be identified after the segmentation processing through the pedestrian re-identification system model, calculate the similarity between the feature vector of a target pedestrian and the pedestrian feature vector in the pedestrian video to be identified, and obtain a retrieval image of the target pedestrian from the pedestrian video to be identified according to the similarity, so that the efficiency and accuracy of pedestrian searching of massive videos can be improved, and the requirement of real-time searching can be met. The embodiment of the invention is based on the structural composition of a Spark big data platform and the intelligent analysis of big data, sequentially realizes the intelligent segmentation algorithm of the video key frame based on deep learning and the pedestrian searching technology based on the video, and has the advantages of high system reliability, good identification degree, good robustness, simple step calculation, high efficiency maintenance and real-time performance meeting the requirements.
Fig. 9 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 8, the terminal device 9 of this embodiment includes: a processor 90, a memory 91 and a computer program 92 stored in said memory 91 and operable on said processor 90, such as a program for performing a pedestrian search based on joint judgment and generation learning. The processor 90, when executing the computer program 92, implements the steps in the above-described method embodiments, e.g., S101 to S103 shown in fig. 1. Alternatively, the processor 90, when executing the computer program 92, implements the functions of each module/unit in the above-mentioned device embodiments, for example, the functions of the modules 501 to 503 shown in fig. 5.
Illustratively, the computer program 92 may be partitioned into one or more modules/units that are stored in the memory 91 and executed by the processor 90 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 92 in the terminal device 9. For example, the computer program 92 may be divided into an acquisition module 501, an extraction module 502 and a retrieval module 503. (modules in the virtual device), the specific functions of each module are as follows:
the obtaining module 501 is configured to obtain a pedestrian re-identification system model.
The extraction module 502 is configured to perform segmentation processing on the to-be-identified pedestrian video based on the keyframe of the to-be-identified pedestrian video, and extract a pedestrian feature vector from the segmented to-be-identified pedestrian video through the pedestrian re-identification system model.
The retrieval module 503 is configured to calculate a similarity between a feature vector of a target pedestrian and the pedestrian feature vector in the to-be-identified pedestrian video, and obtain a retrieval image of the target pedestrian from the to-be-identified pedestrian video according to the similarity.
The terminal device 9 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. Terminal device 9 may include, but is not limited to, a processor 90, a memory 91. Those skilled in the art will appreciate that fig. 9 is only an example of a terminal device 9, and does not constitute a limitation to the terminal device 9, and may include more or less components than those shown, or combine some components, or different components, for example, the terminal device may also include an input-output device, a network access device, a bus, etc.
The Processor 90 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 91 may be an internal storage unit of the terminal device 9, such as a hard disk or a memory of the terminal device 9. The memory 91 may also be an external storage device of the terminal device 9, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) and the like provided on the terminal device 9. Further, the memory 91 may also include both an internal storage unit of the terminal device 9 and an external storage device. The memory 91 is used for storing the computer program and other programs and data required by the terminal device 9. The memory 91 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (10)

1. A pedestrian searching method based on joint judgment and generation learning is characterized by comprising the following steps:
acquiring a pedestrian re-identification system model;
segmenting the pedestrian video to be identified based on the key frame of the pedestrian video to be identified, and extracting a pedestrian feature vector from the segmented pedestrian video to be identified through the pedestrian re-identification system model;
and calculating the similarity between the feature vector of the target pedestrian and the pedestrian feature vector in the pedestrian video to be recognized, and obtaining a retrieval image of the target pedestrian from the pedestrian video to be recognized according to the similarity.
2. The pedestrian search method based on joint judgment and generation learning according to claim 1, wherein the obtaining of the pedestrian re-identification system model comprises:
building and generating a learning convolution neural network module based on a given pedestrian image; the learning convolutional neural network generation module comprises a self-generation network and a cross generation network; the self-generation network is used for reconstructing a given pedestrian image to generate a reconstructed image, and the cross generation network is used for synthesizing pedestrian images with different identity characteristics to generate a synthesized image;
building a combined judgment module according to the generated learning convolutional neural network module; the joint discrimination module is used for learning the preliminary features of the reconstructed image and the synthesized image and mining the fine-grained features of the reconstructed image and the synthesized image;
determining a loss function of an overall network of the generated learning convolutional neural network module and the combined discrimination module based on the loss function of the generated learning convolutional neural network module and the loss function of the combined discrimination module, and determining a pedestrian re-identification system model according to the loss function of the overall network; the pedestrian re-identification system model is used for re-identifying pedestrians on the basis of pedestrian images.
3. The pedestrian search method based on joint judgment and generation learning according to claim 1, wherein the step of segmenting the pedestrian video to be recognized based on the keyframe of the pedestrian video to be recognized and extracting the pedestrian feature vector from the segmented pedestrian video to be recognized through the pedestrian re-recognition system model comprises the steps of:
distinguishing I frame data and P frame data according to different frame data in the pedestrian video to be identified, and taking out key frame information of the pedestrian video to be identified;
based on the key frame information, taking the key frame of the pedestrian video to be identified as a segmentation point of the pedestrian video to be identified; the key frame comprises a pedestrian to be identified and a preset marker;
according to a moving target detection algorithm, taking the time point when a moving target appears or disappears as the starting and stopping time point of the segmented video of the pedestrian video to be identified;
and extracting pedestrian feature vectors from the segmented pedestrian video to be recognized through the pedestrian re-recognition system model.
4. The pedestrian search method based on joint judgment and generation learning according to claim 1, wherein the calculating a similarity between a feature vector of a target pedestrian and the pedestrian feature vector in the pedestrian video to be recognized and obtaining a retrieval image of the target pedestrian from the pedestrian video to be recognized according to the similarity comprises:
calculating the similarity between the feature vector of the target pedestrian and the pedestrian feature vector in the pedestrian video to be identified
Figure FDA0002177347190000021
Figure FDA0002177347190000022
Wherein the content of the first and second substances,
Figure FDA0002177347190000023
is the feature vector of the target pedestrian,
Figure FDA0002177347190000024
for the pedestrian feature vector in the pedestrian video to be identified, | x | is
Figure FDA0002177347190000025
Is | y |Norm of theta is
Figure FDA0002177347190000027
And
Figure FDA0002177347190000028
the included angle between them;
and obtaining a retrieval image of the target pedestrian from the pedestrian video to be identified according to the similarity.
5. A pedestrian search device based on joint judgment and generation learning, characterized by comprising:
the acquisition module is used for acquiring a pedestrian re-identification system model;
the extraction module is used for carrying out segmentation processing on the pedestrian video to be recognized based on the key frame of the pedestrian video to be recognized and extracting a pedestrian feature vector from the segmented pedestrian video to be recognized through the pedestrian re-recognition system model;
and the retrieval module is used for calculating the similarity between the feature vector of the target pedestrian and the pedestrian feature vector in the pedestrian video to be identified and obtaining the retrieval image of the target pedestrian from the pedestrian video to be identified according to the similarity.
6. The pedestrian search device based on joint determination and generation learning according to claim 5, wherein the acquisition module includes:
the learning network building unit is used for building and generating a learning convolution neural network module based on a given pedestrian image; the learning convolutional neural network generation module comprises a self-generation network and a cross generation network; the self-generation network is used for reconstructing a given pedestrian image to generate a reconstructed image, and the cross generation network is used for synthesizing pedestrian images with different identity characteristics to generate a synthesized image;
the discrimination building unit is used for building a combined discrimination module based on the generated learning convolutional neural network module; the joint discrimination module is used for learning the preliminary features of the reconstructed image and the synthesized image and mining the fine-grained features of the reconstructed image and the synthesized image;
the re-recognition model determining unit is used for determining a loss function of an overall network of the generated learning convolutional neural network module and the joint discrimination module based on the loss function of the generated learning convolutional neural network module and the loss function of the joint discrimination module, and determining a pedestrian re-recognition system model according to the overall network loss function; the pedestrian re-identification system model is used for re-identifying pedestrians on the basis of pedestrian images.
7. The pedestrian search apparatus based on joint judgment and generation learning according to claim 5, wherein the extraction module includes:
the distinguishing unit is used for distinguishing I frame data and P frame data according to the difference of different frame data in the pedestrian video to be identified and extracting key frame information of the pedestrian video to be identified;
the first dividing unit is used for taking the key frame of the pedestrian video to be identified as the segmentation point of the pedestrian video to be identified based on the key frame information; the key frame comprises a pedestrian to be identified and a preset marker;
the second dividing unit is used for taking the time point when the moving target appears or disappears as the starting and stopping time point of the segmented video of the pedestrian video to be identified according to a moving target detection algorithm;
and the extraction unit is used for extracting the pedestrian feature vector from the segmented pedestrian video to be identified through the pedestrian re-identification system model.
8. The pedestrian search apparatus based on joint judgment and generation learning according to claim 5, wherein the retrieval module includes:
a calculation unit for calculating the similarity between the feature vector of the target pedestrian and the pedestrian feature vector in the pedestrian video to be identified
Figure FDA0002177347190000031
Figure FDA0002177347190000032
Wherein the content of the first and second substances,
Figure FDA0002177347190000033
is the feature vector of the target pedestrian,
Figure FDA0002177347190000034
for the pedestrian feature vector in the pedestrian video to be identified, | x | is
Figure FDA0002177347190000035
Is | y |
Figure FDA0002177347190000036
Norm of theta is
Figure FDA0002177347190000037
And
Figure FDA0002177347190000038
the included angle between them;
and the searching unit is used for obtaining the retrieval image of the target pedestrian from the pedestrian video to be identified according to the similarity.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-4 when executing the computer program.
10. A computer-readable medium, in which a computer program is stored which, when being processed and executed, carries out the steps of the method according to any one of claims 1 to 4.
CN201910783692.4A 2019-08-23 2019-08-23 Pedestrian searching method and device based on joint judgment and generation learning Pending CN110659576A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910783692.4A CN110659576A (en) 2019-08-23 2019-08-23 Pedestrian searching method and device based on joint judgment and generation learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910783692.4A CN110659576A (en) 2019-08-23 2019-08-23 Pedestrian searching method and device based on joint judgment and generation learning

Publications (1)

Publication Number Publication Date
CN110659576A true CN110659576A (en) 2020-01-07

Family

ID=69037783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910783692.4A Pending CN110659576A (en) 2019-08-23 2019-08-23 Pedestrian searching method and device based on joint judgment and generation learning

Country Status (1)

Country Link
CN (1) CN110659576A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257692A (en) * 2020-12-22 2021-01-22 湖北亿咖通科技有限公司 Pedestrian target detection method, electronic device and storage medium
CN113837022A (en) * 2021-09-02 2021-12-24 北京新橙智慧科技发展有限公司 Method for rapidly searching video pedestrian

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682108A (en) * 2016-12-06 2017-05-17 浙江大学 Video retrieval method based on multi-modal convolutional neural network
CN109101913A (en) * 2018-08-01 2018-12-28 北京飞搜科技有限公司 Pedestrian recognition methods and device again
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device
CN110096982A (en) * 2019-04-22 2019-08-06 长沙千视通智能科技有限公司 A kind of video frequency vehicle big data searching method based on deep learning
CN110110755A (en) * 2019-04-04 2019-08-09 长沙千视通智能科技有限公司 Based on the pedestrian of PTGAN Regional disparity and multiple branches weight recognition detection algorithm and device
CN110110601A (en) * 2019-04-04 2019-08-09 深圳久凌软件技术有限公司 Video pedestrian weight recognizer and device based on multi-space attention model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682108A (en) * 2016-12-06 2017-05-17 浙江大学 Video retrieval method based on multi-modal convolutional neural network
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device
CN109101913A (en) * 2018-08-01 2018-12-28 北京飞搜科技有限公司 Pedestrian recognition methods and device again
CN110110755A (en) * 2019-04-04 2019-08-09 长沙千视通智能科技有限公司 Based on the pedestrian of PTGAN Regional disparity and multiple branches weight recognition detection algorithm and device
CN110110601A (en) * 2019-04-04 2019-08-09 深圳久凌软件技术有限公司 Video pedestrian weight recognizer and device based on multi-space attention model
CN110096982A (en) * 2019-04-22 2019-08-06 长沙千视通智能科技有限公司 A kind of video frequency vehicle big data searching method based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ZHEDONG ZHENG 等: "Joint Discriminative and Generative Learning for Person Re-Identification", 《IEEE》 *
吕云翔等: "基于机器学习的监控视频行人检测与追踪系统的设计与实现", 《工业和信息化教育》 *
齐美彬等: "多特征融合与独立测度学习的行人再识别", 《中国图象图形学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257692A (en) * 2020-12-22 2021-01-22 湖北亿咖通科技有限公司 Pedestrian target detection method, electronic device and storage medium
CN112257692B (en) * 2020-12-22 2021-03-12 湖北亿咖通科技有限公司 Pedestrian target detection method, electronic device and storage medium
CN113837022A (en) * 2021-09-02 2021-12-24 北京新橙智慧科技发展有限公司 Method for rapidly searching video pedestrian

Similar Documents

Publication Publication Date Title
CN109919031B (en) Human behavior recognition method based on deep neural network
CN110414432B (en) Training method of object recognition model, object recognition method and corresponding device
CN106909938B (en) Visual angle independence behavior identification method based on deep learning network
CN112419368A (en) Method, device and equipment for tracking track of moving target and storage medium
Hashmi et al. An exploratory analysis on visual counterfeits using conv-lstm hybrid architecture
CN110688897A (en) Pedestrian re-identification method and device based on joint judgment and generation learning
CN112801236B (en) Image recognition model migration method, device, equipment and storage medium
CN114758362B (en) Clothing changing pedestrian re-identification method based on semantic perception attention and visual shielding
CN108960412A (en) Image-recognizing method, device and computer readable storage medium
Fan Research and realization of video target detection system based on deep learning
CN110991278A (en) Human body action recognition method and device in video of computer vision system
CN113221770A (en) Cross-domain pedestrian re-identification method and system based on multi-feature hybrid learning
Hao et al. Recognition of basketball players’ action detection based on visual image and Harris corner extraction algorithm
CN110659576A (en) Pedestrian searching method and device based on joint judgment and generation learning
Varshney et al. Deep convolutional neural model for human activities recognition in a sequence of video by combining multiple CNN streams
Yu et al. Frequency feature pyramid network with global-local consistency loss for crowd-and-vehicle counting in congested scenes
CN110688512A (en) Pedestrian image search algorithm based on PTGAN region gap and depth neural network
Zhang et al. Human motion tracking and 3D motion track detection technology based on visual information features and machine learning
Shen et al. MCCG: A ConvNeXt-based Multiple-Classifier Method for Cross-view Geo-localization
Yuan et al. A systematic survey on human behavior recognition methods
CN113705301A (en) Image processing method and device
Zhang et al. Visual Object Tracking via Cascaded RPN Fusion and Coordinate Attention.
Gori et al. Semantic video labeling by developmental visual agents
Guo et al. An adaptive kernelized correlation filters with multiple features in the tracking application
Yan et al. Dance Action Recognition Model Using Deep Learning Network in Streaming Media Environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200107

RJ01 Rejection of invention patent application after publication