CN107480178A - A kind of pedestrian's recognition methods again compared based on image and video cross-module state - Google Patents
A kind of pedestrian's recognition methods again compared based on image and video cross-module state Download PDFInfo
- Publication number
- CN107480178A CN107480178A CN201710536118.XA CN201710536118A CN107480178A CN 107480178 A CN107480178 A CN 107480178A CN 201710536118 A CN201710536118 A CN 201710536118A CN 107480178 A CN107480178 A CN 107480178A
- Authority
- CN
- China
- Prior art keywords
- mtd
- video
- mrow
- msub
- mtr
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
- G06F16/738—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7837—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
- G06F16/784—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content the detected or recognised objects being people
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Library & Information Science (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a kind of pedestrian's recognition methods again compared based on image and video cross-module state, corresponds to the video of personage in the query image containing input for being retrieved from multiple videos, comprises the following steps:The configurable depth model of S1, structure;S2, training sample is obtained, and training sample is inputted in depth model, to be trained to depth model;Learn each several part parameter of the depth model of structure using forwards algorithms and backward algorithm;S3, the parameter obtained using S2 learnings are initialized to depth model;Query image to be measured and multiple videos are inputted into depth model, similarity measurement of each video respectively with the query image is calculated by depth model;It S4, will list with video of the similarity measurement of the query image higher than a threshold value, and be ranked up according to the size of similarity measurement.The present invention is realized the pedestrian compared based on image with video cross-module state and identified again on the premise of degree of precision is ensured.
Description
Technical field
The present invention relates to computer vision and area of pattern recognition, and in particular to one kind is based on image and video cross-module state ratio
To pedestrian's recognition methods again.
Background technology
Pedestrian's weight identification technology is the important basic research problem of computer vision field one.Pedestrian identifies origin again
In personage's tracking technique of video field.When tracked personage's video camera shooting area away from keyboard, when its reenter by
, it is necessary to be identified again during shooting area, and ID as before is distributed for it.With extensively should for video monitoring
With being received more and more attention about the research that pedestrian identifies again.At present, pedestrian identifies and is not limited under single visual angle again
Same person identification, more common situation refers to that the personage under different time, different visual angles identifies again.
The most of method for being confined to compare based on similitude between image of research identified again on pedestrian at present.Input
Inquiry be image, and search for database be similarly by image construction.Although carry out on the research that pedestrian identifies again
For a long time, while also made significant headway, but the problem is still a very challenging research topic.
Its chief reason is due to the difference between existing illumination, angle, background between different cameras, along with personage
The change of posture, same person have very big difference in appearance in the photo that different cameras obtains.
With the quickening of smart city pace of construction, we can easily obtain the monitor video comprising people information.
In criminal investigation and safety-security area, the monitor video of the personage is included as a kind of new demand quilt according to the human image retrieval of input
It is proposed.Compared with the image of static state, video bag contains more abundant people information, can preferably portray the feature of personage.Together
When, the image of multiframe is included in video, for the single image of static state, it can be preferably tackled in the presence of the feelings blocked
Condition.Further, since video is the set of successive image frame, its continuous dynamic space time information included, can be used for aiding in
The identification of personage.Therefore, identify it is a kind of more natural mode again using video information to carry out pedestrian.
However, how compare pedestrian based on image and video will effectively extract in identification problem and rationally utilize video again
Information is one of difficult point.Because substantial amounts of redundancy for image, in video be present, if processing is not
When the precision of identification can be reduced on the contrary.Further, since the comparison between image and video belongs to two kinds of different mode, therefore
How the comparison of cross-module state and an other difficult point are reasonably carried out.
The content of the invention
In view of this, it is necessary to for problems of the prior art, there is provided one kind is based on image and video cross-module state
The pedestrian of comparison recognition methods again, this method can in video database by calculating the similitude between image and video,
So as to retrieve all videos containing personage corresponding with input picture.
To achieve the above object, the present invention uses following technical scheme:
A kind of pedestrian's recognition methods again compared based on image and video cross-module state, is contained for being retrieved from multiple videos
There is the video that personage is corresponded in the query image of input, comprise the following steps:
The configurable depth model of S1, structure;
The depth model includes convolutional neural networks, long memory network and similarity-based learning network in short-term;Convolutional Neural
Network is used to extract the characteristics of image of query image and the video features of video respectively;Long memory network in short-term is used for extracting
Video features in embedded video spatial information, and export the space video feature for including the spatial information;Similarity-based learning
Network is used for characteristics of image and space video Feature Mapping to same dimension, and learns similarity measurement between the two;
S2, training sample is obtained, and training sample is inputted in depth model, to be trained to depth model;Utilize
Forwards algorithms and backward algorithm learn each several part parameter of the depth model of structure;
S3, the parameter obtained using S2 learnings are initialized to depth model;Inputted into depth model to be measured
Query image and multiple videos, similarity measurement of each video respectively with the query image is calculated by depth model;
S4, it will be listed with video of the similarity measurement of the query image higher than a threshold value, and according to similarity measurement
Size is ranked up.
Further, in S2, input before training sample, the parameter of depth model is carried out using random fashion initial
Change.
Further, in S2, every group of training sample includes a query image and a video, and precalculated two
Mutual similarity measurement between person.
Further, in convolutional neural networks, x is made to represent the query image of input, Y represents video, Y={ yt| t=
1 ..., N }, wherein ytFor video Y t two field pictures, N is the totalframes of video.Cnn is made to represent the letter of convolutional neural networks
Number, then query image x is as follows by the image feature representation obtained after convolutional neural networks:
fx=Cnn (x);
For video Y, its each two field picture y is obtained using convolutional neural networkstFeature, i.e. video features are expressed as:
{f(yt)=Cnn (yt) | t=1 ..., N }.
Further, for the t two field pictures of video, its feature obtained by convolutional neural networks is f (yt), f (yt)
Input as long memory network in short-term;In long memory network in short-term, output corresponding to it is state ht, htFor the two field picture
By the feature of long memory network in short-term;Above-mentioned processing is all carried out for each frame of video, finally all state htGroup
It is combined together as the space video feature that embedded in spatial information:
fy={ ht| t=1 ..., N }.
Further, in similarity-based learning network, S (x, y) is made to represent the similitude between query image x and video Y
Measurement, then have:
Wherein, A, B, C, d, e, f are similarity measurement S (x, y) parameter, and wherein A, B are positive definite matrix, and C is positive semidefinite
Matrix;
OrderThen have
S (x, y)=| | LAfx||2+||LBfy||2+2dTfx-2(Lc xfx)T(Lc yfy)+2etfy;
Wherein, LA, LB, LCFor the parameter of similarity-based learning network, learn to obtain by the training to depth model.
Further, in S2, using class hinge loss function, and the stochastic gradient descent method of standard is used to depth mould
Type is trained;
W is made to represent the parameter of network, then loss function is defined as follows:
Wherein,
Wherein, liFor indicator function, it is defined as follows:
The present invention can will automatically input the query image comprising personage and is compared with the video in database, and return
The video of personage identical with query image is returned, and according to the similarity auto-sequencing of personage in video and input picture personage.
The present invention is by the way that similarity-based learning is embedded into deep neural network, so as to which depth characteristic is expressed into study and similarity-based learning
Consolidated network framework is fused to, therefore can be by the two progress combination learning with optimizing, so as to solve spy in conventional method
Sign study isolates study of coming with similarity-based learning, the defects of can not be optimized end to end with study.The present invention is using deeply
The adaptive feature for removing study image and video of degree neutral net, it is achieved thereby that compared based on image and video cross-module state
Pedestrian identifies solve a great problem in the pedestrian's weight identification technology compared based on image and video again.
Brief description of the drawings
Fig. 1 is that a kind of flow of pedestrian compared based on image with video cross-module state provided by the invention recognition methods again is shown
It is intended to.
Fig. 2 is the structural representation of the depth model used in the present invention.
Embodiment
Technical scheme is described in detail below in conjunction with accompanying drawing and specific embodiment.
The invention provides a kind of pedestrian's recognition methods again compared based on image and video cross-module state, for being regarded from multiple
The video that personage is corresponded in the query image containing input is retrieved in frequency.It should be noted that heretofore described is multiple
Video can be the multiple videos for the multiple videos or scattered storage being uniformly stored in a video database.
A kind of as shown in figure 1, pedestrian compared based on image and video cross-module state provided by the invention recognition methods bag again
Include following steps:
The configurable depth model of S1, structure;
The depth model includes convolutional neural networks, long memory network and similarity-based learning network in short-term;Convolutional Neural
Network is used to extract the characteristics of image of query image and the video features of video respectively;Long memory network in short-term is used for extracting
Video features in embedded video spatial information, and export the space video feature for including the spatial information;Similarity-based learning
Network is used for characteristics of image and space video Feature Mapping to same dimension, and learns similarity measurement between the two;
S2, training sample is obtained, and training sample is inputted in depth model, to be trained to depth model;Utilize
Forwards algorithms and backward algorithm learn each several part parameter of the depth model of structure;
S3, the parameter obtained using S2 learnings are initialized to depth model;Inputted into depth model to be measured
Query image and multiple videos, similarity measurement of each video respectively with the query image is calculated by depth model;
S4, it will be listed with video of the similarity measurement of the query image higher than a threshold value, and according to similarity measurement
Size is ranked up.
As shown in Fig. 2 the depth model framework used in the present invention is as follows:The depth model includes convolutional Neural net
Network, long three parts of memory network and similarity-based learning network in short-term.The input of model has two, respectively query image and regards
Frequently, correspondingly, the data transfer of model is divided into Liang Ge branches.One of branch extracts inquiry with convolutional neural networks
The feature of image, it exports a branch for being connected to similarity-based learning network inputs;Another branch of model is refreshing with convolution
The feature of video is extracted through network and long memory network in short-term, it exports second branch for being connected to similarity-based learning network.
The concrete function of each network is described in detail below:
Convolutional neural networks:In invention, convolutional neural networks are responsible for extracting query image and the feature of video.Convolution god
Structure through network is using classical GoogLeNet structures.For the query image of input, by being obtained after convolutional neural networks
Its feature, it is output to the study that similarity-based learning e-learning is used for similarity measurement.
X is made to represent the query image of input, Y represents video, Y={ yt| t=1 ..., N }, wherein ytFor video Y t
Two field picture, N are the totalframes of video.Cnn is made to represent the function of convolutional neural networks, then query image x passes through convolutional Neural net
The image feature representation obtained after network is as follows:
fx=Cnn (x);
For video Y, its each two field picture y is obtained using convolutional neural networkstFeature, i.e. video features are expressed as:
{f(yt)=Cnn (yt) | t=1 ..., N }.
After each frame of video obtains its feature by convolutional neural networks, then by long memory network in short-term in video
Embedded space time information.
Long memory network in short-term:Long memory network in short-term as a kind of Recognition with Recurrent Neural Network, can handle random length
Video data, and export video features.In the present invention, we using long memory network in short-term come to from convolutional neural networks
The information of each frame of video of acquisition further encodes.The characteristics of long memory network in short-term be its current state not only with currently
The data of input are relevant, and further relate to the previous state of current state be.
For the t two field pictures of video, its feature obtained by convolutional neural networks is f (yt), f (yt) it is used as length
When memory network input;In long memory network in short-term, output corresponding to it is state ht, htPass through length for the two field picture
When memory network feature;Above-mentioned processing is all carried out for each frame of video, finally all state htIt is grouped together
As the space video feature that embedded in spatial information:
fy={ ht| t=1 ..., N }.
Similarity-based learning network:After the feature and the feature of video for being extracted query image respectively, we use phase
Learn the similitude between query image and video like inquiry learning network.
Specifically, make S (x, y) represent the similarity measurement between query image x and video Y, then have:
Wherein, A, B, C, d, e, f are similarity measurement S (x, y) parameter, and wherein A, B are positive definite matrix, and C is positive semidefinite
Matrix.OrderThen have:
S (x, y)=| | LAfx||2+||LBfy||2+2dTfx-2(Lc xfx)T(Lc yfy)+2etfy;
Wherein, LA, LB, LCFor the parameter of similarity-based learning network, learn to obtain by the training to depth model.
During the model training described in S2, every group of training sample includes a query image and a video, and in advance
The similarity measurement mutual between the two calculated.Before training sample is inputted, using random fashion to depth model
Parameter is initialized.
In the present invention, using class hinge loss function, and depth model is entered using the stochastic gradient descent method of standard
Row training;W is made to represent the parameter of network, then loss function is defined as follows:
Wherein,
Wherein, liFor indicator function, it is defined as follows:
Learnt by S2 after each several part parameter of the depth model of structure, depth model is entered again using these parameters
Row initialization, can formally carries out pedestrian and identifies work again afterwards.
In S3 and S4, the present invention calculates the query image and video database of input with the depth model succeeded in school respectively
In multiple videos between similitude.Then the mode according to similitude from big to small is arranged the video in database
Sequence, and return to ranking results.
Embodiment described above only expresses the several embodiments of the present invention, and its description is more specific and detailed, but simultaneously
Therefore the limitation to the scope of the claims of the present invention can not be interpreted as.It should be pointed out that for one of ordinary skill in the art
For, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the guarantor of the present invention
Protect scope.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.
Claims (7)
1. a kind of pedestrian's recognition methods again compared based on image and video cross-module state, for retrieved from multiple videos containing
The video of personage is corresponded in the query image of input, it is characterised in that comprise the following steps:
The configurable depth model of S1, structure;
The depth model includes convolutional neural networks, long memory network and similarity-based learning network in short-term;Convolutional neural networks
For extracting the characteristics of image of query image and the video features of video respectively;Long memory network in short-term is used to regard to what is extracted
The spatial information of embedded video in frequency feature, and export the space video feature for including the spatial information;Similarity-based learning network
For by characteristics of image and space video Feature Mapping to same dimension, and learn similarity measurement between the two;
S2, training sample is obtained, and training sample is inputted in depth model, to be trained to depth model;Using it is preceding to
Algorithm and backward algorithm learn each several part parameter of the depth model of structure;
S3, the parameter obtained using S2 learnings are initialized to depth model;Inquiry to be measured is inputted into depth model
Image and multiple videos, similarity measurement of each video respectively with the query image is calculated by depth model;
S4, it will be listed with video of the similarity measurement of the query image higher than a threshold value, and according to the size of similarity measurement
It is ranked up.
2. according to the method for claim 1, it is characterised in that in S2, input before training sample, using random fashion
The parameter of depth model is initialized.
3. according to the method for claim 1, it is characterised in that in S2, every group of training sample include a query image and
One video, and precalculated similarity measurement mutual between the two.
4. according to the method for claim 1, it is characterised in that in convolutional neural networks, make x represent the query graph of input
Picture, Y represent video, Y={ yt| t=1 ..., N }, wherein ytFor video Y t two field pictures, N is the totalframes of video.Make Cnn
The function of convolutional neural networks is represented, then query image x is as follows by the image feature representation obtained after convolutional neural networks:
fx=Cnn (x);
For video Y, its each two field picture y is obtained using convolutional neural networkstFeature, i.e. video features are expressed as:
{f(yt)=Cnn (yt) | t=1 ..., N }.
5. according to the method for claim 4, it is characterised in that for the t two field pictures of video, it passes through convolutional Neural net
The feature that network obtains is f (yt), f (yt) as the input for growing memory network in short-term;In long memory network in short-term, corresponding to it
Export as state ht, htFeature for the two field picture by long memory network in short-term;Above-mentioned place is all carried out for each frame of video
Reason, finally all state htIt is grouped together as the space video feature that embedded in spatial information:
fy={ ht| t=1 ..., N }.
6. according to the method for claim 5, it is characterised in that in similarity-based learning network, make S (x, y) represent inquiry
Similarity measurement between image x and video Y, then have:
<mrow>
<mi>S</mi>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfenced open = "[" close = "]">
<mtable>
<mtr>
<mtd>
<msub>
<mi>f</mi>
<mi>x</mi>
</msub>
</mtd>
<mtd>
<msub>
<mi>f</mi>
<mi>y</mi>
</msub>
</mtd>
<mtd>
<mn>1</mn>
</mtd>
</mtr>
</mtable>
</mfenced>
<mfenced open = "[" close = "]">
<mtable>
<mtr>
<mtd>
<mi>A</mi>
</mtd>
<mtd>
<mi>C</mi>
</mtd>
<mtd>
<mi>d</mi>
</mtd>
</mtr>
<mtr>
<mtd>
<msup>
<mi>C</mi>
<mi>T</mi>
</msup>
</mtd>
<mtd>
<mi>B</mi>
</mtd>
<mtd>
<mi>e</mi>
</mtd>
</mtr>
<mtr>
<mtd>
<msup>
<mi>d</mi>
<mi>T</mi>
</msup>
</mtd>
<mtd>
<msup>
<mi>e</mi>
<mi>T</mi>
</msup>
</mtd>
<mtd>
<mi>f</mi>
</mtd>
</mtr>
</mtable>
</mfenced>
<mfenced open = "[" close = "]">
<mtable>
<mtr>
<mtd>
<msub>
<mi>f</mi>
<mi>x</mi>
</msub>
</mtd>
</mtr>
<mtr>
<mtd>
<msub>
<mi>f</mi>
<mi>y</mi>
</msub>
</mtd>
</mtr>
<mtr>
<mtd>
<mn>1</mn>
</mtd>
</mtr>
</mtable>
</mfenced>
<mo>;</mo>
</mrow>
Wherein, A, B, C, d, e, f are similarity measurement S (x, y) parameter, and wherein A, B are positive definite matrix, and C is positive semidefinite matrix;
OrderThen have:
S (x, y)=| | LAfx||2+||LBfy||2+2dTfx-2(Lc xfx)T(Lc yfy)+2etfy;
Wherein, LA, LB, LCFor the parameter of similarity-based learning network, learn to obtain by the training to depth model.
7. according to the method for claim 1, it is characterised in that in S2, using class hinge loss function, and use standard
Stochastic gradient descent method depth model is trained;
W is made to represent the parameter of network, then loss function is defined as follows:
<mrow>
<mi>W</mi>
<mo>=</mo>
<msub>
<mi>argmin</mi>
<mi>w</mi>
</msub>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<msub>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<msub>
<mi>l</mi>
<mi>i</mi>
</msub>
<mi>S</mi>
<mo>(</mo>
<mrow>
<msub>
<mi>x</mi>
<mi>i</mi>
</msub>
<mo>,</mo>
<msub>
<mi>y</mi>
<mi>i</mi>
</msub>
</mrow>
<mo>)</mo>
<mo>)</mo>
</mrow>
<mo>+</mo>
</msub>
<mo>+</mo>
<mi>&lambda;</mi>
<mo>|</mo>
<mo>|</mo>
<mi>W</mi>
<mo>|</mo>
<msup>
<mo>|</mo>
<mn>2</mn>
</msup>
<mo>;</mo>
</mrow>
Wherein,
<mrow>
<mi>S</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>x</mi>
<mi>i</mi>
</msub>
<mo>,</mo>
<msub>
<mi>y</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
<mfenced open = "{" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<mo><</mo>
<mo>-</mo>
<mn>1</mn>
<mo>,</mo>
</mrow>
</mtd>
<mtd>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>i</mi>
<mi>f</mi>
</mrow>
</mtd>
<mtd>
<mrow>
<msub>
<mi>l</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<mo>-</mo>
<mn>1</mn>
</mrow>
</mtd>
</mtr>
</mtable>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mo>&GreaterEqual;</mo>
<mn>1</mn>
<mo>,</mo>
</mrow>
</mtd>
<mtd>
<mrow>
<mi>o</mi>
<mi>t</mi>
<mi>h</mi>
<mi>e</mi>
<mi>r</mi>
<mi>w</mi>
<mi>i</mi>
<mi>s</mi>
<mi>e</mi>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
<mo>;</mo>
</mrow>
Wherein, liFor indicator function, it is defined as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710536118.XA CN107480178B (en) | 2017-07-01 | 2017-07-01 | Pedestrian re-identification method based on cross-modal comparison of image and video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710536118.XA CN107480178B (en) | 2017-07-01 | 2017-07-01 | Pedestrian re-identification method based on cross-modal comparison of image and video |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107480178A true CN107480178A (en) | 2017-12-15 |
CN107480178B CN107480178B (en) | 2020-07-07 |
Family
ID=60595188
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710536118.XA Active CN107480178B (en) | 2017-07-01 | 2017-07-01 | Pedestrian re-identification method based on cross-modal comparison of image and video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107480178B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108228915A (en) * | 2018-03-29 | 2018-06-29 | 华南理工大学 | A kind of video retrieval method based on deep learning |
CN108932509A (en) * | 2018-08-16 | 2018-12-04 | 新智数字科技有限公司 | A kind of across scene objects search methods and device based on video tracking |
CN108960141A (en) * | 2018-07-04 | 2018-12-07 | 国家新闻出版广电总局广播科学研究院 | Pedestrian's recognition methods again based on enhanced depth convolutional neural networks |
CN109063589A (en) * | 2018-07-12 | 2018-12-21 | 杭州电子科技大学 | Instrument and equipment on-line monitoring method neural network based and system |
CN109165563A (en) * | 2018-07-27 | 2019-01-08 | 北京市商汤科技开发有限公司 | Pedestrian recognition methods and device, electronic equipment, storage medium, program product again |
CN109635676A (en) * | 2018-11-23 | 2019-04-16 | 清华大学 | A method of positioning source of sound from video |
CN110084979A (en) * | 2019-04-23 | 2019-08-02 | 暗物智能科技(广州)有限公司 | Man-machine interaction method, device and controller and interactive device |
CN110245267A (en) * | 2019-05-17 | 2019-09-17 | 天津大学 | Multi-user's video flowing deep learning is shared to calculate multiplexing method |
CN110334743A (en) * | 2019-06-10 | 2019-10-15 | 浙江大学 | A kind of progressive transfer learning method based on the long memory network in short-term of convolution |
CN111050219A (en) * | 2018-10-12 | 2020-04-21 | 奥多比公司 | Spatio-temporal memory network for locating target objects in video content |
CN111931637A (en) * | 2020-08-07 | 2020-11-13 | 华南理工大学 | Cross-modal pedestrian re-identification method and system based on double-current convolutional neural network |
CN112651262A (en) * | 2019-10-09 | 2021-04-13 | 四川大学 | Cross-modal pedestrian re-identification method based on self-adaptive pedestrian alignment |
CN113283362A (en) * | 2021-06-04 | 2021-08-20 | 中国矿业大学 | Cross-modal pedestrian re-identification method |
CN113761995A (en) * | 2020-08-13 | 2021-12-07 | 四川大学 | Cross-mode pedestrian re-identification method based on double-transformation alignment and blocking |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110182469A1 (en) * | 2010-01-28 | 2011-07-28 | Nec Laboratories America, Inc. | 3d convolutional neural networks for automatic human action recognition |
US20110222724A1 (en) * | 2010-03-15 | 2011-09-15 | Nec Laboratories America, Inc. | Systems and methods for determining personal characteristics |
CN104915643A (en) * | 2015-05-26 | 2015-09-16 | 中山大学 | Deep-learning-based pedestrian re-identification method |
CN106096568A (en) * | 2016-06-21 | 2016-11-09 | 同济大学 | A kind of pedestrian's recognition methods again based on CNN and convolution LSTM network |
-
2017
- 2017-07-01 CN CN201710536118.XA patent/CN107480178B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110182469A1 (en) * | 2010-01-28 | 2011-07-28 | Nec Laboratories America, Inc. | 3d convolutional neural networks for automatic human action recognition |
US20110222724A1 (en) * | 2010-03-15 | 2011-09-15 | Nec Laboratories America, Inc. | Systems and methods for determining personal characteristics |
CN104915643A (en) * | 2015-05-26 | 2015-09-16 | 中山大学 | Deep-learning-based pedestrian re-identification method |
CN106096568A (en) * | 2016-06-21 | 2016-11-09 | 同济大学 | A kind of pedestrian's recognition methods again based on CNN and convolution LSTM network |
Non-Patent Citations (4)
Title |
---|
DONG YI等: "Deep Metric Learning for Person Re-identification", 《2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION》 * |
LIANG ZHENG等: "MARS: A video benchmark for large-scale person re-identification", 《COMPUTER VISION – ECCV 2016》 * |
YICHAO YAN等: "Person Re-identification via recurrent feature aggregation", 《COMPUTER VISION – ECCV 2016》 * |
马连洋: "跨摄像机行人再标识研究", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108228915A (en) * | 2018-03-29 | 2018-06-29 | 华南理工大学 | A kind of video retrieval method based on deep learning |
CN108960141B (en) * | 2018-07-04 | 2021-04-23 | 国家新闻出版广电总局广播科学研究院 | Pedestrian re-identification method based on enhanced deep convolutional neural network |
CN108960141A (en) * | 2018-07-04 | 2018-12-07 | 国家新闻出版广电总局广播科学研究院 | Pedestrian's recognition methods again based on enhanced depth convolutional neural networks |
CN109063589A (en) * | 2018-07-12 | 2018-12-21 | 杭州电子科技大学 | Instrument and equipment on-line monitoring method neural network based and system |
CN109165563A (en) * | 2018-07-27 | 2019-01-08 | 北京市商汤科技开发有限公司 | Pedestrian recognition methods and device, electronic equipment, storage medium, program product again |
CN109165563B (en) * | 2018-07-27 | 2021-03-23 | 北京市商汤科技开发有限公司 | Pedestrian re-identification method and apparatus, electronic device, storage medium, and program product |
CN108932509A (en) * | 2018-08-16 | 2018-12-04 | 新智数字科技有限公司 | A kind of across scene objects search methods and device based on video tracking |
CN111050219A (en) * | 2018-10-12 | 2020-04-21 | 奥多比公司 | Spatio-temporal memory network for locating target objects in video content |
CN109635676A (en) * | 2018-11-23 | 2019-04-16 | 清华大学 | A method of positioning source of sound from video |
CN110084979B (en) * | 2019-04-23 | 2022-05-10 | 暗物智能科技(广州)有限公司 | Human-computer interaction method and device, controller and interaction equipment |
CN110084979A (en) * | 2019-04-23 | 2019-08-02 | 暗物智能科技(广州)有限公司 | Man-machine interaction method, device and controller and interactive device |
CN110245267A (en) * | 2019-05-17 | 2019-09-17 | 天津大学 | Multi-user's video flowing deep learning is shared to calculate multiplexing method |
CN110245267B (en) * | 2019-05-17 | 2023-08-11 | 天津大学 | Multi-user video stream deep learning sharing calculation multiplexing method |
CN110334743A (en) * | 2019-06-10 | 2019-10-15 | 浙江大学 | A kind of progressive transfer learning method based on the long memory network in short-term of convolution |
CN110334743B (en) * | 2019-06-10 | 2021-05-04 | 浙江大学 | Gradual migration learning method based on convolution long-time and short-time memory network |
CN112651262A (en) * | 2019-10-09 | 2021-04-13 | 四川大学 | Cross-modal pedestrian re-identification method based on self-adaptive pedestrian alignment |
CN112651262B (en) * | 2019-10-09 | 2022-10-14 | 四川大学 | Cross-modal pedestrian re-identification method based on self-adaptive pedestrian alignment |
CN111931637A (en) * | 2020-08-07 | 2020-11-13 | 华南理工大学 | Cross-modal pedestrian re-identification method and system based on double-current convolutional neural network |
CN111931637B (en) * | 2020-08-07 | 2023-09-15 | 华南理工大学 | Cross-modal pedestrian re-identification method and system based on double-flow convolutional neural network |
CN113761995A (en) * | 2020-08-13 | 2021-12-07 | 四川大学 | Cross-mode pedestrian re-identification method based on double-transformation alignment and blocking |
CN113283362A (en) * | 2021-06-04 | 2021-08-20 | 中国矿业大学 | Cross-modal pedestrian re-identification method |
CN113283362B (en) * | 2021-06-04 | 2024-03-22 | 中国矿业大学 | Cross-mode pedestrian re-identification method |
Also Published As
Publication number | Publication date |
---|---|
CN107480178B (en) | 2020-07-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107480178A (en) | A kind of pedestrian's recognition methods again compared based on image and video cross-module state | |
CN107358257B (en) | Under a kind of big data scene can incremental learning image classification training method | |
CN108875674B (en) | Driver behavior identification method based on multi-column fusion convolutional neural network | |
Li et al. | Building-a-nets: Robust building extraction from high-resolution remote sensing images with adversarial networks | |
CN106204449B (en) | A kind of single image super resolution ratio reconstruction method based on symmetrical depth network | |
CN106503687B (en) | Merge the monitor video system for identifying figures and its method of face multi-angle feature | |
CN104361363B (en) | Depth deconvolution feature learning network, generation method and image classification method | |
Zhou et al. | Learning deep features for scene recognition using places database | |
CN108062574B (en) | Weak supervision target detection method based on specific category space constraint | |
CN110008842A (en) | A kind of pedestrian's recognition methods again for more losing Fusion Model based on depth | |
CN109543602A (en) | A kind of recognition methods again of the pedestrian based on multi-view image feature decomposition | |
CN106897714A (en) | A kind of video actions detection method based on convolutional neural networks | |
CN109815826A (en) | The generation method and device of face character model | |
CN107220604A (en) | A kind of fall detection method based on video | |
CN105574510A (en) | Gait identification method and device | |
CN110503076B (en) | Video classification method, device, equipment and medium based on artificial intelligence | |
CN109410190B (en) | Tower pole reverse-breaking detection model training method based on high-resolution remote sensing satellite image | |
CN104063721B (en) | A kind of human behavior recognition methods learnt automatically based on semantic feature with screening | |
CN105718960A (en) | Image ordering model based on convolutional neural network and spatial pyramid matching | |
CN106682628B (en) | Face attribute classification method based on multilayer depth feature information | |
CN109784288B (en) | Pedestrian re-identification method based on discrimination perception fusion | |
CN104462494A (en) | Remote sensing image retrieval method and system based on non-supervision characteristic learning | |
CN105005798B (en) | One kind is based on the similar matched target identification method of structures statistics in part | |
CN107633226A (en) | A kind of human action Tracking Recognition method and system | |
CN104281572B (en) | A kind of target matching method and its system based on mutual information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20200612 Address after: 511400 floor 16, No.37, Jinlong Road, Nansha District, Guangzhou City, Guangdong Province (office only) Applicant after: DMAI (GUANGZHOU) Co.,Ltd. Address before: Panyu District city in Guangdong province Guangzhou Shilou town 510000 Gen Kai Road No. 63 Building No. 1 Chong Kai 210-5 Applicant before: GUANGZHOU SHENYU INFORMATION TECHNOLOGY Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |