CN108108699A - Merge deep neural network model and the human motion recognition method of binary system Hash - Google Patents

Merge deep neural network model and the human motion recognition method of binary system Hash Download PDF

Info

Publication number
CN108108699A
CN108108699A CN201711422702.9A CN201711422702A CN108108699A CN 108108699 A CN108108699 A CN 108108699A CN 201711422702 A CN201711422702 A CN 201711422702A CN 108108699 A CN108108699 A CN 108108699A
Authority
CN
China
Prior art keywords
video
frame
msub
mrow
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711422702.9A
Other languages
Chinese (zh)
Inventor
李伟生
冯晨
肖斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201711422702.9A priority Critical patent/CN108108699A/en
Publication of CN108108699A publication Critical patent/CN108108699A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The human motion recognition method that the present invention is a kind of deep neural network model and binary system Hash is combined, belongs to mode identification technology.This method includes:Pretreatment cutting framed sequence is carried out to action recognition database first, calculates light stream figure, and the coordinate of human joint points is calculated using Attitude estimation algorithm, video area frame is extracted using result coordinate;Secondly FC (Full Convolutional) feature is extracted respectively with light stream to the RGB streams of video using 16 network models of VGG of pre-training, key frame is chosen in sequence of frames of video, FC features corresponding to these key frames take difference;Binary conversion treatment is done to difference;The uniform characteristics for obtaining each video with binary hashing methods again represent;With obtaining the character representation of video using a variety of method for normalizing such as L1, L2 after PCNN Fusion Features;Finally it is identified using algorithm of support vector machine training grader human action video.The present invention has higher action recognition accuracy.

Description

Merge deep neural network model and the human motion recognition method of binary system Hash
Technical field
It is more particularly to a kind of to combine two based on deep neural network model the invention belongs to image/video processing technology field The human motion recognition method of system Hash.
Background technology
In recent years, research of the human action identification in fields such as pattern-recognition, image processing and analysis achieves very big Progress, at present existing part human body motion recognition system input actual use.Human action recognizer mainly includes action schedule Show with two steps of the classification of motion, how to encode human body action message is a very crucial step to the subsequent classification of motion.Reason In the case of thinking action represent algorithm not only will to the variation of human appearance, scale, complex background and responsiveness, but also comprising Enough information is supplied to grader to be divided for type of action.But the changeable sex chromosome mosaicism of complex background and human body in itself gives people body Action recognition brings great challenge.
A series of frame that short-sighted frequency is regarded as inputs by deep learning method is handled.It is obvious that using individual frame not It is enough the dynamic of effective capturing motion, and a large amount of frames need substantial amounts of parameter, so as to cause model over-fitting, it is necessary to bigger Training set, computation complexity also higher.This problem is existed in other popular CNN frameworks, such as Tran.D et al. The 3D convolutional networks of proposition.Therefore, state-of-the-art depth motion identification model is usually trained to from short video clipping generation Then useful feature is collected and generates whole sequence level descriptor, be then used for line of the training with specific action label Property grader.In the PCNN models proposed in Cheron et al., the output characteristics of the FC layers flowed by extracting video RGB and combination The character representation of video is obtained using min or max ponds method.But min or max ponds method is only captured between feature Level-one association, aggregation operator can more properly capture the High order correletion between CNN functions.
Although CNN in the function of frame grade may it is extremely complex, it is contemplated that using video frame change between pass Connection property can capture video unique feature this potentially contribute to improve video identification performance.
The content of the invention
Present invention seek to address that above problem of the prior art.It is deep to propose a kind of fusion with better recognition effect Spend the human motion recognition method of neural network model and binary system Hash.Technical scheme is as follows:
A kind of human motion recognition method for merging deep neural network model and binary system Hash, including following step Suddenly:
101st, the short-sighted frequency for including human action is obtained, and the short-sighted frequency is cut into sequence of frames of video;
102nd, using the light stream figure of adjacent video frames in 101 sequence of frames of video of optical flow algorithm calculation procedure;
103rd, the coordinate of human joint points is obtained using Attitude estimation algorithm to 101 sequence of frames of video;
104th, the RGB and light stream administrative division map at the body joint point coordinate interception different human body position obtained using step 103, is obtained The RGB frame sequence of video and light stream frame sequence;
105th, using the VGG-16 models and light stream net of Oxford University's visual geometric group (Visual Geometry Group) The full connection that network (FlowNet) model extracts the RGB frame sequence that step 104 obtains with each frame in light stream frame sequence (Full Connected) layer feature, this layer of characteristic dimension are 4096 dimensions;
106th, the FC features obtained using step 105 are carried out pondization operation and are assembled, and it is special to obtain the video tieed up n × 4096 Sign represents;
107th, the video features for obtaining step 106 carry out l2Linear SVM grader is sent into after normalization to classify.
Further, the step 102 uses the light stream figure of 101 adjacent video frame sequence of optical flow algorithm calculation procedure, tool Body includes step:
Light stream vector between 201. two adjacent video frames of extraction;
Absolute value at all pixels point of the light stream vector of 202. pairs of generations horizontally and vertically is summed respectively, Obtain the sum of horizontally and vertically two light stream absolute values of frame;
203. generate the light stream absolute value of all frames and according to time sequence entire video level direction and vertical direction Light stream sequence.
Further, the step of RGB frame sequence of step 104 selecting video is closed with light stream frame sequence includes:
The sliding window size h of different sizes is chosen, and dynamically according to video frame number | F | the sample of acquisition S numbers Frame simultaneously extracts feature.fTRepresent the frame in original video frame sequence, wherein original video shares T frames;It is crucial selected by expression A frame in frame sequence, key frame extraction use method shown in formula (2), choose a frame at interval of S frames, choose h frames altogether.
Further, the step 105 uses the convolution of two kinds of different frameworks to distinguish RGB sequences and light stream sequence Network model, each network contain five layers of convolutional layer and three layers of full articulamentum, use the defeated of second full articulamentum Go out as FC features i.e. video frame feature, input picture is uniformly adjusted to 224 × 224 size, can so obtain consistent FC layer features, we using min and max pondizations operation all frame features of one video are polymerize after just regarded The character representation of frequency.
Further, the FC features of the key frame to selection and corresponding 4096 dimension carry out adjacent mathematic interpolation, use 0,1 represents the variation tendency of feature, thus obtains the matrix of 4096 × h size, and each element is 0 or is 1 in matrix, Binary sequence of the extraction per a line is calculated using formula (3) and exported, thus obtained video corresponding 4096 as input The binary system Hash feature of dimension.
Further, the step 106 calculates video characteristic values and specifically includes:Compare two adjacent key framesWith Characteristic value changes, corresponding to the corresponding feature vector f of video framet p, more adjacent two frame is the same as the variation of characteristic value on dimension, increasing Adding and represented with 1, reduction is represented with 0, can so obtain the eigenvalue matrix M of a 4096*h, and matrix element only includes 0 or 1, For every a line feature vector [x of matrixh-1,xh-2,...,x0] using the following formula (3) its binary system Hash mapping is calculated, The numeric string being made of 0 and 1 is converted into a signless integer by formula (3);
The RGB streams of human body different parts and the binary system Hash feature of light stream frame changing features are finally obtained.
Further, step 107 is except using l2Beyond normalization, fusion l is also used1+β·l2Feature normalization Mode, l2Represent the second order normalized to feature, l1It represents to normalize the single order of feature, β represents fusion normalization coefficient.When The mark sheet of video is obtained after the Fusion Features that finally feature extracted by deep neural network is obtained with binary system Hash Show p, since the characteristic value scale of separate sources has differences, normalize all characteristic values and reuse grader point to a scale Class.
Further, it is described to have used l1+β·l2The normalization mode of fusion, i.e.,
P=p/ (| | p | |1+β·||p||2) (4)
It advantages of the present invention and has the beneficial effect that:
The innovation of the present invention is:Depth network model and binary system hash method are blended.In view of in recent years Carry out depth convolutional neural networks to the validity and accuracy on objects in images characterization problems, so selection use covers 2 The VGG-16 network models of pre-training use bag to RGB frame sequential extraction procedures feature on the Imagenet data sets of more than ten thousand kinds of object The depth model for having contained pre-training on the UCF101 data sets of 101 kinds of actions extracts feature to light stream frame sequence.Use binary system The simple operations of hash method and high efficiency make at further high-order the static video frame and light stream frame feature of extraction Reason.With reference to after various features identification is trained using different method for normalizing.Thus know compared with traditional human action Other method has better recognition effect.
Description of the drawings
Fig. 1 is the output result figure that the present invention provides preferred embodiment Attitude estimation method;
Fig. 2 is the flow chart that the present invention provides preferred embodiment method;
Fig. 3 is binary system hash algorithm flow;
Fig. 4 is:The comparison figure of different method for normalizing.
Fig. 5 is that different size of hash window compares figure;
Fig. 6 is that different size of fusion coefficients compare figure.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, detailed Carefully describe.Described embodiment is only the part of the embodiment of the present invention.
The present invention solve above-mentioned technical problem technical solution be:
It is a kind of to be known based on the human action of depth network model and binary system hash method below in conjunction with the accompanying drawings shown in 1-2 Other method comprises the following steps:
1. extract the depth characteristic of video
Sample in the video library of experiment is divided into training set and test set, and to all sample extraction FC layers of features, it should Extracting method detailed step is as follows:
1) to input video cutting framing
In order to extract the local feature information of video, by the video slicing framed sequence for including human action of input.
2) light stream frame is calculated to RGB frame sequence using optical flow algorithm.
3) coordinate of the artis of human body is positioned using Attitude estimation algorithm.
4) region according to where more than body joint point coordinate extraction RGB frame sequence with human joint points in light stream frame sequence. Including head, shoulder, waist, ancon.
5) in order to distinguish RGB sequences and light stream sequence, we use the convolutional network model of two kinds of different frameworks, each net Network contains five layers of convolutional layer and three layers of full articulamentum.We use the output of second full articulamentum as FC features That is video frame feature.Input picture is uniformly adjusted to 224 × 224 size by we, and so we can obtain consistent FC Layer feature.We have just obtained video after polymerizeing using the operation of min and max pondizations to all frame features of a video Character representation.
2. calculate the binary system Hash feature of video
Observation is it can be found that the kinetic characteristic of video is sometimes what is distinguished by the transitory motions of Partial key.In order to further The kinetic characteristic of video is captured, we calculate the binary system Hash feature of video using following steps:
1) it is similar with extraction video depth feature.First to video slicing framing, light stream frame is calculated, extracts human joint points Coordinate calculates the corresponding FC features of frame sequence of different position of joints.
2) different videos has different frame numbers | F |, we define sliding window size as h, and step-length S is | F |/h.Every Corresponding step-length chooses key frame.As shown in Figure 3.
3) the FC features of the key frame to selection and corresponding 4096 dimension.We carry out adjacent mathematic interpolation, use 0,1 Represent the variation tendency of feature.So we just obtain the matrix of 4096 × h size, and each element is 0 or is in matrix 1. we extract the binary sequence of every a line as input, calculated and exported using formula (3).So we have just obtained video The binary system Hash feature of corresponding 4096 dimension.
3. merge depth characteristic and Hash feature
For the depth characteristic that above step 1,2 obtains and binary system Hash feature, we first have to carry out Fusion Features, SVM classifier is reused to classify.It is main to include step in detailed below:
1) depth characteristic and the fusion feature after Hash merging features are preserved.
2) normal form of eigenmatrix and L2 normal forms are calculated using the fusion feature of everything video.
3) to all elements divided by l in eigenmatrix1Normal form, l2Normal form obtains two kinds and different returns after being normalized One changes feature.
4) fusion factor β, l are defined1+β·l2Another normalization characteristic is obtained as the normalization normal form after fusion.
5) more than normalization characteristic and respective action class label are sent into SVM classifier, linear kernel is selected to be instructed Practice.
6) grader is trained to each video.Mark current class is positive sample, other all categories are negative sample This.The multiple graders of training.
7) for the video of test set, using each classifier calculated score, select highest scoring as accordingly moving Make classification.
One embodiment of the present of invention is as follows:
Using JHMDB and MPII-Cooking human actions storehouse as experimental data base.
JHMDB action datas collection includes 21 anthropoid actions, including combing one's hair, sitting, standing, running, waving.Each video is only One section of very short video is contained, includes 15-40 frames.Share 928 videos and 31838 frames marked.
MPII-Cooking action datas collection includes a series of action video that high-resolution humans are cooked in kitchen.Comprising The actions such as wash dishes, cut fruit, washing one's hands.Each video includes a kind of culinary art activity.The other culinary art action of 64 species is contained altogether, It is related to 3748 video segments and same background.
(1) JHDMB data sets are 80/20 point there are three types of different training sets/test set division, ratio.Guarantee can be covered Cover everything species.The accuracy rate of classification is calculated in each test division, using three kinds of divisions average achievement as commenting Price card is accurate.Specific test result is as shown in attached drawing 4, attached drawing 5.To be significantly better than using the effect of method for normalizing and use original spy Levy the result classified.Different size of hash window is chosen, in most cases l1Normalization is better than l2Normalization.
We equally compare under different hash windows difference fusion coefficients β to l1+β·l2Normalized influence.Experiment As a result as shown in Figure 6.
(2) we test classification effect using identical method on JHMDB data sets and MPII-Cooking data sets Fruit.As shown in table 1, classifying quality show to have merged depth network characterization and binary system Hash feature method be better than before base In the method for PCNN models.
Table 1:In JHDMB data sets method for normalizing combination Hash feature different from MPII-Cooking data sets to dividing The influence of class result
The above embodiment is interpreted as being merely to illustrate the present invention rather than limit the scope of the invention. After the content for having read the record of the present invention, technical staff can make various changes or modifications the present invention, these equivalent changes Change and modification equally falls into the scope of the claims in the present invention.

Claims (8)

1. a kind of human motion recognition method for merging deep neural network model and binary system Hash, which is characterized in that including Following steps:
101st, the short-sighted frequency for including human action is obtained, and the short-sighted frequency is cut into sequence of frames of video;
102nd, using the light stream figure of consecutive frame in 101 sequence of frames of video of optical flow algorithm calculation procedure;
103rd, the coordinate of human joint points is obtained using Attitude estimation algorithm to 101 sequence of frames of video;
104th, the RGB and light stream administrative division map at the body joint point coordinate interception different human body position obtained using step 103, obtains video RGB frame sequence and light stream frame sequence;
105th, the RGB frame obtained using the VGG-16 models of Oxford University's visual geometric group with light stream network model to step 104 Sequence and the full articulamentum feature of each frame extraction in light stream frame sequence, this layer of characteristic dimension are 4096 dimensions;
106th, the FC features obtained using step 105 are carried out pondization operation and are assembled, and obtain the video features table that n × 4096 are tieed up Show;
107th, the video features for obtaining step 106 carry out l2Linear SVM grader is sent into after normalization to classify.
2. the human motion recognition method of fusion deep neural network model according to claim 1 and binary system Hash, It is characterized in that, the step 102 is specifically included using the light stream figure of 101 adjacent video frame sequence of optical flow algorithm calculation procedure Step:
Light stream vector between 201. two adjacent video frames of extraction;
Absolute value at all pixels point of the light stream vector of 202. pairs of generations horizontally and vertically is summed respectively, is obtained The sum of horizontally and vertically two light stream absolute values of frame;
203. generate the light stream absolute value of all frames and according to time sequence the light stream of entire video level direction and vertical direction Sequence.
3. the human motion recognition method of fusion deep neural network model according to claim 1 and binary system Hash, It is characterized in that, the RGB frame sequence of step 104 selecting video includes the step of being closed with light stream frame sequence:
The sliding window size h of different sizes is chosen, and dynamically according to video frame number | F | the sample frame of acquisition S numbers is simultaneously Extract feature, fTRepresent the frame in original video frame sequence, wherein original video shares T frames;Key frame sequence selected by expression A frame in row, key frame extraction use method shown in formula (2), choose a frame at interval of S frames, choose h frames altogether;
<mrow> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mo>&amp;lsqb;</mo> <msub> <mi>f</mi> <msub> <mi>t</mi> <mn>1</mn> </msub> </msub> <mo>,</mo> <msub> <mi>f</mi> <msub> <mi>t</mi> <mn>2</mn> </msub> </msub> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msub> <mi>f</mi> <msub> <mi>t</mi> <mi>h</mi> </msub> </msub> <mo>&amp;rsqb;</mo> <mo>&amp;SubsetEqual;</mo> <mi>F</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>F</mi> <mo>=</mo> <mo>&amp;lsqb;</mo> <msub> <mi>f</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>f</mi> <mn>2</mn> </msub> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msub> <mi>f</mi> <mi>T</mi> </msub> <mo>&amp;rsqb;</mo> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
4. the human motion recognition method of fusion deep neural network model according to claim 3 and binary system Hash, It is characterized in that, the step 105 in order to distinguish RGB sequences and light stream sequence, uses the convolutional network mould of two kinds of different frameworks Type, each network contain five layers of convolutional layer and three layers of full articulamentum, using second full articulamentum output as Input picture is uniformly adjusted to 224 × 224 size by FC features, that is, video frame feature, can so obtain consistent FC layers Feature has just obtained the mark sheet of video after polymerizeing using the operation of min and max pondizations to all frame features of a video Show.
5. the human motion recognition method of fusion deep neural network model according to claim 4 and binary system Hash, It is characterized in that, the FC features of the key frame and corresponding 4096 dimension to selection carry out adjacent mathematic interpolation, represented using 0,1 The variation tendency of feature thus obtains the matrix of 4096 × h size, and each element is 0 or is 1 in matrix, and extraction is often The binary sequence of a line is calculated using formula (3) and exported, thus obtained the two of corresponding 4096 dimension of video as input System Hash feature.
6. the human motion recognition method of fusion deep neural network model according to claim 4 and binary system Hash, It is specifically included it is characterized in that, the step 106 calculates video characteristic values:Compare two adjacent key framesWithCharacteristic value Variation, corresponding to the corresponding feature vector f of video framet p, more adjacent two frame increases with the variation of characteristic value on dimension with 1 It representing, reduction is represented with 0, can so obtain the eigenvalue matrix M of a 4096*h, and matrix element only includes 0 or 1, for Every a line feature vector [x of matrixh-1,xh-2,...,x0] using the following formula (3) calculate its binary system Hash mapping, formula (3) numeric string being made of 0 and 1 is converted into a signless integer;
<mrow> <mi>B</mi> <mn>2</mn> <msub> <mi>U</mi> <mi>w</mi> </msub> <mrow> <mo>(</mo> <mover> <mi>x</mi> <mo>&amp;RightArrow;</mo> </mover> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>w</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>&amp;times;</mo> <msup> <mn>2</mn> <mi>i</mi> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>
The RGB streams of human body different parts and the binary system Hash feature of light stream frame changing features are finally obtained.
7. the human motion recognition method of fusion deep neural network model according to claim 6 and binary system Hash, It is characterized in that, step 107 is except using l2Beyond normalization, fusion l is also used1+β·l2Feature normalization mode, l2 Represent the second order normalized to feature, l1It represents to normalize the single order of feature, β represents fusion normalization coefficient.When final handle The character representation p of video is obtained after the Fusion Features obtained by the feature that deep neural network is extracted with binary system Hash, by It is had differences in the characteristic value scale of separate sources, normalizes all characteristic values and reuse grader classification to a scale.
8. the human motion recognition method of fusion deep neural network model according to claim 7 and binary system Hash, It is characterized in that, described used l1+β·l2The normalization mode of fusion, i.e.,
P=p/ (| | p | |1+β·||p||2) (4)。
CN201711422702.9A 2017-12-25 2017-12-25 Merge deep neural network model and the human motion recognition method of binary system Hash Pending CN108108699A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711422702.9A CN108108699A (en) 2017-12-25 2017-12-25 Merge deep neural network model and the human motion recognition method of binary system Hash

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711422702.9A CN108108699A (en) 2017-12-25 2017-12-25 Merge deep neural network model and the human motion recognition method of binary system Hash

Publications (1)

Publication Number Publication Date
CN108108699A true CN108108699A (en) 2018-06-01

Family

ID=62212862

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711422702.9A Pending CN108108699A (en) 2017-12-25 2017-12-25 Merge deep neural network model and the human motion recognition method of binary system Hash

Country Status (1)

Country Link
CN (1) CN108108699A (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960207A (en) * 2018-08-08 2018-12-07 广东工业大学 A kind of method of image recognition, system and associated component
CN108985223A (en) * 2018-07-12 2018-12-11 天津艾思科尔科技有限公司 A kind of human motion recognition method
CN109086659A (en) * 2018-06-13 2018-12-25 深圳市感动智能科技有限公司 A kind of Human bodys' response method and apparatus based on multimode road Fusion Features
CN109255284A (en) * 2018-07-10 2019-01-22 西安理工大学 A kind of Activity recognition method of the 3D convolutional neural networks based on motion profile
CN109815921A (en) * 2019-01-29 2019-05-28 北京融链科技有限公司 The prediction technique and device of the class of activity in hydrogenation stations
CN109858406A (en) * 2019-01-17 2019-06-07 西北大学 A kind of extraction method of key frame based on artis information
CN109918537A (en) * 2019-01-18 2019-06-21 杭州电子科技大学 A kind of method for quickly retrieving of the ship monitor video content based on HBase
CN110096950A (en) * 2019-03-20 2019-08-06 西北大学 A kind of multiple features fusion Activity recognition method based on key frame
CN110135386A (en) * 2019-05-24 2019-08-16 长沙学院 A kind of human motion recognition method and system based on deep learning
CN110163127A (en) * 2019-05-07 2019-08-23 国网江西省电力有限公司检修分公司 A kind of video object Activity recognition method from thick to thin
CN111104837A (en) * 2018-10-29 2020-05-05 联发科技股份有限公司 Mobile device and related video editing method
CN111324744A (en) * 2020-02-17 2020-06-23 中山大学 Data enhancement method based on target emotion analysis data set
CN111666845A (en) * 2020-05-26 2020-09-15 南京邮电大学 Small sample deep learning multi-mode sign language recognition method based on key frame sampling
CN111695507A (en) * 2020-06-12 2020-09-22 桂林电子科技大学 Static gesture recognition method based on improved VGGNet network and PCA
CN112560817A (en) * 2021-02-22 2021-03-26 西南交通大学 Human body action recognition method and device, electronic equipment and storage medium
CN112784658A (en) * 2019-11-01 2021-05-11 纬创资通股份有限公司 Method and system for recognizing actions based on atomic gestures and computer readable recording medium
CN112818859A (en) * 2021-02-02 2021-05-18 电子科技大学 Deep hash-based multi-level retrieval pedestrian re-identification method
CN113313030A (en) * 2021-05-31 2021-08-27 华南理工大学 Human behavior identification method based on motion trend characteristics
CN113326835A (en) * 2021-08-04 2021-08-31 中国科学院深圳先进技术研究院 Action detection method and device, terminal equipment and storage medium
CN113326724A (en) * 2021-02-07 2021-08-31 海南长光卫星信息技术有限公司 Method, device and equipment for detecting change of remote sensing image and readable storage medium
CN113420719A (en) * 2021-07-20 2021-09-21 北京百度网讯科技有限公司 Method and device for generating motion capture data, electronic equipment and storage medium
CN113420612A (en) * 2021-06-02 2021-09-21 深圳中集智能科技有限公司 Production beat calculation method based on machine vision
WO2022012239A1 (en) * 2020-07-16 2022-01-20 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Action recognition method and related device, storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103384331A (en) * 2013-07-19 2013-11-06 上海交通大学 Video inter-frame forgery detection method based on light stream consistency
CN104469229A (en) * 2014-11-18 2015-03-25 北京恒华伟业科技股份有限公司 Video data storing method and device
CN105468755A (en) * 2015-11-27 2016-04-06 东方网力科技股份有限公司 Video screening and storing method and device
US20160148391A1 (en) * 2013-06-12 2016-05-26 Agency For Science, Technology And Research Method and system for human motion recognition
CN105741853A (en) * 2016-01-25 2016-07-06 西南交通大学 Digital speech perception hash method based on formant frequency
CN105989611A (en) * 2015-02-05 2016-10-05 南京理工大学 Blocking perception Hash tracking method with shadow removing
CN106203283A (en) * 2016-06-30 2016-12-07 重庆理工大学 Based on Three dimensional convolution deep neural network and the action identification method of deep video
CN106331524A (en) * 2016-08-18 2017-01-11 无锡天脉聚源传媒科技有限公司 Method and device for recognizing shot cut
CN106845351A (en) * 2016-05-13 2017-06-13 苏州大学 It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term
CN106937114A (en) * 2015-12-30 2017-07-07 株式会社日立制作所 Method and apparatus for being detected to video scene switching
CN107169415A (en) * 2017-04-13 2017-09-15 西安电子科技大学 Human motion recognition method based on convolutional neural networks feature coding
CN107403153A (en) * 2017-07-20 2017-11-28 大连大学 A kind of palmprint image recognition methods encoded based on convolutional neural networks and Hash

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148391A1 (en) * 2013-06-12 2016-05-26 Agency For Science, Technology And Research Method and system for human motion recognition
CN103384331A (en) * 2013-07-19 2013-11-06 上海交通大学 Video inter-frame forgery detection method based on light stream consistency
CN104469229A (en) * 2014-11-18 2015-03-25 北京恒华伟业科技股份有限公司 Video data storing method and device
CN105989611A (en) * 2015-02-05 2016-10-05 南京理工大学 Blocking perception Hash tracking method with shadow removing
CN105468755A (en) * 2015-11-27 2016-04-06 东方网力科技股份有限公司 Video screening and storing method and device
CN106937114A (en) * 2015-12-30 2017-07-07 株式会社日立制作所 Method and apparatus for being detected to video scene switching
CN105741853A (en) * 2016-01-25 2016-07-06 西南交通大学 Digital speech perception hash method based on formant frequency
CN106845351A (en) * 2016-05-13 2017-06-13 苏州大学 It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term
CN106203283A (en) * 2016-06-30 2016-12-07 重庆理工大学 Based on Three dimensional convolution deep neural network and the action identification method of deep video
CN106331524A (en) * 2016-08-18 2017-01-11 无锡天脉聚源传媒科技有限公司 Method and device for recognizing shot cut
CN107169415A (en) * 2017-04-13 2017-09-15 西安电子科技大学 Human motion recognition method based on convolutional neural networks feature coding
CN107403153A (en) * 2017-07-20 2017-11-28 大连大学 A kind of palmprint image recognition methods encoded based on convolutional neural networks and Hash

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GUIHEM CHERON 等: "P-CNN: Pose-based CNN Features for Action Recognition", 《2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION》 *
RANDAL E.BRYANT 等: "Computer Systems A Programmer’s Perspective", 《COMPUTER SYSTEMS A PROGRAMMER’S PERSPECTIVE》 *
XIUSHAN NIE 等: "Key-Frame Based Robust Video Hashing Using Isometric Feature Mapping", 《JOURNAL OF COMPUTATIONAL INFORMATION SYSTEMS》 *
彭天强 等: "基于深度卷积神经网络和二进制哈希学习的图像检索方法", 《电子与信息学报》 *
王欢: "基于哈希学习的动作捕捉数据的编码与检索", 《万方数据库》 *

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086659A (en) * 2018-06-13 2018-12-25 深圳市感动智能科技有限公司 A kind of Human bodys' response method and apparatus based on multimode road Fusion Features
CN109086659B (en) * 2018-06-13 2023-01-31 深圳市感动智能科技有限公司 Human behavior recognition method and device based on multi-channel feature fusion
CN109255284A (en) * 2018-07-10 2019-01-22 西安理工大学 A kind of Activity recognition method of the 3D convolutional neural networks based on motion profile
CN108985223A (en) * 2018-07-12 2018-12-11 天津艾思科尔科技有限公司 A kind of human motion recognition method
CN108985223B (en) * 2018-07-12 2024-05-07 天津艾思科尔科技有限公司 Human body action recognition method
CN108960207A (en) * 2018-08-08 2018-12-07 广东工业大学 A kind of method of image recognition, system and associated component
CN108960207B (en) * 2018-08-08 2021-05-11 广东工业大学 Image recognition method, system and related components
CN111104837A (en) * 2018-10-29 2020-05-05 联发科技股份有限公司 Mobile device and related video editing method
CN109858406A (en) * 2019-01-17 2019-06-07 西北大学 A kind of extraction method of key frame based on artis information
CN109858406B (en) * 2019-01-17 2023-04-07 西北大学 Key frame extraction method based on joint point information
CN109918537A (en) * 2019-01-18 2019-06-21 杭州电子科技大学 A kind of method for quickly retrieving of the ship monitor video content based on HBase
CN109918537B (en) * 2019-01-18 2021-05-11 杭州电子科技大学 HBase-based rapid retrieval method for ship monitoring video content
CN109815921A (en) * 2019-01-29 2019-05-28 北京融链科技有限公司 The prediction technique and device of the class of activity in hydrogenation stations
CN110096950B (en) * 2019-03-20 2023-04-07 西北大学 Multi-feature fusion behavior identification method based on key frame
CN110096950A (en) * 2019-03-20 2019-08-06 西北大学 A kind of multiple features fusion Activity recognition method based on key frame
CN110163127A (en) * 2019-05-07 2019-08-23 国网江西省电力有限公司检修分公司 A kind of video object Activity recognition method from thick to thin
CN110135386A (en) * 2019-05-24 2019-08-16 长沙学院 A kind of human motion recognition method and system based on deep learning
CN112784658A (en) * 2019-11-01 2021-05-11 纬创资通股份有限公司 Method and system for recognizing actions based on atomic gestures and computer readable recording medium
CN111324744A (en) * 2020-02-17 2020-06-23 中山大学 Data enhancement method based on target emotion analysis data set
CN111324744B (en) * 2020-02-17 2023-04-07 中山大学 Data enhancement method based on target emotion analysis data set
CN111666845A (en) * 2020-05-26 2020-09-15 南京邮电大学 Small sample deep learning multi-mode sign language recognition method based on key frame sampling
CN111695507A (en) * 2020-06-12 2020-09-22 桂林电子科技大学 Static gesture recognition method based on improved VGGNet network and PCA
CN111695507B (en) * 2020-06-12 2022-08-16 桂林电子科技大学 Static gesture recognition method based on improved VGGNet network and PCA
WO2022012239A1 (en) * 2020-07-16 2022-01-20 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Action recognition method and related device, storage medium
CN112818859A (en) * 2021-02-02 2021-05-18 电子科技大学 Deep hash-based multi-level retrieval pedestrian re-identification method
CN113326724A (en) * 2021-02-07 2021-08-31 海南长光卫星信息技术有限公司 Method, device and equipment for detecting change of remote sensing image and readable storage medium
CN113326724B (en) * 2021-02-07 2024-02-02 海南长光卫星信息技术有限公司 Remote sensing image change detection method, device, equipment and readable storage medium
CN112560817B (en) * 2021-02-22 2021-07-06 西南交通大学 Human body action recognition method and device, electronic equipment and storage medium
CN112560817A (en) * 2021-02-22 2021-03-26 西南交通大学 Human body action recognition method and device, electronic equipment and storage medium
CN113313030B (en) * 2021-05-31 2023-02-14 华南理工大学 Human behavior identification method based on motion trend characteristics
CN113313030A (en) * 2021-05-31 2021-08-27 华南理工大学 Human behavior identification method based on motion trend characteristics
CN113420612A (en) * 2021-06-02 2021-09-21 深圳中集智能科技有限公司 Production beat calculation method based on machine vision
CN113420612B (en) * 2021-06-02 2022-03-18 深圳中集智能科技有限公司 Production beat calculation method based on machine vision
CN113420719A (en) * 2021-07-20 2021-09-21 北京百度网讯科技有限公司 Method and device for generating motion capture data, electronic equipment and storage medium
CN113326835B (en) * 2021-08-04 2021-10-29 中国科学院深圳先进技术研究院 Action detection method and device, terminal equipment and storage medium
WO2023010758A1 (en) * 2021-08-04 2023-02-09 中国科学院深圳先进技术研究院 Action detection method and apparatus, and terminal device and storage medium
CN113326835A (en) * 2021-08-04 2021-08-31 中国科学院深圳先进技术研究院 Action detection method and device, terminal equipment and storage medium

Similar Documents

Publication Publication Date Title
CN108108699A (en) Merge deep neural network model and the human motion recognition method of binary system Hash
Tu et al. Edge-guided non-local fully convolutional network for salient object detection
CN104143079B (en) The method and system of face character identification
CN108648191B (en) Pest image recognition method based on Bayesian width residual error neural network
CN109523463A (en) A kind of face aging method generating confrontation network based on condition
CN106919951A (en) A kind of Weakly supervised bilinearity deep learning method merged with vision based on click
CN107066973A (en) A kind of video content description method of utilization spatio-temporal attention model
CN104298974B (en) A kind of Human bodys&#39; response method based on deep video sequence
CN109815826A (en) The generation method and device of face character model
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
CN105469376B (en) The method and apparatus for determining picture similarity
CN108108674A (en) A kind of recognition methods again of the pedestrian based on joint point analysis
CN108520213B (en) Face beauty prediction method based on multi-scale depth
CN104077742B (en) Human face sketch synthetic method and system based on Gabor characteristic
Rao et al. Sign Language Recognition System Simulated for Video Captured with Smart Phone Front Camera.
CN103336835B (en) Image retrieval method based on weight color-sift characteristic dictionary
CN111062329B (en) Unsupervised pedestrian re-identification method based on augmented network
CN110378208A (en) A kind of Activity recognition method based on depth residual error network
Gan et al. Facial beauty prediction based on lighted deep convolution neural network with feature extraction strengthened
CN107463954A (en) A kind of template matches recognition methods for obscuring different spectrogram picture
CN106529586A (en) Image classification method based on supplemented text characteristic
CN104063721A (en) Human behavior recognition method based on automatic semantic feature study and screening
CN107563319A (en) Face similarity measurement computational methods between a kind of parent-offspring based on image
CN109034012A (en) First person gesture identification method based on dynamic image and video sequence
CN106203510A (en) A kind of based on morphological feature with the hyperspectral image classification method of dictionary learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180601

WD01 Invention patent application deemed withdrawn after publication