CN108108699A - Merge deep neural network model and the human motion recognition method of binary system Hash - Google Patents
Merge deep neural network model and the human motion recognition method of binary system Hash Download PDFInfo
- Publication number
- CN108108699A CN108108699A CN201711422702.9A CN201711422702A CN108108699A CN 108108699 A CN108108699 A CN 108108699A CN 201711422702 A CN201711422702 A CN 201711422702A CN 108108699 A CN108108699 A CN 108108699A
- Authority
- CN
- China
- Prior art keywords
- video
- frame
- msub
- mrow
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The human motion recognition method that the present invention is a kind of deep neural network model and binary system Hash is combined, belongs to mode identification technology.This method includes:Pretreatment cutting framed sequence is carried out to action recognition database first, calculates light stream figure, and the coordinate of human joint points is calculated using Attitude estimation algorithm, video area frame is extracted using result coordinate;Secondly FC (Full Convolutional) feature is extracted respectively with light stream to the RGB streams of video using 16 network models of VGG of pre-training, key frame is chosen in sequence of frames of video, FC features corresponding to these key frames take difference;Binary conversion treatment is done to difference;The uniform characteristics for obtaining each video with binary hashing methods again represent;With obtaining the character representation of video using a variety of method for normalizing such as L1, L2 after PCNN Fusion Features;Finally it is identified using algorithm of support vector machine training grader human action video.The present invention has higher action recognition accuracy.
Description
Technical field
It is more particularly to a kind of to combine two based on deep neural network model the invention belongs to image/video processing technology field
The human motion recognition method of system Hash.
Background technology
In recent years, research of the human action identification in fields such as pattern-recognition, image processing and analysis achieves very big
Progress, at present existing part human body motion recognition system input actual use.Human action recognizer mainly includes action schedule
Show with two steps of the classification of motion, how to encode human body action message is a very crucial step to the subsequent classification of motion.Reason
In the case of thinking action represent algorithm not only will to the variation of human appearance, scale, complex background and responsiveness, but also comprising
Enough information is supplied to grader to be divided for type of action.But the changeable sex chromosome mosaicism of complex background and human body in itself gives people body
Action recognition brings great challenge.
A series of frame that short-sighted frequency is regarded as inputs by deep learning method is handled.It is obvious that using individual frame not
It is enough the dynamic of effective capturing motion, and a large amount of frames need substantial amounts of parameter, so as to cause model over-fitting, it is necessary to bigger
Training set, computation complexity also higher.This problem is existed in other popular CNN frameworks, such as Tran.D et al.
The 3D convolutional networks of proposition.Therefore, state-of-the-art depth motion identification model is usually trained to from short video clipping generation
Then useful feature is collected and generates whole sequence level descriptor, be then used for line of the training with specific action label
Property grader.In the PCNN models proposed in Cheron et al., the output characteristics of the FC layers flowed by extracting video RGB and combination
The character representation of video is obtained using min or max ponds method.But min or max ponds method is only captured between feature
Level-one association, aggregation operator can more properly capture the High order correletion between CNN functions.
Although CNN in the function of frame grade may it is extremely complex, it is contemplated that using video frame change between pass
Connection property can capture video unique feature this potentially contribute to improve video identification performance.
The content of the invention
Present invention seek to address that above problem of the prior art.It is deep to propose a kind of fusion with better recognition effect
Spend the human motion recognition method of neural network model and binary system Hash.Technical scheme is as follows:
A kind of human motion recognition method for merging deep neural network model and binary system Hash, including following step
Suddenly:
101st, the short-sighted frequency for including human action is obtained, and the short-sighted frequency is cut into sequence of frames of video;
102nd, using the light stream figure of adjacent video frames in 101 sequence of frames of video of optical flow algorithm calculation procedure;
103rd, the coordinate of human joint points is obtained using Attitude estimation algorithm to 101 sequence of frames of video;
104th, the RGB and light stream administrative division map at the body joint point coordinate interception different human body position obtained using step 103, is obtained
The RGB frame sequence of video and light stream frame sequence;
105th, using the VGG-16 models and light stream net of Oxford University's visual geometric group (Visual Geometry Group)
The full connection that network (FlowNet) model extracts the RGB frame sequence that step 104 obtains with each frame in light stream frame sequence
(Full Connected) layer feature, this layer of characteristic dimension are 4096 dimensions;
106th, the FC features obtained using step 105 are carried out pondization operation and are assembled, and it is special to obtain the video tieed up n × 4096
Sign represents;
107th, the video features for obtaining step 106 carry out l2Linear SVM grader is sent into after normalization to classify.
Further, the step 102 uses the light stream figure of 101 adjacent video frame sequence of optical flow algorithm calculation procedure, tool
Body includes step:
Light stream vector between 201. two adjacent video frames of extraction;
Absolute value at all pixels point of the light stream vector of 202. pairs of generations horizontally and vertically is summed respectively,
Obtain the sum of horizontally and vertically two light stream absolute values of frame;
203. generate the light stream absolute value of all frames and according to time sequence entire video level direction and vertical direction
Light stream sequence.
Further, the step of RGB frame sequence of step 104 selecting video is closed with light stream frame sequence includes:
The sliding window size h of different sizes is chosen, and dynamically according to video frame number | F | the sample of acquisition S numbers
Frame simultaneously extracts feature.fTRepresent the frame in original video frame sequence, wherein original video shares T frames;It is crucial selected by expression
A frame in frame sequence, key frame extraction use method shown in formula (2), choose a frame at interval of S frames, choose h frames altogether.
Further, the step 105 uses the convolution of two kinds of different frameworks to distinguish RGB sequences and light stream sequence
Network model, each network contain five layers of convolutional layer and three layers of full articulamentum, use the defeated of second full articulamentum
Go out as FC features i.e. video frame feature, input picture is uniformly adjusted to 224 × 224 size, can so obtain consistent
FC layer features, we using min and max pondizations operation all frame features of one video are polymerize after just regarded
The character representation of frequency.
Further, the FC features of the key frame to selection and corresponding 4096 dimension carry out adjacent mathematic interpolation, use
0,1 represents the variation tendency of feature, thus obtains the matrix of 4096 × h size, and each element is 0 or is 1 in matrix,
Binary sequence of the extraction per a line is calculated using formula (3) and exported, thus obtained video corresponding 4096 as input
The binary system Hash feature of dimension.
Further, the step 106 calculates video characteristic values and specifically includes:Compare two adjacent key framesWith
Characteristic value changes, corresponding to the corresponding feature vector f of video framet p, more adjacent two frame is the same as the variation of characteristic value on dimension, increasing
Adding and represented with 1, reduction is represented with 0, can so obtain the eigenvalue matrix M of a 4096*h, and matrix element only includes 0 or 1,
For every a line feature vector [x of matrixh-1,xh-2,...,x0] using the following formula (3) its binary system Hash mapping is calculated,
The numeric string being made of 0 and 1 is converted into a signless integer by formula (3);
The RGB streams of human body different parts and the binary system Hash feature of light stream frame changing features are finally obtained.
Further, step 107 is except using l2Beyond normalization, fusion l is also used1+β·l2Feature normalization
Mode, l2Represent the second order normalized to feature, l1It represents to normalize the single order of feature, β represents fusion normalization coefficient.When
The mark sheet of video is obtained after the Fusion Features that finally feature extracted by deep neural network is obtained with binary system Hash
Show p, since the characteristic value scale of separate sources has differences, normalize all characteristic values and reuse grader point to a scale
Class.
Further, it is described to have used l1+β·l2The normalization mode of fusion, i.e.,
P=p/ (| | p | |1+β·||p||2) (4)
It advantages of the present invention and has the beneficial effect that:
The innovation of the present invention is:Depth network model and binary system hash method are blended.In view of in recent years
Carry out depth convolutional neural networks to the validity and accuracy on objects in images characterization problems, so selection use covers 2
The VGG-16 network models of pre-training use bag to RGB frame sequential extraction procedures feature on the Imagenet data sets of more than ten thousand kinds of object
The depth model for having contained pre-training on the UCF101 data sets of 101 kinds of actions extracts feature to light stream frame sequence.Use binary system
The simple operations of hash method and high efficiency make at further high-order the static video frame and light stream frame feature of extraction
Reason.With reference to after various features identification is trained using different method for normalizing.Thus know compared with traditional human action
Other method has better recognition effect.
Description of the drawings
Fig. 1 is the output result figure that the present invention provides preferred embodiment Attitude estimation method;
Fig. 2 is the flow chart that the present invention provides preferred embodiment method;
Fig. 3 is binary system hash algorithm flow;
Fig. 4 is:The comparison figure of different method for normalizing.
Fig. 5 is that different size of hash window compares figure;
Fig. 6 is that different size of fusion coefficients compare figure.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, detailed
Carefully describe.Described embodiment is only the part of the embodiment of the present invention.
The present invention solve above-mentioned technical problem technical solution be:
It is a kind of to be known based on the human action of depth network model and binary system hash method below in conjunction with the accompanying drawings shown in 1-2
Other method comprises the following steps:
1. extract the depth characteristic of video
Sample in the video library of experiment is divided into training set and test set, and to all sample extraction FC layers of features, it should
Extracting method detailed step is as follows:
1) to input video cutting framing
In order to extract the local feature information of video, by the video slicing framed sequence for including human action of input.
2) light stream frame is calculated to RGB frame sequence using optical flow algorithm.
3) coordinate of the artis of human body is positioned using Attitude estimation algorithm.
4) region according to where more than body joint point coordinate extraction RGB frame sequence with human joint points in light stream frame sequence.
Including head, shoulder, waist, ancon.
5) in order to distinguish RGB sequences and light stream sequence, we use the convolutional network model of two kinds of different frameworks, each net
Network contains five layers of convolutional layer and three layers of full articulamentum.We use the output of second full articulamentum as FC features
That is video frame feature.Input picture is uniformly adjusted to 224 × 224 size by we, and so we can obtain consistent FC
Layer feature.We have just obtained video after polymerizeing using the operation of min and max pondizations to all frame features of a video
Character representation.
2. calculate the binary system Hash feature of video
Observation is it can be found that the kinetic characteristic of video is sometimes what is distinguished by the transitory motions of Partial key.In order to further
The kinetic characteristic of video is captured, we calculate the binary system Hash feature of video using following steps:
1) it is similar with extraction video depth feature.First to video slicing framing, light stream frame is calculated, extracts human joint points
Coordinate calculates the corresponding FC features of frame sequence of different position of joints.
2) different videos has different frame numbers | F |, we define sliding window size as h, and step-length S is | F |/h.Every
Corresponding step-length chooses key frame.As shown in Figure 3.
3) the FC features of the key frame to selection and corresponding 4096 dimension.We carry out adjacent mathematic interpolation, use 0,1
Represent the variation tendency of feature.So we just obtain the matrix of 4096 × h size, and each element is 0 or is in matrix
1. we extract the binary sequence of every a line as input, calculated and exported using formula (3).So we have just obtained video
The binary system Hash feature of corresponding 4096 dimension.
3. merge depth characteristic and Hash feature
For the depth characteristic that above step 1,2 obtains and binary system Hash feature, we first have to carry out Fusion Features,
SVM classifier is reused to classify.It is main to include step in detailed below:
1) depth characteristic and the fusion feature after Hash merging features are preserved.
2) normal form of eigenmatrix and L2 normal forms are calculated using the fusion feature of everything video.
3) to all elements divided by l in eigenmatrix1Normal form, l2Normal form obtains two kinds and different returns after being normalized
One changes feature.
4) fusion factor β, l are defined1+β·l2Another normalization characteristic is obtained as the normalization normal form after fusion.
5) more than normalization characteristic and respective action class label are sent into SVM classifier, linear kernel is selected to be instructed
Practice.
6) grader is trained to each video.Mark current class is positive sample, other all categories are negative sample
This.The multiple graders of training.
7) for the video of test set, using each classifier calculated score, select highest scoring as accordingly moving
Make classification.
One embodiment of the present of invention is as follows:
Using JHMDB and MPII-Cooking human actions storehouse as experimental data base.
JHMDB action datas collection includes 21 anthropoid actions, including combing one's hair, sitting, standing, running, waving.Each video is only
One section of very short video is contained, includes 15-40 frames.Share 928 videos and 31838 frames marked.
MPII-Cooking action datas collection includes a series of action video that high-resolution humans are cooked in kitchen.Comprising
The actions such as wash dishes, cut fruit, washing one's hands.Each video includes a kind of culinary art activity.The other culinary art action of 64 species is contained altogether,
It is related to 3748 video segments and same background.
(1) JHDMB data sets are 80/20 point there are three types of different training sets/test set division, ratio.Guarantee can be covered
Cover everything species.The accuracy rate of classification is calculated in each test division, using three kinds of divisions average achievement as commenting
Price card is accurate.Specific test result is as shown in attached drawing 4, attached drawing 5.To be significantly better than using the effect of method for normalizing and use original spy
Levy the result classified.Different size of hash window is chosen, in most cases l1Normalization is better than l2Normalization.
We equally compare under different hash windows difference fusion coefficients β to l1+β·l2Normalized influence.Experiment
As a result as shown in Figure 6.
(2) we test classification effect using identical method on JHMDB data sets and MPII-Cooking data sets
Fruit.As shown in table 1, classifying quality show to have merged depth network characterization and binary system Hash feature method be better than before base
In the method for PCNN models.
Table 1:In JHDMB data sets method for normalizing combination Hash feature different from MPII-Cooking data sets to dividing
The influence of class result
The above embodiment is interpreted as being merely to illustrate the present invention rather than limit the scope of the invention.
After the content for having read the record of the present invention, technical staff can make various changes or modifications the present invention, these equivalent changes
Change and modification equally falls into the scope of the claims in the present invention.
Claims (8)
1. a kind of human motion recognition method for merging deep neural network model and binary system Hash, which is characterized in that including
Following steps:
101st, the short-sighted frequency for including human action is obtained, and the short-sighted frequency is cut into sequence of frames of video;
102nd, using the light stream figure of consecutive frame in 101 sequence of frames of video of optical flow algorithm calculation procedure;
103rd, the coordinate of human joint points is obtained using Attitude estimation algorithm to 101 sequence of frames of video;
104th, the RGB and light stream administrative division map at the body joint point coordinate interception different human body position obtained using step 103, obtains video
RGB frame sequence and light stream frame sequence;
105th, the RGB frame obtained using the VGG-16 models of Oxford University's visual geometric group with light stream network model to step 104
Sequence and the full articulamentum feature of each frame extraction in light stream frame sequence, this layer of characteristic dimension are 4096 dimensions;
106th, the FC features obtained using step 105 are carried out pondization operation and are assembled, and obtain the video features table that n × 4096 are tieed up
Show;
107th, the video features for obtaining step 106 carry out l2Linear SVM grader is sent into after normalization to classify.
2. the human motion recognition method of fusion deep neural network model according to claim 1 and binary system Hash,
It is characterized in that, the step 102 is specifically included using the light stream figure of 101 adjacent video frame sequence of optical flow algorithm calculation procedure
Step:
Light stream vector between 201. two adjacent video frames of extraction;
Absolute value at all pixels point of the light stream vector of 202. pairs of generations horizontally and vertically is summed respectively, is obtained
The sum of horizontally and vertically two light stream absolute values of frame;
203. generate the light stream absolute value of all frames and according to time sequence the light stream of entire video level direction and vertical direction
Sequence.
3. the human motion recognition method of fusion deep neural network model according to claim 1 and binary system Hash,
It is characterized in that, the RGB frame sequence of step 104 selecting video includes the step of being closed with light stream frame sequence:
The sliding window size h of different sizes is chosen, and dynamically according to video frame number | F | the sample frame of acquisition S numbers is simultaneously
Extract feature, fTRepresent the frame in original video frame sequence, wherein original video shares T frames;Key frame sequence selected by expression
A frame in row, key frame extraction use method shown in formula (2), choose a frame at interval of S frames, choose h frames altogether;
<mrow>
<mfenced open = "{" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<mo>&lsqb;</mo>
<msub>
<mi>f</mi>
<msub>
<mi>t</mi>
<mn>1</mn>
</msub>
</msub>
<mo>,</mo>
<msub>
<mi>f</mi>
<msub>
<mi>t</mi>
<mn>2</mn>
</msub>
</msub>
<mo>,</mo>
<mo>...</mo>
<mo>,</mo>
<msub>
<mi>f</mi>
<msub>
<mi>t</mi>
<mi>h</mi>
</msub>
</msub>
<mo>&rsqb;</mo>
<mo>&SubsetEqual;</mo>
<mi>F</mi>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mi>F</mi>
<mo>=</mo>
<mo>&lsqb;</mo>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
<mo>,</mo>
<msub>
<mi>f</mi>
<mn>2</mn>
</msub>
<mo>,</mo>
<mo>...</mo>
<mo>,</mo>
<msub>
<mi>f</mi>
<mi>T</mi>
</msub>
<mo>&rsqb;</mo>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>)</mo>
</mrow>
</mrow>
4. the human motion recognition method of fusion deep neural network model according to claim 3 and binary system Hash,
It is characterized in that, the step 105 in order to distinguish RGB sequences and light stream sequence, uses the convolutional network mould of two kinds of different frameworks
Type, each network contain five layers of convolutional layer and three layers of full articulamentum, using second full articulamentum output as
Input picture is uniformly adjusted to 224 × 224 size by FC features, that is, video frame feature, can so obtain consistent FC layers
Feature has just obtained the mark sheet of video after polymerizeing using the operation of min and max pondizations to all frame features of a video
Show.
5. the human motion recognition method of fusion deep neural network model according to claim 4 and binary system Hash,
It is characterized in that, the FC features of the key frame and corresponding 4096 dimension to selection carry out adjacent mathematic interpolation, represented using 0,1
The variation tendency of feature thus obtains the matrix of 4096 × h size, and each element is 0 or is 1 in matrix, and extraction is often
The binary sequence of a line is calculated using formula (3) and exported, thus obtained the two of corresponding 4096 dimension of video as input
System Hash feature.
6. the human motion recognition method of fusion deep neural network model according to claim 4 and binary system Hash,
It is specifically included it is characterized in that, the step 106 calculates video characteristic values:Compare two adjacent key framesWithCharacteristic value
Variation, corresponding to the corresponding feature vector f of video framet p, more adjacent two frame increases with the variation of characteristic value on dimension with 1
It representing, reduction is represented with 0, can so obtain the eigenvalue matrix M of a 4096*h, and matrix element only includes 0 or 1, for
Every a line feature vector [x of matrixh-1,xh-2,...,x0] using the following formula (3) calculate its binary system Hash mapping, formula
(3) numeric string being made of 0 and 1 is converted into a signless integer;
<mrow>
<mi>B</mi>
<mn>2</mn>
<msub>
<mi>U</mi>
<mi>w</mi>
</msub>
<mrow>
<mo>(</mo>
<mover>
<mi>x</mi>
<mo>&RightArrow;</mo>
</mover>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mrow>
<mi>w</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
</munderover>
<msub>
<mi>x</mi>
<mi>i</mi>
</msub>
<mo>&times;</mo>
<msup>
<mn>2</mn>
<mi>i</mi>
</msup>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>3</mn>
<mo>)</mo>
</mrow>
</mrow>
The RGB streams of human body different parts and the binary system Hash feature of light stream frame changing features are finally obtained.
7. the human motion recognition method of fusion deep neural network model according to claim 6 and binary system Hash,
It is characterized in that, step 107 is except using l2Beyond normalization, fusion l is also used1+β·l2Feature normalization mode, l2
Represent the second order normalized to feature, l1It represents to normalize the single order of feature, β represents fusion normalization coefficient.When final handle
The character representation p of video is obtained after the Fusion Features obtained by the feature that deep neural network is extracted with binary system Hash, by
It is had differences in the characteristic value scale of separate sources, normalizes all characteristic values and reuse grader classification to a scale.
8. the human motion recognition method of fusion deep neural network model according to claim 7 and binary system Hash,
It is characterized in that, described used l1+β·l2The normalization mode of fusion, i.e.,
P=p/ (| | p | |1+β·||p||2) (4)。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711422702.9A CN108108699A (en) | 2017-12-25 | 2017-12-25 | Merge deep neural network model and the human motion recognition method of binary system Hash |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711422702.9A CN108108699A (en) | 2017-12-25 | 2017-12-25 | Merge deep neural network model and the human motion recognition method of binary system Hash |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108108699A true CN108108699A (en) | 2018-06-01 |
Family
ID=62212862
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711422702.9A Pending CN108108699A (en) | 2017-12-25 | 2017-12-25 | Merge deep neural network model and the human motion recognition method of binary system Hash |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108108699A (en) |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108960207A (en) * | 2018-08-08 | 2018-12-07 | 广东工业大学 | A kind of method of image recognition, system and associated component |
CN108985223A (en) * | 2018-07-12 | 2018-12-11 | 天津艾思科尔科技有限公司 | A kind of human motion recognition method |
CN109086659A (en) * | 2018-06-13 | 2018-12-25 | 深圳市感动智能科技有限公司 | A kind of Human bodys' response method and apparatus based on multimode road Fusion Features |
CN109255284A (en) * | 2018-07-10 | 2019-01-22 | 西安理工大学 | A kind of Activity recognition method of the 3D convolutional neural networks based on motion profile |
CN109815921A (en) * | 2019-01-29 | 2019-05-28 | 北京融链科技有限公司 | The prediction technique and device of the class of activity in hydrogenation stations |
CN109858406A (en) * | 2019-01-17 | 2019-06-07 | 西北大学 | A kind of extraction method of key frame based on artis information |
CN109918537A (en) * | 2019-01-18 | 2019-06-21 | 杭州电子科技大学 | A kind of method for quickly retrieving of the ship monitor video content based on HBase |
CN110096950A (en) * | 2019-03-20 | 2019-08-06 | 西北大学 | A kind of multiple features fusion Activity recognition method based on key frame |
CN110135386A (en) * | 2019-05-24 | 2019-08-16 | 长沙学院 | A kind of human motion recognition method and system based on deep learning |
CN110163127A (en) * | 2019-05-07 | 2019-08-23 | 国网江西省电力有限公司检修分公司 | A kind of video object Activity recognition method from thick to thin |
CN111104837A (en) * | 2018-10-29 | 2020-05-05 | 联发科技股份有限公司 | Mobile device and related video editing method |
CN111324744A (en) * | 2020-02-17 | 2020-06-23 | 中山大学 | Data enhancement method based on target emotion analysis data set |
CN111666845A (en) * | 2020-05-26 | 2020-09-15 | 南京邮电大学 | Small sample deep learning multi-mode sign language recognition method based on key frame sampling |
CN111695507A (en) * | 2020-06-12 | 2020-09-22 | 桂林电子科技大学 | Static gesture recognition method based on improved VGGNet network and PCA |
CN112560817A (en) * | 2021-02-22 | 2021-03-26 | 西南交通大学 | Human body action recognition method and device, electronic equipment and storage medium |
CN112784658A (en) * | 2019-11-01 | 2021-05-11 | 纬创资通股份有限公司 | Method and system for recognizing actions based on atomic gestures and computer readable recording medium |
CN112818859A (en) * | 2021-02-02 | 2021-05-18 | 电子科技大学 | Deep hash-based multi-level retrieval pedestrian re-identification method |
CN113313030A (en) * | 2021-05-31 | 2021-08-27 | 华南理工大学 | Human behavior identification method based on motion trend characteristics |
CN113326724A (en) * | 2021-02-07 | 2021-08-31 | 海南长光卫星信息技术有限公司 | Method, device and equipment for detecting change of remote sensing image and readable storage medium |
CN113326835A (en) * | 2021-08-04 | 2021-08-31 | 中国科学院深圳先进技术研究院 | Action detection method and device, terminal equipment and storage medium |
CN113420612A (en) * | 2021-06-02 | 2021-09-21 | 深圳中集智能科技有限公司 | Production beat calculation method based on machine vision |
CN113420719A (en) * | 2021-07-20 | 2021-09-21 | 北京百度网讯科技有限公司 | Method and device for generating motion capture data, electronic equipment and storage medium |
WO2022012239A1 (en) * | 2020-07-16 | 2022-01-20 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Action recognition method and related device, storage medium |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103384331A (en) * | 2013-07-19 | 2013-11-06 | 上海交通大学 | Video inter-frame forgery detection method based on light stream consistency |
CN104469229A (en) * | 2014-11-18 | 2015-03-25 | 北京恒华伟业科技股份有限公司 | Video data storing method and device |
CN105468755A (en) * | 2015-11-27 | 2016-04-06 | 东方网力科技股份有限公司 | Video screening and storing method and device |
US20160148391A1 (en) * | 2013-06-12 | 2016-05-26 | Agency For Science, Technology And Research | Method and system for human motion recognition |
CN105741853A (en) * | 2016-01-25 | 2016-07-06 | 西南交通大学 | Digital speech perception hash method based on formant frequency |
CN105989611A (en) * | 2015-02-05 | 2016-10-05 | 南京理工大学 | Blocking perception Hash tracking method with shadow removing |
CN106203283A (en) * | 2016-06-30 | 2016-12-07 | 重庆理工大学 | Based on Three dimensional convolution deep neural network and the action identification method of deep video |
CN106331524A (en) * | 2016-08-18 | 2017-01-11 | 无锡天脉聚源传媒科技有限公司 | Method and device for recognizing shot cut |
CN106845351A (en) * | 2016-05-13 | 2017-06-13 | 苏州大学 | It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term |
CN106937114A (en) * | 2015-12-30 | 2017-07-07 | 株式会社日立制作所 | Method and apparatus for being detected to video scene switching |
CN107169415A (en) * | 2017-04-13 | 2017-09-15 | 西安电子科技大学 | Human motion recognition method based on convolutional neural networks feature coding |
CN107403153A (en) * | 2017-07-20 | 2017-11-28 | 大连大学 | A kind of palmprint image recognition methods encoded based on convolutional neural networks and Hash |
-
2017
- 2017-12-25 CN CN201711422702.9A patent/CN108108699A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160148391A1 (en) * | 2013-06-12 | 2016-05-26 | Agency For Science, Technology And Research | Method and system for human motion recognition |
CN103384331A (en) * | 2013-07-19 | 2013-11-06 | 上海交通大学 | Video inter-frame forgery detection method based on light stream consistency |
CN104469229A (en) * | 2014-11-18 | 2015-03-25 | 北京恒华伟业科技股份有限公司 | Video data storing method and device |
CN105989611A (en) * | 2015-02-05 | 2016-10-05 | 南京理工大学 | Blocking perception Hash tracking method with shadow removing |
CN105468755A (en) * | 2015-11-27 | 2016-04-06 | 东方网力科技股份有限公司 | Video screening and storing method and device |
CN106937114A (en) * | 2015-12-30 | 2017-07-07 | 株式会社日立制作所 | Method and apparatus for being detected to video scene switching |
CN105741853A (en) * | 2016-01-25 | 2016-07-06 | 西南交通大学 | Digital speech perception hash method based on formant frequency |
CN106845351A (en) * | 2016-05-13 | 2017-06-13 | 苏州大学 | It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term |
CN106203283A (en) * | 2016-06-30 | 2016-12-07 | 重庆理工大学 | Based on Three dimensional convolution deep neural network and the action identification method of deep video |
CN106331524A (en) * | 2016-08-18 | 2017-01-11 | 无锡天脉聚源传媒科技有限公司 | Method and device for recognizing shot cut |
CN107169415A (en) * | 2017-04-13 | 2017-09-15 | 西安电子科技大学 | Human motion recognition method based on convolutional neural networks feature coding |
CN107403153A (en) * | 2017-07-20 | 2017-11-28 | 大连大学 | A kind of palmprint image recognition methods encoded based on convolutional neural networks and Hash |
Non-Patent Citations (5)
Title |
---|
GUIHEM CHERON 等: "P-CNN: Pose-based CNN Features for Action Recognition", 《2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION》 * |
RANDAL E.BRYANT 等: "Computer Systems A Programmer’s Perspective", 《COMPUTER SYSTEMS A PROGRAMMER’S PERSPECTIVE》 * |
XIUSHAN NIE 等: "Key-Frame Based Robust Video Hashing Using Isometric Feature Mapping", 《JOURNAL OF COMPUTATIONAL INFORMATION SYSTEMS》 * |
彭天强 等: "基于深度卷积神经网络和二进制哈希学习的图像检索方法", 《电子与信息学报》 * |
王欢: "基于哈希学习的动作捕捉数据的编码与检索", 《万方数据库》 * |
Cited By (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109086659A (en) * | 2018-06-13 | 2018-12-25 | 深圳市感动智能科技有限公司 | A kind of Human bodys' response method and apparatus based on multimode road Fusion Features |
CN109086659B (en) * | 2018-06-13 | 2023-01-31 | 深圳市感动智能科技有限公司 | Human behavior recognition method and device based on multi-channel feature fusion |
CN109255284A (en) * | 2018-07-10 | 2019-01-22 | 西安理工大学 | A kind of Activity recognition method of the 3D convolutional neural networks based on motion profile |
CN108985223A (en) * | 2018-07-12 | 2018-12-11 | 天津艾思科尔科技有限公司 | A kind of human motion recognition method |
CN108985223B (en) * | 2018-07-12 | 2024-05-07 | 天津艾思科尔科技有限公司 | Human body action recognition method |
CN108960207A (en) * | 2018-08-08 | 2018-12-07 | 广东工业大学 | A kind of method of image recognition, system and associated component |
CN108960207B (en) * | 2018-08-08 | 2021-05-11 | 广东工业大学 | Image recognition method, system and related components |
CN111104837A (en) * | 2018-10-29 | 2020-05-05 | 联发科技股份有限公司 | Mobile device and related video editing method |
CN109858406A (en) * | 2019-01-17 | 2019-06-07 | 西北大学 | A kind of extraction method of key frame based on artis information |
CN109858406B (en) * | 2019-01-17 | 2023-04-07 | 西北大学 | Key frame extraction method based on joint point information |
CN109918537A (en) * | 2019-01-18 | 2019-06-21 | 杭州电子科技大学 | A kind of method for quickly retrieving of the ship monitor video content based on HBase |
CN109918537B (en) * | 2019-01-18 | 2021-05-11 | 杭州电子科技大学 | HBase-based rapid retrieval method for ship monitoring video content |
CN109815921A (en) * | 2019-01-29 | 2019-05-28 | 北京融链科技有限公司 | The prediction technique and device of the class of activity in hydrogenation stations |
CN110096950B (en) * | 2019-03-20 | 2023-04-07 | 西北大学 | Multi-feature fusion behavior identification method based on key frame |
CN110096950A (en) * | 2019-03-20 | 2019-08-06 | 西北大学 | A kind of multiple features fusion Activity recognition method based on key frame |
CN110163127A (en) * | 2019-05-07 | 2019-08-23 | 国网江西省电力有限公司检修分公司 | A kind of video object Activity recognition method from thick to thin |
CN110135386A (en) * | 2019-05-24 | 2019-08-16 | 长沙学院 | A kind of human motion recognition method and system based on deep learning |
CN112784658A (en) * | 2019-11-01 | 2021-05-11 | 纬创资通股份有限公司 | Method and system for recognizing actions based on atomic gestures and computer readable recording medium |
CN111324744A (en) * | 2020-02-17 | 2020-06-23 | 中山大学 | Data enhancement method based on target emotion analysis data set |
CN111324744B (en) * | 2020-02-17 | 2023-04-07 | 中山大学 | Data enhancement method based on target emotion analysis data set |
CN111666845A (en) * | 2020-05-26 | 2020-09-15 | 南京邮电大学 | Small sample deep learning multi-mode sign language recognition method based on key frame sampling |
CN111695507A (en) * | 2020-06-12 | 2020-09-22 | 桂林电子科技大学 | Static gesture recognition method based on improved VGGNet network and PCA |
CN111695507B (en) * | 2020-06-12 | 2022-08-16 | 桂林电子科技大学 | Static gesture recognition method based on improved VGGNet network and PCA |
WO2022012239A1 (en) * | 2020-07-16 | 2022-01-20 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Action recognition method and related device, storage medium |
CN112818859A (en) * | 2021-02-02 | 2021-05-18 | 电子科技大学 | Deep hash-based multi-level retrieval pedestrian re-identification method |
CN113326724A (en) * | 2021-02-07 | 2021-08-31 | 海南长光卫星信息技术有限公司 | Method, device and equipment for detecting change of remote sensing image and readable storage medium |
CN113326724B (en) * | 2021-02-07 | 2024-02-02 | 海南长光卫星信息技术有限公司 | Remote sensing image change detection method, device, equipment and readable storage medium |
CN112560817B (en) * | 2021-02-22 | 2021-07-06 | 西南交通大学 | Human body action recognition method and device, electronic equipment and storage medium |
CN112560817A (en) * | 2021-02-22 | 2021-03-26 | 西南交通大学 | Human body action recognition method and device, electronic equipment and storage medium |
CN113313030B (en) * | 2021-05-31 | 2023-02-14 | 华南理工大学 | Human behavior identification method based on motion trend characteristics |
CN113313030A (en) * | 2021-05-31 | 2021-08-27 | 华南理工大学 | Human behavior identification method based on motion trend characteristics |
CN113420612B (en) * | 2021-06-02 | 2022-03-18 | 深圳中集智能科技有限公司 | Production beat calculation method based on machine vision |
CN113420612A (en) * | 2021-06-02 | 2021-09-21 | 深圳中集智能科技有限公司 | Production beat calculation method based on machine vision |
CN113420719A (en) * | 2021-07-20 | 2021-09-21 | 北京百度网讯科技有限公司 | Method and device for generating motion capture data, electronic equipment and storage medium |
CN113326835B (en) * | 2021-08-04 | 2021-10-29 | 中国科学院深圳先进技术研究院 | Action detection method and device, terminal equipment and storage medium |
CN113326835A (en) * | 2021-08-04 | 2021-08-31 | 中国科学院深圳先进技术研究院 | Action detection method and device, terminal equipment and storage medium |
WO2023010758A1 (en) * | 2021-08-04 | 2023-02-09 | 中国科学院深圳先进技术研究院 | Action detection method and apparatus, and terminal device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108108699A (en) | Merge deep neural network model and the human motion recognition method of binary system Hash | |
Tu et al. | Edge-guided non-local fully convolutional network for salient object detection | |
CN110956185B (en) | Method for detecting image salient object | |
CN104143079B (en) | The method and system of face character identification | |
CN108648191B (en) | Pest image recognition method based on Bayesian width residual error neural network | |
CN109523463A (en) | A kind of face aging method generating confrontation network based on condition | |
CN109615582A (en) | A kind of face image super-resolution reconstruction method generating confrontation network based on attribute description | |
CN108764308A (en) | Pedestrian re-identification method based on convolution cycle network | |
CN107066973A (en) | A kind of video content description method of utilization spatio-temporal attention model | |
CN109325443A (en) | A kind of face character recognition methods based on the study of more example multi-tag depth migrations | |
CN104298974B (en) | A kind of Human bodys' response method based on deep video sequence | |
CN109815826A (en) | The generation method and device of face character model | |
CN105160310A (en) | 3D (three-dimensional) convolutional neural network based human body behavior recognition method | |
CN108108674A (en) | A kind of recognition methods again of the pedestrian based on joint point analysis | |
CN108520213B (en) | Face beauty prediction method based on multi-scale depth | |
Rao et al. | Sign Language Recognition System Simulated for Video Captured with Smart Phone Front Camera. | |
CN104077742B (en) | Human face sketch synthetic method and system based on Gabor characteristic | |
CN109635812B (en) | The example dividing method and device of image | |
CN111062329B (en) | Unsupervised pedestrian re-identification method based on augmented network | |
CN110047081A (en) | Example dividing method, device, equipment and the medium of chest x-ray image | |
CN110378208A (en) | A kind of Activity recognition method based on depth residual error network | |
Gan et al. | Facial beauty prediction based on lighted deep convolution neural network with feature extraction strengthened | |
CN110490109A (en) | A kind of online human body recovery action identification method based on monocular vision | |
CN107463954A (en) | A kind of template matches recognition methods for obscuring different spectrogram picture | |
CN106529586A (en) | Image classification method based on supplemented text characteristic |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180601 |
|
WD01 | Invention patent application deemed withdrawn after publication |