CN107808376A - A kind of detection method of raising one's hand based on deep learning - Google Patents

A kind of detection method of raising one's hand based on deep learning Download PDF

Info

Publication number
CN107808376A
CN107808376A CN201711044722.7A CN201711044722A CN107808376A CN 107808376 A CN107808376 A CN 107808376A CN 201711044722 A CN201711044722 A CN 201711044722A CN 107808376 A CN107808376 A CN 107808376A
Authority
CN
China
Prior art keywords
hand
raising
frame
sample
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711044722.7A
Other languages
Chinese (zh)
Other versions
CN107808376B (en
Inventor
林娇娇
姜飞
申瑞民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201711044722.7A priority Critical patent/CN107808376B/en
Publication of CN107808376A publication Critical patent/CN107808376A/en
Application granted granted Critical
Publication of CN107808376B publication Critical patent/CN107808376B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/254Analysis of motion involving subtraction of images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30242Counting objects in image

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a kind of detection method of raising one's hand based on deep learning, comprise the following steps:1) sample is collected, the sample is complex environment sample;2) detection model of raising one's hand is established, the detection model of raising one's hand is based on convolutional neural networks structure, and is trained based on the sample with R FCN algorithm of target detection;3) video to be measured is carried out raising one's hand to detect using the detection model of raising one's hand after training, obtains frame position of raising one's hand.Compared with prior art, the present invention have the advantages that can to raise one's hand in detection of complex environment to act, accuracy rate recall ratio it is high.

Description

A kind of detection method of raising one's hand based on deep learning
Technical field
The present invention relates to a kind of video detecting method, more particularly, to a kind of detection method of raising one's hand based on deep learning.
Background technology
Movement human detection in video sequence and Activity recognition are one and are related to computer vision, pattern-recognition and artificial The multi-field research topics such as intelligence, it is always people because it is widely applied value in the fields such as business, medical treatment and military affairs The focus of research.However, because the diversity of human body behavior and non-rigid and intrinsic video image complexity, be proposed A kind of sane and real-time accurately method is still difficult point.
Due to noise and highly dynamic background, different illumination conditions, and small size and multiple possible matchings pair As the action of raising one's hand that people is detected in a typical classroom environment is a challenging task.
Document " Haar-Feature Based Gesture Detection of Hand-Raising for Mobile Robot in HRI Environments " disclose a kind of detection technique of raising one's hand based on Haar features, and this method is trained first Then two graders, all positions of this method human-face detector scanning input picture raise one's hand to examine to search people with one Survey device and scan the specific region around face to have detected whether to raise one's hand.This method is divided into training stage and detection-phase.Training Stage specifically includes:(1) sample is created, training sample is divided into positive sample and negative sample, and wherein positive sample refers to target sample to be checked This, negative sample refers to other any images;(2) feature extraction, including edge feature, linear feature and central feature;(3) Cascaded Adaboost are trained, and are completed by calling OpenCV opencv_traincascade programs.Training terminates A .xml model file is generated afterwards, and the adaboost cascade classifiers of generation, which can detect, raises one's hand to act, and this is also entirely to examine The key of survey technology.Detection-phase specifically includes:(1) video cuts frame and carries out Face datection;(2) sense based on face constraint is emerging Interesting regional choice;(3) carry out raising one's hand to detect in the region of interest using the cascade classifier trained.
Although the above method can obtain testing result, still have several drawbacks:(1) need to carry out Face datection, face The effect quality of detection will directly affect the effect of final detection of raising one's hand;(2) selection of area-of-interest needs to continuously attempt to, right New detection environment needs to reformulate selection scheme, as testing result not robust;(3) raise one's hand to detect based on Haar features Ineffective, accuracy rate and recall ratio are relatively low.
The content of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide one kind is based on deep learning Detection method of raising one's hand.
An object of the present invention is can to raise one's hand in detection of complex environment (such as classroom environment) to act.
The second object of the present invention is to improve the accuracy rate for detection of raising one's hand.
The third object of the present invention is to improve the recall ratio for detection of raising one's hand.
The fourth object of the present invention is to merge the same action of raising one's hand of different frame, obtains number of more really raising one's hand.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of detection method of raising one's hand based on deep learning, comprises the following steps:
1) sample is collected, the sample is complex environment sample;
2) detection model of raising one's hand is established, the detection model of raising one's hand is based on convolutional neural networks structure, and is based on the sample It is trained with R-FCN algorithm of target detection;
3) video to be measured is carried out raising one's hand to detect using the detection model of raising one's hand after training, obtains frame position of raising one's hand.
Further, in the step 1), sample size is more than 30,000.
Further, the step 1) also includes:Sample information is preserved, the sample information includes key frame of video figure The bounding box coordinate for target of being raised one's hand in picture, key frame image information and key frame image information.
Further, the step 1) also includes:Sample-size is clustered, obtains the Pattern plate ruler needed for training process It is very little.
Further, the convolutional neural networks structure includes intermediate level fused layer.
Further, this method also includes step:
4) merged using same raise one's hand action of the track algorithm to different frame.
Further, the step 4) is specially:
401) obtaining first picture frame and the frame coordinate of raising one's hand detected, frame correspondence establishment of respectively raising one's hand has a tracklet Array, and state initialization is ALIVE;
402) next picture frame is obtained, judges whether that camera lens view transformation occurs, if so, then by all tracklet numbers The state of group is changed to DEAD, re-establishes new tracklet arrays, return to step 402), if it is not, then performing step 403);
403) all frames of raising one's hand that traversal current image frame detects, it is optimal for each frame selection of raising one's hand using track algorithm One tracklet array of matching;
404) for the tracklet arrays not being matched under current image frame, judge its state whether ALIVE, if It is that then status modifier is WAIT, if it is not, then status modifier is DEAD, return to step 402), until all images are completed in processing Frame.
Further, it is described to judge whether that camera lens view transformation, which occurs, is specially:
Two adjacent images frame is obtained, counts the pixel that two picture frame corresponding pixel points rates of change exceed first threshold Number;Judge whether the pixel number of change is more than Second Threshold, if so, be then judged to that camera lens view transformation occurs, if it is not, Camera lens view transformation does not occur then.
Further, this method also includes step:
5) action of raising one's hand after detection and merging is counted.
Compared with prior art, the invention has the advantages that:
1st, the present invention uses the video image in complex environment as sample raise one's hand the training of detection model so that Inventive method can raise one's hand to detect suitable for complex environment, can be well adapted for more complicated background.
2nd, detection model of raising one's hand proposed by the invention is the depth based on a large amount of (sample of being raised one's hand more than 30,000) sample trainings Learning model, the accuracy rate of model is high, and by substantial amounts of test, accuracy rate of the present invention is more than 90%.
3rd, the template size required for training process of the present invention is that the size cluster based on sample obtains, rather than artificial choosing Select, effectively improve the effect of model.
4th, template size of the invention cluster and the fusion of the network intermediate level ensure that the recall ratio of model, by a large amount of Test, recall ratio of the present invention be more than 70%.
5th, the track algorithm that uses of the present invention can effectively track between different frame it is same raise one's hand to act, therefore can obtain true Raise one's hand in fact the data of number, foundation is provided for further analysis and evaluation.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the present invention;
Fig. 2 is the schematic flow sheet of sample-size of the present invention cluster;
Fig. 3 is the schematic diagram of network intermediate layer level fusion;
Fig. 4 is the schematic network structure of detection model of the invention of raising one's hand;
Fig. 5 is the merging schematic flow sheet that the present invention raises one's hand to act;
Fig. 6 is that shot boundary of the present invention judges schematic flow sheet;
Fig. 7 is the Detection results figure in embodiment.
Embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.The present embodiment is with technical solution of the present invention Premised on implemented, give detailed embodiment and specific operating process, but protection scope of the present invention is not limited to Following embodiments.
As shown in figure 1, the present invention provides a kind of detection method of raising one's hand based on deep learning, comprise the following steps:
1) sample is collected, the sample is complex environment sample, and sample size is more than 30,000.
Need to preserve sample information, including Video Key two field picture, key frame image information and key frame after collecting sample Bounding box coordinate for target of being raised one's hand in image information etc..
Preserving for sample information can make according to the form of PASCAL VOC data sets.PASCAL VOC are image recognition The outstanding data set of a whole set of standardization is provided with classification, the file preserved under the form include JPEGImages, The key frame images of video are deposited in Annotations etc., wherein JPEGImages, correspondence image is deposited in Annotations Details and image in raise one's hand the bounding box coordinate of target, wherein frame position mark form of raising one's hand by top left co-ordinate and Lower left corner coordinate composition.
Need to use template (anchors) during model training, template size is clustered by sample-size in the present invention Mode obtains.In certain embodiments, sample-size is clustered using kmeans algorithms, selects most representational 9 Kind size is as template.
Distance metric formula in k-means is newly defined as herein:
D (box, centroid)=1-IOU (box, centroid)
Wherein, d (box, centroid) expression bounding box box and particle centroid distance, IOU (box, Centroid rate is overlapped corresponding to) representing.
In above-mentioned formula, IOU (Intersection over Union) represents template anchors (i.e. box) and preliminary making Raise one's hand frame ground truth (i.e. centroid) overlapping rate, be defined as:
As shown in Fig. 2 the input detailed process false code of cluster can be described as:
Require:The pre- bounding box for demarcating frame of raising one's hand of input
Ensure:9 kinds of most typical sizes are exported as template size
1:K=9
2:K point is selected as initial barycenter
3:repeat
4:According to range formula:D (box, centroid)=1-IOU (box, centroid)
5:Each bounding box are assigned to nearest barycenter, form k cluster
6:Recalculate the barycenter of each cluster
7:Until clusters do not change
2) detection model of raising one's hand is established, the detection model of raising one's hand is based on convolutional neural networks structure, and is based on the sample It is trained with R-FCN algorithm of target detection.Convolutional neural networks structure includes intermediate level fused layer, to enrich convolutional Neural The feature that network extraction arrives, and then improve the accuracy rate of detection.
In certain embodiments, the ResNet-101 for the revision that convolutional neural networks structure uses, with C1, C2, C3, C4, C5 represent ResNet-101 conv1, conv2, conv3, conv4, conv5 output respectively.With folding for the convolution number of plies Add, the receptive field of each convolution kernel is increasing, and the semantic feature learnt is also more advanced, but some trickle features are got over Easily it is ignored.And some environment pass the imperial examinations at the provincial level the resolution ratio made manually can be smaller, therefore in order to correctly detect Small object, we are by C3 It is superimposed with C5 output, make feature that network learns in C5 layers while have high-level semantics feature and low-level details special Sign.As shown in figure 3, res5c_relu is C5 output, C5_topdown is C5 up-sampling layer, C5 is upsampled to and C3 mono- The size of sample, last C5_topdown are superimposed to obtain P3 layers with C3, and P3 is in being that instead of outputs of the res5c_relu as C5, and this is just Enrich the feature that convolutional neural networks extract.
After feature extraction network uses ResNet-101, and the characteristic pattern for having done the network intermediate level merges, using R-FCN Algorithm of target detection carries out model training.Extract image's first by the conv+relu+pooling layers on one group of basis feature maps.The feature maps are shared for follow-up RPN networks and detection networks.RPN networks are used to give birth to Into region proposals, the layer judges that anchors belongs to foreground or background by softmax, then Accurate proposals is obtained using bounding box regression amendments anchors.Roi Pooling layers are collected defeated The feature maps and proposals entered, proposal feature maps are extracted after integrating these information, and calculated Position-sensitive score maps, it is then fed into follow-up detection networks and judges target classification.Finally utilize Proposal feature maps calculate proposal classification, and obtain the final exact position of detection block.
ResNet-101 includes 5 convolution blocks, 101 layers altogether, and 4 convolution blocks are as RPN nets before the R-FCN of master is used The shared weights network of network and detection networks, feature extraction network of the 5th convolution block as detection networks, The present invention is using all 101 layers as RPN networks and the shared weights network of detection networks, the 5th convolution block output Feature map be shared for RPN networks and detection networks, such processing mode is ensureing the base of accuracy rate Amount of calculation is also greatly reduced on plinth simultaneously.
Raise one's hand detection model network it is as shown in Figure 4.
3) video to be measured is carried out raising one's hand to detect using the detection model of raising one's hand after training, obtains frame position of raising one's hand.
In certain embodiments, this method also includes step:4) according to the position of previous frame, next frame is raised one's hand to act It is tracked, is merged using same raise one's hand action of the track algorithm to different frame.In the feelings that camera lens visual angle does not convert Under condition, it can be tracked using same raise one's hand action of the track algorithm to different frame.Track algorithm can use backtracking-beta pruning Method, for raise one's hand action and the action progress Optimum Matching of raising one's hand of next frame of previous frame.
Step 4) is specially:
401) obtaining first picture frame and the frame coordinate of raising one's hand detected, frame correspondence establishment of respectively raising one's hand has a tracklet Array, and state initialization is ALIVE;
402) next picture frame is obtained, judges whether that camera lens view transformation occurs, if so, then by all tracklet numbers The state of group is changed to DEAD, re-establishes new tracklet arrays, return to step 402), if it is not, then performing step 403);
403) all frames of raising one's hand that traversal current image frame detects, frame selection is raised one's hand most to be each using beta pruning method is recalled One tracklet array of good matching;
404) for the tracklet arrays not being matched under current image frame, judge its state whether ALIVE, if It is that then status modifier is WAIT, if it is not, then status modifier is DEAD, return to step 402), until all images are completed in processing Frame.
The false code of said process can be summarized as:
Require:The set of N number of image is inputted, and the frame bounding box that raise one's hand detected respectively
Ensure:Export tracklets
The single image frame merging process made manually of passing the imperial examinations at the provincial level is as shown in Figure 5.
There is the possibility of camera lens view transformation in the video capture based on camera, the present invention solves this using frame difference method and asked Topic, i.e., successive frame subtracts each other.As shown in fig. 6, judge whether that camera lens view transformation, which occurs, is specially:
Two adjacent images frame is obtained, counts the pixel that two picture frame corresponding pixel points rates of change exceed first threshold Number;Judge whether the pixel number of change is more than Second Threshold, if so, be then judged to that camera lens view transformation occurs, if it is not, Camera lens view transformation does not occur then.
Specific determination methods are that white portion (i.e. motion parts) accounts for whether overall pixel has exceeded 20%, more than being to cut Change.
Based on above-mentioned merging process, this method may also include step:5) action of raising one's hand after detection and merging is counted Number.
Embodiment 1
The present embodiment illustrates the above method by taking students in middle and primary schools' classroom environment as an example.40,000 sample sizes are collected, by PASCAL The form of VOC data sets makes sample of raising one's hand.By the cluster of sample-size, the 9 kinds of anchor box sizes finally clustered out For:
(37,59) (44,72) (53,80) (56,96) (67,105) (75,128) (91,150) (115,184) (177, 283)。
Training process in the present embodiment has iteration altogether 20000 times, obtains an effect and preferably raises one's hand detection model. The detection model part design sketch of raising one's hand trained is as shown in Figure 7.
After the merging for action of being raised one's hand using track algorithm progress different frame, the statistics of quantity is carried out, records whole classroom Pass the imperial examinations at the provincial level the frequency made manually, complete a classroom and pass the imperial examinations at the provincial level the counting made manually, classroom atmosphere is assessed with this, is classroom atmosphere Intellectual analysis provide foundation.
Through experiment, the above method raise one's hand Detection accuracy and recall ratio it is higher, accuracy rate more than 90%, recall ratio 70% More than.
Preferred embodiment of the invention described in detail above.It should be appreciated that one of ordinary skill in the art without Creative work can is needed to make many modifications and variations according to the design of the present invention.Therefore, all technologies in the art Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Technical scheme, all should be in the protection domain being defined in the patent claims.

Claims (9)

1. a kind of detection method of raising one's hand based on deep learning, it is characterised in that comprise the following steps:
1) sample is collected, the sample is complex environment sample;
2) detection model of raising one's hand is established, the detection model of raising one's hand is based on convolutional neural networks structure, and based on the sample with R- FCN algorithm of target detection is trained;
3) video to be measured is carried out raising one's hand to detect using the detection model of raising one's hand after training, obtains frame position of raising one's hand.
2. the detection method of raising one's hand according to claim 1 based on deep learning, it is characterised in that in the step 1), Sample size is more than 30,000.
3. the detection method of raising one's hand according to claim 1 based on deep learning, it is characterised in that the step 1) is also wrapped Include:Sample information is preserved, the sample information includes Video Key two field picture, key frame image information and key frame image information In raise one's hand the bounding box coordinate of target.
4. the detection method of raising one's hand according to claim 1 based on deep learning, it is characterised in that the step 1) is also wrapped Include:Sample-size is clustered, obtains the template size needed for training process.
5. the detection method of raising one's hand according to claim 1 based on deep learning, it is characterised in that the convolutional Neural net Network structure includes intermediate level fused layer.
6. the detection method of raising one's hand according to claim 1 based on deep learning, it is characterised in that this method also includes step Suddenly:
4) merged using same raise one's hand action of the track algorithm to different frame.
7. the detection method of raising one's hand according to claim 6 based on deep learning, it is characterised in that the step 4) is specific For:
401) obtaining first picture frame and the frame coordinate of raising one's hand detected, frame correspondence establishment of respectively raising one's hand has a tracklet numbers Group, and state initialization is ALIVE;
402) next picture frame is obtained, judges whether that camera lens view transformation occurs, if so, then by all tracklet arrays State is changed to DEAD, re-establishes new tracklet arrays, return to step 402), if it is not, then performing step 403);
403) all frames of raising one's hand that traversal current image frame detects, best match is selected for each frame of raising one's hand using track algorithm A tracklet array;
404) for the tracklet arrays not being matched under current image frame, judge its state whether ALIVE, if so, then Status modifier is WAIT, if it is not, then status modifier is DEAD, return to step 402), until all picture frames are completed in processing.
8. the detection method of raising one's hand according to claim 6 based on deep learning, it is characterised in that described to judge whether to send out Giving birth to camera lens view transformation is specially:
Two adjacent images frame is obtained, counts the pixel that two picture frame corresponding pixel points rates of change exceed first threshold Number;Judge whether the pixel number of change is more than Second Threshold, if so, being then judged to that camera lens view transformation occurs, if it is not, then Camera lens view transformation does not occur.
9. the detection method of raising one's hand according to claim 6 based on deep learning, it is characterised in that this method also includes step Suddenly:
5) action of raising one's hand after detection and merging is counted.
CN201711044722.7A 2017-10-31 2017-10-31 Hand raising detection method based on deep learning Expired - Fee Related CN107808376B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711044722.7A CN107808376B (en) 2017-10-31 2017-10-31 Hand raising detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711044722.7A CN107808376B (en) 2017-10-31 2017-10-31 Hand raising detection method based on deep learning

Publications (2)

Publication Number Publication Date
CN107808376A true CN107808376A (en) 2018-03-16
CN107808376B CN107808376B (en) 2022-03-11

Family

ID=61591064

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711044722.7A Expired - Fee Related CN107808376B (en) 2017-10-31 2017-10-31 Hand raising detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN107808376B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108921748A (en) * 2018-07-17 2018-11-30 郑州大学体育学院 Didactic code method and computer-readable medium based on big data analysis
CN109508661A (en) * 2018-10-31 2019-03-22 上海交通大学 A kind of person's of raising one's hand detection method based on object detection and Attitude estimation
CN110163836A (en) * 2018-11-14 2019-08-23 宁波大学 Based on deep learning for the excavator detection method under the inspection of high-altitude
CN110399822A (en) * 2019-07-17 2019-11-01 思百达物联网科技(北京)有限公司 Action identification method of raising one's hand, device and storage medium based on deep learning
CN110414380A (en) * 2019-07-10 2019-11-05 上海交通大学 A kind of students ' behavior detection method based on target detection
CN110941976A (en) * 2018-09-24 2020-03-31 天津大学 Student classroom behavior identification method based on convolutional neural network
CN112686128A (en) * 2020-12-28 2021-04-20 南京览众智能科技有限公司 Classroom desk detection method based on machine learning
CN116739859A (en) * 2023-08-15 2023-09-12 创而新(北京)教育科技有限公司 Method and system for on-line teaching question-answering interaction
CN117670259A (en) * 2024-01-31 2024-03-08 天津师范大学 Sample detection information management method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112121A (en) * 2014-07-01 2014-10-22 深圳市欢创科技有限公司 Face identification method, device and interactive game system used for interactive game device
CN106651765A (en) * 2016-12-30 2017-05-10 深圳市唯特视科技有限公司 Method for automatically generating thumbnail by use of deep neutral network
CN107122736A (en) * 2017-04-26 2017-09-01 北京邮电大学 A kind of human body based on deep learning is towards Forecasting Methodology and device
CN107145908A (en) * 2017-05-08 2017-09-08 江南大学 A kind of small target detecting method based on R FCN
CN107273828A (en) * 2017-05-29 2017-10-20 浙江师范大学 A kind of guideboard detection method of the full convolutional neural networks based on region

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112121A (en) * 2014-07-01 2014-10-22 深圳市欢创科技有限公司 Face identification method, device and interactive game system used for interactive game device
CN106651765A (en) * 2016-12-30 2017-05-10 深圳市唯特视科技有限公司 Method for automatically generating thumbnail by use of deep neutral network
CN107122736A (en) * 2017-04-26 2017-09-01 北京邮电大学 A kind of human body based on deep learning is towards Forecasting Methodology and device
CN107145908A (en) * 2017-05-08 2017-09-08 江南大学 A kind of small target detecting method based on R FCN
CN107273828A (en) * 2017-05-29 2017-10-20 浙江师范大学 A kind of guideboard detection method of the full convolutional neural networks based on region

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TIAGO S.NAZAR´ 等: "Hand-Raising Gesture Detection with Lienhart-Maydt Method in Videoconference and Distance Learning", 《SPRING》 *
桑农 等: "复杂场景下基于R-FCN的手势识别", 《华中科技大学学报(自然科学版)》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108921748B (en) * 2018-07-17 2022-02-01 郑州大学体育学院 Teaching planning method based on big data analysis and computer readable medium
CN108921748A (en) * 2018-07-17 2018-11-30 郑州大学体育学院 Didactic code method and computer-readable medium based on big data analysis
CN110941976A (en) * 2018-09-24 2020-03-31 天津大学 Student classroom behavior identification method based on convolutional neural network
CN109508661A (en) * 2018-10-31 2019-03-22 上海交通大学 A kind of person's of raising one's hand detection method based on object detection and Attitude estimation
CN109508661B (en) * 2018-10-31 2021-07-09 上海交通大学 Method for detecting hand lifter based on object detection and posture estimation
CN110163836B (en) * 2018-11-14 2021-04-06 宁波大学 Excavator detection method used under high-altitude inspection based on deep learning
CN110163836A (en) * 2018-11-14 2019-08-23 宁波大学 Based on deep learning for the excavator detection method under the inspection of high-altitude
CN110414380A (en) * 2019-07-10 2019-11-05 上海交通大学 A kind of students ' behavior detection method based on target detection
CN110399822A (en) * 2019-07-17 2019-11-01 思百达物联网科技(北京)有限公司 Action identification method of raising one's hand, device and storage medium based on deep learning
CN112686128A (en) * 2020-12-28 2021-04-20 南京览众智能科技有限公司 Classroom desk detection method based on machine learning
CN116739859A (en) * 2023-08-15 2023-09-12 创而新(北京)教育科技有限公司 Method and system for on-line teaching question-answering interaction
CN117670259A (en) * 2024-01-31 2024-03-08 天津师范大学 Sample detection information management method
CN117670259B (en) * 2024-01-31 2024-04-19 天津师范大学 Sample detection information management method

Also Published As

Publication number Publication date
CN107808376B (en) 2022-03-11

Similar Documents

Publication Publication Date Title
CN107808376A (en) A kind of detection method of raising one's hand based on deep learning
Wu et al. Recent advances in video-based human action recognition using deep learning: A review
CN108334848B (en) Tiny face recognition method based on generation countermeasure network
Li et al. Robust visual tracking based on convolutional features with illumination and occlusion handing
CN108062525B (en) Deep learning hand detection method based on hand region prediction
CN107316058A (en) Improve the method for target detection performance by improving target classification and positional accuracy
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
CN107481264A (en) A kind of video target tracking method of adaptive scale
CN114220035A (en) Rapid pest detection method based on improved YOLO V4
CN109816689A (en) A kind of motion target tracking method that multilayer convolution feature adaptively merges
CN107808143A (en) Dynamic gesture identification method based on computer vision
CN107945153A (en) A kind of road surface crack detection method based on deep learning
CN105740758A (en) Internet video face recognition method based on deep learning
CN107292246A (en) Infrared human body target identification method based on HOG PCA and transfer learning
Li et al. Sign language recognition based on computer vision
CN106650619A (en) Human action recognition method
CN114241548A (en) Small target detection algorithm based on improved YOLOv5
CN103400122A (en) Method for recognizing faces of living bodies rapidly
CN108664838A (en) Based on the monitoring scene pedestrian detection method end to end for improving RPN depth networks
CN110163567A (en) Classroom roll calling system based on multitask concatenated convolutional neural network
Yang et al. Facial expression recognition based on dual-feature fusion and improved random forest classifier
CN108171133A (en) A kind of dynamic gesture identification method of feature based covariance matrix
CN114241422A (en) Student classroom behavior detection method based on ESRGAN and improved YOLOv5s
Tanisik et al. Facial descriptors for human interaction recognition in still images
CN109360179A (en) A kind of image interfusion method, device and readable storage medium storing program for executing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220311

CF01 Termination of patent right due to non-payment of annual fee