CN112906438B - Human body action behavior prediction method and computer equipment - Google Patents

Human body action behavior prediction method and computer equipment Download PDF

Info

Publication number
CN112906438B
CN112906438B CN201911224818.0A CN201911224818A CN112906438B CN 112906438 B CN112906438 B CN 112906438B CN 201911224818 A CN201911224818 A CN 201911224818A CN 112906438 B CN112906438 B CN 112906438B
Authority
CN
China
Prior art keywords
characteristic
action behavior
human
frame
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911224818.0A
Other languages
Chinese (zh)
Other versions
CN112906438A (en
Inventor
李建军
李轲赛
刘慧婷
张宝华
张超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University of Science and Technology
Original Assignee
Inner Mongolia University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University of Science and Technology filed Critical Inner Mongolia University of Science and Technology
Priority to CN201911224818.0A priority Critical patent/CN112906438B/en
Publication of CN112906438A publication Critical patent/CN112906438A/en
Application granted granted Critical
Publication of CN112906438B publication Critical patent/CN112906438B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a prediction method of human body action behaviors and computer equipment, wherein the prediction method comprises the following steps: frame-dividing and sampling the human body action period image information to obtain an image sequence with reduced frame number; sequentially aiming at the images of a plurality of frames of the image sequence according to the ordering of the frames, establishing a skeleton model aiming at each frame of image, and determining the actual values of a plurality of characteristic angles of the skeleton model; according to the reference values of a plurality of characteristic angles corresponding to a plurality of preset human body action behavior categories obtained through pre-training and learning, actual values of a plurality of characteristic angles of the skeleton model are compared with the reference values of a plurality of characteristic angles corresponding to the preset human body action behavior categories; and determining the human action behavior category of each frame of image according to the comparison result. The invention can improve the recognition accuracy of human motion behavior prediction.

Description

Human body action behavior prediction method and computer equipment
Technical Field
The present invention relates to the field of computer vision technology and pattern recognition, and in particular, to a method for predicting human motion behavior and a computer device.
Background
The rapid development of artificial intelligence has led to the widespread use of computer vision technology in numerous fields such as video monitoring, motion retrieval, man-machine interaction, smart home and healthcare. The human behavior recognition technology has achieved a certain result, and can accurately recognize and classify simple human behaviors such as walking, running, bending, waving and the like. In the course of action behavior detection, it is often desirable to be able to quickly obtain the own needs, reducing unnecessary time waste. Therefore, action behavior key frame retrieval in video sequences is of great significance. Prediction of human motion behavior is one of the important bases.
The human body action behavior prediction is to estimate the category of the action as far as possible for the action which does not complete the whole action, and compared with the traditional human body action recognition, the time sequence structure of the complete action is lacking. Ryoo and the like extract middle layer characteristics related to action time sequence structures through improved two word bag models, and firstly, the action prediction problem is presented. Kong et al propose a concept of multiple time scales, which divides videos into a fixed number, extracts dynamic characteristics of behaviors from video segments one by one, and completes the discrimination capability of incomplete actions through a time sequence relation of constraint actions. Xu et al propose an automatic action complement method by utilizing the automatic complement concept in information retrieval, each divided video segment is equivalent to a single character of a sentence, the incomplete video forms the prefix of the sentence, and the prediction of actions in the video is carried out through similarity. The known human motion behavior prediction algorithm has lower recognition accuracy.
Disclosure of Invention
In view of the above, the present invention provides a method for predicting human motion behavior and a computer device, so as to improve recognition accuracy of human motion behavior prediction.
In one aspect, the present invention provides a method for predicting human motion behavior, including: frame-dividing and sampling the human body action period image information to obtain an image sequence with reduced frame number; sequentially aiming at the images of a plurality of frames of the image sequence according to the ordering of the frames, establishing a skeleton model aiming at each frame of image, and determining the actual values of a plurality of characteristic angles of the skeleton model; according to the reference values of a plurality of characteristic angles corresponding to a plurality of preset human body action behavior categories obtained through pre-training and learning, actual values of a plurality of characteristic angles of the skeleton model are compared with the reference values of a plurality of characteristic angles corresponding to the preset human body action behavior categories; and determining the human action behavior category of each frame of image according to the comparison result.
Further, the building a skeleton model for each frame of image and determining actual values of a plurality of feature angles of the skeleton model includes: establishing a skeleton model for each frame of image, wherein the skeleton model comprises an upper limb part, a lower limb part and a middle part; the upper limb portion includes: left hand, left elbow, left shoulder, right hand, right elbow, and right shoulder; the lower limb portion includes: left foot, left knee, right foot, right knee; the intermediate portion comprising a crotch portion;
respectively determining actual values of 1 st characteristic angles formed by a left hand, a left elbow and a left shoulder, actual values of 2 nd characteristic angles formed by a left elbow, a left shoulder and a crotch, actual values of 3 rd characteristic angles formed by a right hand, a right elbow and a right shoulder, actual values of 4 th characteristic angles formed by a right elbow, a right shoulder and a crotch, actual values of 5 th characteristic angles formed by a left foot, a left knee and a crotch, actual values of 6 th characteristic angles formed by a right foot, a right knee and a crotch, and actual values of 7 th characteristic angles formed by a left knee, a crotch and a right knee; wherein the 1 st to 4 th characteristic angles originate from the upper limb portion, the 5 th to 6 th characteristic angles originate from the lower limb portion, and the 7 th characteristic angle originates from the intermediate portion.
Further, the comparing the actual values of the plurality of characteristic angles of the skeleton model with the reference values of the plurality of characteristic angles corresponding to the predetermined human motion behavior categories according to the reference values of the plurality of characteristic angles corresponding to the predetermined human motion behavior categories obtained by training and learning in advance includes: sequentially judging whether the actual value of the nth characteristic angle is within a preset deviation range of a reference value of the nth characteristic angle corresponding to a preset human body action behavior category, wherein n is an integer between 1 and 7; counting the number of the characteristic angles within a preset deviation range of the reference values of the 1 st to 7 th characteristic angles of the specific human action behaviors for each frame of image; the number is the comparison result; the specific human action behavior is any one of predetermined human action behavior categories.
Further, the determining, according to the comparison result, the human action behavior category to which each frame of image belongs includes: if the number is greater than or equal to 5, preliminarily determining the human body action behavior category to which the image of the corresponding frame belongs as the specific human body action behavior; for each frame of image, if the number of the characteristic angles which are not located in the preset deviation range of the reference value of each characteristic angle of the specific human motion behavior is two, and the two characteristic angles which are not located in the preset deviation range of the reference value of each characteristic angle of the specific human motion behavior are respectively sourced from any one of the upper limb part, the lower limb part and the middle part, the human motion behavior category to which the image of the corresponding frame belongs is finally determined as the specific human motion behavior.
Further, the skeleton model is a tree-like structure model of left hand, left elbow, left shoulder, right hand, right elbow, right shoulder, left foot, left knee, right foot, right knee and crotch connection.
Further, the preset deviation range of the reference value of the nth characteristic angle is a range of the reference value degree of the nth characteristic angle.
Further, the obtaining the reference values of the plurality of characteristic angles corresponding to the predetermined human action behavior category obtained by training and learning in advance specifically includes: inputting an action training learning data set; extracting characteristic information of the training learning data set; inputting the characteristic information into a support vector machine for classification to obtain action behavior categories of each training learning data; the action behavior type of each training learning data is a preset human action behavior type; and determining reference values of a plurality of characteristic angles of the corresponding preset action behavior categories according to the action behavior training learning data corresponding to each preset action behavior category.
Further, the image sequence after frame number reduction obtained by frame-dividing and sampling the human motion period image information includes: and merging the first five frames into one frame according to the human motion period image information, sampling every ten frames, and merging the sampled five frames to obtain an image sequence with reduced frame number.
On the other hand, the invention also provides computer equipment, which comprises a processor, wherein the processor is used for realizing the prediction method of the human body action behavior according to the above.
In yet another aspect, the present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the above-described human action behavior prediction method.
According to the prediction method of the human body action behavior and the computer equipment, the image sequence with reduced frame number is obtained by carrying out frame-dividing sampling on the human body action period image information, so that the calculation load can be reduced; sequentially aiming at the images of a plurality of frames of the image sequence according to the ordering of the frames, establishing a skeleton model aiming at each frame of image, and determining the actual values of a plurality of characteristic angles of the skeleton model; according to the reference values of a plurality of characteristic angles corresponding to a plurality of preset human body action behavior categories obtained through pre-training and learning, actual values of a plurality of characteristic angles of the skeleton model are compared with the reference values of a plurality of characteristic angles corresponding to the preset human body action behavior categories; and determining the human action behavior category of each frame of image according to the comparison result. The reference values of a plurality of more accurate characteristic angles are determined through the reference values of a plurality of characteristic angles corresponding to the preset human action behavior categories obtained through pre-training and learning, and then the characteristic behavior prediction is carried out through the comparison result of the reference values of the plurality of characteristic angles, so that the recognition precision of the human action behavior prediction is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method of predicting human action behavior according to an exemplary first embodiment of the present invention;
fig. 2 is a flowchart of a method of predicting human action behavior according to an exemplary second embodiment of the present invention;
FIG. 3 is a flow chart of the SVM classification of FIG. 2 to obtain each predetermined action behavior class;
FIG. 4 is a flowchart of predicting the human action behavior class to which each frame of image belongs in FIG. 2;
FIGS. 5a and 5b are schematic diagrams of skeleton refinement of frames 15 and 25;
FIG. 6 is a schematic diagram of a skeleton model created for each frame of image;
fig. 7 is a schematic structural view of a computer device according to an exemplary third embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
It should be noted that, without conflict, the following embodiments and features in the embodiments may be combined with each other; and, based on the embodiments in this disclosure, all other embodiments that may be made by one of ordinary skill in the art without inventive effort are within the scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the following claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the present disclosure, one skilled in the art will appreciate that one aspect described herein may be implemented independently of any other aspect, and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. In addition, such apparatus may be implemented and/or such methods practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
As shown in fig. 1, the method for predicting human action behavior of the present invention includes:
step 101: frame-dividing and sampling the human body action period image information to obtain an image sequence with reduced frame number;
step 102: sequentially aiming at the images of a plurality of frames of the image sequence according to the ordering of the frames, carrying out the following operations:
step 102a: establishing a skeleton model for each frame of image, and determining actual values of a plurality of characteristic angles of the skeleton model;
step 102b: according to the reference values of a plurality of characteristic angles corresponding to a plurality of preset human body action behavior categories obtained through pre-training and learning, actual values of a plurality of characteristic angles of the skeleton model are compared with the reference values of a plurality of characteristic angles corresponding to the preset human body action behavior categories;
step 102c: and determining the human action behavior category of each frame of image according to the comparison result.
According to the embodiment, the image sequence with reduced frame number is obtained by carrying out frame division sampling on the image information of the human body action period, so that the calculation load can be reduced; sequentially aiming at the images of a plurality of frames of the image sequence according to the ordering of the frames, establishing a skeleton model aiming at each frame of image, and determining the actual values of a plurality of characteristic angles of the skeleton model; according to the reference values of a plurality of characteristic angles corresponding to a plurality of preset human body action behavior categories obtained through pre-training and learning, actual values of a plurality of characteristic angles of the skeleton model are compared with the reference values of a plurality of characteristic angles corresponding to the preset human body action behavior categories; and determining the human action behavior category of each frame of image according to the comparison result. The reference values of a plurality of more accurate characteristic angles are determined through the reference values of a plurality of characteristic angles corresponding to the preset human action behavior categories obtained through pre-training and learning, and then the characteristic behavior prediction is carried out through the comparison result of the reference values of the plurality of characteristic angles, so that the recognition precision of the human action behavior prediction is improved.
Fig. 2 provides a preferred implementation of a human motion behavior prediction method according to the present invention, where the embodiment mainly includes two major processes, the first process is a reference value of a plurality of feature angles corresponding to a predetermined human motion behavior category obtained through training and learning in advance, and specific steps correspond to steps 21-32 in fig. 2; the second flow is a flow for specifically identifying image information to be predicted according to reference values of a plurality of feature angles. Referring to fig. 2 to fig. 6, another method for predicting human motion behavior according to an embodiment of the present invention includes:
firstly, combining with SVM, modeling to obtain reference values of a plurality of characteristic angles corresponding to the predetermined human action behavior category. In the embodiment, firstly, recognition of daily human action behaviors is carried out from a large number of published action behavior data sets through the SVM, and then the preset human action behavior category can be obtained. Because the SVM has higher accuracy for identifying various human daily actions, according to the identification result of the SVM, feature modeling can be carried out on each type of preset human action categories, and a plurality of reference values of feature angles corresponding to the preset human action categories can be obtained.
The specific details of the process of identifying the predetermined human action behavior class according to the SVM are shown in steps 21 to 24, and also can be seen in steps 301 to 304 in the process shown in fig. 3;
step 21: inputting human motion cycle image information including, but not limited to, video;
step 22: preprocessing, such as conventional signal processing of filtering, denoising and the like, is not developed here;
step 23: extracting characteristic information from the human body action cycle image information; specifically, feature extraction can be performed on the motion behavior by using an HOG (direction gradient histogram, histogram of Oriented Gradient) method, the horizontal direction gradient and the vertical direction gradient of the pixel point in the image are calculated respectively, the gradient amplitude and the gradient direction are obtained, and feature extraction is performed by using the histogram. For example, the horizontal gradient and the vertical gradient of the pixel point (x, y) in the image are respectively:
D x (x,y)=H(x+1,y)-H(x-1,y);
D y (x,y)=H(x,y+1)-H(x,y-1);
wherein H (x, y) is the pixel value of the pixel point in the input image at the (x, y) point; the gradient magnitude and gradient direction at the pixel point (x, y) are respectively:
Figure BDA0002301886500000061
α(x,y)=tan -1 (G y (x,y)/G x (x,y));
step 24: and inputting the characteristic information into a Support Vector Machine (SVM) to perform action behavior classification, and obtaining a classification result of the predetermined action behavior category.
The SVM converts the maximum geometric compartmentalization problem into a convex function for mathematical solving optimization, and the embodiment uses a method of constructing multiple classifications by a plurality of two classifiers to realize classification of multiple action behaviors of a human body. Recognition results of SVM in KTH and Weizmann two action behavior data sets:
Figure BDA0002301886500000062
step 25: frame extraction is carried out on the action behavior sequence with correct classification;
considering that the motion training often has a data redundancy phenomenon due to high-speed acquisition, the embodiment preferably adopts a processing method of sampling and framing to rearrange the motion sequence, reduces the operand of subsequent work and reduces the complexity of the process. The first five frames can be fused into one frame, ten frames are sampled every interval, the sampled five frames are fused, and finally, the behavior action sequence is divided into image sequences with fewer frames; for example, a motion behavior video is divided into a motion sequence of 750 frames, each 15 frames are divided into 50 basic units, the first 5 frames of each basic unit are sampled respectively and fused into 1 frame image, and the sequence after being regulated is a new sequence containing 50 frames of images.
Step 26: performing skeleton refinement treatment (shown in fig. 6) on a human body appearance model (such as the model shown in fig. 5a and 5b, wherein the left graph in fig. 5a and 5b is a two-dimensional image, and the right graph is an image skeleton) in the image;
considering that the human body is a non-rigid structure, for the execution of action behaviors, the most remarkable action characteristics are mainly concentrated on the movement of the limbs of the human body, the angles of the movement of the limbs of the human body also change along with time in the action execution process, and meanwhile, different angle changes have certain relevance, such as certain relevance between the bending angle of the knee and the angles of the hands and the shoulders during running; there is also a certain correlation between the angle at the hand and shoulder and the bending angle of the elbow, etc. The present embodiment extracts the angle change of the human limbs as a main angle feature. Specifically, as shown in fig. 6, the skeleton-thinned image may divide the human skeleton into three parts according to a graphic structure: an upper limb portion, a lower limb portion, and a middle portion, the upper limb portion comprising: left hand, left elbow, left shoulder, right hand, right elbow, right shoulder; the lower limb portion includes: left foot, left knee, right foot, right knee; the middle part is crotch part;
step 27: calculating a characteristic angle to realize modeling;
firstly, extracting the angle characteristic angles of independent joints, and specifically dividing the angle characteristic angles into a left hand-left elbow-left shoulder, a left elbow-left shoulder-crotch, a right hand-right elbow-right shoulder, and a right elbow-right shoulder-crotch; left foot-left knee-crotch, right foot-right knee-crotch; seven characteristic angles of left knee-crotch-right knee;
secondly, setting a threshold range of a characteristic angle, establishing a human skeleton angle characteristic model of the action behavior, and sequentially establishing angle characteristic models of various action behaviors;
through a large number of experiments, the threshold setting of the selected characteristic angle and the allowable error range are obtained, and the error size is selected to be the optimal deviation of 5 ℃. The set and allowable errors of the angle threshold of the motion behavior characteristic of running are as follows:
Figure BDA0002301886500000071
after the reference value of the characteristic angle of each action behavior is obtained by modeling in the above-described steps 21 to 27, the magnitude of the characteristic angle of each frame image is calculated for each data to be tested by the following steps 28 to 210, wherein the steps 28 to 30 basically correspond to the processing of the steps 25 to 207. The flow of steps 28-31 can be seen in the detailed flow of fig. 4.
Step 28: frame-dividing and sampling the human body action period image information to obtain an image sequence with reduced frame number;
step 29: building a skeleton model (refinement process) for each frame of image; the skeleton model comprises an upper limb part, a lower limb part and a middle part; the upper limb portion includes: left hand, left elbow, left shoulder, right hand, right elbow, and right shoulder; the lower limb portion includes: left foot, left knee, right foot, right knee; the intermediate portion comprising a crotch portion;
step 30: calculating the characteristic angle, namely respectively determining the actual value of the 1 st characteristic angle formed by the left hand, the left elbow and the left shoulder, the actual value of the 2 nd characteristic angle formed by the left elbow, the left shoulder and the crotch, the actual value of the 3 rd characteristic angle formed by the right hand, the right elbow and the right shoulder, the actual value of the 4 th characteristic angle formed by the right elbow, the right shoulder and the crotch, the actual value of the 5 th characteristic angle formed by the left foot, the left knee and the crotch, the actual value of the 6 th characteristic angle formed by the right foot, the right knee and the crotch, and the actual value of the 7 th characteristic angle formed by the left knee, the crotch and the right knee;
wherein the 1 st to 4 th characteristic angles originate from the upper limb portion, the 5 th to 6 th characteristic angles originate from the lower limb portion, and the 7 th characteristic angle originates from the intermediate portion.
Step 31: according to the reference value determined in step 27, an error range between the actual value of each characteristic angle and the reference value is determined. Namely: sequentially judging whether the actual value of the nth characteristic angle is within a preset deviation range of a reference value of the nth characteristic angle corresponding to a preset human body action behavior category, wherein n is an integer between 1 and 7;
counting the number of the characteristic angles within a preset deviation range of the reference values of the 1 st to 7 th characteristic angles of the specific human action behaviors for each frame of image; the number is the comparison result; the specific human action behavior is any one of predetermined human action behavior categories.
Step 32: and performing action pre-judgment according to a preset rule, and then performing corresponding human action prediction on the next frame of image.
The action behavior prediction can be performed by a template matching method, and the specific steps comprise: firstly, roughly judging, namely respectively comparing the extracted angle characteristics of the sequence to be detected with a template, and if 5 characteristics in 7 characteristics are within a threshold variation range, primarily predicting the action behavior; and secondly, finely judging, namely performing angle characteristic matching on three partial parts of the upper limb part, the lower limb part and the central part respectively, and accurately predicting the action behavior if the two partial angle characteristics are in a threshold variation range. The coarse judgment is only used for comparing 7 independent angle features, and the fine judgment is performed on the basis of the coarse judgment, namely, the angle judgment of the integral features of the local combination is added.
In this embodiment, the pre-judgment effect is shown by taking the running action as an example. Firstly, extracting images of the 10 th frame, the 15 th frame, the 25 th frame, the 35 th frame, the 45 th frame and the 50 th frame in an action period, and then carrying out skeleton refinement treatment on the extracted images to obtain a skeleton model. FIG. 5a and FIG. 5b are key frame skeleton refinements; wherein, fig. 5a is a 15 th frame skeleton refinement, and fig. 5b is a 25 th frame skeleton refinement. The extracted characteristic angles are respectively as follows:
Figure BDA0002301886500000091
fine judgment: namely, the local overall characteristic angles are compared, and three local integers are arranged in the embodiment: the first local overall angle change comparison is performed by taking the left hand-elbow-left shoulder, left elbow-left shoulder-middle (namely crotch), right hand-elbow-right shoulder, right elbow-right shoulder-middle as a whole, namely an upper limb part; taking the middle of the left foot-left knee-and the middle of the right foot-right knee-as a whole, namely a lower limb part, and comparing the second local whole angle change; the crotch angle variation comparison is performed with the left knee-middle-right knee as a whole, i.e., the center portion. In the fine judgment, three local overall angle changes are all in a threshold range and serve as constraint conditions, the pre-judgment of the action behavior can be achieved according to requirements, and meanwhile action key frames in a video sequence can be determined.
Screening daily simple actions of human beings through the disclosed action behavior data set, and carrying out experimental pre-judgment on 6 types of daily actions (running, walking, clapping, bending, waving and upwards waving), wherein the results are shown in the following table.
Figure BDA0002301886500000092
The running and walking actions have better pre-judging effects than other 4 types of actions, wherein the running and walking actions can completely judge specific actions in a completed half period; the effect of prejudging behavior is only obvious after the clapping hands are bent, the hands are swung upwards and the hands are swung only after most of the time of completion of the behavior period. There are two reasons for this: firstly, the running and walking have a larger movement range than other 4-type behavior limbs in the movement process, namely the characteristic angle changes are more obvious; secondly, the region with obvious running and walking action behavior prejudging effect is concentrated at one half period time, because the moment when the action execution is completed in half is the moment with the largest action change, and the characteristic angle change amplitude is the most obvious.
Firstly, identifying and classifying human action behaviors in a selected data set, and carrying out correct action behavior classification by adopting a method of HOG feature extraction and an SVM classifier; then, building a skeleton model for each type of action behaviors, wherein the specific steps comprise: sampling interval frames in the action cycle of the behavior, reconstructing a data set of the behavior, processing targets in the image by using a skeleton refined image processing method, extracting 7 independent angle features respectively, setting a change threshold of a feature angle, establishing a skeleton model, and completing model training; and then extracting characteristic angles of the action behavior data set to be detected, and judging twice through independent angles and local overall angles to achieve the aim of prejudgment and the determination of action behavior key frames in the embodiment. The human skeleton model provided by the embodiment can effectively represent the human action gesture with lower characteristic dimension; meanwhile, the interval sampling fusion frame algorithm can better remove redundant information, so that key actions can be described more effectively, and the operation amount of extracting features frame by frame is less than that in the existing scheme; in addition, by matching the secondary angle feature templates, the classification accuracy of the action behaviors and the accurate critical frames of the pre-judging action behaviors can be further improved.
As shown in fig. 7, one embodiment of a computer device includes a processor that, when executed, implements the method for predicting human motion behavior shown in fig. 1 or fig. 2 described above. The method shown in fig. 2 specifically includes:
(1) Selecting an action behavior disclosure data set for action recognition, and carrying out correct classification of action behaviors by an SVM classifier through HOG feature extraction;
(2) Firstly, frame-dividing extraction is carried out on action sequences with correct classification, in the embodiment, frame-dividing treatment is carried out by adopting a method of frame-dividing at intervals, the first five frames are fused into one frame, sampling is carried out on every ten frames at intervals, the sampled five frames are fused, and finally, the action sequences are divided into image sequences with fewer frames;
(3) Then carrying out skeleton refinement treatment on the human body appearance model in the image;
(4) The human skeleton is divided into three parts according to the graphic structure in the skeleton-thinned image: an upper limb portion, a lower limb portion, and a central portion, the upper limb portion comprising: left hand, left elbow, left shoulder, right hand, right elbow, right shoulder; the lower limb portion includes: left foot, left knee, right foot, right knee; the middle part is crotch part;
(5) The independent joint angle characteristic extraction is specifically divided into left hand-elbow-left shoulder, left elbow-left shoulder-middle, right hand-elbow-right shoulder, right elbow-right shoulder-middle; left foot-left knee-middle, right foot-right knee-middle; seven characteristic angles of left knee-middle-right knee;
(6) Setting a threshold range of a characteristic angle, establishing a human skeleton angle characteristic model of the action behavior, and sequentially establishing angle characteristic models of various action behaviors;
(7) Dividing the test sequence and extracting the angle characteristics by using the methods from the step (2) to the step (6);
(8) The method for predicting the action behavior by the template matching method comprises the following specific steps: firstly, roughly judging, namely respectively comparing the extracted angle characteristics of the sequence to be detected with a template, and if 5 characteristics in 7 characteristics are within a threshold variation range, primarily predicting the action behavior; the second step, fine judgment, namely performing angle characteristic matching on three partial parts of the upper limb part, the lower limb part and the central part respectively, wherein if two partial angle characteristics are in a threshold variation range, the action behavior can be accurately predicted;
(9) And determining a key frame of the action behavior through a prediction step of the action behavior.
The embodiment can search the action key frame and classify the action prejudgment for the daily simple action of the human, and has higher classification accuracy. Based on human skeleton information extracted from the video sequence, the action key frame is searched and the action behavior at the subsequent moment is prejudged through a multi-angle feature matching model.
The present invention also provides a computer-readable storage medium embodiment storing a computer program which, when executed by a processor, implements the above-described human action behavior prediction method.
The embodiment of the computer readable storage medium storing the computer program has the corresponding technical effects of the embodiment of the method for predicting the human motion behavior, and is not described herein.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (8)

1. A method for predicting human action behavior, comprising:
frame-dividing and sampling the human body action period image information to obtain an image sequence with reduced frame number;
sequentially aiming at the images of a plurality of frames of the image sequence according to the ordering of the frames, carrying out the following operations:
establishing a skeleton model for each frame of image, and determining actual values of a plurality of characteristic angles of the skeleton model;
according to the reference values of the plurality of characteristic angles corresponding to the plurality of preset human action behavior categories obtained through pre-training and learning, the actual values of the plurality of characteristic angles of the skeleton model are compared with the reference values of the plurality of characteristic angles corresponding to the preset human action behavior categories, and the method comprises the following steps:
sequentially judging whether the actual value of the nth characteristic angle is within a preset deviation range of a reference value of the nth characteristic angle corresponding to a preset human body action behavior category, wherein n is an integer between 1 and 7;
counting the number of the characteristic angles within a preset deviation range of the reference values of the 1 st to 7 th characteristic angles of the specific human action behaviors for each frame of image; the number is the comparison result; the specific human action behavior is any one of preset human action behavior categories;
determining the human action behavior category of each frame of image according to the comparison result, wherein the human action behavior category comprises:
if the number is greater than or equal to 5, preliminarily determining the human body action behavior category to which the image of the corresponding frame belongs as the specific human body action behavior;
for each frame of image, if the number of the characteristic angles which are not located in the preset deviation range of the reference value of each characteristic angle of the specific human motion behavior is two, and the two characteristic angles which are not located in the preset deviation range of the reference value of each characteristic angle of the specific human motion behavior are respectively sourced from any one of the upper limb part, the lower limb part and the middle part, the human motion behavior category to which the image of the corresponding frame belongs is finally determined as the specific human motion behavior.
2. The method of predicting human action behavior according to claim 1, wherein building a skeleton model for each frame of images in the plurality of frames and determining actual values of a plurality of feature angles of the skeleton model comprises:
establishing a skeleton model for each frame of image, wherein the skeleton model comprises an upper limb part, a lower limb part and a middle part; the upper limb portion includes: left hand, left elbow, left shoulder, right hand, right elbow, and right shoulder; the lower limb portion includes: left foot, left knee, right foot, right knee; the intermediate portion comprising a crotch portion;
respectively determining actual values of 1 st characteristic angles formed by a left hand, a left elbow and a left shoulder, actual values of 2 nd characteristic angles formed by a left elbow, a left shoulder and a crotch, actual values of 3 rd characteristic angles formed by a right hand, a right elbow and a right shoulder, actual values of 4 th characteristic angles formed by a right elbow, a right shoulder and a crotch, actual values of 5 th characteristic angles formed by a left foot, a left knee and a crotch, actual values of 6 th characteristic angles formed by a right foot, a right knee and a crotch, and actual values of 7 th characteristic angles formed by a left knee, a crotch and a right knee;
wherein the 1 st to 4 th characteristic angles originate from the upper limb portion, the 5 th to 6 th characteristic angles originate from the lower limb portion, and the 7 th characteristic angle originates from the intermediate portion.
3. The method of claim 1 or 2, wherein the skeletal model is a tree model of left hand, left elbow, left shoulder, right hand, right elbow, right shoulder, left foot, left knee, right foot, right knee, and crotch connection.
4. The prediction method of human motion behavior according to claim 1, wherein the preset deviation range of the reference value of the nth characteristic angle is a range of ±5 degrees of the reference value of the nth characteristic angle.
5. The method for predicting human action behavior according to claim 1, wherein the reference values of the plurality of feature angles corresponding to the predetermined human action behavior class obtained by training and learning in advance are obtained by:
inputting an action training learning data set;
extracting characteristic information of the training learning data set;
inputting the characteristic information into a support vector machine for classification to obtain action behavior categories of each training learning data; the action behavior type of each training learning data is a preset human action behavior type;
and determining reference values of a plurality of characteristic angles of the corresponding preset action behavior categories according to the action behavior training learning data corresponding to each preset action behavior category.
6. The method for predicting human motion behavior according to claim 1, wherein the step of sampling the human motion cycle image information in frames to obtain the image sequence with reduced frame number comprises:
and merging the first five frames into one frame according to the human motion period image information, sampling every ten frames, and merging the sampled five frames to obtain an image sequence with reduced frame number.
7. A computer device comprising a processor which when executed implements a method of predicting human action behaviour according to any one of claims 1 to 6.
8. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the method of predicting human action behavior according to any one of claims 1 to 6.
CN201911224818.0A 2019-12-04 2019-12-04 Human body action behavior prediction method and computer equipment Active CN112906438B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911224818.0A CN112906438B (en) 2019-12-04 2019-12-04 Human body action behavior prediction method and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911224818.0A CN112906438B (en) 2019-12-04 2019-12-04 Human body action behavior prediction method and computer equipment

Publications (2)

Publication Number Publication Date
CN112906438A CN112906438A (en) 2021-06-04
CN112906438B true CN112906438B (en) 2023-05-02

Family

ID=76104735

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911224818.0A Active CN112906438B (en) 2019-12-04 2019-12-04 Human body action behavior prediction method and computer equipment

Country Status (1)

Country Link
CN (1) CN112906438B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129551A (en) * 2010-02-16 2011-07-20 微软公司 Gesture detection based on joint skipping
CN105320944A (en) * 2015-10-24 2016-02-10 西安电子科技大学 Human body behavior prediction method based on human body skeleton movement information
CN107122752A (en) * 2017-05-05 2017-09-01 北京工业大学 A kind of human action comparison method and device
CN108416251A (en) * 2018-01-08 2018-08-17 中国矿业大学 Efficient human motion recognition method based on quantum genetic algorithm optimization
CN108764120A (en) * 2018-05-24 2018-11-06 杭州师范大学 A kind of human body specification action evaluation method
CN109101864A (en) * 2018-04-18 2018-12-28 长春理工大学 The upper half of human body action identification method returned based on key frame and random forest
CN109409384A (en) * 2018-09-30 2019-03-01 内蒙古科技大学 Image-recognizing method, device, medium and equipment based on fine granularity image
CN110414453A (en) * 2019-07-31 2019-11-05 电子科技大学成都学院 Human body action state monitoring method under a kind of multiple perspective based on machine vision

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4677046B2 (en) * 2006-12-06 2011-04-27 本田技研工業株式会社 Fast human pose estimation using appearance and motion via multidimensional boost regression
JP6433149B2 (en) * 2013-07-30 2018-12-05 キヤノン株式会社 Posture estimation apparatus, posture estimation method and program
EP3574828B8 (en) * 2018-05-28 2020-12-30 Kaia Health Software GmbH Monitoring the performance of physical exercises

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129551A (en) * 2010-02-16 2011-07-20 微软公司 Gesture detection based on joint skipping
CN105320944A (en) * 2015-10-24 2016-02-10 西安电子科技大学 Human body behavior prediction method based on human body skeleton movement information
CN107122752A (en) * 2017-05-05 2017-09-01 北京工业大学 A kind of human action comparison method and device
CN108416251A (en) * 2018-01-08 2018-08-17 中国矿业大学 Efficient human motion recognition method based on quantum genetic algorithm optimization
CN109101864A (en) * 2018-04-18 2018-12-28 长春理工大学 The upper half of human body action identification method returned based on key frame and random forest
CN108764120A (en) * 2018-05-24 2018-11-06 杭州师范大学 A kind of human body specification action evaluation method
CN109409384A (en) * 2018-09-30 2019-03-01 内蒙古科技大学 Image-recognizing method, device, medium and equipment based on fine granularity image
CN110414453A (en) * 2019-07-31 2019-11-05 电子科技大学成都学院 Human body action state monitoring method under a kind of multiple perspective based on machine vision

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
2d human pose estimation: New benchmark and state of the art analysis;Mykhaylo Andriluka;《In Computer Vision and Pattern》;20141230;3686-3693 *
PoseTrack: Joint Multi-person Pose Estimation and;Iqbal U;《ComputerVisionandPatternRecognition》;20171230;4654-4663 *
一种基于RGB-D特征融合的人体行为识别框架;毛峡等;《计算机科学》;20180830(第8期);22-28 *
人体行为分析方法的研究与仿真;甘玲等;《计算机仿真》;20130915(第09期);435-439 *

Also Published As

Publication number Publication date
CN112906438A (en) 2021-06-04

Similar Documents

Publication Publication Date Title
CN109657631B (en) Human body posture recognition method and device
CN108764065B (en) Pedestrian re-recognition feature fusion aided learning method
Ji et al. Interactive body part contrast mining for human interaction recognition
Lin et al. Discriminatively trained and-or graph models for object shape detection
CN109685013B (en) Method and device for detecting head key points in human body posture recognition
CN102938065B (en) Face feature extraction method and face identification method based on large-scale image data
Wu et al. Metric learning based structural appearance model for robust visual tracking
CN105956560A (en) Vehicle model identification method based on pooling multi-scale depth convolution characteristics
Wang et al. A survey on masked facial detection methods and datasets for fighting against COVID-19
Nalepa et al. Wrist localization in color images for hand gesture recognition
CN104036287A (en) Human movement significant trajectory-based video classification method
CN104200240A (en) Sketch retrieval method based on content adaptive Hash encoding
CN104915658B (en) A kind of emotion component analyzing method and its system based on emotion Distributed learning
Seidl et al. Automated classification of petroglyphs
Paul et al. Extraction of facial feature points using cumulative histogram
CN103020614A (en) Human movement identification method based on spatio-temporal interest point detection
Kakadiaris et al. Show me your body: Gender classification from still images
Yang et al. Combination of manual and non-manual features for sign language recognition based on conditional random field and active appearance model
Devi et al. A two-level classification scheme for single-hand gestures of Sattriya dance
Xie et al. Sequential gesture learning for continuous labanotation generation based on the fusion of graph neural networks
Pang et al. Analysis of computer vision applied in martial arts
Li et al. A novel art gesture recognition model based on two channel region-based convolution neural network for explainable human-computer interaction understanding
Zhao et al. Automatic Analysis of Human Body Representations in Western Art
Ma et al. A method of perceptual-based shape decomposition
Cheema et al. Dilated temporal fully-convolutional network for semantic segmentation of motion capture data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant