CN112906438A

CN112906438A - Human body action behavior prediction method and computer equipment

Info

Publication number: CN112906438A
Application number: CN201911224818.0A
Authority: CN
Inventors: 李建军; 李轲赛; 刘慧婷; 张宝华; 张超
Original assignee: Inner Mongolia University of Science and Technology
Current assignee: Inner Mongolia University of Science and Technology
Priority date: 2019-12-04
Filing date: 2019-12-04
Publication date: 2021-06-04
Anticipated expiration: 2039-12-04
Also published as: CN112906438B

Abstract

The invention provides a prediction method of human body action behavior and computer equipment, wherein the prediction method comprises the following steps: carrying out frame-dividing sampling on the human body action period image information to obtain an image sequence with reduced frame number; establishing a skeleton model for each frame of image, and determining actual values of a plurality of characteristic angles of the skeleton model; comparing the actual values of the plurality of characteristic angles of the skeleton model with the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories according to the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories obtained by pre-training and learning; and determining the human body action behavior category to which each frame of image belongs according to the comparison result. The invention can improve the recognition precision of human body action behavior prediction.

Description

Human body action behavior prediction method and computer equipment

Technical Field

The invention relates to the field of computer vision technology and pattern recognition, in particular to a human body action behavior prediction method and computer equipment.

Background

The rapid development of artificial intelligence enables computer vision technology to be widely applied to numerous fields such as video monitoring, motion retrieval, human-computer interaction, smart home, medical care and the like. Human behavior recognition technology has achieved certain achievements, and can accurately recognize and classify simple human behaviors, such as walking, running, bending, waving and other actions. In the course of detecting the action, it is often desired to quickly obtain the self-requirement, and reduce unnecessary time waste. Therefore, the retrieval of the action behavior key frame in the video sequence has important significance. The prediction of human body action behavior is one of the important bases.

The human body action behavior prediction is to estimate the category of the action as much as possible for the action of which the whole action is not finished, and the human body action prediction lacks a time sequence structure of the complete action compared with the traditional human body action recognition. Ryoo et al extract the middle-level features related to the action time sequence structure through two improved bag-of-word models, and put forward the action prediction problem first. Kong et al put forward a concept of multi-time scale, which divides the video by a fixed amount, extracts dynamic characteristics of behaviors one by one from the video band, and completes the discrimination capability of incomplete actions by restricting the time sequence relation of the actions. Xu et al propose an action automatic completion method by using an automatic completion thought in information retrieval, each divided video segment is equivalent to a single character of a sentence, an incomplete video forms a prefix of the sentence, and action prediction in the video is performed through similarity. The known human body action behavior prediction algorithm has low recognition accuracy.

Disclosure of Invention

In view of the above, the present invention provides a method for predicting human body motion behavior and a computer device, so as to improve the recognition accuracy of human body motion behavior prediction.

In one aspect, the present invention provides a method for predicting human body action behavior, including: carrying out frame-dividing sampling on the human body action period image information to obtain an image sequence with reduced frame number; establishing a skeleton model for each frame of image, and determining actual values of a plurality of characteristic angles of the skeleton model; comparing the actual values of the plurality of characteristic angles of the skeleton model with the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories according to the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories obtained by pre-training and learning; and determining the human body action behavior category to which each frame of image belongs according to the comparison result.

Further, the building a skeleton model for each frame of image and determining the actual values of the plurality of feature angles of the skeleton model includes: aiming at each frame of image, establishing a skeleton model, wherein the skeleton model comprises an upper limb part, a lower limb part and a middle part; the upper limb portion includes: left hand, left elbow, left shoulder, right hand, right elbow, and right shoulder; the lower limb portion includes: a left foot, a left knee, a right foot, and a right knee; the middle portion comprises a crotch portion;

respectively determining the actual values of the 1 st characteristic angle formed by the left hand, the left elbow and the left shoulder, the 2 nd characteristic angle formed by the left elbow, the left shoulder and the crotch, the 3 rd characteristic angle formed by the right hand, the right elbow and the right shoulder, the 4 th characteristic angle formed by the right elbow, the right shoulder and the crotch, the 5 th characteristic angle formed by the left foot, the left knee and the crotch, the 6 th characteristic angle formed by the right foot, the right knee and the crotch, and the 7 th characteristic angle formed by the left knee, the crotch and the right knee; wherein the 1 st to 4 th characteristic angles are derived from the upper limb portion, the 5 th to 6 th characteristic angles are derived from the lower limb portion, and the 7 th characteristic angle is derived from the middle portion.

Further, the comparing, according to the reference values of the plurality of feature angles corresponding to the plurality of predetermined human motion behavior categories obtained by the pre-training and learning, the actual values of the plurality of feature angles of the skeletal model with the reference values of the plurality of feature angles corresponding to the predetermined human motion behavior categories includes: sequentially judging whether the actual value of the nth characteristic angle is within a preset deviation range of the reference value of the nth characteristic angle corresponding to the preset human body action behavior type, wherein n is an integer between 1 and 7; counting the number of each characteristic angle within a preset deviation range of the reference values of the 1 st to 7 th characteristic angles of the specific human body action behaviors for each frame of image; the number is the comparison result; the specific human body action is any one of predetermined human body action category.

Further, the determining the human body action behavior category to which each frame of image belongs according to the comparison result includes: if the number is more than or equal to 5, preliminarily determining the human body action behavior category to which the image of the corresponding frame belongs as the specific human body action behavior; and for each frame of image, when the number of the characteristic angles which are not located in the preset deviation range of the reference value of each characteristic angle of the specific human motion behavior is two, and the two characteristic angles which are not located in the preset deviation range of the reference value of each characteristic angle of the specific human motion behavior are respectively only from the upper limb part, the lower limb part and the middle part, the human motion behavior category to which the image of the corresponding frame belongs is finally determined as the specific human motion behavior.

Further, the skeleton model is a tree structure model with connected left hand, left elbow, left shoulder, right hand, right elbow, right shoulder, left foot, left knee, right foot, right knee and crotch.

Further, the preset deviation range of the reference value of the nth characteristic angle is a range of the reference value of the nth characteristic angle.

Further, the obtaining of the reference values of the plurality of characteristic angles corresponding to the predetermined human body motion behavior categories obtained by the pre-training learning through the following operations specifically includes: inputting a motion behavior training learning data set; extracting characteristic information of the training learning data set; inputting the characteristic information into a support vector machine for classification to obtain action behavior categories of each training learning data; the action behavior category of each training learning data is a preset human body action behavior category; and determining reference values of a plurality of characteristic angles of the corresponding preset action behavior categories according to the action behavior training learning data corresponding to each type of preset action behavior categories.

Further, the above-mentioned image sequence obtained by performing frame-wise sampling on the image information of the human motion cycle and having a reduced frame number includes: and (3) fusing the first five frames into one frame for human body action period image information, sampling every ten frames, and fusing the sampled five frames to obtain an image sequence with reduced frame number.

In another aspect, the present invention further provides a computer device, which includes a processor, and when the processor executes the method for predicting human body action behavior according to the above description.

In still another aspect, the present invention also provides a computer readable storage medium storing a computer program, which when executed by a processor implements the above-mentioned method for predicting human body motion behavior.

The invention provides a method for predicting body motion behavior and computer equipment, which can reduce the calculation load by performing frame-division sampling on image information of a body motion period to obtain an image sequence with reduced frame number; establishing a skeleton model for each frame of image, and determining actual values of a plurality of characteristic angles of the skeleton model; comparing the actual values of the plurality of characteristic angles of the skeleton model with the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories according to the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories obtained by pre-training and learning; and determining the human body action behavior category to which each frame of image belongs according to the comparison result. The more accurate reference values of the plurality of characteristic angles are determined through the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories obtained through pre-training and learning, and then the characteristic behavior prediction is carried out through the comparison result of the reference values of the plurality of characteristic angles, so that the recognition accuracy of the human body action behavior prediction is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of a method of predicting human body motion behavior according to an exemplary first embodiment of the present invention;

fig. 2 is a flowchart of a method of predicting human body motion behavior according to an exemplary second embodiment of the present invention;

FIG. 3 is a flowchart illustrating the classification of the predetermined action categories by SVM in FIG. 2;

FIG. 4 is a flowchart of the method for predicting the motion behavior category of the human body to which each frame of image belongs in FIG. 2;

FIGS. 5a and 5b are schematic diagrams illustrating skeleton refinement of the 15 th frame and the 25 th frame;

FIG. 6 is a schematic diagram of a skeleton model established for each frame of image;

fig. 7 is a schematic structural diagram of a computer apparatus according to an exemplary third embodiment of the present invention.

Detailed Description

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

It should be noted that, in the case of no conflict, the features in the following embodiments and examples may be combined with each other; moreover, all other embodiments that can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort fall within the scope of the present disclosure.

It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.

As shown in fig. 1, the method for predicting human body action behavior of the present invention includes:

step 101: carrying out frame-dividing sampling on the human body action period image information to obtain an image sequence with reduced frame number;

step 102: sequentially carrying out the following operations on the images of the frames of the image sequence according to the sequence of the frames:

step 102 a: establishing a skeleton model aiming at each frame of image, and determining actual values of a plurality of characteristic angles of the skeleton model;

step 102 b: comparing the actual values of the plurality of characteristic angles of the skeleton model with the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories according to the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories obtained by pre-training and learning;

step 102 c: and determining the human body action behavior category to which each frame of image belongs according to the comparison result.

According to the embodiment, the image sequence with the reduced frame number is obtained by performing frame-division sampling on the image information of the human action period, so that the calculation load can be reduced; establishing a skeleton model for each frame of image, and determining actual values of a plurality of characteristic angles of the skeleton model; comparing the actual values of the plurality of characteristic angles of the skeleton model with the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories according to the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories obtained by pre-training and learning; and determining the human body action behavior category to which each frame of image belongs according to the comparison result. The more accurate reference values of the plurality of characteristic angles are determined through the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories obtained through pre-training and learning, and then the characteristic behavior prediction is carried out through the comparison result of the reference values of the plurality of characteristic angles, so that the recognition accuracy of the human body action behavior prediction is improved.

Fig. 2 provides a preferred embodiment of a method for predicting human body motion behavior according to the present invention, and this embodiment mainly includes two major processes, where the first process is a reference value of a plurality of characteristic angles corresponding to predetermined human body motion behavior categories obtained through pre-training and learning, and the specific steps correspond to steps 21 to 32 in fig. 2; the second flow is a flow of specifically recognizing image information to be predicted from the reference values of the plurality of characteristic angles. Referring to fig. 2 to fig. 6, another method for predicting human body action behavior according to an embodiment of the present invention includes:

firstly, a model is built by combining with an SVM to obtain reference values of a plurality of characteristic angles corresponding to preset human body action behavior categories. In the embodiment, the preset human body action behavior category can be obtained by recognizing the daily human body action from a large amount of public action behavior data sets through the SVM. Because the SVM has higher precision for recognizing various human daily action behaviors, according to the recognition result of the SVM, feature modeling can be carried out on each type of preset human action behavior categories to obtain the reference values of a plurality of feature angles corresponding to the preset human action behavior categories.

The process of recognizing the predetermined human body action behavior category according to the SVM is specifically detailed in steps 21 to 24, and also can be referred to in steps 301 to 304 in the process shown in fig. 3;

step 21: inputting human body action cycle image information, wherein the human body action cycle image information comprises but is not limited to videos;

step 22: preprocessing, such as conventional signal processing such as filtering and denoising, is not performed herein;

step 23: extracting characteristic information from the human body action cycle image information; specifically, a HOG (Histogram of Oriented gradients) method may be used to perform feature extraction on the motion behavior, calculate the horizontal Gradient and the vertical Gradient of the pixel points in the image, respectively, find the Gradient amplitude and the Gradient direction, and perform feature extraction using the Histogram. If the horizontal direction gradient and the vertical direction gradient of the pixel point (x, y) in the image are respectively as follows:

D_x(x,y)＝H(x+1,y)-H(x-1,y)；

D_y(x,y)＝H(x,y+1)-H(x,y-1)；

h (x, y) is the pixel value of a pixel point in the input image at the (x, y) point; the gradient magnitude and gradient direction at pixel point (x, y) are respectively:

α(x,y)＝tan^-1(G_y(x,y)/G_x(x,y))；

step 24: and inputting the characteristic information into a Support Vector Machine (SVM), and performing action behavior classification to obtain a classification result of the preset action behavior class.

The SVM is essentially characterized in that the problem of maximum geometric compartmentalization is converted into a convex function to carry out mathematical solution optimization, and the classification of various actions and behaviors of a human body is realized by using a method of constructing multiple classifications by using a plurality of two classifiers. The SVM identifies results in two action behavior data sets of KTH and Weizmann:

step 25: performing frame extraction on the action behavior sequence with correct classification;

considering that the motion training often has a data redundancy phenomenon due to high-speed acquisition, the embodiment preferably applies a sampling framing processing method to re-normalize the motion sequence, so as to reduce the computation load of subsequent work and reduce the complexity of the process. The first five frames can be fused into one frame, sampling is carried out at intervals of ten frames, the sampled five frames are fused, and finally the behavior action sequence is divided into an image sequence with fewer frames; for example, a motion sequence of one type of motion behavior video is divided into 750 frames, each 15 frames is divided into 50 basic units, the first 5 frames of each basic unit are respectively sampled and fused into 1 frame of image, and so on, and the normalized sequence is a new sequence containing 50 frames of images.

Step 26: performing skeleton refinement processing on a human appearance model (such as models shown in fig. 5a and 5b, wherein the left image in fig. 5a and 5b is a two-dimensional image, and the right image is an image skeleton) in the image (shown in fig. 6);

considering that a human body is a non-rigid structure, the most remarkable action characteristics are mainly focused on the movement of limbs of the human body for the execution of action behaviors, the angle of the movement of the limbs of the human body changes along with time in the action execution process, and meanwhile, certain relevance exists between different angle changes, for example, a bending angle of a knee and an angle at a hand and a shoulder have certain relevance in running; the angles of the hand and the shoulder are also related to the bending angle of the elbow, and the like. The present embodiment extracts the angle change of the limbs of the human body as the main angular feature. Specifically, as shown in fig. 6, the skeleton-refined image may be divided into three parts according to the graphic structure: an upper limb portion, a lower limb portion and a middle portion, the upper limb portion comprising: left hand, left elbow, left shoulder, right hand, right elbow, right shoulder; the lower limb portion includes: left foot, left knee, right foot, right knee; the middle part is a crotch part;

step 27: calculating a characteristic angle to realize modeling;

firstly, extracting the angle characteristic angles of the independent joint points, specifically comprising left hand-left elbow-left shoulder, left elbow-left shoulder-crotch, right hand-right elbow-right shoulder and right elbow-right shoulder-crotch; left foot-left knee-crotch, right foot-right knee-crotch; seven characteristic angles of left knee-crotch-right knee;

secondly, setting a threshold range of the characteristic angle, establishing a human skeleton angle characteristic model of the action behaviors, and sequentially establishing angle characteristic models of various action behaviors;

through a large number of experiments, threshold setting and error allowable range of the selected characteristic angle are obtained, and the error selecting deviation is optimal at 5 ℃. The specific setting and allowable error of the motion behavior characteristic angle threshold value of the running are as follows:

after the reference value of the characteristic angle of each action behavior is obtained through modeling in the above steps 21 to 27, the magnitude of the characteristic angle of each frame image is calculated for each data to be tested through the following steps 28 to 210, wherein the steps 28 to 30 basically correspond to the processing of the steps 25 to 207. The flow of steps 28-31 can be seen in the detailed flow of fig. 4.

Step 28: carrying out frame-dividing sampling on the human body action period image information to obtain an image sequence with reduced frame number;

step 29: establishing a skeleton model for each frame of image (thinning processing); the skeletal model comprises an upper limb part, a lower limb part and a middle part; the upper limb portion includes: left hand, left elbow, left shoulder, right hand, right elbow, and right shoulder; the lower limb portion includes: a left foot, a left knee, a right foot, and a right knee; the middle portion comprises a crotch portion;

step 30: calculating the size of the characteristic angle, and specifically determining the actual value of the 1 st characteristic angle formed by the left hand, the left elbow and the left shoulder, the actual value of the 2 nd characteristic angle formed by the left elbow, the left shoulder and the crotch, the actual value of the 3 rd characteristic angle formed by the right hand, the right elbow and the right shoulder, the actual value of the 4 th characteristic angle formed by the right elbow, the right shoulder and the crotch, the actual value of the 5 th characteristic angle formed by the left foot, the left knee and the crotch, the actual value of the 6 th characteristic angle formed by the right foot, the right knee and the crotch, and the actual value of the 7 th characteristic angle formed by the left knee, the crotch and the right knee;

wherein the 1 st to 4 th characteristic angles are derived from the upper limb portion, the 5 th to 6 th characteristic angles are derived from the lower limb portion, and the 7 th characteristic angle is derived from the middle portion.

Step 31: based on the reference values determined in step 27, the error range between the actual value of each characteristic angle and the reference value is determined. Namely: sequentially judging whether the actual value of the nth characteristic angle is within a preset deviation range of the reference value of the nth characteristic angle corresponding to the preset human body action behavior type, wherein n is an integer between 1 and 7;

counting the number of each characteristic angle within a preset deviation range of the reference values of the 1 st to 7 th characteristic angles of the specific human body action behaviors for each frame of image; the number is the comparison result; the specific human body action is any one of predetermined human body action category.

Step 32: and performing action prejudgment according to a preset rule, and then predicting the action of the corresponding human body on the next frame of image.

Specifically, the action behavior prediction can be performed by a template matching method, and the specific steps include: step one, rough judgment, namely comparing the extracted angle characteristics of the sequence to be detected with a template, and if 5 characteristics in 7 characteristics are within a threshold variation range, preliminarily predicting the action behaviors; and secondly, fine judgment, namely angle feature matching is respectively carried out on the upper limb part, the lower limb part and the central part, and if two local angle features are in a threshold value variation range, the action behaviors can be accurately predicted. The rough judgment is only carried out the comparison of 7 independent angle characteristics, and the fine judgment is carried out on the basis of the rough judgment, namely the judgment of the integral characteristic angle of the local combination is added.

The embodiment shows the pre-judging effect by taking the running action as an example. Firstly, images of 10 th frame, 15 th frame, 25 th frame, 35 th frame, 45 th frame and 50 th frame are extracted in an action period, and then skeleton thinning processing is carried out on the extracted images to obtain a skeleton model. FIGS. 5a and 5b are detailed views of the skeleton of the key frame; fig. 5a is a skeleton refinement diagram of the 15 th frame, and fig. 5b is a skeleton refinement diagram of the 25 th frame. The extracted characteristic angles are respectively as follows:

and (3) fine judgment: that is, the local global feature angle comparison, the present embodiment sets three local global: performing a first local global angle change comparison on the left hand-elbow-left shoulder, left elbow-left shoulder-middle (namely crotch), right hand-elbow-right shoulder and right elbow-right shoulder-middle as a whole, namely an upper limb part; taking the left foot-left knee-middle and the right foot-right knee-middle as a whole, namely a lower limb part, and carrying out second local whole angle change comparison; the hip angle change comparison was performed with the left knee-middle-right knee as a whole, i.e., the center portion. In the fine judgment, the three local overall angle changes are all in the threshold range to serve as constraint conditions, the prejudgment of the action behaviors can be achieved according with requirements, and meanwhile, action key frames in the video sequence can also be determined.

The method comprises the steps of screening daily simple actions of human beings through a public action behavior data set, and carrying out experimental prejudgment on 6 types of daily behaviors (running, walking, clapping, bending, waving and waving upwards), wherein the results are shown in the following table.

It can be seen that the pre-judging effect of running and walking is better than that of other 4 types of behaviors, wherein the specific action behaviors can be completely judged in a completed half cycle of the running and walking action behaviors; the effect of anticipating the behavior is only evident when the hand is clapped, bent, swung upwards and swung only after the most of the time of the behavior cycle. This phenomenon occurs for two reasons: firstly, running and walking have larger limb movement range than other 4 types of behaviors in the process of movement, namely the characteristic angle changes more obviously; secondly, the obvious area of the pre-judging effect of the running and walking action behaviors is concentrated at the time of one-half cycle, because the change amplitude of the characteristic angle is most obvious when the action is performed by half and the action is changed maximally.

The method comprises the steps of firstly, identifying and classifying human body action behaviors in a selected data set, and correctly classifying the action behaviors by adopting a method of HOG feature extraction and SVM classifier; then, establishing a skeleton model for each type of action behaviors, wherein the specific steps comprise: sampling frames at intervals in the action period of the behavior, reconstructing a data set of the behavior, processing targets in the image by using a skeleton refined image processing method, respectively extracting 7 independent angle features, setting a change threshold of the feature angle, establishing a skeleton model, and finishing model training; and then, extracting characteristic angles of the action behavior data set to be detected, and achieving the purpose of prejudgment and the determination of the action behavior key frame of the embodiment through twice judgment of independent angles and local overall angles. The human skeleton model provided by the embodiment can effectively represent the human action posture by using lower characteristic dimensions; meanwhile, the interval sampling fusion frame algorithm can well remove redundant information, describe key actions more effectively and has less computation amount for extracting features frame by frame compared with the prior scheme; in addition, the classification accuracy of the action behaviors and the accurate pre-judgment action behavior key frame can be further improved through secondary angle feature template matching.

As shown in fig. 7, an embodiment of a computer device includes a processor, which when executed implements the method for predicting human body action behavior as shown in fig. 1 or fig. 2. The method for executing the method shown in fig. 2 specifically includes:

(1) selecting an action behavior public data set for action recognition, and carrying out correct classification on action behaviors through HOG feature extraction and an SVM classifier;

(2) firstly, performing frame extraction on action and action sequences which are classified correctly, performing frame processing by adopting a frame interval division method in the embodiment, fusing first five frames into one frame, sampling every ten frames, fusing the sampled five frames, and finally dividing the action and action sequences into image sequences with fewer frames;

(3) then, skeleton thinning processing is carried out on the human body appearance model in the image;

(4) the skeleton of the human body is divided into three parts according to the graph structure of the image with the skeleton refined: an upper limb portion, a lower limb portion and a central portion, the upper limb portion comprising: left hand, left elbow, left shoulder, right hand, right elbow, right shoulder; the lower limb portion includes: left foot, left knee, right foot, right knee; the middle part is a crotch part;

(5) extracting the angle characteristics of the independent joint points, specifically comprising left hand-elbow-left shoulder, left elbow-left shoulder-middle, right hand-elbow-right shoulder and right elbow-right shoulder-middle; left foot-left knee-middle, right foot-right knee-middle; seven characteristic angles of the left knee, the middle knee and the right knee;

(6) setting a threshold range of the characteristic angle, establishing a human skeleton angle characteristic model of the action behaviors, and sequentially establishing angle characteristic models of various action behaviors;

(7) segmenting the test sequence and extracting the angle features by using the methods in the steps (2) to (6);

(8) the method for predicting the action behavior through the template matching comprises the following specific steps: step one, rough judgment, namely comparing the extracted angle characteristics of the sequence to be detected with a template, and if 5 characteristics in 7 characteristics are within a threshold variation range, preliminarily predicting the action behaviors; secondly, fine judgment is carried out, angle feature matching is carried out on three local parts, namely an upper limb part, a lower limb part and a central part, and if two local angle features are in a threshold value change range, the action behaviors can be accurately predicted;

(9) and determining a key frame of the action behavior through the action behavior prediction step.

The embodiment can perform action key frame retrieval and action pre-judging classification on the daily simple action of the human, and has higher classification accuracy. And on the basis of human skeleton information extracted from the video sequence, searching action key frames and prejudging action behaviors at subsequent moments through a multi-angle feature matching model.

The present invention also provides an embodiment of a computer-readable storage medium storing a computer program which, when executed by a processor, implements the above-described method for predicting human body motion behavior.

The embodiment of the computer-readable storage medium storing the computer program of the present invention has the corresponding technical effects of the embodiment of the method for predicting human body action behaviors, and is not described herein again.

The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for predicting human body action behaviors is characterized by comprising the following steps:

carrying out frame-dividing sampling on the human body action period image information to obtain an image sequence with reduced frame number;

sequentially carrying out the following operations on the images of the frames of the image sequence according to the sequence of the frames:

establishing a skeleton model aiming at each frame of image, and determining actual values of a plurality of characteristic angles of the skeleton model;

comparing the actual values of the plurality of characteristic angles of the skeleton model with the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories according to the reference values of the plurality of characteristic angles corresponding to the preset human body action behavior categories obtained by pre-training and learning;

and determining the human body action behavior category to which each frame of image belongs according to the comparison result.

2. The method according to claim 1, wherein the establishing a skeleton model for each frame of image in the plurality of frames and determining actual values of a plurality of characteristic angles of the skeleton model comprises:

aiming at each frame of image, establishing a skeleton model, wherein the skeleton model comprises an upper limb part, a lower limb part and a middle part; the upper limb portion includes: left hand, left elbow, left shoulder, right hand, right elbow, and right shoulder; the lower limb portion includes: a left foot, a left knee, a right foot, and a right knee; the middle portion comprises a crotch portion;

respectively determining the actual values of the 1 st characteristic angle formed by the left hand, the left elbow and the left shoulder, the 2 nd characteristic angle formed by the left elbow, the left shoulder and the crotch, the 3 rd characteristic angle formed by the right hand, the right elbow and the right shoulder, the 4 th characteristic angle formed by the right elbow, the right shoulder and the crotch, the 5 th characteristic angle formed by the left foot, the left knee and the crotch, the 6 th characteristic angle formed by the right foot, the right knee and the crotch, and the 7 th characteristic angle formed by the left knee, the crotch and the right knee;

3. The method according to claim 2, wherein comparing actual values of the plurality of characteristic angles of the skeletal model with reference values of a plurality of characteristic angles corresponding to predetermined human motion behavior classes according to reference values of a plurality of characteristic angles corresponding to predetermined human motion behavior classes obtained by pre-training and learning, comprises:

sequentially judging whether the actual value of the nth characteristic angle is within a preset deviation range of the reference value of the nth characteristic angle corresponding to the preset human body action behavior type, wherein n is an integer between 1 and 7;

4. The method for predicting human body action behavior according to claim 3, wherein the determining the human body action behavior category to which each frame of image belongs according to the comparison result comprises:

if the number is more than or equal to 5, preliminarily determining the human body action behavior category to which the image of the corresponding frame belongs as the specific human body action behavior;

and for each frame of image, when the number of the characteristic angles which are not located in the preset deviation range of the reference value of each characteristic angle of the specific human motion behavior is two, and the two characteristic angles which are not located in the preset deviation range of the reference value of each characteristic angle of the specific human motion behavior are respectively only from the upper limb part, the lower limb part and the middle part, the human motion behavior category to which the image of the corresponding frame belongs is finally determined as the specific human motion behavior.

5. The method for predicting human body motion behavior according to any one of claims 2 to 4, wherein the skeletal model is a tree-like structure model connected with a left hand, a left elbow, a left shoulder, a right hand, a right elbow, a right shoulder, a left foot, a left knee, a right foot, a right knee, and a crotch.

6. The method for predicting human body motion behavior according to claim 3 or 4, wherein the preset deviation range of the reference value of the nth characteristic angle is within a range of ± 5 degrees from the reference value of the nth characteristic angle.

7. The method for predicting human body motion behavior according to any one of claims 1 to 4, wherein the obtaining of the reference values of the plurality of characteristic angles corresponding to the predetermined human body motion behavior categories obtained by pre-training and learning by the following operations specifically comprises:

inputting a motion behavior training learning data set;

extracting characteristic information of the training learning data set;

inputting the characteristic information into a support vector machine for classification to obtain action behavior categories of each training learning data; the action behavior category of each training learning data is a preset human body action behavior category;

and determining reference values of a plurality of characteristic angles of the corresponding preset action behavior categories according to the action behavior training learning data corresponding to each type of preset action behavior categories.

8. The method for predicting human body motion behavior according to any one of claims 1 to 4, wherein performing frame-wise sampling on the human body motion cycle image information to obtain the image sequence with reduced frame number comprises:

and (3) fusing the first five frames into one frame for human body action period image information, sampling every ten frames, and fusing the sampled five frames to obtain an image sequence with reduced frame number.

9. A computer device, characterized in that it comprises a processor which, when executed, implements the method of human action behavior prediction according to any one of claims 1 to 8.

10. A computer-readable storage medium storing a computer program, characterized in that the computer program, when being executed by a processor, implements the method of predicting human body action behavior according to any one of claims 1 to 8.