CN111353346A

CN111353346A - Action recognition method, device, system, electronic equipment and storage medium

Info

Publication number: CN111353346A
Application number: CN201811578234.9A
Authority: CN
Inventors: 冯伟; 孟庆伟
Original assignee: Shanghai Myshape Information Technology Co ltd
Current assignee: Shanghai shibeisi Fitness Management Co.,Ltd.
Priority date: 2018-12-21
Filing date: 2018-12-21
Publication date: 2020-06-30

Abstract

The invention provides a method, a device, system electronic equipment and a storage medium for recognizing actions, which comprise the following steps: a. receiving video data to be identified sent by a mobile terminal; b. generating a bone identification request message associated with the video data to be identified according to the video data to be identified, and adding the bone identification request message into a bone identification message queue; c. the skeleton identification module extracts a skeleton identification request message from a skeleton identification message queue and identifies skeleton data in the video data to be identified according to the skeleton identification request message; d. generating an action identification request message associated with the skeleton data according to the skeleton data, and adding the action identification request message into an action identification message queue; e. the action identification module extracts action identification request information from the action identification information queue and identifies the action of the skeleton data according to the action identification request information; f. and feeding back the action recognition result to the mobile terminal. The method and the equipment provided by the invention improve the identification efficiency.

Description

Action recognition method, device, system, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of computer application, in particular to a method, a device, a system, electronic equipment and a storage medium for recognizing actions.

Background

Human motion capture and recognition methods are very widely used in today's society, for example: intelligent monitoring, human-computer interaction motion sensing games, video retrieval and the like.

Human motion detection recognition, which is a transition from traditional RGB-based video sequences to today's popular RGB-D video sequences, has been developed as an important feature. The traditional motion trail capture is usually based on a detection algorithm of characteristic points, and different characteristic point detection methods can obtain completely different motion trails. Meanwhile, because the retrieval of the feature points in different frames is very unstable, and the feature points are often discontinuous in the whole video sequence, a histogram-based statistical method is mostly adopted for the feature point trajectory method, and after the whole video sequence is calculated and counted, classifiers such as a support vector machine and the like are adopted for classification.

The matching calculation method of the video sequences has large calculation amount, cannot respond immediately and cannot be suitable for man-machine interaction at the civil level. Therefore, the prior art has difficulty in meeting the requirement of a system for real-time feedback of whether the action is wrong or not for human-computer interaction of fitness identification and error correction.

Meanwhile, the prior art also lacks of the adoption of the mobile terminal to realize bone identification and action identification, so that the calculation load of the mobile terminal is increased, the response time of the mobile terminal is increased, and the user experience is reduced.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides a motion recognition method, a motion recognition device, a motion recognition system, electronic equipment and a storage medium, so that the bone recognition and the motion recognition are realized by utilizing a server, the recognition operation time is reduced, and the user experience is improved.

According to an aspect of the present invention, there is provided a motion recognition method including:

a. the server side receives video data to be identified sent by the mobile terminal;

b. the server side generates a bone identification request message related to the video data to be identified according to the video data to be identified, and adds the bone identification request message into a bone identification message queue;

c. a bone identification module of the server side extracts a bone identification request message from the bone identification message queue and identifies bone data in the video data to be identified according to the bone identification request message;

d. the server side generates an action identification request message related to the skeleton data according to the skeleton data, and adds the action identification request message into an action identification message queue;

e. an action identification module of the server side extracts action identification request information from the action identification information queue and carries out action identification on the skeleton data of the video data to be identified according to the action identification request information;

f. and the server side feeds back the action recognition result of the video data to be recognized to the mobile terminal.

Optionally, the step b further includes:

the server side uploads the video data to be identified to a first database;

correspondingly, the step c further comprises:

and a bone identification module of the server downloads the video data to be identified from the first database according to the bone identification request message extracted from the bone identification message queue so as to identify bones.

Optionally, the step d further includes:

the server side uploads the bone data of the video data to be identified to a second database;

correspondingly, the step e further comprises:

and the action identification module of the server side downloads the bone data of the video data to be identified from the second database according to the action identification request message extracted from the action identification message queue so as to identify the action.

Optionally, the step of motion recognition comprises:

determining a target action, wherein the target action at least comprises a target action stage, and each target action stage is divided into a plurality of target part actions;

taking the bone data as a to-be-detected action, dividing the to-be-detected action into at least one to-be-detected action stage according to the time of the target action stage of the target action, and forming a matching group by the target action stage and the to-be-detected action stage with corresponding time;

in each matching group, dividing the action stage to be detected into corresponding action of the part to be detected according to the action of the target part in the target action stage, and forming a part matching group by the action of the part to be detected in the action stage to be detected and the action of the target part in the corresponding target action stage;

performing motion recognition on each part matching group to obtain a motion recognition result of the part matching group;

and integrating the action recognition results of the plurality of matching groups to obtain an action recognition result of the action to be detected.

Optionally, the target part motion includes dividing 5 body part motions by body part and at least one random part motion, and the body part includes: the random part is composed of at least two selected bone points in the body part, the random part action at least corresponds to one or more process-oriented identification items, each identification item comprises an identification object, an identification parameter, an identification rule and a standard bone point coordinate base, the identification object comprises a vector formed by at least two bone points of the random part in the process-oriented identification item, and the standard bone point coordinate base stores standard coordinates of all the bone points in the target action according to the time sequence.

Optionally, the performing motion recognition on each part matching group to obtain a motion recognition result of the part matching group includes:

at least obtaining an identification item of random part action in target part action, obtaining a vector formed by at least two selected bone points in the action of the part to be detected according to the two-dimensional bone action model, and performing matching calculation on the vector of the random part action and a standard vector formed by corresponding standard coordinates in a standard bone point coordinate library to compare with a vector threshold set by the identification parameters so as to obtain an action identification result of a part matching group.

According to still another aspect of the present invention, there is also provided a motion recognition apparatus including:

the receiving module is used for receiving video data to be identified sent by the mobile terminal;

the first generation module is used for generating a bone identification request message related to the video data to be identified according to the video data to be identified and adding the bone identification request message into a bone identification message queue;

the bone identification module is used for extracting bone identification request information from the bone identification information queue and identifying bone data in the video data to be identified according to the bone identification request information;

the second generation module is used for generating an action identification request message related to the skeleton data according to the skeleton data and adding the action identification request message into an action identification message queue;

the action identification module is used for extracting action identification request information from the action identification information queue and carrying out action identification on the bone data of the video data to be identified according to the action identification request information;

and the sending module is used for feeding back the action recognition result of the video data to be recognized to the mobile terminal.

Optionally, the apparatus includes a plurality of receiving modules, and the motion recognition apparatus further includes:

and the load balancing module is used for distributing the video data to be identified sent by the mobile terminal to the plurality of receiving modules according to the loads of the plurality of receiving modules.

Optionally, the motion recognition device comprises a plurality of said bone recognition modules and/or a plurality of motion recognition modules.

Optionally, the method further comprises:

and the first database is used for storing the video data to be identified and allowing the bone identification module to download the video data to be identified from the first database according to the bone identification request message extracted from the bone identification message queue so as to identify bones.

Optionally, the method further comprises:

and the second database is used for storing the bone data of the video data to be identified and allowing the action identification module to download the bone data of the video data to be identified from the second database according to the action identification request message extracted from the action identification message queue so as to identify actions.

According to still another aspect of the present invention, there is also provided a motion recognition system including:

the mobile terminal is used for shooting and sending video data to be identified;

the motion recognition device is configured to receive the video data to be recognized and perform motion recognition on the video data to be recognized.

According to still another aspect of the present invention, there is also provided an electronic apparatus, including: a processor; a storage medium having stored thereon a computer program which, when executed by the processor, performs the steps as described above.

According to yet another aspect of the present invention, there is also provided a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps as described above.

Compared with the prior art, on one hand, the bone identification and the action identification of the video data to be identified of the mobile terminal are realized through the server side, the calculation amount of the mobile terminal is greatly reduced, and most of calculation is carried out by the server side with better calculation resources, so that the identification efficiency is improved, the identification response time is reduced, and the user experience is improved; on the other hand, the invention simplifies the skeleton points according to the body structure by collecting each motion to be detected, divides the motion into the motion of the part to be detected by taking three skeleton points as units, identifies the motion of the part to be detected by facing the process identification item, and simply calculates the vector formed by the real-time collected skeleton points and the skeleton point coordinates in a standard skeleton point coordinate library in the facing process identification item to be compared with the set vector threshold value, so that the process calculation amount of the setting and matching identification of the skeleton points and the vector is small, the real-time feedback can be realized, and the phenomenon of feedback delay can not be generated.

Drawings

The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.

FIG. 1 shows a flow diagram of a method of motion recognition according to an embodiment of the invention;

FIG. 2 shows a schematic diagram of a motion recognition system according to an embodiment of the invention;

FIG. 3 shows a schematic diagram of a bone model according to an embodiment of the invention;

fig. 4 to 8 show schematic views of 5 body parts according to an embodiment of the invention;

FIG. 9 illustrates a comparison of a standard vector formed by bone points from a standard bone point coordinate base and a real-time acquisition vector according to an embodiment of the present invention;

FIGS. 10 and 11 are diagrams illustrating an angle between the normal vectors formed by the bone points in the normal bone point coordinate base and an angle between the real-time collection vectors according to an embodiment of the present invention;

fig. 12 schematically illustrates a computer-readable storage medium in an exemplary embodiment of the disclosure.

Fig. 13 schematically illustrates an electronic device in an exemplary embodiment of the disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams depicted in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

Referring first to fig. 1, fig. 1 shows a flow chart of a motion recognition method according to an embodiment of the invention. Fig. 1 shows 6 steps in total:

step S110: the server side receives video data to be identified sent by the mobile terminal;

step S120: the server side generates a bone identification request message related to the video data to be identified according to the video data to be identified, and adds the bone identification request message into a bone identification message queue;

step S130: a bone identification module of the server side extracts a bone identification request message from the bone identification message queue and identifies bone data in the video data to be identified according to the bone identification request message;

step S140: the server side generates an action identification request message related to the skeleton data according to the skeleton data, and adds the action identification request message into an action identification message queue;

step S150: an action identification module of the server side extracts action identification request information from the action identification information queue and carries out action identification on the skeleton data of the video data to be identified according to the action identification request information;

step S160: and the server side feeds back the action recognition result of the video data to be recognized to the mobile terminal.

Therefore, in the action identification method provided by the invention, the bone identification and the action identification of the video data to be identified of the mobile terminal are realized through the server side, the calculation amount of the mobile terminal is greatly reduced, and most of calculation is carried out by the server side with better calculation resources, so that the identification efficiency is improved, the identification response time is shortened, and the user experience is improved.

The following describes the action recognition method provided by the present invention through interaction between the mobile terminal and the server (action recognition device) in detail with reference to fig. 1 and 2. FIG. 2 shows a schematic diagram of a motion recognition system according to an embodiment of the invention.

As shown in fig. 2, the motion recognition system includes a mobile terminal 101 and a motion recognition device 100 (server side). The mobile terminal 101 is used to capture and transmit video data to be recognized. The motion recognition device 100 is configured to receive the video data to be recognized and perform motion recognition on the video data to be recognized.

Specifically, the mobile terminal 101 may be, for example, a mobile phone installed with a mobile APP or APP embedded applet. The system has video shooting and uploading functions. And the device is responsible for shooting the motion video of the user to serve as the video data to be identified and uploading the video data to the action identification device 100. All modules of the motion recognition device 100 cooperate to complete the whole set of motion recognition and push the result to the mobile terminal 101.

The motion recognition apparatus 100 includes a receiving module 103, a first generating module 111, a bone recognition module 106, a second generating module 112, a motion recognition module 109, and a transmitting module 110.

The receiving module 103 is configured to receive video data to be identified sent by the mobile terminal.

The first generating module 111 is configured to generate a bone identification request message associated with the video data to be identified according to the video data to be identified, and add the bone identification request message to the bone identification message queue 105.

Specifically, in some embodiments, the receiving module 103 and the first generating module 111 together comprise a web server. The invention can deploy a plurality of web servers to realize transverse capacity expansion, thereby improving the concurrency capability.

In the above embodiment where multiple web servers are deployed, the motion recognition apparatus 100 provided by the present invention may further include a load balancing module 102. The load balancing module 102 is responsible for distributing the video data to be identified sent by the mobile terminals 101 (only one mobile terminal 101 is shown for clarity, and the number of the mobile terminals 101 is not limited thereto) to a plurality of Web servers according to a load balancing policy, so as to improve the capability of processing concurrent requests.

In yet another variation of the above embodiment, the motion recognition apparatus 100 provided by the present invention may further include a first database 104. After receiving the video data to be recognized of the mobile terminal 101, the web server stores the video data to be recognized in the first database 104 in association with the generated bone recognition request message. The first database may be a cloud database.

The bone identification module 106 is configured to extract a bone identification request message from the bone identification message queue, and identify bone data in the video data to be identified according to the bone identification request message.

In particular, the aforementioned bone identification message queue may be a cloud message queue. In some embodiments, the bone identification message queue may be pre-configured to prevent loss of messages. For example, if one of the requests a in the bone identification message queue is removed from the queue, the request a is set to an "invisible" state, which lasts for a predetermined time. The message a is only deleted when the bone identification module 106 has processed the request a. In other words, if the request a is not deleted yet after the predetermined time, the message a is considered to be not processed correctly, and the state of the message a is set to be a visible state, so that the phenomenon that the message is lost due to the fact that one request is not processed correctly is prevented.

In particular, when a message is properly processed, the bone recognition module 106 that processes the message may actively delete the message to prevent the message from being repeatedly retrieved and processed.

The skeleton identification module 106 is configured to obtain a skeleton identification request message from the skeleton identification message queue, and identify human skeleton coordinate sequence frame data in the video data to be identified.

Further, the bone recognition module 106 is one or more "never-stop" working machines, and is responsible for monitoring the "bone recognition message queue, and when a new message is in the queue, the bone recognition module 106 will take out the message and download the corresponding video data from the first database according to the message content. The bone recognition module 106 can analyze and extract human bone coordinate series frame data in the video.

It should be noted that in the motion recognition apparatus 100 provided by the present invention, the bone recognition module 106 may be deployed to deploy multiple instances. Because the input and output of the bone recognition module 106 are queue components, there is independence between instances, and theoretically an infinite number of instances can be deployed. Therefore, the horizontal capacity expansion is realized, and the processing capacity of the concurrent request is improved.

The second generating module 112 is configured to generate an action recognition request message associated with the bone data according to the bone data, and add the action recognition request message to an action recognition message queue.

Specifically, the setting of the action identification message queue may coincide with or be close to the bone identification message queue setting.

In a specific embodiment, the motion recognition apparatus 100 provided by the present invention may further include a second database 108 for storing the bone data (associated with the motion recognition request message) recognized by the bone recognition module 106. The second database may also be a cloud database.

The action recognition module 109 is configured to extract an action recognition request message from the action recognition message queue, and perform action recognition on the bone data of the video data to be recognized according to the action recognition request message.

Specifically, the action recognition module 109 is also one or more "never-stop" work machines, which are responsible for listening to the action recognition message queue. When a new message arrives in the queue, the system is responsible for taking out the message and acquiring a skeleton coordinate sequence frame file (skeleton data) to be processed from the second database according to the content of the message. And analyzing the file, identifying the bone coordinates and identifying the action. And giving standard judgment and corresponding guidance of the action. Like the skeleton recognition module 106, the action recognition module 109 can also expand laterally without limitation, thereby improving the concurrent processing capability.

The sending module 110 is configured to feed back the action recognition result of the video data to be recognized to the mobile terminal.

The above description is only illustrative of the various embodiments of the motion recognition device 100 and the motion recognition system of the present invention, and the division and combination, addition, or omission of modules is within the scope of the present invention.

In the motion recognition device 100 and the motion recognition system of the present invention, the Web server, the skeleton recognition module 106, and the motion recognition module 109 centralize the main operation of the system, wherein the Web server has only one end point to the outside, and the horizontal capacity expansion is realized by adopting a load balancing manner. The skeleton recognition module 106 and the action recognition module 109 are independent from other modules and depend on the message queue to obtain requests, so that the skeleton recognition module and the action recognition module have the characteristic of horizontal capacity expansion. Therefore, the motion recognition device 100 and the motion recognition system of the present invention have good lateral capacity expansion capability, and further have strong concurrent processing capability.

The action recognition of the present invention is described below by way of some specific embodiments.

Firstly, determining a target action, wherein the target action at least comprises one target action stage, each target action stage is divided into a plurality of target part actions, and the target part actions comprise 5 body part actions divided according to body parts and at least one random part action.

In some embodiments, the target action may be determined by displaying a workout video. Specifically, the fitness video comprises a plurality of target actions, and each target action is associated with the playing time of the fitness video. In other embodiments, the user may directly select the target action.

Specifically, in the present case, 15 skeletal points are set for each human body (see fig. 3), and the 15 skeletal points are: head center 211, neck center (e.g., spinal center of neck) 212, torso center 213 (e.g., spinal center of torso), left shoulder joint point 221, left elbow joint point 222, left wrist joint point 223, right shoulder joint point 231, right elbow joint point 232, right wrist joint point 233, left hip joint point 241, left knee joint point 242, left ankle joint point 243, right hip joint point 251, right knee joint point 252, right ankle joint point 253.

In the present case, the 15 skeletal points are divided into five body parts by taking 3 skeletal points as units: the torso (see fig. 4), the left arm (see fig. 5), the right arm (see fig. 6), the left leg (see fig. 7), and the right leg (see fig. 8). Vectors are formed among the skeleton points in each body part, and included angles are formed among the vectors.

Specifically, the torso (see fig. 4) includes a head center 211, a spine center 212 of the neck, a spine center 213 of the torso, a first vector 214 formed from the head center 211 to the spine center 212 of the neck, a second vector 215 formed from the spine center 212 of the neck to the spine center 213 of the torso, a third vector 216 formed from the head center 211 to the spine center 213 of the torso, and an angle 217 formed by the first vector 214 and the second vector 215.

The left arm (see fig. 5) includes a left wrist joint point 223, a left elbow joint point 222, a left shoulder joint point 221, a first vector 224 formed from the left shoulder joint point 221 to the left elbow joint point 222, a second vector 225 formed from the left elbow joint point 222 to the left wrist joint point 223, a third vector 226 formed from the left shoulder joint point 221 to the left wrist joint point 223, and an angle 227 between the first vector 224 and the second vector 225.

The right arm (see fig. 6) includes a right wrist joint point 233, a right elbow joint point 232, a right shoulder joint point 231, a first vector 234 formed from the right shoulder joint point 231 to the right elbow joint point 232, a second vector 235 formed from the right elbow joint point 232 to the right wrist joint point 233, a third vector 236 formed from the right shoulder joint point 231 to the right wrist joint point 233, and an angle 237 between the first vector 234 and the second vector 235.

The left leg includes (see fig. 7) a left ankle joint point 243, a left knee joint point 242, a left hip joint point 241, a first vector 244 formed from left hip joint point 241 to left knee joint point 242, a second vector 245 formed from left knee joint point 242 to left ankle joint point 243, a third vector 246 formed from left hip joint point 241 to left ankle joint point 243, and an angle 247 between the first vector 244 and the second vector 245.

The right leg includes (see fig. 8) a right ankle joint point 253, a right knee joint point 252, a right hip joint point 251, a first vector 254 formed from right hip joint point 251 to right knee joint point 252, a second vector 255 formed from right knee joint point 252 to right ankle joint point 253, a third vector 256 formed from right hip joint point 251 to right ankle joint point 253, and an angle between the first vector 254 and the second vector 255.

Less representative joint points are set as skeleton points to reduce the amount of calculation in motion recognition and error correction.

The target action is broken down into five body parts: left arm, right arm, left leg, right leg and torso. Each body part comprises three skeletal points as shown in fig. 4 to 8, three vectors formed by the three skeletal points and an included angle between two of the three vectors.

To increase the flexibility of motion recognition, the target motion may further comprise at least one random part motion, the random part being constituted by at least two selected bone points in said body part, such as selected

bone points

212 and 223 in fig. 3, and the random part being formed by

bone points

212 and 223. The random part is not limited to this, and any at least two bone points may form the random part, so that on the basis of five body parts, more dimensional motion recognition may be achieved.

The random part action at least corresponds to one or more process-oriented identification items, and each identification item comprises an identification object, an identification parameter, an identification rule and a standard skeleton point coordinate library. In the identification item corresponding to the process, the identification object comprises a vector formed by at least two bone points of the random part. The identification parameters include a set vector threshold. The identification rule includes that the similarity between a vector (identification object) formed by at least two bone points of the random part and a standard vector formed by corresponding standard coordinates in a standard bone point coordinate library is required to be greater than or equal to a set vector threshold (identification parameter) in the motion process, and if the similarity between the vector (identification object) formed by at least two bone points of the random part and the standard vector formed by corresponding standard coordinates in the standard bone point coordinate library is smaller than the set vector threshold (identification parameter), an error is reported (the reported error can be stored as the identification parameter in advance).

In a specific embodiment, the vector of the random part motion and a standard vector formed by corresponding standard coordinates in a standard bone point coordinate library are subjected to matching calculation to be compared with a vector threshold set by the identification parameter through the following steps:

calculating standard vectors formed by corresponding standard coordinates in a standard skeleton point coordinate library

(x_ai，y_ai) Vector of motion of random part

(x_bi，y_bi) Cosine of angle θ between:

vector quantity

And vector

The cosine value of the included angle theta is used for comparing with the vector threshold value set by the identification parameter. For example, when bone points 212 and 223 form random sites, the vector

And vector

The vectors formed by the bone points 212 and 223 acquired in real time and the vectors formed by the bone points 212 and 223 in the standard bone point coordinate base are respectively.

Further, in the invention, the two-dimensional video data collected in real time generates a two-dimensional skeleton motion model, and the coordinates in the standard skeleton point coordinate library can be three-dimensional coordinates. And if so, matching and calculating the vector of the random part action and a standard vector formed by corresponding standard coordinates in a standard skeleton point coordinate library. If not, converting the corresponding standard coordinates in the standard skeleton point coordinate library into two-dimensional coordinates, and performing matching calculation with the vector of the random part action.

In a specific embodiment, for the process-oriented identification item corresponding to the random part motion, the identification parameter may further include an initial amplitude threshold and an achievement amplitude threshold, where the initial amplitude threshold is used to determine whether the motion of the part to be detected starts; the achievement amplitude threshold is used for judging whether the action of the part to be detected is finished or not. Specifically, the starting amplitude and the achievement amplitude are based on the position on the action time axis. In specific implementation, the frame number can be used to determine the initial amplitude and the achieved amplitude. For example: assuming that an action has 20 frames of data in the standard bone point coordinate library, assuming that the initial amplitude threshold is set to 0.2, the amplitude threshold is set to 0.8. then the action is determined to start when the matching degree of any frame data between the random part action of the actual action of the user and the 0 th to 4 th (i.e. 20 x 0.2) frames in the standard bone point coordinate library is the highest (within the vector threshold range). When the user action is started and the random part action fails to match with the standard skeleton point coordinate library in the motion process, once the matching degree of any frame data between the random part action of the user action and the 16 th (namely 20 x 0.8) -20 th frames in the standard skeleton point coordinate library is the highest (within the range of the vector threshold), the action is determined to be achieved. The foregoing is merely an illustrative description of implementations of the invention and is not intended to be limiting thereof.

In a specific embodiment, the random part motion further corresponds to one or more distance-oriented recognition items. For the distance-oriented recognition item, the recognition object comprises the distance between at least two bone points of the random part. The identification parameter sets a distance threshold. The identification rule comprises that the identification object of the action of the part to be detected is always larger than or equal to the range of the distance threshold set by the identification parameter in the motion process. In the distance recognition, when the recognition object moving at the part to be detected is always greater than or equal to the distance threshold value set by the recognition parameter in the moving process, the movement is achieved; and when the identification object moving at the part to be detected is smaller than the distance threshold set by the identification parameter in the moving process, an error is reported. In the negative distance recognition, when the recognition object moving at the part to be detected is greater than or equal to the distance threshold set by the recognition parameter at any time in the moving process, an error is reported.

The above description is only an exemplary description of the random part motion recognition in the present invention, and the present invention is not limited thereto. The following will describe an embodiment mode of recognition and error correction of body part motion in the present invention.

The at least one body part action corresponds to one or more process-oriented or displacement-oriented recognized terms. Each identification item comprises an identification object, an identification parameter and an identification rule, wherein the identification object comprises at least one of the three skeleton points of the part action; at least one of the three vectors; and one or more of an angle between two of the three vectors.

The process-oriented identification item needs to be matched with the vector collected in real time through a standard skeleton point coordinate library so as to judge whether the identification item is met. The standard bone point coordinate library stores the coordinates of at least one bone point of the part motion in a time sequence with a sampling frequency. For example, for the left leg movement of the push-up, at least the coordinates of the bone points 221, 222, and 223 of the left arm are stored in time series at a sampling frequency of 5 times/second, whereby the first vector 224 and the second vector 225 (and the angle 227) formed by the bone points 221, 222, and 223 can be known.

Specifically, the identification items facing the process comprise track identification, negative track identification and hold identification; the identification items facing displacement include displacement identification and negative displacement identification.

And the track identification is used for identifying whether the part moves according to a preset track, and if the part does not move according to the preset track, an error is prompted. The identification object comprises at least one vector in the three vectors and/or an included angle between two vectors in the three vectors. The identification parameter sets one or more threshold values corresponding to the identification object. The threshold value comprises a vector threshold value of the three vectors and an included angle threshold value of the included angle, and the identification parameter determines to adopt the vector threshold value and/or the included angle threshold value according to the identification object.

Specifically, the vector threshold and the included angle threshold are used to determine whether the vectors (and included angles) collected in real time match the standard vectors (and included angles between the standard vectors) formed by the standard bone points in the standard bone point coordinate library. For example, referring to FIG. 9, for the vector threshold, the vectors from skeleton point 222 to skeleton point 293 of a body part motion are collected in real time

(x_bi，y_bi) Finding corresponding bone points 222 to 223 corresponding to the time in a standard bone point coordinate library according to the time to form a vector

(x_ai，y_ai) Calculating the vector in the standard skeleton point coordinate library

(x_ai，y_ai) Vector of body part motion acquired in real time

(x_bi，y_bi) Cosine of angle θ between:

vector quantity

And vector

The cosine value of the included angle theta (cosine value is-1 to 1) is used for comparing with the vector threshold value set by the identification parameter. The vector threshold may be set to 0.8, the corresponding vector

And vector

When the cosine value of the included angle theta is greater than or equal to 0.8, the two vectors are considered to be matched. The vector can be determined by comparing the vector threshold with the calculated cosine value

(x_bi，y_bi) Whether it is within the vector threshold.

For example, in an embodiment where an angle threshold is set, the standard bone point coordinate library stores at least standard bone points in chronological order and may include an angle between a standard vector formed by the standard bone points and the angle between the standard vectors, the first and second vectors of body part motion may be calculated from the two vectors or may be stored directly in the standard bone point coordinate library, referring to fig. 10 and 11, an angle threshold is used to compare an angle 297 α between a first vector 294 of (bone point 292 to bone point 291) and a second vector 295 of (bone point 292 to bone point 293) of the site motion acquired in real time to a ratio α/β of an angle 227 β between a first vector 224 of (bone point 222 to bone point 221) and a second vector 225 of (bone point 222 to bone point 223) of the site motion acquired in real time in the standard bone point coordinate library to determine whether the angle of the site motion acquired in real time is within the range of the angle threshold.

Furthermore, the identification parameters of the track identification also comprise an initial amplitude threshold value and an achievement amplitude threshold value, wherein the initial amplitude threshold value is used for judging whether the part action starts or not, and the achievement amplitude threshold value is used for judging whether the part action finishes or not to complete achievement of the amplitude. Specifically, the starting amplitude and the achievement amplitude are based on the position on the action time axis. In specific implementation, the frame number can be used to determine the initial amplitude and the achieved amplitude. For example: assuming that an action has 20 frames of data in the standard bone point coordinate library, assuming that the initial amplitude threshold is set to 0.2, the amplitude threshold is achieved to be 0.8. then the action is considered to begin when the matching degree of any frame of data between the actual action of the user and the 0 th-4 th (i.e. 20 x 0.2) frames in the standard bone point coordinate library is the highest (within the vector threshold). When the action of the user is started and the matching with the standard bone point coordinate base is not failed in the motion process, once the matching degree of any frame data between the action of the user and the 16 th (namely 20 x 0.8) to 20 th frames in the standard bone point coordinate base is the highest (within the range of the vector threshold value), the action is determined to be achieved. The foregoing is merely an illustrative description of implementations of the invention and is not intended to be limiting thereof.

The recognition rules of the track recognition include achievement rules and optionally different error rules corresponding to the set recognition objects and recognition parameters. The achievement rule of the track recognition is that the recognition object of the part action starts from the position represented by the initial amplitude threshold value and the recognition objects are all within the set vector threshold value and/or included angle threshold value; when the recognition objects of the part action reach the position represented by the amplitude threshold value from the position represented by the initial amplitude threshold value, the recognition objects are all within the set vector threshold value and/or included angle threshold value; and the recognition objects of the part action reach the position represented by the achievement amplitude threshold value and are all within the set vector threshold value and/or included angle threshold value. Different error rules for track identification include: an out of corresponding vector threshold error (e.g., the large arm or thigh represented by vector one is out of threshold); an angle threshold error is exceeded (e.g., an angle at the elbow or an angle at the knee represented by the angle exceeds a threshold); and insufficient amplitude error. The identification rule with the amplitude being not wrong enough is that the identification object of the part action starts from the position represented by the initial amplitude threshold value and the identification objects are all within the set vector threshold value and/or included angle threshold value; when the recognition objects of the part action reach the position represented by the amplitude threshold value from the position represented by the initial amplitude threshold value, the recognition objects are all within the set vector threshold value and/or included angle threshold value; and the recognition objects of the part action do not reach the position represented by the achievement amplitude threshold value and are all within the set vector threshold value and/or included angle threshold value.

And the negative track identification is used for identifying whether the part moves according to a preset track, and if the part moves according to the preset track, an error is prompted. For negative trajectory recognition, which is similar to trajectory recognition, the recognition object comprises at least one of the three vectors and/or an angle between two of the three vectors (preferably, an angle between the first vector and the second vector). And setting one or more thresholds for the identification parameters of the negative track identification, wherein the thresholds comprise vector thresholds of the three vectors and an included angle threshold of the included angle, and the identification parameters determine to adopt the vector thresholds and/or the included angle thresholds according to the identification object. The negative track recognition is different from the track recognition in that the negative track recognition achievement rule is as follows: the identification object of the part action starts from the position represented by the initial amplitude threshold value and is within the set vector threshold value and/or included angle threshold value; when the recognition objects of the part action reach the position represented by the amplitude threshold value from the position represented by the initial amplitude threshold value, the recognition objects are all within the set vector threshold value and/or included angle threshold value; the recognition objects of the part action reach the position represented by the achievement amplitude threshold value and are all within the set vector threshold value and/or included angle threshold value; and there is currently a state in which recognition other than negative recognition and hold recognition is in progress (in other words, the trajectory or displacement amplitude is growing). When the rule is reached, a track error is prompted. In other words, if the recognition object is not always within the threshold range set by the recognition parameter during the movement of the body part, and the motion of the part represented by the recognition object generates a trajectory and/or displacement during the movement, an error will not be presented.

The hold recognition is used to identify whether the motion of the part is kept in a certain state (for example, kept upright or kept at a bending angle) during the motion, and if the motion is not kept in the state, an error is prompted. The identification object kept identified comprises at least one vector in the three vectors and/or an included angle between two vectors in the three vectors. And setting one or more thresholds according to the identification parameters, wherein the thresholds comprise vector thresholds of the three vectors and an included angle threshold of the included angle, and the identification parameters determine to adopt the vector thresholds and/or the included angle thresholds according to the identification object. The achievement rule for keeping identification is: the recognition target of the part motion is always within the set vector threshold and/or included angle threshold. If the achievement rule of the keeping identification is not reached, an error corresponding to the keeping identification is prompted.

For displacement recognition and negative displacement recognition, although the displacement recognition and the negative displacement recognition are described as recognition items facing displacement instead of object, the displacement recognition and the negative displacement recognition actually need to recognize whether the part action is in a continuous motion state, and if the part action is not in the continuous motion state, the recognition is interrupted, and an error is directly prompted; or to re-identify from the current location.

And the displacement identification is used for judging whether the identified object reaches the preset displacement direction and displacement distance, and if not, prompting an error. The recognition object of the displacement recognition includes one of three bone points. Preferably, one skeletal point of the site action is specified. The identification parameters set displacement distance, displacement direction (the displacement direction can be mapped to the positive direction of the X axis, the negative direction of the X axis, the positive direction of the Y axis and the negative direction of the Y axis in the two-dimensional coordinates, and the specific displacement direction does not need to be calculated) and initial amplitude threshold values. The starting amplitude threshold of the displacement is a value in the range of 0 to 1. For example, the starting amplitude threshold may be set to 0.2 and represent that the site action or displacement recognition begins when the displacement of a given bone point exceeds 20% of the set displacement distance. The recognition rules of displacement recognition include an achievement rule and optionally different error rules. The achievement rule of the displacement identification is that the moving direction of the appointed bone point is consistent with the displacement direction set in the identification parameter, and the displacement distance of one continuous motion is more than or equal to the displacement distance set in the identification parameter. Different error rules include that when the displacement of the specified bone point does not exceed the initial amplitude threshold value, the initial action amplitude is not enough; and if the displacement amplitude of the appointed bone point exceeds the initial amplitude threshold value, the moving direction of the appointed bone point is consistent with the displacement direction set in the identification parameter, and the displacement distance of one continuous motion is less than the displacement distance set in the identification parameter, the achievement amplitude is not enough.

And the negative displacement identification is used for judging whether the identified object reaches the preset displacement direction and displacement distance, and if so, prompting an error. Similar to displacement recognition, the recognition object includes one of three bone points. Preferably, one skeletal point of the site action is specified. The identification parameters set displacement distance, displacement direction (the displacement direction can be mapped to the positive direction of the X axis, the negative direction of the X axis, the positive direction of the Y axis and the negative direction of the Y axis in the two-dimensional coordinates) and initial amplitude threshold values. The achievement rule of the negative displacement recognition is that the moving direction of the specified bone point coincides with the displacement direction set in the recognition parameter, the displacement distance of one continuous motion is equal to or greater than the displacement distance set in the recognition parameter, and there is a state in which recognition other than the negative recognition and the hold recognition is currently in progress (in other words, the trajectory or the displacement amplitude is increasing). When the rule is reached, a track error is prompted. In other words, if the recognition object does not move in the displacement direction set by the recognition parameter or the movement distance is greater than the displacement distance set by the recognition parameter during the movement of the body part, no error is indicated.

In the above embodiments, the difficulty factor may be increased, for example, the product of the difficulty factor and the achievement condition for each action may be used as the achievement condition for actions with different difficulties.

The identification item is set for at least one part action of an action, the at least one part action and the identification item of the at least one part action are used as an action file of the action, and the action file and the action number are stored in the standard action database in a correlation mode.

In one embodiment, for a deep squat action, it sets the identification terms for the torso, left leg and right leg. The identification items of the trunk include a hold identification and a displacement identification. In the trunk keeping identification, the identification object is only a first vector from the head center to the spine center of the neck, the parameters of the first vector are set correspondingly, and a standard skeleton point coordinate base of skeleton points of the trunk in the deep squatting process is stored for subsequent matching. When the first vector of the trunk acquired in real time exceeds the threshold value of the first vector, the body is not kept upright, and an error is prompted. Here, due to the characteristics of the trunk, when the first vector from the center of the head to the center of the spine of the neck remains upright, the second vector from the center of the spine of the neck to the center of the spine of the trunk can be generally directly determined to also remain upright, and only a threshold value of one vector is set, so as to reduce the subsequent calculation amount and improve the subsequent real-time error correction efficiency.

In the displacement recognition of the trunk, the recognition object is a skeleton point at the center of the spine of the trunk, and the corresponding recognition parameters are a predetermined displacement distance and a predetermined displacement direction (the direction is the negative direction of the Y axis) of the skeleton point. When the spine center of the torso moves more than a predetermined distance in the negative Y-axis direction, this identification of the motion of the part is indicated. If the spine center of the trunk does not move along the Y-axis negative direction for more than a preset displacement distance, the amplitude of the part motion is not enough.

The left leg is provided with negative displacement recognition for reminding the deep squatting middle knee not to exceed the toe. In the negative displacement recognition of the left leg, the recognition target is a joint point of the left knee, and the recognition parameters are a predetermined displacement distance, a predetermined displacement direction (the direction is the positive X-axis direction), and a start amplitude threshold. When the left knee moves more than a preset displacement distance along the positive direction of the X axis, the prompt shows that the part moves wrongly. When the left knee does not move more than the predetermined displacement distance in the positive X-axis direction, this recognition of the motion of the part is achieved. The identification item of the right leg is the same as that of the left leg, and is not described herein.

In some embodiments, the stages may be divided for each action. For example, for deep squats, squats and uprisals may be divided into two stages. In some embodiments, the movement of the back and forth for squatting, push-up, etc. can be set and identified for only one course in the middle of the back and forth. For example, the setting of the identification item and the identification error correction are only carried out on the action of squatting deeply; the setting of the identification item and the identification error correction are only carried out on the action during the push-up and the push-up, thereby further reducing the calculation amount of the action identification and increasing the real-time performance of the error correction.

After the target action is determined, the identified bone data is used as a to-be-detected action, the to-be-detected action is divided into at least one to-be-detected action stage according to the time of the target action stage of the target action, and the target action stage and the to-be-detected action stage corresponding to the time form a matching group.

Specifically, for example, the target action is a deep squat, and is divided into two target action stages: squat and rise, the squat time being 2 seconds and the rise time being 2 seconds. According to time, correspondingly dividing the action to be tested into two action stages to be tested: squat down and rise up. And forming a matching group by the target action stage corresponding to squatting and the action stage to be tested, and forming a matching group by the target action stage corresponding to rising and the action stage to be tested.

In each matching group, the action stage to be tested is divided into corresponding action of the part to be tested according to the action of the target part in the target action stage, and the action of the part to be tested in the action stage to be tested and the action of the target part in the corresponding target action stage form a part matching group.

For example, the motion phase to be measured is divided into five motion parts of a left arm, a right arm, a left leg, a right leg and a trunk. If the left arm, the right arm, the trunk and a random part of the target action stage are provided with identification items, taking the action of the part to be detected of the left arm and the action of the target part as a part matching group; taking the motion of the part to be measured of the right arm and the motion of the target part as a part matching group; taking the motion of the part to be detected of the trunk and the motion of the target part as a part matching group; the motion of the part to be measured of the random part and the motion of the target part are used as a part matching group.

And for each part matching group, at least acquiring an identification item of random part action in target part action, acquiring a vector formed by at least two selected bone points in the action of the part to be detected according to the two-dimensional bone action model, and performing matching calculation on the vector of the random part action and a standard vector formed by corresponding standard coordinates in a standard bone point coordinate library so as to compare the vector with a vector threshold set by the identification parameters to acquire action identification feedback. I.e. identification and error correction according to the content of the different identification items as described in step S110 above.

In one embodiment, each of the fitness videos has a video file, the video file includes a number of a target action in the fitness video and a playing time of the target action, and the step S110 further includes: when the target action is played, searching a standard action database for a target action file of the number of the target action, wherein the target action file and the number of the target action are stored in the standard action database in a correlation manner, and each target action file comprises a target action stage of the target action, a target part action and an identification item corresponding to the target part action.

And integrating the action recognition feedback of at least one matching group to obtain the action recognition feedback of the action to be detected.

In some embodiments, the target action at least includes a plurality of target action phases having a sequence, and when the action identification feedback of the previous target action phase and the corresponding action phase to be tested is that the action is not achieved, the action identification feedback achieved by the action of the subsequent target action phase and the corresponding action phase to be tested is invalid.

On one hand, the bone identification and the action identification of the video data to be identified of the mobile terminal are realized through the server side, the calculation amount of the mobile terminal is greatly reduced, and the server side with better calculation resources performs most of calculation, so that the identification efficiency is improved, the identification response time is reduced, and the user experience is improved; on the other hand, the invention simplifies the skeleton points according to the body structure by collecting each motion to be detected, divides the motion into the motion of the part to be detected by taking three skeleton points as units, identifies the motion of the part to be detected by facing the process identification item, and simply calculates the vector formed by the real-time collected skeleton points and the skeleton point coordinates in a standard skeleton point coordinate library in the facing process identification item to be compared with the set vector threshold value, so that the process calculation amount of the setting and matching identification of the skeleton points and the vector is small, the real-time feedback can be realized, and the phenomenon of feedback delay can not be generated.

In an exemplary embodiment of the present disclosure, a computer-readable storage medium is also provided, on which a computer program is stored, which when executed by, for example, a processor, may implement the steps of the motion recognition method described in any of the above embodiments. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above-mentioned action recognition method section of the present description, when said program product is run on the terminal device.

Referring to fig. 12, a program product 300 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + +, C #, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the tenant computing device, partly on the tenant device, as a stand-alone software package, partly on the tenant computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing devices may be connected to the tenant computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

Engineering programs for performing the operations of the present invention may be built in any combination of one or more programming Integrated Development Environments (IDE), game development engines, such as Unity3D, Unreal, Visual Studio, and the like.

In an exemplary embodiment of the present disclosure, there is also provided an electronic device, which may include a processor, and a memory for storing executable instructions of the processor. Wherein the processor is configured to perform the steps of the action recognition method in any of the above embodiments via execution of the executable instructions.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 13. The electronic device 600 shown in fig. 13 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 13, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one storage unit 620, a bus 630 that connects the various system components (including the storage unit 620 and the processing unit 610), a display unit 640, and the like.

Wherein the storage unit stores program code executable by the processing unit 610 to cause the processing unit 610 to perform steps according to various exemplary embodiments of the present invention described in the above-mentioned action recognition method section of the present specification. For example, the processing unit 610 may perform the steps as described in fig. 1.

The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.

The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a tenant to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 via the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, or a network device, etc.) to execute the above-mentioned action recognition method according to the embodiments of the present disclosure.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims

1. A motion recognition method, comprising:

2. The motion recognition method according to claim 1, wherein the step b further comprises:

the server side uploads the video data to be identified to a first database;

correspondingly, the step c further comprises:

3. The motion recognition method according to claim 1, wherein the step d further comprises:

correspondingly, the step e further comprises:

4. The motion recognition method according to claim 1, wherein the step of motion recognition comprises:

5. The motion recognition method according to claim 4, wherein the target part motion includes a body part-divided 5 body part motion and at least one random part motion, the body part including: the random part is composed of at least two selected bone points in the body part, the random part action at least corresponds to one or more process-oriented identification items, each identification item comprises an identification object, an identification parameter, an identification rule and a standard bone point coordinate base, the identification object comprises a vector formed by at least two bone points of the random part in the process-oriented identification item, and the standard bone point coordinate base stores standard coordinates of all the bone points in the target action according to the time sequence.

6. The motion recognition method according to claim 5, wherein the performing motion recognition on each part matching group to obtain the motion recognition result of the part matching group comprises:

7. An action recognition device, comprising:

8. The motion recognition apparatus of claim 7, comprising a plurality of receiving modules, the motion recognition apparatus further comprising:

9. The motion recognition apparatus of claim 7, wherein the motion recognition apparatus comprises a plurality of the bone recognition modules and/or a plurality of motion recognition modules.

10. The motion recognition apparatus according to claim 7, further comprising:

11. The motion recognition apparatus according to claim 7, further comprising:

12. A motion recognition system, comprising:

the motion recognition apparatus according to any one of claims 8 to 11, configured to receive the video data to be recognized and perform motion recognition on the video data to be recognized.

13. An electronic device, characterized in that the electronic device comprises:

a processor;

a storage medium having stored thereon a computer program which, when executed by the processor, performs the steps of any of claims 1 to 7.

14. A storage medium, having stored thereon a computer program which, when executed by a processor, performs the steps of any of claims 1 to 7.