CN115565244A

CN115565244A - Winding body action counting method and counting system based on machine vision

Info

Publication number: CN115565244A
Application number: CN202211237831.1A
Authority: CN
Inventors: 唐义平
Original assignee: Anhui Yishi Technology Co ltd
Current assignee: Anhui Yishi Technology Co ltd
Priority date: 2022-10-10
Filing date: 2022-10-10
Publication date: 2023-01-03

Abstract

The invention discloses a volume motion counting method based on machine vision, which saves a video stream to generate an image, processes the image to obtain a bitmap containing characteristic points corresponding to human body parts, realizes the acquisition of human body gestures, counts volume motions by taking the positions of the dynamic characteristic points as a judgment basis for finishing the volume motions, is convenient for image detection, greatly reduces the data volume of image processing while ensuring the counting accuracy, thereby realizing the replacement of manual counting. The invention further provides a rolling motion counting system based on the machine vision.

Description

Winding body action counting method and counting system based on machine vision

Technical Field

The invention relates to the technical field of scroll motion counting, in particular to a scroll motion counting method and system based on machine vision.

Background

The sit-up is an important sport in domestic student physical training and military physical training. For the body movements such as sit-up and abdomen rolling, the conventional detection method has poor applicability due to the body size difference of the testee. Most of the tests are performed by a tester lying on a mat and then a statistical person counts. Exercising and testing in this manner requires more labor and if the movements are not standardized, they may be counted and the test results may not meet the accuracy requirements of the final test.

Disclosure of Invention

In order to solve the technical problems in the background art, the invention provides a winding action counting method and a counting system based on machine vision.

The invention provides a winding action counting method based on machine vision, which comprises the following steps:

s1, acquiring a real-time video stream of a tested person during the movement of a volume;

s2, performing frame separation storage on the real-time video stream and generating an image;

s3, preprocessing the image;

s4, sending the preprocessed image into a feature point detection network for feature point detection, and obtaining a feature point bitmap, wherein the feature point bitmap comprises dynamic feature points;

and sequentially detecting the feature point bitmaps, judging that one scroll motion is finished according to the position of the dynamic feature point, and adding one to the count.

Preferably, in S4, the determination of completion of a roll motion according to the position of the dynamic feature point is specifically performed when the X-th time detects that the position of the dynamic feature point in the Y-th frame feature point bitmap is located at a preset position, and the X + 1-th time detects that the dynamic feature point in the Y + W-th frame feature point bitmap is located at the preset position, and it is determined that a roll motion between the Y-th frame and the Y + W-th frame is completed; wherein X, Y, W are all positive integers.

Preferably, the feature point bitmap further includes a static reference point, and the position of the dynamic feature point is located at a preset position, specifically, it is determined that the dynamic feature point is located at the preset position by the static reference point;

preferably, when the preset position is within a preset position threshold range, X is a positive odd number, and the preset position threshold range does not include the static reference point;

preferably, the dynamic feature point is determined to be located at a preset position through the distance between the dynamic feature point and the static reference point lens;

preferably, the dynamic feature point is an elbow point and the static reference point is a knee point.

Preferably, in S4, the static reference point is a knee point, and the dynamic feature point is an elbow point;

when the distance between the elbow point and the knee point in the Y frame feature point bitmap is detected for the X time to be M ₀ And the distance between the elbow point and the knee point in the Y + W frame characteristic point bitmap is detected as M at the X +1 th time ₀ Judging that a scroll action between the Y frame and the Y + W frame is finished, and adding one to the count;

wherein, M ₀ To preset distance threshold, M ₀ The values of 0,X and Y, W are positive integers.

Preferably, the feature points further include a static reference point, the static reference point is a knee point, and the dynamic feature points are elbow points;

in S4, "determining that one winding motion is completed according to the position of the dynamic feature point" specifically includes:

detecting that the distance between the elbow point and the knee point in the Z-th frame feature point bitmap is M for the X time _XZ When M is _XZ ＜M _X(Z-1) And M _XZ ＜M _X(Z+1) Judging that one roll body action is finished;

wherein M is _X Not less than 0, and X, Z, W are all positive integers;

preferably, in S4, "determining that one roll motion is completed according to the position of the dynamic feature point" specifically includes:

detecting that the distance between the elbow point and the knee point in the Z-th frame feature point bitmap is M for the X time _XZ Satisfy M _XZ ＞M _X(Z-1) And M _XZ ＞M _X(Z+1) While satisfying M _(X+1)(Z+U) ＞M _(X+1)(Z+U-1) And M _(X+1)(Z+U) ＞M _(X+1)(Z+U+1) Judging that one rolling motion is finished;

wherein M is _X Not less than 0, and X, Z, U are all positive integers.

Preferably, in the X-th detection, the distance M between the elbow point and the knee point _X ∈(M _max -T ₂ ,M _max ) Wherein T is ₂ Presetting a threshold value for the maximum value;

distance M between elbow point and knee point in feature point bitmap of Z-th frame _XZ Z + th-Distance M between elbow point and knee point in 1 frame feature point bitmap _x(Z－1) And the distance M between the elbow point and the knee point in the Z-th frame feature point bitmap _XZ Distance M between elbow point and knee point in Z +1 th frame feature point bitmap _x(Z－1) ；

Then M is identified _XZ ＝M _Xmax 。

Preferably, the counting method further comprises: sequentially detecting all feature point bitmaps subjected to the X +1 detection in the X-th detection, and if the distance between the elbow point and the knee point in any frame of feature point bitmap is larger than a first preset violation threshold, counting by one;

and/or the characteristic points further comprise hand points and head points; the counting method further comprises the following steps: sequentially detecting all feature point bitmaps subjected to the X +1 detection in the X-th detection, and if the distance between a hand point and a head point in any frame of feature point bitmap is larger than a second preset violation threshold, counting by one;

and/or the characteristic points further comprise a crotch point and a foot point; the counting method further comprises the following steps: sequentially detecting all feature point bitmaps subjected to the X +1 detection in the X-th detection, and if the angle formed by the crotch point, the knee point and the foot point in any frame of feature point bitmap and taking the knee point as the center is larger than a third preset violation threshold, counting and reducing by one;

and/or the feature points further comprise shoulder points and crotch points; the counting method further comprises the following steps: and sequentially detecting all feature point bitmaps subjected to the X +1 detection, and if the height difference of the shoulder point and the crotch point in any frame of feature point bitmap is greater than a fourth preset violation threshold, counting by one.

The invention provides a rolling motion counting method based on machine vision, which saves a video stream to generate an image, processes the image to obtain a bitmap containing characteristic points corresponding to human body parts, realizes the acquisition of human body gestures, counts the rolling motions by taking the positions of the dynamic characteristic points as a judgment basis for finishing the rolling motions, is convenient for image detection, greatly reduces the data volume of image processing while ensuring the counting accuracy, and thus realizes the purpose of replacing manual counting.

The invention also provides a rolling motion counting system based on machine vision, which comprises:

a feature point detection network;

the video acquisition module is used for acquiring a real-time video stream of the tested person during the movement of the body;

the video stream processing module is used for storing the real-time video stream and generating an image;

the image preprocessing module is used for preprocessing the generated image;

the characteristic point bitmap acquisition module is used for sending the preprocessed image into a characteristic point detection network for characteristic point detection and acquiring a characteristic point bitmap, wherein the characteristic point bitmap comprises dynamic characteristic points;

the scroll action detection module is used for sequentially detecting the feature point bitmaps and judging that a scroll action is finished according to the position of the dynamic feature points;

and the counter module is used for adding one to the count when the rolling motion detection module detects that one rolling motion is completed.

Preferably, in the feature point bitmap obtaining module, the determination that a scroll action is completed according to the position of the dynamic feature point is performed, specifically, when the position of the dynamic feature point detected in the Y-th frame feature point bitmap for the X-th time is located at a preset position, and the position of the dynamic feature point detected in the Y + W-th frame feature point bitmap for the X + 1-th time is located at the preset position, the determination that a scroll action between the Y-th frame and the Y + W-th frame is completed is performed; wherein X, Y, W are positive integers;

preferably, the dynamic feature point is determined to be located at a preset position according to the distance between the dynamic feature point and the static reference point lens;

Preferably, in the feature point bitmap acquisition module, the static reference point is a knee point, and the dynamic feature point is an elbow point;

when the distance between the elbow point and the knee point in the Y frame feature point bitmap is detected for the X time to be M ₀ And the distance between the elbow point and the knee point in the Y + W frame feature point bitmap is detected to be M for the X +1 time ₀ Judging that a scroll action between the Y frame and the Y + W frame is finished, and adding one to the count;

wherein M is ₀ To preset a distance threshold, M ₀ Not less than 0, and X, Z, W are all positive integers;

preferably, the first and second electrodes are formed of a metal,

the characteristic points further comprise static reference points, the static reference points are knee points, and the dynamic characteristic points are elbow points;

in the feature point bitmap acquisition module, "determining that a scroll action is completed according to the position of the dynamic feature point" specifically includes:

detecting that the distance between the elbow point and the knee point in the Z-th frame feature point bitmap is M for the X time _XZ When M is _XZ ＜M _X(Z-1) And M is _XZ ＜M _X(Z+1) Judging that one roll body action is finished;

wherein M is _X Not less than 0, and X, Z, W are all positive integers;

preferably, in the feature point bitmap acquisition module, "determining that a scroll action is completed according to the position of the dynamic feature point" specifically includes:

wherein M is _X Not less than 0, and X, Z, U are all positive integers.

Preferably, in the feature point bitmap acquisition module, all feature point bitmaps subjected to the X +1 th detection in the X-th detection are sequentially detected, and if the distance between the elbow point and the knee point in any frame of feature point bitmap is greater than a first preset violation threshold, the count is reduced by one;

and/or the characteristic points further comprise hand points and head points; in a feature point bitmap acquisition module, sequentially detecting all feature point bitmaps subjected to the X +1 detection, and if the distance between a hand point and a head point in any frame of feature point bitmap is greater than a second preset violation threshold, counting by one;

and/or the characteristic points further comprise a crotch point and a foot point; in a feature point bitmap acquisition module, sequentially detecting all feature point bitmaps subjected to the X +1 th detection, and if the angle formed by a crotch point, a knee point and a foot point in any frame of feature point bitmap and taking the knee point as a center is larger than a third preset violation threshold, counting by one;

and/or the feature points further comprise shoulder points and crotch points; and in a feature point bitmap acquisition module, sequentially detecting all feature point bitmaps subjected to the X +1 detection, and if the height difference between the shoulder point and the crotch point in any frame of feature point bitmap is greater than a fourth preset violation threshold, counting and reducing by one.

In the present invention, the technical effect of the proposed rolling motion counting system based on machine vision is similar to that of the above-mentioned technical method, and therefore, the description thereof is omitted.

Detailed Description

s2, performing frame-by-frame storage on the real-time video stream and generating an image;

s3, preprocessing the image;

In this embodiment, the proposed method for counting volume actions based on machine vision stores a video stream to generate an image, processes the image to obtain a bitmap containing feature points corresponding to human body parts, and implements obtaining human body gestures, and counts the volume actions by using the positions of the dynamic feature points as a basis for determining the volume actions, thereby facilitating image detection, greatly reducing the data volume of image processing while ensuring the counting accuracy, and thereby implementing manual counting instead.

The counting method proposed in this embodiment will be described in detail below by taking the sit-up exercise as an example. The counting method of the embodiment comprises the following steps:

s1, acquiring a real-time video stream of a tested person during sit-up action;

s3, preprocessing the image;

specifically, in the sit-up exercise, one exercise cycle is from when the subject lies down on his upper body to when the subject sits up with his elbows touching their knees. The step of judging that a scroll motion is completed according to the position of the dynamic feature point is specifically that when the position of the dynamic feature point detected in the Y-th frame feature point bitmap for the X-th time is located at a preset position and the position of the dynamic feature point detected in the Y + W-th frame feature point bitmap for the X + 1-th time is located at the preset position, a scroll motion from the Y-th frame to the Y + W-th frame is judged to be completed; wherein X, Y, W are positive integers. In the actual selection, one point of the body of the tested person moving in the sit-up process can be selected as a dynamic characteristic point, and any position on the motion track of the dynamic characteristic point is taken as a preset position; and then, sequentially detecting the feature point bitmap. For example, when the position of the dynamic feature point is detected to be located at the preset position, the testee is in the process from the supine position to the sitting-up position, the position of the dynamic feature point is detected to be located at the preset position again next time, and the testee is in the process of resetting from the sitting-up position to the supine position, then the testee completes one supine starting action. And vice versa.

In the detection of the characteristic points by the human body characteristic point detection network in the field, the human body characteristic point detection network structure can comprise a backbone network, a characteristic fusion network and a detection network; the backbone network inherits the darknet53 network structure; the feature fusion network performs feature fusion on the third downsampling feature map A, the fourth downsampling feature map B and the fifth downsampling feature map C; the sizes of the third downsampling feature map A, the fourth downsampling feature map B and the fifth downsampling feature map C are 52 multiplied by 128,26 multiplied by 26 multiplied by 256,13 multiplied by 13 multiplied by 512 respectively; and performing 1 × 1 convolution on the fifth downsampling feature map C to change the number of channels to 256, performing upsampling and fourth downsampling feature map fusion on the fifth downsampling feature map C to form a new fourth downsampling feature map D, performing 1 × 1 convolution on the new fourth downsampling feature map D to change the number of channels to 128, and performing upsampling and third downsampling feature map A fusion on the new fourth downsampling feature map D to form a new third downsampling feature map E.

In a specific implementation manner of this embodiment, the feature point bitmap further includes a static reference point, and the dynamic feature point is determined to be located at a preset position by the static reference point, so that position calculation of the dynamic feature point is facilitated. For example, the dynamic feature point is determined to be located at the preset position according to the distance between the dynamic feature point and the static reference point lens.

In one embodiment, when the preset position is within a preset position threshold range, the preset position threshold range does not include the static reference point. In one sit-up action, the starting point and the end point of the dynamic characteristic point of the sit-up can be used as static reference points, the preset position threshold range is between the starting point and the end point on the dynamic characteristic point action route, and when any position of the starting point and the end point is selected in the preset position, the dynamic characteristic point passes through the preset position twice in the sit-up and reset processes every time. At this time, to avoid repeated counts, X is a positive odd number and every other count is incremented by one. In a specific selection of feature points and reference points, the dynamic feature points are elbow points and the static reference points are knee points.

In another embodiment, when the preset position is located at the starting point or the end point on the dynamic characteristic point action route, the dynamic characteristic point passes through the preset position only once in each sit-up and reset process, and at this time, each time the position where the dynamic characteristic point is detected to be located at the preset position is completed corresponding to one roll body action, the X counts one by one. For convenience of calculation, a preset position may be set at the static reference point, and at this time, it is determined that the dynamic feature point is located at the preset position according to a distance between the dynamic feature point and the static reference point lens. In a sit-up maneuver, the dynamic feature point may select an elbow point and the static reference point may select a knee point.

In a specific calculation mode that the position of the dynamic characteristic point is located at a preset position, the static reference point is a knee point, and the dynamic characteristic point is an elbow point;

wherein, M ₀ To preset a distance threshold, M ₀ The values of 0,X and Y, W are positive integers.

Because the sit-up is the reciprocating motion of the upper body, M is taken out under the ideal state for facilitating data processing ₀ Not less than 0 or M ₀ ＝M _MAX In time, the rolling motion can be judged to be finished once only by detecting once. That is, each time the elbow point contacts with the knee point or each time the elbow point returns to the lying initial position, the operation of the rolling body is judged to be finishedAnd (4) obtaining. However, in actual detection, the position of the subject for each movement has a tolerance such that the start and end points of the elbow point movement always have a certain deviation, that is, M ₀ I.e. the exact threshold range cannot be taken.

Therefore, when the elbow point and the knee point are in contact for determination each time, "determining that one scroll motion is completed according to the position of the dynamic feature point" is specifically: the distance between the elbow point and the knee point in the Z-th frame feature point bitmap is detected for the X time to be M _XZ When M is _XZ ＜M _X(Z-1) And M is _XZ ＜M _X(Z+1) Judging that one roll body action is finished; wherein, M _X Not less than 0, and X, Z, W are all positive integers. In the X detection, when the Z frame elbow point and knee point distance M is detected _XZ When the distance is less than the distance between two frames before and after the M is considered to be M _XZ ＝M _Xmin . At this time, it is judged that the distance between the elbow point and the knee point is the minimum between the Z-1 th frame and the Z +1 th frame, that is, the elbow and the knee are in contact, and a rolling motion is completed.

Similarly, when the farthest position of the elbow point and the farthest position of the knee point are determined each time, the "determining that one roll motion is completed according to the position of the dynamic feature point" specifically includes: detecting that the distance between the elbow point and the knee point in the Z-th frame feature point bitmap is M for the X time _XZ Satisfy M _XZ ＞M _X(Z-1) And M is _XZ ＞M _X(Z+1) At this time, it is determined that the distance between the elbow point and the knee point is the largest between the Z-1 st frame and the Z +1 st frame. In the X detection, when the Z frame elbow point and knee point distance M is detected _XZ When the distance is larger than the distance between two frames before and after the frame, M is considered to be _XZ ＝M _Xmax . However, since the initial position of the sit-up exercise is usually the upper body lying flat, the distance between the elbow point and the knee point is the largest. To avoid miscalculating the initial lying-flat image, M needs to be satisfied at the same time _(X+1)(Z+U) ＞M _(X+1)(Z+U-1) And M _(X+1)(Z+U) ＞M _(X+1)(Z+U+1) . The elbow points of the X-th time and the X + 1-th time are located at the initial positions, and the completion of one roll body motion is judged; wherein M is _X Not less than 0, and X, Z, U are all positive integers.

In the actual detection, in order to ensure the accurate counting, whether the movement of the testee is standard in the movement period of the supine position and the sitting up position needs to be monitored. If the action is illegal, the counting is reduced by one when the action is not counted.

During specific violation detection, in knee-touching action judgment, all feature point bitmaps subjected to the X +1 th detection are sequentially detected, if the distance between an elbow point and a knee point in any frame of feature point bitmap is larger than a first preset violation threshold value, the elbow point and the knee point are not touched, and the counting is reduced by one when the action is not touched.

In the head holding action judgment, the feature point bitmap further comprises hand points and head points, all feature point bitmaps subjected to the X +1 detection are sequentially detected, and if the distance between the hand points and the head points in any frame of feature point bitmap is larger than a second preset violation threshold value, the measured person does not hold the head by the hand, and counting is reduced by one.

In the leg action judgment, the feature point bitmap further comprises a crotch point and a foot point; sequentially detecting all feature point bitmaps subjected to the X +1 detection, and if the angle formed by the crotch point, the knee point and the foot point in any frame of feature point bitmap and taking the knee point as the center is larger than a third preset violation threshold value, indicating that the sit-up leg action does not meet the specification, counting by one;

in the lying translation determination, the feature point bitmap further comprises shoulder points and crotch points; and sequentially detecting all feature point bitmaps subjected to the X +1 detection, and if the height difference between the shoulder point and the crotch point in any frame of feature point bitmap is greater than a fourth preset violation threshold, indicating that the lying translation does not meet the specification, and counting by one.

In order to ensure the accuracy of violation detection, the tested person can be judged to lie on the back or sit up through two adjacent M values of the feature point bitmap for violation judgment.

In the specific judgment, in the X-th detection, the distance M between the elbow point and the knee point _X ∈(M _min ,M _min +T ₁ ) Wherein T is ₁ Presetting a threshold value for the minimum value; distance M between elbow point and knee point in feature point bitmap of Z-th frame _XZ Distance M between elbow point and knee point in < Z-1 th frame feature point bitmap _x(Z－1) And the distance M between the elbow point and the knee point in the Y-th frame feature point bitmap _XZ Distance M between elbow point and knee point in < Z +1 th frame feature point bitmap _x(Z－1) (ii) a Then M is identified _XZ ＝M _Xmin . At this time, the minimum distance between the elbow point and the knee point is described.

Accordingly, in the X-th test, the distance M between the elbow point and the knee point _X ∈(M _max -T ₂ ,M _max ) Wherein T is ₂ Presetting a threshold value for the maximum value; distance M between elbow point and knee point in feature point bitmap of Z-th frame _XZ Distance M between elbow point and knee point in Z-1 th frame feature point bitmap _x(Z－1) And the distance M between the elbow point and the knee point in the Z-th frame feature point bitmap _XZ Distance M between elbow point and knee point in Z +1 th frame feature point bitmap _x(Z－1) (ii) a Then M is identified _XZ ＝M _Xmax . At this time, the maximum distance between the elbow point and the knee point is described.

In summary, when the feature point bitmap relating to the violation is at M _Xmin To M _Xmax In the middle frame, the testee is in the supine motion process. When the feature point bitmap relating to the violation is at M _Xmax To M _Xmin In the middle frame, the subject is in the process of sitting up.

The counting method of the present embodiment is described in detail below by way of example. When the athlete bends knees to lie between the test positions, the human body is detected based on a self-training target detection algorithm, whether the human body meets the requirements or not is judged, and whether the sit-up project is started or not is determined. After the sit-up project is started, the system can detect the human body in real time by using a human body posture detection algorithm, and key points of the human body are obtained to serve as feature points. Key points of the human body may include joint positions of the head, shoulders, hands, elbows, crotch, knees, feet, etc.

Firstly, the illegal action is judged according to the position information of the key point.

Judging a violation item I (whether to hold the head or not): judging whether the distance between the key point of the hand and the key point of the head is smaller than a set threshold epsilon 1, if so, indicating that the head holding action meets the specification, and otherwise, indicating that the rule is violated;

and judging the violation item two (whether to touch the knee): judging whether the distance between the elbow key point and the knee key point is smaller than a set threshold epsilon 2 and whether the elbow key point is below the knee key point in the vertical direction, if so, indicating that the knee touching action meets the specification, otherwise, indicating that the rule is violated;

and (3) judging a violation item III (whether the knee is bent or not): judging whether the angle formed by the crotch key point, the knee key point and the foot key point is smaller than a set threshold epsilon 3, if so, indicating that the leg action conforms to the specification; otherwise, the violation is indicated;

violation item four (lying or not) judgment: and judging whether the distance between the shoulder key point and the crotch key point in the vertical direction is smaller than a set threshold epsilon 4, if so, indicating that the lying translation operation meets the specification, otherwise, indicating that the rule is violated.

And when all the actions are qualified, calculating the sit-up score. The action from one lying translation to one knee touching action is a qualified sit-up action period, and the head embracing action and the knee bending action in the action period need to meet the requirements. So the count module can count up by 1. If there is any action violation in this cycle, the count module does not count. And judging whether the knee touching action is qualified after one lying translation action in the next period.

The present embodiment further provides a rolling motion counting system based on machine vision, including:

a feature point detection network;

the image preprocessing module is used for preprocessing the generated image;

In the specific work of the feature point bitmap acquisition module, the working mode is similar to the counting method, and therefore, the description is omitted.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims

1. A rolling motion counting method based on machine vision is characterized by comprising the following steps:

s3, preprocessing the image;

2. The machine-vision-based roll motion counting method according to claim 1, wherein in S4, a roll motion is determined to be completed according to the position of the dynamic feature point, and specifically, when the position of the dynamic feature point in the Y-th frame feature point bitmap is detected to be located at a preset position for the X +1 th time, and the dynamic feature point in the Y + W-th frame feature point bitmap is detected to be located at the preset position for the X +1 th time, a roll motion is determined to be completed between the Y-th frame and the Y + W-th frame; wherein X, Y, W are positive integers.

3. The machine vision-based volume motion counting method according to claim 2, wherein the feature point bitmap further includes a static reference point, and the position of the dynamic feature point is located at a preset position, specifically, the static reference point determines that the dynamic feature point is located at the preset position;

preferably, when the preset position is not coincident with the two ends of the motion trail of the dynamic characteristic point, X is a positive odd number; preferably, the dynamic feature point is determined to be located at a preset position through the distance between the dynamic feature point and the static reference point lens;

4. The machine-vision-based roll body motion counting method according to any one of claims 1 to 3, wherein in S4, the static reference point is a knee point, and the dynamic feature point is an elbow point;

wherein M is ₀ To preset a distance threshold, M ₀ The values of 0,X and Y, W are positive integers.

5. The machine-vision-based roll body motion counting method according to any one of claims 1 to 3, wherein the feature points further comprise a static reference point, and the static reference point is a knee point, and the dynamic feature points are elbow points;

wherein M is _X Not less than 0, and X, Z, W are all positive integers;

wherein M is _X Not less than 0, and X, Z, U are all positive integers.

6. The machine-vision-based volume motion counting method according to any one of claims 2, 4 and 5,

the counting method further comprises the following steps: sequentially detecting all feature point bitmaps subjected to the X +1 detection in the X-th detection, and if the distance between the elbow point and the knee point in any frame of feature point bitmap is larger than a first preset violation threshold, counting by one;

and/or the feature point bitmap further comprises hand points and head points; the counting method further comprises the following steps: sequentially detecting all feature point bitmaps subjected to the X +1 detection in the X-th detection, and if the distance between a hand point and a head point in any frame of feature point bitmap is larger than a second preset violation threshold, counting by one;

and/or the feature point bitmap further comprises crotch points and foot points; the counting method further comprises the following steps: sequentially detecting all feature point bitmaps subjected to the X +1 detection in the X-th detection, and if the angle formed by the crotch point, the knee point and the foot point in any frame of feature point bitmap and taking the knee point as the center is larger than a third preset violation threshold, counting and reducing by one;

and/or the feature point bitmap further comprises shoulder points and crotch points; the counting method further comprises the following steps: and sequentially detecting all feature point bitmaps subjected to the X +1 detection, and if the height difference between the shoulder point and the crotch point in any frame of feature point bitmap is greater than a fourth preset violation threshold, counting and reducing by one.

7. A machine vision based roll-to-roll motion counting system, comprising:

a feature point detection network;

the video acquisition module is used for acquiring real-time video stream of the tested person during the body movement;

the image preprocessing module is used for preprocessing the generated image;

8. The machine-vision-based roll motion counting system according to claim 7, wherein in the feature point bitmap obtaining module, the determination of the completion of a roll motion according to the position of the dynamic feature point is performed, specifically, when the position of the dynamic feature point in the Y-th frame feature point bitmap is detected to be at a preset position for the X-th time and the position of the dynamic feature point in the Y + W-th frame feature point bitmap is detected to be at the preset position for the X + 1-th time, the determination of the completion of a roll motion from the Y-th frame to the Y + W-th frame is performed; wherein X, Y, W are positive integers;

preferably, the feature point bitmap further includes a static reference point, and the position of the dynamic feature point is located at a preset position, specifically, the dynamic feature point is determined to be located at the preset position by the static reference point;

9. The machine-vision-based volume motion counting system of claim 7 or 8, wherein in the feature point bitmap acquisition module, the static reference point is a knee point, and the dynamic feature point is an elbow point;

when the distance between the elbow point and the knee point in the Y-th frame feature point bitmap is detected to be M0 for the X-th time and the distance between the elbow point and the knee point in the Y + W-th frame feature point bitmap is detected to be M0 for the X + 1-th time, judging that a scroll action between the Y-th frame and the Y + W-th frame is completed, and adding one to the counting;

wherein M is ₀ Is a preset distance threshold, and X, Z, W are positive integers;

the Xth detection is in the Z frame feature point bitmapThe distance between the elbow point and the knee point is M _XZ When M is _XZ ＜M _X(Z-1) And M _XZ ＜M _X(Z+1) Judging that one roll body action is finished;

wherein M is _X Not less than 0, and X, Z, W are all positive integers;

detecting that the distance between the elbow point and the knee point in the Z-th frame feature point bitmap is M for the X time _XZ Satisfy M _XZ ＞M _X(Z-1) And M is _XZ ＞M _X(Z+1) While satisfying M _(X+1)(Z+U) ＞M _(X+1)(Z+U-1) And M _(X+1)(Z+U) ＞M _(X+1)(Z+U+1) Judging that one rolling motion is finished;

wherein M is _X Not less than 0, and X, Z, U are all positive integers.

10. The machine-vision-based roll-to-roll motion counting system of claim 9,

in a feature point bitmap acquisition module, sequentially detecting all feature point bitmaps subjected to the X +1 detection, and if the distance between elbow points and knee points in any frame of feature point bitmap is larger than a first preset violation threshold, counting by one;

and/or the feature point bitmap further comprises hand points and head points; in a feature point bitmap acquisition module, sequentially detecting all feature point bitmaps subjected to the X +1 detection, wherein if the distance between a hand point and a head point in any frame of feature point bitmap is greater than a second preset violation threshold, the count is reduced by one;

and/or the feature point bitmap further comprises crotch points and foot points; in a feature point bitmap acquisition module, sequentially detecting all feature point bitmaps subjected to the X +1 th detection, and if the angle formed by a crotch point, a knee point and a foot point in any frame of feature point bitmap and taking the knee point as a center is larger than a third preset violation threshold, counting by one;

and/or the feature point bitmap further comprises shoulder points and crotch points; and in a feature point bitmap acquisition module, sequentially detecting all feature point bitmaps subjected to the X +1 detection, and if the height difference between the shoulder point and the crotch point in any frame of feature point bitmap is greater than a fourth preset violation threshold, counting and reducing by one.