CN110490173B

CN110490173B - Intelligent action scoring system based on 3D somatosensory model

Info

Publication number: CN110490173B
Application number: CN201910798879.1A
Authority: CN
Inventors: 邝翠珊
Original assignee: Shenzhen Digital Galaxy Technology Co ltd
Current assignee: Shenzhen Digital Galaxy Technology Co ltd
Priority date: 2019-08-28
Filing date: 2019-08-28
Publication date: 2022-11-18
Anticipated expiration: 2039-08-28
Also published as: CN110490173A

Abstract

An intelligent action scoring system based on a 3D somatosensory model is an obtaining method of 3D somatosensory coordinate data, the action data of a teacher related to action training and learning are used as a standard model, a scaling ratio is obtained through the same posture data of the teacher and the user body type, the coordinate data of the standard model which is forced to be aligned with a user and the coordinate data of the user action are compared and scored, the standard corresponding to a timestamp with a frame score lower than a threshold value and the frame picture of the user are processed and compared, and the user can visually know the difference between the action and the standard.

Description

Intelligent action scoring system based on 3D somatosensory model

Technical Field

The invention relates to the technical field of 3D (three-dimensional) somatosensory, in particular to an intelligent action scoring system based on a 3D somatosensory model.

Background

The 3D motion sensing technology is the technology field of leading edge and hot spot nowadays, and a large amount of technical research and development investments have all been made in the 3D field to a large amount of international science and technology companies and enterprises on the market, and the scientific and technological achievements are also revealed in the spring bamboo shoots after rain, like a dispute: the infrared imaging principle of the somatosensory interaction equipment is consistent with that of the active infrared camera technology in a night vision system; provided is a polarized stereoscopic display technology having an excellent 3D effect. Gesture recognition is converted into user instruction technology, and the application of the technology is mostly limited to games or instruction input. But rarely has the application in the education industries relating to body shapes and actions such as yoga, dance, martial arts and the like, so that the technology for automatically scoring and intelligently correcting errors and guiding in the aspect of action through further expansion of the 3D motion sensing technology has pioneering application significance.

Disclosure of Invention

In view of the contents explained in the background art, the technical scheme is based on 3D body feeling, and aims at the technical scheme of automatic scoring and intelligent error correction guidance of dance, yoga, martial arts and gymnastics training related to action learning, namely an intelligent action scoring system, and the specific contents are as follows:

obtaining human body skeleton joint coordinates, obtaining user images and videos through a special photographic device, obtaining 3D coordinates of key points of a target object skeleton according to a human body skeleton recognition algorithm, and in order to recognize motion of a human body, adopting KINECT skeleton joint points as a human body joint point recognition basis, taking SpineBase on a spine as a reference 0 point through coordinate transformation, and respectively taking other 24 joint points as 1 and SpineMid; 2. hack; 3. head; 4. ShoulderLeft; 5. elbowleft; 6. (xxix), wristLef; 7. handleft; 8. Shoulderrright; 9. ElbowRight; 10. WristRight; 11. HandRight; 12. HipLeft 13, kneeLeft 14, ankleLeft; 15. FootLeft; 16. HipRight; 17. kneeRight; 18. AnkleRight; 19. FootRight; 20, spineShoulder; 21. HandTipLeft; 22. ThumbLeft; 23, handTipRight; 24. ThumbRight; and acquiring the coordinates of the joint point and related data by using a technical method specified by GetJointKinectposition () and a related interface.

And (3) identifying the standard standing posture: and (3) monitoring coordinates of human skeleton joint points, setting judgment conditions for relative positions between the nodes, and obtaining a judgment result of the standard standing posture through verification of the conditions.

The first requirement is as follows: comparing and calculating difference values of x coordinates of spine base points SpineBase and x coordinates of SpineMud, spineShoulder, neck and Head joints respectively, comparing and calculating difference values of z coordinates of spine base points SpineBase and Z coordinates of SpineMud, spineShoulder, neck and Head joints respectively, setting a threshold value F, and if absolute values of the difference values are smaller than the value F, the necessary condition is satisfied;

the second essential condition is that: x, y and z coordinates of the spine base points SpineBase are respectively compared with x, y and z coordinates corresponding to the joints of ShoulderLeft, elbowLeft, wristLef, handLeft, shoulderRight, elbowRight, wristright, handRight, hipLeft, kneeLeft, ankleLeft, footLeft, hipRight, kneeRight, ankleRight, footRight, handtLeft, thumbLeft, handtipRight and ThumbRight, and coordinates relative difference values between the joints are calculated, and a corresponding threshold value and an arithmetic method for judging a necessary condition are set for all the difference values, so that whether the necessary condition is satisfied or not is judged to be the standard standing posture or not is calculated.

Through the technical scheme of obtaining the coordinates of the joints of the bones of the human body, after operation and technical processing, the standard standing posture is judged, spineBase on the spine is taken as a reference 0 point in a coordinate system, namely the three-dimensional coordinates of each joint of the human body in the frame frequency of a time point with the coordinates of 0,0 and 0 of SpineBase, the serial number of the joint point represented by the following table number of the coordinate values corresponds to the joint point, and coordinate data are obtained

、

、

、

......

；

、

......

；

、

、

、

......

(ii) a Preferably, the requirements for setting the standing posture are as follows:

requirement 1: abs (

）<12；

Requirement 2: abs (

）<12；

Requirement 3: abs (

）<12；

Requirement 4: abs (

）<12；

Requirement 5: abs (

）<16；

Requirement 6: abs (

）<16；

Requirement 7: abs (

）<16；

Requirement 8: abs (

）<12；

Requirement 9: abs (

）- Abs（

）<8；

Requirements 10: abs (

）- Abs（

）<18；

Requirement 11: abs (

）- Abs（

）<18；

Requirements 12:

<-53；

requirement 13:

<-53；

requirement 14: abs (

）- Abs（

）<8；

Requirements 15: abs (

）- Abs（

）<8；

Requirements 16: abs (

）<16；

Requirement 17: abs (

）<16；

Requirements 18:

-

<68；

and sequentially verifying the joint coordinate values obtained in the time sequence through the conditions, and if all the joint coordinate values are true results, setting the mapped human body as a standard standing posture.

The gesture stay command response module is used for obtaining body form action scoring and correcting the error to guide good user experience, and avoiding the mixing of keyboard and mouse commands as necessary operation steps in action test, and the commands of man-machine conversation of gesture actions and limb actions are easy to be confused and interfered with the content of the scoring action, so the invention creates a technical method for responding the gesture stay command, namely, the preset duration of the stay time for monitoring specific gesture time is used as an input command of an operation system, a gesture command detection module is arranged, the coordinates in accordance with the command joint point are monitored, when the coordinates in accordance with the position triggered by the command, the duration for keeping the gesture is monitored to be equal to or exceed the preset command time threshold, and the command operation corresponding to the gesture is triggered.

Gesture dwell command response one embodiment: the system starts joint coordinate monitoring to obtain the coordinates of each joint point in real time, wherein the command module judges whether each coordinate accords with the position of the trigger command and accords with the condition-10<

<10 and-10<

<10；and-10<

<10 and -10<

<10; and the coordinates of HandLeft and HandRight are transmitted to the gesture command detection module.

Gesture command detection module obtains offIn nodal coordinates, preferably Abs: (

）- Abs（

）<18 hours; according to

Setting corresponding stay parameters for adding 1: -10<

<10. Time corresponding parameters A, abs (

-

）<10. Time corresponds to parameter B; abs (

-

）<10. Time corresponds to parameter C; abs (

-

）<10. Time corresponding to parameter D; and so on; the other stay parameters are assigned to 0; meanwhile, verifying whether a stay parameter corresponding to data received by a gesture command detection module is the same as the stay parameter corresponding to the time, and if not, modifying and assigning the stay parameter corresponding to the time to be 1; setting a dwell time threshold, setting a program event process corresponding to each dwell parameter, and triggering a corresponding command program event when the dwell parameter is greater than or equal to the threshold, so as to realize the operation of the system responding to the command of a user.

The system checks the standard standing posture and starts a standard model record or a scoring test process after completing the basis of the technical method, and the specific technical scheme is as follows:

firstly, a standard mode and a scoring mode are set in a system, and a module for recording background audio information of a project is created.

The method comprises the steps of establishing a standard model, monitoring by the system through a standard standing violet identification method according to audio rhythms and duration t0 of mp4 and rm formats recorded in a project, including but not limited to dance, gymnastics, martial arts, yoga teachers, coaches and standing standard standing postures, starting a posture stop command response module after monitoring and confirming the standard standing postures, prompting the start of motion demonstration when the system obtains a start command, enabling the teachers and the coaches to demonstrate standard motions once, monitoring coordinates of joints of the teachers and the coaches captured at different time points in real time by the system, recording the coordinates into the system, and establishing the standard model of the project.

The method comprises the following steps of forcibly aligning a standard model, directly comparing the data of action world coordinates of a teacher in the standard model with the data of student world coordinates in consideration of the difference between the fat, thinness and height of a demonstrator in the standard model, and having extremely large error and no reference value, so that the method respectively calculates the scaling correction coefficients of x, y and z in a world coordinate system according to the same standard posture of the demonstrator and the user in the standard model, corrects all the data of x, y and z in the standard model through the scaling correction coefficients to obtain the aligned standard model, compares the dynamic coordinate data of the user in the posture with the actual action coordinate data of the user to obtain a score, and comprises the following specific technical methods:

by the technical method for judging the standard standing posture, under the standard standing posture, the maximum x value of the user in the 24 joint coordinates of the user and the standard model is respectively obtained through circular comparison

Maximum y value

Maximum z value

Minimum x value

Minimum y value

Minimum z value

Maximum x value in corresponding standard model record

Maximum y value

Maximum z value

Minimum x value

Minimum y value

Minimum z value

。

Wherein a, b and c are divided into scaling alignment coefficients which are used for forcing the standard model to be aligned to the world coordinates x, y and z of the body type of the user, x in the record of the standard model is multiplied by a, y is multiplied by b, z is multiplied by c respectively, and 0 to 0 are obtained

Data set corresponding to all time stamps

、

、

、

......

；

、

、

、

......

；

、

、

、

......

(ii) a When the program language implementation of the subscript variable formula is not supported, a transition variable with the value of an integer from 0 to t × 24 is set and used as a subscript variable parameter.

After the user is confirmed by a standard standing posture system and when the system monitors a starting command of a posture stopping response module, corresponding action test is started, the system obtains the joint coordinates of the user calling a timestamp in real time and records the joint coordinates into the system, the test duration t1 is recorded when the test duration is finished, when t1 is greater than t0, a prompt is displayed on a system interface, and the prompt is that the action rhythm of the user is slower than the standard by t1-t0, and when t1 is less than or equal to t0, the prompt is displayed on the system interface, and the prompt is that the action rhythm of the user is faster than the standard by t0-t 1. While forcing alignment of time stamps of data in the user coordinate data set, i.e.

And forcibly compressing the time length t1 of the test action of the user to be equal to t0 by correcting the timestamp where the coordinate value is located. A variable t represents a timestamp from 0 to t0, then coordinate data of character nodes in a frame frequency image in a standard model after forced alignment and a user test video after duration correction, which correspond to the time point t respectively, are combined with 1 in terms of x, y and z of coordinates under a user test in order to distinguish the difference between the standard model and the user test video coordinates, that is, x1, y1 and z1 represent coordinate parameters, the standard model coordinates still represent corresponding coordinate parameters by x, y and z, and then the average difference of the two coordinates at the time point t is given as a score:

setting a system frame score qualification judgment threshold value H to be

Records below H, stored in an array RWaiting for system call; the average differences corresponding to all t are accumulated, superposed and averaged to obtain the average difference value between the user completing the action and the standard, and the calculation formula is as follows:

through the algorithm, the

Is displayed on the user interface as one of the scoring results, and the rating is set: and poor, qualified, good and excellent thresholds, judging the threshold interval in which the result score is positioned, and displaying corresponding grades.

In the above statistical score algorithm, although the absolute value is used to avoid the positive and negative hedging of different coordinate differences, which may make the total error sum zero and may not reflect the error of the coordinate sample, the algorithm for accumulating the absolute value cannot measure the dispersion degree of the numerical value, and the more stable small amplitude deviation is easy to be similar to the operation result under the condition of larger fluctuation of the deviation, so this technical solution further adopts the variance-based algorithm to supplement the statistical score, so that the user can know the deviation degree of the action result compared with the standard, first obtain the frame frequency data of the person node in the standard model after the forced alignment and the user test video after the time length correction corresponding to the t time point, and obtain the score calculated based on the variance principle and mapped by these coordinate data, where the corresponding coordinate value in the standard model after the forced alignment is set as the average value in the variance calculation, and the score formula of the t time point is as follows:

setting a system frame score qualification judgment threshold G to be

Notes below GAnd recording, storing in an array S, waiting for system call:

wherein

The larger the value is, the larger the deviation fluctuation of the marked user action and the standard is, while an absolute number is difficult to form the cognition of the error for the user, in order to further contact with the cognition of the user, the result based on the variance operation is converted into a percentile score which is more easily accepted and understood by the user, so a threshold value K is set,

values greater than the value K are directly 0 points, whereas values less than the value K are converted to percentile values by the following algorithm:

and F is a percentile error deviation score, and the result of the F value is pushed to a user interface to serve as an auxiliary result of the average difference algorithm score and provide a reference for the fluctuation size score of a measured deviation of the action.

In order to further let the user know the action with larger error and visually guide the user to correct the error action, the technical scheme further extracts the action of the human body in the frame with lower score and compares the action with the action of the human body in the frame of the corresponding timestamp of the standard model, so that the user visually knows the action defect, the user can conveniently improve the action, the key of the action can be efficiently learned, and the specific implementation is as follows:

step 1, obtaining the parameter members of the intersection of R and S in the arrays R and S recorded in the scoring technical method through circular operation, and storing the members in a new array Q.

And 2, extracting action error pictures, sequentially reading the member timestamp values in the Q array, performing destaticizing background processing on the standard model corresponding to the timestamp and the corresponding frame in the user test video, performing video preprocessing, setting a static judgment threshold value U, and firstly converting the unit of the video frame picture acquired by software into pixels.

The method comprises the following specific implementation steps: the unit obtained by the software is T twip, and the value of T/15 is the pixel value.

And further extracting RGB values of mapping pixel points in the obtained frame data, comparing the RGB values with RGB of pixel points at the position of a frame before the member timestamp in the current Q, respectively subtracting R/G/B in RGB, dividing the sum of the results by 3, comparing the result with a static judgment threshold value K, assigning the RGB values of the pixel points to three fixed values u1, u2 and u3 when the result is smaller than the threshold value, obtaining the optimal visual effect through tests, wherein the fixed values are preferably 166, 166 and 166, and filtering out parts which do not have motion and image change compared with the previous frame, including typical bone points which do not change.

Preferably, the static decision threshold U has a value of 11.

Further, a judgment threshold value algorithm is optimized, the length of the appointed span interval is used for judgment, and the operation of the judgment result is prolonged to a range of half of the span value before and after the judgment. One embodiment is as follows: frame object

，

From x is 0 to

The value of Width, with an interval period length of k, is calculated as follows:

from y is 0 to

The value of Height, with a period interval length of k, is calculated as follows:

setting coordinates as points of current x and y values in current frame picture

Corresponding RGB values are respectively

.R、

.G、

B, point in picture of frame corresponding to previous timestamp of current frame corresponding to x, y value

Corresponding RGB values are respectively

.R、

.G、

B, calculating the V value according to the following equation;

v is less than 0

RGB at the (X, Y) position is set to 166, and the position points RGB at the positive and negative k/2 value interval of the sum X difference or the sum Y difference are set to 166, 166.

And further performing a loop of increasing y by 1, and then performing a loop of increasing x by 1, and processing the frame picture.

In the above algorithm, the operation speed is increased by several times compared with the point-by-point algorithm, and when a unit twip is used, 15 twips equals to 1 pixel, which means that if a pixel is specified on the screen by taking the twip as a unit, 15 × 15=225 times! Since 0 to 14 twips are all 1 pixel in essence, when x is assigned to 0 to 14, y is assigned to 0 to 14, and z is assigned to 0 to 14, (x, y) is the point to which the coordinates point is the same pixel point (0, 0) on the screen in essence; therefore, points are sequentially drawn at 0 twip, 15 twip, 30 twip and 45 twip \8230 \ 8230with the step length of 15, repeated operation for many times can be avoided, the span value k is used, and on the premise of guaranteeing the operation result, the span is preferably increased, and comparison is carried out by a sampling method of balanced distribution; the method effectively reduces the operation load, improves the operation processing speed, shields unnecessary static background and further improves the operation efficiency.

Through the processing of the technical method, the member timestamp values in the Q array respectively correspond to the standard model and the processed picture of the corresponding frame picture in the user video, and are recorded into the system. Setting an action score test image analysis interface, and setting a progress bar by taking a value from 0 to t0 as an interval; and carrying out corresponding identification on the member timestamp values in the Q array on the progress bar, wherein the identification corresponds to the pictures subjected to action error comparison, the pictures subjected to corresponding standard model processing and the pictures subjected to user video processing are separately placed in a comparison mode, a user clicks the identification, the two processed pictures are displayed at different positions of an interface, a play function button is arranged, and the two pictures corresponding to the identification are displayed according to a set time interval.

Drawings

Fig. 1 is an overall logic framework diagram of an intelligent action scoring system based on a 3D somatosensory model.

FIG. 2 is a schematic diagram of a coordinate point of a 3D somatosensory human skeleton joint.

Particularly, it is stated that: reference in the specification to "an embodiment" or the like means that a particular feature, element or characteristic described in connection with the embodiment is included in the embodiments generally described in the present application. The appearances of the same phrase in various places in the specification are not necessarily all referring to the same embodiment. That is, when a particular feature, element, or characteristic is described in connection with any embodiment, it is intended that the feature, element, or characteristic be implemented in connection with other embodiments and included within the scope of the claims; the present invention has been described with reference to a number of illustrative embodiments of the logical architecture and concept of the present invention, but the scope of the invention is not limited thereto, and those skilled in the art can devise many other modifications and embodiments within the spirit and scope of the present invention, and various combinations and/or arrangements of the elements of the present invention, and other uses will be apparent to those skilled in the art, and insubstantial changes or substitutions in the implementation can be easily made, which will fall within the spirit and scope of the principles of the present invention.

Claims

1. The utility model provides an intelligent action system of scoring based on model is felt to 3D body, its characteristic includes step and key element have: obtaining user images and videos through a special shooting device, and obtaining 3D coordinates of key points of bones of a target object according to an algorithm for human bone recognition;

and (3) identifying the standard standing posture: monitoring coordinates of human skeleton joint points, setting judgment conditions for relative positions among the nodes, and obtaining a judgment result of standard standing posture through verification of the conditions;

the second requirement is as follows: calculating x, y and z coordinates of spine base points SpineBase, comparing and calculating difference values of x, y and z coordinates corresponding to ShoulderLeft, elbowleft, wristLef, handLeft, shoulderRight, elbowRight, wristRight, handRight, hipLeft, kneeLeft, ankleLeft, footLeft, hipRight, kneeRight, ankleRight, footRight, handLeft, thumbLeft, handTight and ThumbRight joints, namely coordinate x difference values, y difference values and z difference values, and setting a corresponding threshold value and an arithmetic method for judging necessary conditions for all the difference values, calculating whether the necessary conditions are satisfied and judging whether the spine base points are in a standard standing posture;

the gesture stopping command response module is used for monitoring specific gesture time and stopping for a preset time as an input command of an operating system, setting a gesture command detection module, monitoring coordinates of joint points conforming to an instruction, monitoring that the duration time for keeping the gesture is equal to or exceeds a preset command time threshold value when the coordinates conform to a position triggered by the command, and triggering command operation corresponding to the gesture;

the system checks the standard standing posture and starts a standard model record or a scoring test process;

firstly, setting a standard mode and a scoring mode in a system, and creating a module for recording background audio information of a project by the project;

creating a standard model, monitoring by the system through the standard standing purple identification method according to the audio rhythm and the time length t0 of mp4 and rm formats recorded in the project, including but not limited to dance, gymnastics, martial arts, yoga teachers and coaches, and standing standard standing postures, starting a posture starting and stopping command response module after monitoring and confirming the standard standing postures, prompting the start of action demonstration when the system obtains a start command, starting the standard action demonstration by the teachers and coaches, simultaneously monitoring the coordinates of joints captured at different time points of the teachers and coaches in real time by the system, recording the coordinates into the system, and creating the standard model of the project;

forced alignment of the standard model, namely respectively calculating the scaling correction coefficients of x, y and z in a world coordinate system according to the same standard posture of a presenter and a user in the standard model, correcting all data of x, y and z in the standard model through the scaling correction coefficients to obtain an aligned standard model, and comparing dynamic coordinate data of the user in the posture with actual action coordinate data of the user to obtain a score, wherein the specific technical method comprises the following steps:

by the technical method for judging the standard standing posture, under the standard standing posture, 24 joints of the user and the standard model are respectively obtained through circular comparisonMaximum x value of user in coordinates

Maximum y value

Maximum z value

Minimum x value

Minimum y value

Minimum z value

Maximum x value in corresponding standard model record

Maximum y value

Maximum z value

Minimum x value

Minimum y value

Minimum z value

；

Data groups corresponding to all the timestamps;

further, the human-object actions in the frame with the lower score are extracted and compared with the human-object actions in the frame of the corresponding timestamp of the standard model, so that a user can visually know the action defect.

2. The system of claim 1, wherein the system comprises the steps and elements of: further, when the spine SpineBase is taken as a reference 0 point in a coordinate system, namely the coordinate of SpineBase is 0, three-dimensional coordinates of each joint of the human body in a frame frequency of a time point, and the serial number of the joint point is represented by the following table number of the coordinate value, the coordinate data are obtained, and the standard standing posture is judged, the three-dimensional coordinates of each joint of the human body are taken as the three-dimensional coordinates of each joint of the human body in the frame frequency of the time point, and the serial number of the joint point corresponds to the joint point, so that the coordinate data are obtained

、

、

、

......

；

、

......

；

、

、

、

......

requirement 1: abs (

）<12；

Requirement 2: abs (

）<12；

Requirement 3: abs (

）<12；

Requirement 4: abs (

）<12；

Requirement 5: abs (

）<16；

Requirement 6: abs (

）<16；

Requirement 7: abs (

）<16；

Requirement 8: abs (

）<12；

Requirement 9: abs (

）- Abs（

）<8；

Requirements 10: abs (

）- Abs（

）<18；

Requirement 11: abs (

）- Abs（

）<18；

Requirements 12:

<-53；

requirement 13:

<-53；

requirement 14: abs (

）- Abs（

）<8；

Requirements 15: abs (

）- Abs（

）<8；

Requirements 16: abs (

）<16；

Requirement 17: abs (

）<16；

Requirements 18:

-

<68；

3. The system of claim 1, wherein the system comprises the steps and elements of: gesture dwell command response: the system starts joint coordinate monitoring to obtain the coordinates of each joint point in real time, wherein the command module judges whether each coordinate accords with the position of the trigger command and accords with the condition-10<

<10and-10<

<10；and-10<

<10and -10<

<10; the coordinates of Handleft and HandRight are transmitted to a gesture command detection module; the gesture command detection module obtains the joint coordinates, preferably Abs: (a)

）- Abs（

）<At 18 hours; according to

Setting corresponding stay parameters for adding 1: excellent-10<

<10. Time corresponding parameters A, abs (

-

）<10. Time corresponds to parameter B; abs (

-

）<10. Time corresponds to parameter C; abs (

-

）<10. Time corresponding to parameter D; and so on; the other stay parameters are assigned to 0; meanwhile, whether a staying parameter corresponding to data received by the gesture command detection module is the same as the staying parameter corresponding to the time is verified, and if not, the corresponding staying parameter corresponding to the time is modified and assigned to be 1; setting a retention time threshold value, setting a program event process corresponding to each retention parameter, and triggering a corresponding command program event when the retention parameter is greater than or equal to the threshold value, so as to realize the operation of responding to a user command by the system.

4. A method for scoring intelligent actions based on standard model forced alignment is characterized by comprising the following steps and elements:

based on the method for identifying the standard standing posture as claimed in claim 2, after the user stands and is confirmed by the system, when a starting command is monitored by the posture-staying response module method as claimed in claim 3, a corresponding motion test is started, the system obtains the coordinates of the user joint of the time stamp in real time and records the coordinates into the system, when the time duration t1 of the test is recorded and t1 is greater than t0, a prompt "your motion rhythm is slower than the standard by t1-t0" is displayed on the system interface, and when t1 is less than or equal to t0, a prompt "your motion rhythm is faster than the standard by t0-t1" is displayed on the system interface; and meanwhile, carrying out forced alignment on the time stamps of the data in the user coordinate data set:

forcibly compressing the time length t1 of the user test action to be equal to t0 through the correction of the timestamp corresponding to the coordinate value; using a variable t to represent a timestamp from 0 to t0, then in a frame frequency image in the standard model after forced alignment and the user test video after duration correction corresponding to the time point t respectively, combining x, y and z of coordinates under the user test with 1 respectively in order to distinguish the difference between the standard model and the user test video coordinates, that is, x1, y1 and z1 represent coordinate parameters, and using x, y and z to represent corresponding coordinate parameters of the standard model coordinates, then the average difference of the two coordinates at the time point t is the score:

setting a system frame score qualification judgment threshold value H to be

And (3) the records lower than H are additionally stored in an array R, the mean differences corresponding to all the t are accumulated, superposed and averaged, and the mean difference value between the user finishing the action and the standard can be obtained, wherein the calculation formula is as follows:

through the algorithm, the

As one of the scoring results, is displayed on the user interface, and the rating is set: poor, qualified, good and excellent threshold valueJudging the threshold interval of the result score, and displaying the corresponding grade;

acquiring coordinate data of character nodes in a frame frequency image in a standard model after forced alignment and a user test video after duration correction which respectively correspond to a t time point, and acquiring a score which is mapped by the t time point and calculated based on a variance principle according to the coordinate data, wherein the coordinate value corresponding to the standard model after forced alignment is set as an average value in variance calculation, and the score calculation formula of the t time point is as follows:

setting a system frame score qualification judgment threshold G to be

Records below G are additionally stored in an array S;

wherein

and F is a percentile error deviation score, and a result of the F value is pushed to a user interface to serve as an auxiliary result of an average difference algorithm score and provide a reference for a fluctuation size score of a measured deviation of the action.

5. The method for intelligent action scoring based on standard model forced alignment as claimed in claim 4, characterized by comprising the steps and elements of: the method for comparing the action error pictures is implemented as follows:

step 1, in the arrays R and S recorded in the scoring technical method of claim 4, obtaining the parameter members of the intersection of R and S through the circular operation, and storing the members in a new array Q;

step 2, extracting action error pictures, sequentially reading member timestamp values in the Q array, performing destaticizing background processing on a standard model corresponding to the timestamp and a corresponding frame in a user test video, performing video preprocessing, setting a static judgment threshold value U, and firstly converting a unit of a video frame picture acquired by software into pixels;

further extracting RGB numerical values of mapping pixel points in the obtained frame data, comparing the RGB numerical values with RGB of pixel points at the position of a frame before the member timestamp in the current Q, respectively subtracting R/G/B in RGB, dividing the sum of the results by 3, comparing the result with a static judgment threshold value K, assigning the RGB numerical values of the pixel points to three fixed values u1, u2 and u3 if the result is less than the threshold value, obtaining the optimal visual effect through tests, wherein the fixed values are preferably 166, 166 and 166, and filtering out parts which do not generate motion and image change compared with the previous frame, wherein the parts include typical bone points which do not change;

preferably, the static decision threshold U has a value of 11;

further, optimizing a judgment threshold algorithm, judging by using the length of the specified span interval, and extending the operation of the judgment result to a range of half of the span value before and after the change, wherein the span value is a frame object

，

From x is 0 to

The value of Width, with an interval period length of k, is calculated as follows;

from y is 0 to

The value of Height, with interval period length k, is calculated as follows;

Corresponding RGB values are respectively

.R、

.G、

Corresponding RGB values are respectively

.R、

.G、

B, calculating the V value according to the following equation;

v is less than 0 then

RGB of the (X, Y) position is set as 166, and position points RGB of the sum X difference or the sum Y difference in the positive and negative k/2 value interval are set as 166, 166;

further performing a cycle of increasing y by 1, then performing a cycle of increasing x by 1, and processing the frame picture;

through the processing of the technical method, the member timestamp values in the Q array are obtained and respectively correspond to the standard model and the processed pictures of the corresponding frame pictures in the user video, the pictures are recorded into the system, an action score test image analysis interface is set, and a progress bar is set with the value from 0 to t0 as an interval; and carrying out corresponding identification on the member timestamp values in the Q array on the progress bar, wherein the identification corresponds to the pictures subjected to action error comparison, the pictures subjected to corresponding standard model processing and the pictures subjected to user video processing are separately placed in a comparison mode, a user clicks the identification, the two processed pictures are displayed at different positions of an interface, a play function button is arranged, and the two pictures corresponding to the identification are displayed according to a set time interval.