CN113469469A - Student physical ability score prediction method based on sectional loss function - Google Patents
Student physical ability score prediction method based on sectional loss function Download PDFInfo
- Publication number
- CN113469469A CN113469469A CN202111023318.8A CN202111023318A CN113469469A CN 113469469 A CN113469469 A CN 113469469A CN 202111023318 A CN202111023318 A CN 202111023318A CN 113469469 A CN113469469 A CN 113469469A
- Authority
- CN
- China
- Prior art keywords
- data
- student
- time period
- physical
- physical ability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000012512 characterization method Methods 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 60
- 201000010099 disease Diseases 0.000 claims description 30
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 30
- 238000012360 testing method Methods 0.000 claims description 22
- 230000037213 diet Effects 0.000 claims description 13
- 235000005911 diet Nutrition 0.000 claims description 13
- 230000036541 health Effects 0.000 claims description 11
- 238000011156 evaluation Methods 0.000 claims description 9
- 230000004617 sleep duration Effects 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 5
- 208000001145 Metabolic Syndrome Diseases 0.000 claims description 4
- 208000008589 Obesity Diseases 0.000 claims description 4
- 201000000690 abdominal obesity-metabolic syndrome Diseases 0.000 claims description 4
- 201000009310 astigmatism Diseases 0.000 claims description 4
- 208000001491 myopia Diseases 0.000 claims description 4
- 230000004379 myopia Effects 0.000 claims description 4
- 235000020824 obesity Nutrition 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 230000036314 physical performance Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 230000002159 abnormal effect Effects 0.000 description 3
- 230000005251 gamma ray Effects 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000002265 prevention Effects 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000005452 bending Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013501 data transformation Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 230000005802 health problem Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Tourism & Hospitality (AREA)
- Human Resources & Organizations (AREA)
- Physics & Mathematics (AREA)
- Educational Technology (AREA)
- Software Systems (AREA)
- Marketing (AREA)
- Educational Administration (AREA)
- General Business, Economics & Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Primary Health Care (AREA)
- Mathematical Physics (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a student physical ability score prediction method based on a sectional loss function, which comprises the following steps: step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13And D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data D23Is sparse data; step S2, for the data D collected in the step S113Data D23Filling is carried out to obtain data D14Data D24(ii) a Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24Constructing a second time period characteristic module; and step S4, based on the first time period characteristic module and the second time period characteristic module in the step S3, adopting XGboost modeling to predict the physical ability score of the student in the second time period.
Description
Technical Field
The invention belongs to the field of artificial intelligence, and relates to a student physical ability prediction method, a storage medium and a system, and a storage medium and a system based on a sectional loss function.
Background
The physical ability of students reflects the physical quality of the students, and the physical health problems of the students are always valued by the nation. Schools and parents often attach importance to the learning of children and neglect the physical health of students. At present, many school schools test physical ability of students every year to know physical ability states of the students. In the prior art, the following method is adopted for a physical ability test field of students: selecting a plurality of tests from the items of vital capacity, 50-meter running, forward bending of a sitting body, rope skipping for one minute, back running, sit-up, step test, pull-up, standing long jump, balls and the like, integrating the scores of each test and giving out the comprehensive scores of physical ability of students. The inventor finds that the current student physical ability test method at least has the following problems: firstly, students need to perform multiple tests, which is time-consuming and labor-consuming; secondly, schools can only test physical abilities of students once every year, cannot observe the current physical abilities of the students in time, cannot remind students with recently reduced physical abilities to keep healthy living states in time, and actively perform physical exercise; thirdly, the conventional prediction algorithm cannot process data which are large in dimensionality and large in density difference and easily cause severe model overfitting, and the student physical fitness data prediction condition is accurately and effectively obtained.
XGboost is a gradient learning algorithm proposed by Chentianqi, and has good generalization performance. The XGboost is improved on the GBDT algorithm, so that the algorithm performance and the operation speed are improved. The XGboost uses the CART tree as a base learner, making the following improvements in performance, speed, and prevention of overfitting: (1) performance: the XGboost introduces second-order Taylor expansion to the loss function and supports the self-defined function, the function which can continuously guide the second order can be used as the loss function as long as the second-order continuous function can be met, and the second-order Taylor expansion can be closer to the real loss function. In addition, XGBoost finds the best split point by using feature pre-ordering + caching, bin point approximation and parallel lookup in order to minimize the mean square error. (2) Speed: the traditional Boosting algorithm cannot calculate in a parallel mode, and therefore the speed is low. XGboost supports parallelization. Before training, data is sorted in advance and stored as a block structure, and the structure is repeatedly utilized in an iteration process. When node splitting is performed, the gain of each feature needs to be calculated, and the feature with large gain is selected for splitting, so that the gain calculation of each feature can be performed in parallel. (3) Prevention of overfitting: the XGboost introduces L1 and L2 regularization in the objective function to control the complexity of the model; introducing a shrinkage algorithm, introducing a shrinkage coefficient after each iteration is completed, and weakening the influence of each tree and leaf on the result; and the characteristics are subjected to column sampling, so that overfitting can be prevented, and the operation speed can be increased.
Disclosure of Invention
The embodiment of the invention aims to provide a student physical ability score prediction method based on a sectional loss function, aiming at the technical problems that the existing student physical ability detection method cannot effectively process data which are large in number of dimensions and large in density difference and easily cause severe model overfitting, and further cannot predict the physical ability state of a student in time.
A student physical ability score prediction method based on a segmented loss function comprises the following steps:
step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13Data D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data D23Is sparse data;
step S2, for the data D collected in the step S113Data D23Filling is carried out to obtain data D14Data D24;
Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24Constructing a second time period characteristic module;
step S4, based on the first time period characteristic module and the second time period characteristic module in the step S3, XGboost modeling is adopted to predict the physical ability score of the student in the second time period;
wherein the loss function is a piecewise function of size segmentation based on the hyper-parameter.
Preferably, the piecewise loss function/1(y, f (x)) is:
wherein δ is a hyper-parameter, f (x) represents a predicted value of the fitness score, and y represents a true value of the fitness score.
Preferably, the data D11Including physical fitness evaluation data, health data, said data D12The data D22Including course data, disease data, said data D13The data D23Including diet data, exercise data, sleep data; the first time period is earlier than the second time period.
Preferably, the data D11The physical ability evaluation data in (1) comprises physical ability test types and physical ability test scores of all the physical ability test types, and the health data comprises the ages, BMI, metabolic syndrome classification, obesity classification, myopia, astigmatism and spectacle prescription conditions of students; the data D12The course data in the course comprises the times of physical courses, the times of culture courses, scores of the physical courses and average scores of the culture courses; the data D12The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D13The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration; the data D22The course arrangement data in the course comprises the times of physical courses and the times of culture courses; the dataD22The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D23The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration.
A student physical ability score prediction method based on a segmented loss function comprises the following steps:
step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13And D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data D23Is sparse data;
step S2, for the data D collected in the step S113Data D23Filling is carried out to obtain data D14Data D24;
Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24Constructing a second time period characteristic module;
step S4, based on the first time period characteristic module and the second time period characteristic module in the step S3, XGboost modeling is adopted to predict the physical ability score of the student in the second time period;
the loss function is a combination of a piecewise function segmented based on the size segmentation of the hyper-parameter and a regular term.
Preferably, the piecewise loss function/2(y, f (x)) is:
wherein, delta is a hyper-parameter, f (x) represents the predicted value of the physical ability score, y represents the physical ability scoreTrue number, T is the number of leaf nodes, wjRepresents the predicted value of the jth leaf node,、are coefficients.
Preferably, the data D11Including physical fitness evaluation data, health data, said data D12The data D22Including course data, disease data, said data D13The data D23Including diet data, exercise data, sleep data; the first time period is earlier than the second time period.
Preferably, the data D11The physical ability evaluation data in (1) comprises physical ability test types and physical ability test scores of all the physical ability test types, and the health data comprises the ages, BMI, metabolic syndrome classification, obesity classification, myopia, astigmatism and spectacle prescription conditions of students; the data D12The course data in the course comprises the times of physical courses, the times of culture courses, scores of the physical courses and average scores of the culture courses; the data D12The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D13The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration; the data D22The course arrangement data in the course comprises the times of physical courses and the times of culture courses; the data D22The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D23The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of any of the methods described above.
A student fitness score prediction system based on a segmented loss function, the system comprising one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for any of the methods described above.
Compared with the prior art, the method, the storage medium and the system for predicting the student physical ability of the segmented loss function have the following beneficial technical effects:
1. the XGboost algorithm is adopted, a plurality of characteristic factors influencing the physical performance of the student in a period of time before the student are collected and input for machine learning, the physical performance state of the student in a period of time in the recent past is predicted according to various data conditions of the student in the period of time in the recent past, the prediction accuracy is high, the calculation speed is high, the real-time performance is strong, and the technical problem that the physical performance state of the student cannot be obtained in time when a plurality of test items exist in the existing student physical performance prediction method is effectively solved.
2. According to the method, the loss functions corresponding to the steps are arranged in a stepped mode and optimized, the technical problem that the sensitivity of the XGboost to each piece of outlier physical ability data is high is solved, the number of data points of non-conductible points is effectively reduced, overfitting of an XGboost model is prevented, and the accuracy of physical ability prediction is improved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
The present embodiments are illustrated by way of example in the accompanying drawings which correspond in no way to the embodiments and in which elements having the same reference numeral designations represent like elements and the drawings are not to scale unless specifically indicated.
Fig. 1 is a flowchart of a student physical fitness score prediction method based on a segmented loss function according to an embodiment of the present invention.
Fig. 2 is a comparison graph of the predicted value and the actual value of the student physical ability score based on the segmented loss function according to the embodiment of the present invention.
Fig. 3 is a diagram of a student physical ability score prediction system based on a segmented loss function according to a third embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention will be described in detail below. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present invention, and the embodiments may be mutually incorporated and referred to without contradiction.
The first embodiment of the invention relates to a method for predicting student physical ability based on a segmented loss function, as shown in fig. 1, and the implementation mode is as follows:
step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13And D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data D23Is sparse data;
in this embodiment, the last school year is selected as the first time period, the obtained related data is used for the XGBoost model to perform machine learning, and the last month of the school year is selectedAnd as a second time period, the acquired data is used as the input of a trained XGboost model to predict and obtain the physical ability prediction result of the student in the last month of the school year. The data D11Comprises physical ability evaluation and physical health examination data, and the data D12Data D22The school can evaluate physical ability and perform health physical examination on students every year, and has detailed course arrangement data of each student and relatively comprehensive understanding of disease conditions of the students, so that data D of each student11Data D12Data D22The method is accurate, belongs to high-density data, basically has no missing value and does not need to be filled. The data D13Data D23Including diet data, motion data, sleep data, because student's number is more, and the habit of living is diverse, and school's weekday monitoring and statistics channel are not enough, and this type of data can have more deletion value, belongs to sparse data.
In this embodiment, the data preprocessing may adopt mature data preprocessing modes such as ETL and structured data transformation, and may be implemented based on multiple platforms such as SPARK.
The collected data is divided into high-density data and sparse data, and the high-density data and the sparse data are respectively processed and used according to the characteristics of the data in the subsequent steps, so that adverse effects on the accuracy of a prediction result due to improper data processing can be avoided to a certain extent.
Step S2, adopting interpolation method to the data D in the first time period and the second time period collected in the step S113Data D23The data D of the first time period is respectively obtained by filling the missing data, judging, eliminating and regenerating abnormal values during filling14And data D of a second time period24。
Before XGboost modeling prediction is carried out, interpolation filling is carried out on the sparse data by adopting an interpolation method, so that the sufficiency, comprehensiveness and accuracy of input data in the subsequent XGboost modeling are effectively guaranteed, and the physical ability prediction result is more accurate.
Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24A second time period characterization module is constructed. In the embodiment, the feature engineering extracted by the feature engineering is shown in specific features in table 1, and the data contained in the two constructed feature modules is shown in table 1. The feature engineering and the construction method of the feature module of multiple time periods can adopt various alternative known technologies provided by libraries such as spark mllib and the like. The number of samples varies with the number of students to be evaluated.
TABLE 1
The partial dimension input data is shown in table 2:
TABLE 2
And step S4, predicting the physical ability score of the student in the last month of the school year by adopting an XGboost model based on the feature module of the last school year in the step S3 and the feature module of the last month of the school year. The method specifically comprises the following steps:
step S41, preprocessing the data in the two feature modules in step S3.
Because in a short time, the influence of different factors in the feature module on the physical ability of the student is different, for example, the influence of diseases on the physical ability in a certain week is larger than the influence of insufficient time in sleeping, so that all the factors influencing the physical ability are input into the XGboost model for learning with the same weight, the data are unreasonable, the data are required to be preprocessed, the weight of each factor in the feature module in the physical ability prediction is adjusted, and then the accuracy of the final physical ability prediction result can be ensured by performing subsequent steps. The determination of the weight can be determined through feedback of the overlapping rate between the predicted value and the actual value of the historical data.
And step S42, taking the data preprocessed in the step S41 as input, and predicting the physical ability score of the student in the last month of the school year by adopting an XGboost model.
Before training, the XGboost model sequences data in advance, stores the data as a block structure, and repeatedly utilizes the structure in an iteration process. When node splitting is carried out, the gain of each feature is calculated, the feature with large gain is selected for splitting, and the gain calculation of each feature can be carried out in parallel. When the XBglost model is adopted for physical fitness prediction, data in the characteristic module in the step S3 is randomly divided into a training part and a testing part, the data of the training part is input into the XGboost model for machine learning, and the learning rate, the student sampling rate, the physical fitness characteristic sampling rate and the tree depth maximization of the XBglost model are optimized through the minimization loss function.
The loss function adopts a step-type loss function which is a piecewise function based on the hyper-parameter segmentation. In the present embodiment, the segmented loss function/1(y, f (x)) is:
wherein δ is a hyper-parameter, f (x) represents a predicted value of the fitness score, and y represents a true value of the fitness score. The magnitude of the value of δ represents the model's emphasis on the root mean square error or absolute mean error over different intervals. The determination method comprises the following steps: firstly, determining the floating range of the hyperparameter delta of the prediction model of the time based on the hyperparameter of the loss function of all historical prediction models, secondly, bringing the value in the range into n historical prediction models which are most similar to the data characteristics to be predicted in a mode of traversing the value in the range, taking the value as the hyperparameter of the loss function one by one, and selecting the value of the hyperparameter delta which can obtain the largest derivable range as the hyperparameter of the loss function of the prediction model of the current sample data.
When y-f (x) is ≦ δ, the loss function is more side-mean-squareRoot error when | y-f (x) & gtdoes not see>δ, the loss function is more focused on the absolute average error. The gamma ray lamp comprises gamma ray lamp and gamma ray lamp>Discrimination of the loss function of δ such that the loss function l1And (y, f (x)) has the advantages of error algorithms such as root mean square error, average error and the like in different intervals. The improvement reduces the sensitivity to data outliers, effectively improves the range of guidance, reduces the proportion of points which cannot be guided, and is more robust to abnormal values.
To avoid overfitting, a penalty term can be further added to the loss function, in another preferred embodiment, the segmented loss function l2(y, f (x)) is:
wherein f (x) represents the predicted value of the physical fraction, and y represents the true value of the physical fraction. T is the number of leaf nodes, wjThe predicted value of the jth leaf node is shown, and g and l are coefficients. Adding a penalty function of the penalty term further prevents overfitting of the model.
The XGboost model establishment, model iteration or other processes adopt the XGboost mature model establishment and prediction method well known to those skilled in the art, and the structure and the establishment method are detailed in common API software description and official documents, so that the details are not repeated here.
In this embodiment, by using the loss function, on the basis of stepwise distinguishing the corresponding loss function according to the difference between the predicted value and the true value, a regular term that can reflect the degree of model complexity is added, so that overfitting is effectively avoided, and the accuracy of the prediction result is improved, and the comparison between the predicted value and the true value is shown in fig. 2.
In summary, the student physical ability prediction method provided by the invention adopts the XGboost algorithm to input a plurality of characteristic factors influencing the student physical ability in a period before the student for machine learning, and then predicts the physical ability state of the student in a period after the student is in a recent period according to various data conditions of the student in a recent period, so that the prediction accuracy is high, the calculation speed is high, the real-time performance is strong, and the technical problem that the physical ability state of the student can not be obtained in time when a plurality of test items exist in the existing student physical ability prediction method is effectively solved. Secondly, the invention solves the technical problem that the sensitivity of the XGboost to each body performance data of the outlier is higher by optimizing the step loss function in a stepped manner according to the difference between the real value and the predicted value, effectively reduces the number of data points of the non-conductible point, has robustness to the abnormal value, prevents the over-fitting of the XGboost model, and improves the accuracy of body performance prediction.
A second embodiment of the invention relates to a computer readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method, storage medium and system for predicting a student physical fitness score based on a piecewise loss function as described above.
The third embodiment of the invention relates to a student physical ability score prediction system based on a segmented loss function, as shown in FIG. 3. The system includes one or more processors; a memory; and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising a data acquisition module, a data interpolation module, a data feature construction module, and an XGBOOST prediction module to execute the instructions of the segmented loss function based method for predicting student physical fitness provided by embodiment one.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.
Claims (10)
1. A student physical ability score prediction method based on a segmented loss function is characterized by comprising the following steps:
step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13Data D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data D23Is sparse data;
step S2, for the data D collected in the step S113Data D23Filling is carried out to obtain data D14Data D24;
Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24Constructing a second time period characteristic module;
step S4, based on the first time period characteristic module and the second time period characteristic module in the step S3, XGboost modeling is adopted to predict the physical ability score of the student in the second time period;
the segmented loss function is based on the hyperparametric segmentation and emphasizes the root mean square error or the absolute average error in different intervals.
3. The segmented loss function-based student physical fitness score prediction method of claim 1, wherein: the data D11Including physical fitness evaluation data, health data, said data D12The data D22Including course data, disease data, said data D13The data D23Including diet data, exercise data, sleep data; the first timeThe period of time is earlier than the second period of time.
4. The segmented loss function based student physical fitness score prediction method of claim 3, wherein: the data D11The physical ability evaluation data in (1) comprises physical ability test types and physical ability test scores of all the physical ability test types, and the health data comprises the ages, BMI, metabolic syndrome classification, obesity classification, myopia, astigmatism and spectacle prescription conditions of students; the data D12The course data in the course comprises the times of physical courses, the times of culture courses, scores of the physical courses and average scores of the culture courses; the data D12The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D13The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration; the data D22The course arrangement data in the course comprises the times of physical courses and the times of culture courses; the data D22The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D23The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration.
5. A student physical ability score prediction method based on a segmented loss function is characterized by comprising the following steps:
step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13And D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data, dataD23Is sparse data;
step S2, for the data D collected in the step S113Data D23Filling is carried out to obtain data D14Data D24;
Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24Constructing a second time period characteristic module;
step S4, based on the first time period characteristic module and the second time period characteristic module in the step S3, XGboost modeling is adopted to predict the physical ability score of the student in the second time period;
the segmented loss function is based on the hyper-parameter segmentation, and is a combination of a segmented function emphasizing root mean square errors or absolute average errors in different intervals and a regular term.
6. The segmented loss function based student physical fitness score prediction method of claim 5, wherein: said segmented loss function l2(y, f (x)) is:
7. The segmented loss function based student physical fitness score prediction method of claim 5, wherein: the data D11IncludedPhysical fitness evaluation data, health data, said data D12The data D22Including course data, disease data, said data D13The data D23Including diet data, exercise data, sleep data; the first time period is earlier than the second time period.
8. The segmented loss function based student physical fitness score prediction method of claim 7, wherein: the data D11The physical ability evaluation data in (1) comprises physical ability test types and physical ability test scores of all the physical ability test types, and the health data comprises the ages, BMI, metabolic syndrome classification, obesity classification, myopia, astigmatism and spectacle prescription conditions of students; the data D12The course data in the course comprises the times of physical courses, the times of culture courses, scores of the physical courses and average scores of the culture courses; the data D12The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D13The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration; the data D22The course arrangement data in the course comprises the times of physical courses and the times of culture courses; the data D22The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D23The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
10. A student fitness score prediction system based on a segmented loss function, the system comprising one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing the method of any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111023318.8A CN113469469A (en) | 2021-09-02 | 2021-09-02 | Student physical ability score prediction method based on sectional loss function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111023318.8A CN113469469A (en) | 2021-09-02 | 2021-09-02 | Student physical ability score prediction method based on sectional loss function |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113469469A true CN113469469A (en) | 2021-10-01 |
Family
ID=77867148
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111023318.8A Pending CN113469469A (en) | 2021-09-02 | 2021-09-02 | Student physical ability score prediction method based on sectional loss function |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113469469A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110111886A (en) * | 2019-05-16 | 2019-08-09 | 闻康集团股份有限公司 | A kind of intelligent interrogation system and method based on XGBoost disease forecasting |
CN110533214A (en) * | 2019-07-12 | 2019-12-03 | 北京航空航天大学 | A kind of subway passenger flow Forecasting Approach for Short-term based on XGBoost algorithm |
CN112990284A (en) * | 2021-03-04 | 2021-06-18 | 安徽大学 | Individual trip behavior prediction method, system and terminal based on XGboost algorithm |
-
2021
- 2021-09-02 CN CN202111023318.8A patent/CN113469469A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110111886A (en) * | 2019-05-16 | 2019-08-09 | 闻康集团股份有限公司 | A kind of intelligent interrogation system and method based on XGBoost disease forecasting |
CN110533214A (en) * | 2019-07-12 | 2019-12-03 | 北京航空航天大学 | A kind of subway passenger flow Forecasting Approach for Short-term based on XGBoost algorithm |
CN112990284A (en) * | 2021-03-04 | 2021-06-18 | 安徽大学 | Individual trip behavior prediction method, system and terminal based on XGboost algorithm |
Non-Patent Citations (1)
Title |
---|
FPGA开发圈: "机器学习者都应该知道的五种损失函数", 《HTTPS://WWW.SOHU.COM/A/237688127_292853》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110147450B (en) | Knowledge complementing method and device for knowledge graph | |
CN110163433B (en) | Ship flow prediction method | |
CN110390561B (en) | User-financial product selection tendency high-speed prediction method and device based on momentum acceleration random gradient decline | |
CN106022954B (en) | Multiple BP neural network load prediction method based on grey correlation degree | |
CN112530594B (en) | Hemodialysis complication long-term risk prediction system based on convolution survival network | |
CN110738271B (en) | Concentrate grade prediction method in zinc flotation process | |
CN110020712B (en) | Optimized particle swarm BP network prediction method and system based on clustering | |
CN108399434B (en) | Analysis and prediction method of high-dimensional time series data based on feature extraction | |
CN111144552A (en) | Multi-index grain quality prediction method and device | |
CN116598014A (en) | Medical missing data complement method based on graph attention mechanism and language big model | |
CN112990343A (en) | Water environment quality evaluation method based on artificial intelligence algorithm | |
CN116187835A (en) | Data-driven-based method and system for estimating theoretical line loss interval of transformer area | |
CN115795042A (en) | Knowledge graph completion method based on path and graph context | |
CN114942951A (en) | Fishing vessel fishing behavior analysis method based on AIS data | |
CN112651499A (en) | Structural model pruning method based on ant colony optimization algorithm and interlayer information | |
CN111832748A (en) | Electronic nose width learning method for performing regression prediction on concentration of mixed gas | |
CN111598580A (en) | XGboost algorithm-based block chain product detection method, system and device | |
CN113469469A (en) | Student physical ability score prediction method based on sectional loss function | |
CN116595465A (en) | High-dimensional sparse data outlier detection method and system based on self-encoder and data enhancement | |
CN115908909A (en) | Evolutionary neural architecture searching method and system based on Bayes convolutional neural network | |
CN109934352A (en) | The automatic evolvement method of model of mind | |
CN115376638A (en) | Physiological characteristic data analysis method based on multi-source health perception data fusion | |
CN113011086B (en) | Estimation method of forest biomass based on GA-SVR algorithm | |
CN114974581A (en) | Method for predicting and evaluating long-term death risk of hyperglycemia crisis | |
Nurmalasari et al. | Classification for Papaya Fruit Maturity Level with Convolutional Neural Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20211001 |