CN113469469A - Student physical ability score prediction method based on sectional loss function - Google Patents

Student physical ability score prediction method based on sectional loss function Download PDF

Info

Publication number
CN113469469A
CN113469469A CN202111023318.8A CN202111023318A CN113469469A CN 113469469 A CN113469469 A CN 113469469A CN 202111023318 A CN202111023318 A CN 202111023318A CN 113469469 A CN113469469 A CN 113469469A
Authority
CN
China
Prior art keywords
data
student
time period
physical
physical ability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111023318.8A
Other languages
Chinese (zh)
Inventor
吴和俊
王敏康
王玲
傅天涯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Huawang Information Technology Co ltd
Original Assignee
Hangzhou Huawang Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Huawang Information Technology Co ltd filed Critical Hangzhou Huawang Information Technology Co ltd
Priority to CN202111023318.8A priority Critical patent/CN113469469A/en
Publication of CN113469469A publication Critical patent/CN113469469A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Educational Technology (AREA)
  • Software Systems (AREA)
  • Marketing (AREA)
  • Educational Administration (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Primary Health Care (AREA)
  • Mathematical Physics (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a student physical ability score prediction method based on a sectional loss function, which comprises the following steps: step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13And D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data D23Is sparse data; step S2, for the data D collected in the step S113Data D23Filling is carried out to obtain data D14Data D24(ii) a Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24Constructing a second time period characteristic module; and step S4, based on the first time period characteristic module and the second time period characteristic module in the step S3, adopting XGboost modeling to predict the physical ability score of the student in the second time period.

Description

Student physical ability score prediction method based on sectional loss function
Technical Field
The invention belongs to the field of artificial intelligence, and relates to a student physical ability prediction method, a storage medium and a system, and a storage medium and a system based on a sectional loss function.
Background
The physical ability of students reflects the physical quality of the students, and the physical health problems of the students are always valued by the nation. Schools and parents often attach importance to the learning of children and neglect the physical health of students. At present, many school schools test physical ability of students every year to know physical ability states of the students. In the prior art, the following method is adopted for a physical ability test field of students: selecting a plurality of tests from the items of vital capacity, 50-meter running, forward bending of a sitting body, rope skipping for one minute, back running, sit-up, step test, pull-up, standing long jump, balls and the like, integrating the scores of each test and giving out the comprehensive scores of physical ability of students. The inventor finds that the current student physical ability test method at least has the following problems: firstly, students need to perform multiple tests, which is time-consuming and labor-consuming; secondly, schools can only test physical abilities of students once every year, cannot observe the current physical abilities of the students in time, cannot remind students with recently reduced physical abilities to keep healthy living states in time, and actively perform physical exercise; thirdly, the conventional prediction algorithm cannot process data which are large in dimensionality and large in density difference and easily cause severe model overfitting, and the student physical fitness data prediction condition is accurately and effectively obtained.
XGboost is a gradient learning algorithm proposed by Chentianqi, and has good generalization performance. The XGboost is improved on the GBDT algorithm, so that the algorithm performance and the operation speed are improved. The XGboost uses the CART tree as a base learner, making the following improvements in performance, speed, and prevention of overfitting: (1) performance: the XGboost introduces second-order Taylor expansion to the loss function and supports the self-defined function, the function which can continuously guide the second order can be used as the loss function as long as the second-order continuous function can be met, and the second-order Taylor expansion can be closer to the real loss function. In addition, XGBoost finds the best split point by using feature pre-ordering + caching, bin point approximation and parallel lookup in order to minimize the mean square error. (2) Speed: the traditional Boosting algorithm cannot calculate in a parallel mode, and therefore the speed is low. XGboost supports parallelization. Before training, data is sorted in advance and stored as a block structure, and the structure is repeatedly utilized in an iteration process. When node splitting is performed, the gain of each feature needs to be calculated, and the feature with large gain is selected for splitting, so that the gain calculation of each feature can be performed in parallel. (3) Prevention of overfitting: the XGboost introduces L1 and L2 regularization in the objective function to control the complexity of the model; introducing a shrinkage algorithm, introducing a shrinkage coefficient after each iteration is completed, and weakening the influence of each tree and leaf on the result; and the characteristics are subjected to column sampling, so that overfitting can be prevented, and the operation speed can be increased.
Disclosure of Invention
The embodiment of the invention aims to provide a student physical ability score prediction method based on a sectional loss function, aiming at the technical problems that the existing student physical ability detection method cannot effectively process data which are large in number of dimensions and large in density difference and easily cause severe model overfitting, and further cannot predict the physical ability state of a student in time.
A student physical ability score prediction method based on a segmented loss function comprises the following steps:
step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13Data D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data D23Is sparse data;
step S2, for the data D collected in the step S113Data D23Filling is carried out to obtain data D14Data D24
Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24Constructing a second time period characteristic module;
step S4, based on the first time period characteristic module and the second time period characteristic module in the step S3, XGboost modeling is adopted to predict the physical ability score of the student in the second time period;
wherein the loss function is a piecewise function of size segmentation based on the hyper-parameter.
Preferably, the piecewise loss function/1(y, f (x)) is:
Figure 424522DEST_PATH_IMAGE001
wherein δ is a hyper-parameter, f (x) represents a predicted value of the fitness score, and y represents a true value of the fitness score.
Preferably, the data D11Including physical fitness evaluation data, health data, said data D12The data D22Including course data, disease data, said data D13The data D23Including diet data, exercise data, sleep data; the first time period is earlier than the second time period.
Preferably, the data D11The physical ability evaluation data in (1) comprises physical ability test types and physical ability test scores of all the physical ability test types, and the health data comprises the ages, BMI, metabolic syndrome classification, obesity classification, myopia, astigmatism and spectacle prescription conditions of students; the data D12The course data in the course comprises the times of physical courses, the times of culture courses, scores of the physical courses and average scores of the culture courses; the data D12The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D13The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration; the data D22The course arrangement data in the course comprises the times of physical courses and the times of culture courses; the dataD22The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D23The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration.
A student physical ability score prediction method based on a segmented loss function comprises the following steps:
step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13And D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data D23Is sparse data;
step S2, for the data D collected in the step S113Data D23Filling is carried out to obtain data D14Data D24
Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24Constructing a second time period characteristic module;
step S4, based on the first time period characteristic module and the second time period characteristic module in the step S3, XGboost modeling is adopted to predict the physical ability score of the student in the second time period;
the loss function is a combination of a piecewise function segmented based on the size segmentation of the hyper-parameter and a regular term.
Preferably, the piecewise loss function/2(y, f (x)) is:
Figure 184667DEST_PATH_IMAGE002
wherein, delta is a hyper-parameter, f (x) represents the predicted value of the physical ability score, y represents the physical ability scoreTrue number, T is the number of leaf nodes, wjRepresents the predicted value of the jth leaf node,
Figure 451701DEST_PATH_IMAGE003
Figure 650601DEST_PATH_IMAGE004
are coefficients.
Preferably, the data D11Including physical fitness evaluation data, health data, said data D12The data D22Including course data, disease data, said data D13The data D23Including diet data, exercise data, sleep data; the first time period is earlier than the second time period.
Preferably, the data D11The physical ability evaluation data in (1) comprises physical ability test types and physical ability test scores of all the physical ability test types, and the health data comprises the ages, BMI, metabolic syndrome classification, obesity classification, myopia, astigmatism and spectacle prescription conditions of students; the data D12The course data in the course comprises the times of physical courses, the times of culture courses, scores of the physical courses and average scores of the culture courses; the data D12The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D13The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration; the data D22The course arrangement data in the course comprises the times of physical courses and the times of culture courses; the data D22The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D23The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of any of the methods described above.
A student fitness score prediction system based on a segmented loss function, the system comprising one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for any of the methods described above.
Compared with the prior art, the method, the storage medium and the system for predicting the student physical ability of the segmented loss function have the following beneficial technical effects:
1. the XGboost algorithm is adopted, a plurality of characteristic factors influencing the physical performance of the student in a period of time before the student are collected and input for machine learning, the physical performance state of the student in a period of time in the recent past is predicted according to various data conditions of the student in the period of time in the recent past, the prediction accuracy is high, the calculation speed is high, the real-time performance is strong, and the technical problem that the physical performance state of the student cannot be obtained in time when a plurality of test items exist in the existing student physical performance prediction method is effectively solved.
2. According to the method, the loss functions corresponding to the steps are arranged in a stepped mode and optimized, the technical problem that the sensitivity of the XGboost to each piece of outlier physical ability data is high is solved, the number of data points of non-conductible points is effectively reduced, overfitting of an XGboost model is prevented, and the accuracy of physical ability prediction is improved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
The present embodiments are illustrated by way of example in the accompanying drawings which correspond in no way to the embodiments and in which elements having the same reference numeral designations represent like elements and the drawings are not to scale unless specifically indicated.
Fig. 1 is a flowchart of a student physical fitness score prediction method based on a segmented loss function according to an embodiment of the present invention.
Fig. 2 is a comparison graph of the predicted value and the actual value of the student physical ability score based on the segmented loss function according to the embodiment of the present invention.
Fig. 3 is a diagram of a student physical ability score prediction system based on a segmented loss function according to a third embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention will be described in detail below. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present invention, and the embodiments may be mutually incorporated and referred to without contradiction.
The first embodiment of the invention relates to a method for predicting student physical ability based on a segmented loss function, as shown in fig. 1, and the implementation mode is as follows:
step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13And D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data D23Is sparse data;
in this embodiment, the last school year is selected as the first time period, the obtained related data is used for the XGBoost model to perform machine learning, and the last month of the school year is selectedAnd as a second time period, the acquired data is used as the input of a trained XGboost model to predict and obtain the physical ability prediction result of the student in the last month of the school year. The data D11Comprises physical ability evaluation and physical health examination data, and the data D12Data D22The school can evaluate physical ability and perform health physical examination on students every year, and has detailed course arrangement data of each student and relatively comprehensive understanding of disease conditions of the students, so that data D of each student11Data D12Data D22The method is accurate, belongs to high-density data, basically has no missing value and does not need to be filled. The data D13Data D23Including diet data, motion data, sleep data, because student's number is more, and the habit of living is diverse, and school's weekday monitoring and statistics channel are not enough, and this type of data can have more deletion value, belongs to sparse data.
In this embodiment, the data preprocessing may adopt mature data preprocessing modes such as ETL and structured data transformation, and may be implemented based on multiple platforms such as SPARK.
The collected data is divided into high-density data and sparse data, and the high-density data and the sparse data are respectively processed and used according to the characteristics of the data in the subsequent steps, so that adverse effects on the accuracy of a prediction result due to improper data processing can be avoided to a certain extent.
Step S2, adopting interpolation method to the data D in the first time period and the second time period collected in the step S113Data D23The data D of the first time period is respectively obtained by filling the missing data, judging, eliminating and regenerating abnormal values during filling14And data D of a second time period24
Before XGboost modeling prediction is carried out, interpolation filling is carried out on the sparse data by adopting an interpolation method, so that the sufficiency, comprehensiveness and accuracy of input data in the subsequent XGboost modeling are effectively guaranteed, and the physical ability prediction result is more accurate.
Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24A second time period characterization module is constructed. In the embodiment, the feature engineering extracted by the feature engineering is shown in specific features in table 1, and the data contained in the two constructed feature modules is shown in table 1. The feature engineering and the construction method of the feature module of multiple time periods can adopt various alternative known technologies provided by libraries such as spark mllib and the like. The number of samples varies with the number of students to be evaluated.
TABLE 1
Figure 758628DEST_PATH_IMAGE005
The partial dimension input data is shown in table 2:
TABLE 2
Figure 751991DEST_PATH_IMAGE006
Figure 444004DEST_PATH_IMAGE007
And step S4, predicting the physical ability score of the student in the last month of the school year by adopting an XGboost model based on the feature module of the last school year in the step S3 and the feature module of the last month of the school year. The method specifically comprises the following steps:
step S41, preprocessing the data in the two feature modules in step S3.
Because in a short time, the influence of different factors in the feature module on the physical ability of the student is different, for example, the influence of diseases on the physical ability in a certain week is larger than the influence of insufficient time in sleeping, so that all the factors influencing the physical ability are input into the XGboost model for learning with the same weight, the data are unreasonable, the data are required to be preprocessed, the weight of each factor in the feature module in the physical ability prediction is adjusted, and then the accuracy of the final physical ability prediction result can be ensured by performing subsequent steps. The determination of the weight can be determined through feedback of the overlapping rate between the predicted value and the actual value of the historical data.
And step S42, taking the data preprocessed in the step S41 as input, and predicting the physical ability score of the student in the last month of the school year by adopting an XGboost model.
Before training, the XGboost model sequences data in advance, stores the data as a block structure, and repeatedly utilizes the structure in an iteration process. When node splitting is carried out, the gain of each feature is calculated, the feature with large gain is selected for splitting, and the gain calculation of each feature can be carried out in parallel. When the XBglost model is adopted for physical fitness prediction, data in the characteristic module in the step S3 is randomly divided into a training part and a testing part, the data of the training part is input into the XGboost model for machine learning, and the learning rate, the student sampling rate, the physical fitness characteristic sampling rate and the tree depth maximization of the XBglost model are optimized through the minimization loss function.
The loss function adopts a step-type loss function which is a piecewise function based on the hyper-parameter segmentation. In the present embodiment, the segmented loss function/1(y, f (x)) is:
Figure 181016DEST_PATH_IMAGE008
wherein δ is a hyper-parameter, f (x) represents a predicted value of the fitness score, and y represents a true value of the fitness score. The magnitude of the value of δ represents the model's emphasis on the root mean square error or absolute mean error over different intervals. The determination method comprises the following steps: firstly, determining the floating range of the hyperparameter delta of the prediction model of the time based on the hyperparameter of the loss function of all historical prediction models, secondly, bringing the value in the range into n historical prediction models which are most similar to the data characteristics to be predicted in a mode of traversing the value in the range, taking the value as the hyperparameter of the loss function one by one, and selecting the value of the hyperparameter delta which can obtain the largest derivable range as the hyperparameter of the loss function of the prediction model of the current sample data.
When y-f (x) is ≦ δ, the loss function is more side-mean-squareRoot error when | y-f (x) & gtdoes not see>δ, the loss function is more focused on the absolute average error. The gamma ray lamp comprises gamma ray lamp and gamma ray lamp>Discrimination of the loss function of δ such that the loss function l1And (y, f (x)) has the advantages of error algorithms such as root mean square error, average error and the like in different intervals. The improvement reduces the sensitivity to data outliers, effectively improves the range of guidance, reduces the proportion of points which cannot be guided, and is more robust to abnormal values.
To avoid overfitting, a penalty term can be further added to the loss function, in another preferred embodiment, the segmented loss function l2(y, f (x)) is:
Figure 642084DEST_PATH_IMAGE009
wherein f (x) represents the predicted value of the physical fraction, and y represents the true value of the physical fraction. T is the number of leaf nodes, wjThe predicted value of the jth leaf node is shown, and g and l are coefficients. Adding a penalty function of the penalty term further prevents overfitting of the model.
The XGboost model establishment, model iteration or other processes adopt the XGboost mature model establishment and prediction method well known to those skilled in the art, and the structure and the establishment method are detailed in common API software description and official documents, so that the details are not repeated here.
In this embodiment, by using the loss function, on the basis of stepwise distinguishing the corresponding loss function according to the difference between the predicted value and the true value, a regular term that can reflect the degree of model complexity is added, so that overfitting is effectively avoided, and the accuracy of the prediction result is improved, and the comparison between the predicted value and the true value is shown in fig. 2.
In summary, the student physical ability prediction method provided by the invention adopts the XGboost algorithm to input a plurality of characteristic factors influencing the student physical ability in a period before the student for machine learning, and then predicts the physical ability state of the student in a period after the student is in a recent period according to various data conditions of the student in a recent period, so that the prediction accuracy is high, the calculation speed is high, the real-time performance is strong, and the technical problem that the physical ability state of the student can not be obtained in time when a plurality of test items exist in the existing student physical ability prediction method is effectively solved. Secondly, the invention solves the technical problem that the sensitivity of the XGboost to each body performance data of the outlier is higher by optimizing the step loss function in a stepped manner according to the difference between the real value and the predicted value, effectively reduces the number of data points of the non-conductible point, has robustness to the abnormal value, prevents the over-fitting of the XGboost model, and improves the accuracy of body performance prediction.
A second embodiment of the invention relates to a computer readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method, storage medium and system for predicting a student physical fitness score based on a piecewise loss function as described above.
The third embodiment of the invention relates to a student physical ability score prediction system based on a segmented loss function, as shown in FIG. 3. The system includes one or more processors; a memory; and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising a data acquisition module, a data interpolation module, a data feature construction module, and an XGBOOST prediction module to execute the instructions of the segmented loss function based method for predicting student physical fitness provided by embodiment one.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.

Claims (10)

1. A student physical ability score prediction method based on a segmented loss function is characterized by comprising the following steps:
step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13Data D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data D23Is sparse data;
step S2, for the data D collected in the step S113Data D23Filling is carried out to obtain data D14Data D24
Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24Constructing a second time period characteristic module;
step S4, based on the first time period characteristic module and the second time period characteristic module in the step S3, XGboost modeling is adopted to predict the physical ability score of the student in the second time period;
the segmented loss function is based on the hyperparametric segmentation and emphasizes the root mean square error or the absolute average error in different intervals.
2. The segmented loss function-based student physical fitness score prediction method of claim 1, wherein: said segmented loss function l1(y, f (x)) is:
Figure 780987DEST_PATH_IMAGE001
wherein δ is a hyper-parameter, f (x) represents a predicted value of the fitness score, and y represents a true value of the fitness score.
3. The segmented loss function-based student physical fitness score prediction method of claim 1, wherein: the data D11Including physical fitness evaluation data, health data, said data D12The data D22Including course data, disease data, said data D13The data D23Including diet data, exercise data, sleep data; the first timeThe period of time is earlier than the second period of time.
4. The segmented loss function based student physical fitness score prediction method of claim 3, wherein: the data D11The physical ability evaluation data in (1) comprises physical ability test types and physical ability test scores of all the physical ability test types, and the health data comprises the ages, BMI, metabolic syndrome classification, obesity classification, myopia, astigmatism and spectacle prescription conditions of students; the data D12The course data in the course comprises the times of physical courses, the times of culture courses, scores of the physical courses and average scores of the culture courses; the data D12The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D13The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration; the data D22The course arrangement data in the course comprises the times of physical courses and the times of culture courses; the data D22The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D23The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration.
5. A student physical ability score prediction method based on a segmented loss function is characterized by comprising the following steps:
step S1, collecting data D needed by student physical ability prediction in the first time period11Data D12And data D13And D required for predicting physical ability of students in the second time period22And data D23Said data D11Data D12Data D22For high density data, the data D13Data, dataD23Is sparse data;
step S2, for the data D collected in the step S113Data D23Filling is carried out to obtain data D14Data D24
Step S3, using data D11Data D12Data D14Constructing a first time period characterization model using data D22Data D24Constructing a second time period characteristic module;
step S4, based on the first time period characteristic module and the second time period characteristic module in the step S3, XGboost modeling is adopted to predict the physical ability score of the student in the second time period;
the segmented loss function is based on the hyper-parameter segmentation, and is a combination of a segmented function emphasizing root mean square errors or absolute average errors in different intervals and a regular term.
6. The segmented loss function based student physical fitness score prediction method of claim 5, wherein: said segmented loss function l2(y, f (x)) is:
Figure 148994DEST_PATH_IMAGE002
wherein, delta is a hyper-parameter, f (x) represents a predicted value of the physical ability score, y represents a true value of the physical ability score, T is the number of leaf nodes, wjRepresents the predicted value of the jth leaf node,
Figure 866414DEST_PATH_IMAGE003
Figure 261624DEST_PATH_IMAGE004
are coefficients.
7. The segmented loss function based student physical fitness score prediction method of claim 5, wherein: the data D11IncludedPhysical fitness evaluation data, health data, said data D12The data D22Including course data, disease data, said data D13The data D23Including diet data, exercise data, sleep data; the first time period is earlier than the second time period.
8. The segmented loss function based student physical fitness score prediction method of claim 7, wherein: the data D11The physical ability evaluation data in (1) comprises physical ability test types and physical ability test scores of all the physical ability test types, and the health data comprises the ages, BMI, metabolic syndrome classification, obesity classification, myopia, astigmatism and spectacle prescription conditions of students; the data D12The course data in the course comprises the times of physical courses, the times of culture courses, scores of the physical courses and average scores of the culture courses; the data D12The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D13The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration; the data D22The course arrangement data in the course comprises the times of physical courses and the times of culture courses; the data D22The disease data in the (1) comprises whether the disease is ill, the ill frequency, the ill severity, the disease type, the student absenteeism frequency and the absenteeism days; the data D23The diet data in the school comprises average daily intake energy and energy required by students, the exercise data comprises average daily exercise amount, exercise duration and school-attending class exercise times, and the sleep data comprises average daily sleep duration.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
10. A student fitness score prediction system based on a segmented loss function, the system comprising one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing the method of any of claims 1-8.
CN202111023318.8A 2021-09-02 2021-09-02 Student physical ability score prediction method based on sectional loss function Pending CN113469469A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111023318.8A CN113469469A (en) 2021-09-02 2021-09-02 Student physical ability score prediction method based on sectional loss function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111023318.8A CN113469469A (en) 2021-09-02 2021-09-02 Student physical ability score prediction method based on sectional loss function

Publications (1)

Publication Number Publication Date
CN113469469A true CN113469469A (en) 2021-10-01

Family

ID=77867148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111023318.8A Pending CN113469469A (en) 2021-09-02 2021-09-02 Student physical ability score prediction method based on sectional loss function

Country Status (1)

Country Link
CN (1) CN113469469A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110111886A (en) * 2019-05-16 2019-08-09 闻康集团股份有限公司 A kind of intelligent interrogation system and method based on XGBoost disease forecasting
CN110533214A (en) * 2019-07-12 2019-12-03 北京航空航天大学 A kind of subway passenger flow Forecasting Approach for Short-term based on XGBoost algorithm
CN112990284A (en) * 2021-03-04 2021-06-18 安徽大学 Individual trip behavior prediction method, system and terminal based on XGboost algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110111886A (en) * 2019-05-16 2019-08-09 闻康集团股份有限公司 A kind of intelligent interrogation system and method based on XGBoost disease forecasting
CN110533214A (en) * 2019-07-12 2019-12-03 北京航空航天大学 A kind of subway passenger flow Forecasting Approach for Short-term based on XGBoost algorithm
CN112990284A (en) * 2021-03-04 2021-06-18 安徽大学 Individual trip behavior prediction method, system and terminal based on XGboost algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FPGA开发圈: "机器学习者都应该知道的五种损失函数", 《HTTPS://WWW.SOHU.COM/A/237688127_292853》 *

Similar Documents

Publication Publication Date Title
CN110147450B (en) Knowledge complementing method and device for knowledge graph
CN110163433B (en) Ship flow prediction method
CN110390561B (en) User-financial product selection tendency high-speed prediction method and device based on momentum acceleration random gradient decline
CN106022954B (en) Multiple BP neural network load prediction method based on grey correlation degree
CN112530594B (en) Hemodialysis complication long-term risk prediction system based on convolution survival network
CN110738271B (en) Concentrate grade prediction method in zinc flotation process
CN110020712B (en) Optimized particle swarm BP network prediction method and system based on clustering
CN108399434B (en) Analysis and prediction method of high-dimensional time series data based on feature extraction
CN111144552A (en) Multi-index grain quality prediction method and device
CN116598014A (en) Medical missing data complement method based on graph attention mechanism and language big model
CN112990343A (en) Water environment quality evaluation method based on artificial intelligence algorithm
CN116187835A (en) Data-driven-based method and system for estimating theoretical line loss interval of transformer area
CN115795042A (en) Knowledge graph completion method based on path and graph context
CN114942951A (en) Fishing vessel fishing behavior analysis method based on AIS data
CN112651499A (en) Structural model pruning method based on ant colony optimization algorithm and interlayer information
CN111832748A (en) Electronic nose width learning method for performing regression prediction on concentration of mixed gas
CN111598580A (en) XGboost algorithm-based block chain product detection method, system and device
CN113469469A (en) Student physical ability score prediction method based on sectional loss function
CN116595465A (en) High-dimensional sparse data outlier detection method and system based on self-encoder and data enhancement
CN115908909A (en) Evolutionary neural architecture searching method and system based on Bayes convolutional neural network
CN109934352A (en) The automatic evolvement method of model of mind
CN115376638A (en) Physiological characteristic data analysis method based on multi-source health perception data fusion
CN113011086B (en) Estimation method of forest biomass based on GA-SVR algorithm
CN114974581A (en) Method for predicting and evaluating long-term death risk of hyperglycemia crisis
Nurmalasari et al. Classification for Papaya Fruit Maturity Level with Convolutional Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211001