WO2018107906A1 - 一种训练分类模型的方法、数据分类的方法及装置 - Google Patents

一种训练分类模型的方法、数据分类的方法及装置 Download PDF

Info

Publication number
WO2018107906A1
WO2018107906A1 PCT/CN2017/107626 CN2017107626W WO2018107906A1 WO 2018107906 A1 WO2018107906 A1 WO 2018107906A1 CN 2017107626 W CN2017107626 W CN 2017107626W WO 2018107906 A1 WO2018107906 A1 WO 2018107906A1
Authority
WO
WIPO (PCT)
Prior art keywords
classification
actual
iteration
classification model
category
Prior art date
Application number
PCT/CN2017/107626
Other languages
English (en)
French (fr)
Inventor
尹红军
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018107906A1 publication Critical patent/WO2018107906A1/zh
Priority to US16/286,894 priority Critical patent/US11386353B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

Definitions

  • the present application relates to the field of data processing technologies, and in particular, to a training classification model, a method and device for data classification.
  • the eXtreme Gradient Boosting is a classification-based learning model based on the principle of the Gradient Boosting Decision Tree (GBDT). It is characterized by the use of a Central Processing Unit (CPU). Multi-threading, achieving high precision and fast computing speed classification.
  • Xgboost may cause classification errors when classifying, such as classifying primary school students into junior high school students or classifying primary school students into doctoral students, all of which are classified errors, that is, there are problems of inaccurate classification.
  • the penalty of the classification error will be punished, so as to gradually improve the accuracy of the model classification.
  • all the classification errors are given the same cost penalty, which is not conducive to quickly improve the classification accuracy of the model.
  • the embodiment of the present application provides a method for training a classification model, by introducing a category and a prediction classification of the actual classification in the gradient loss function of the initial classification model.
  • the distance factor of the gap between the categories can generate different sizes of residuals for different classification errors, so that the classification accuracy of the classification model can be quickly improved.
  • the embodiment of the present application also provides a corresponding data classification method, which can improve the accuracy of data classification.
  • the embodiment of the present application also provides a corresponding device.
  • the embodiment of the present application provides a method for training a classification model, including:
  • the training sample includes a training parameter and an actual classification corresponding to the training parameter
  • the initial classification model is classified and trained to obtain a prediction classification
  • the gradient loss function including a distance factor characterizing a gap between the first category and the second category, the first category The category to which the forecast is classified, and the second category is the category to which the actual category belongs;
  • the initial classification model is corrected to obtain a final classification model.
  • the training parameter is used to classify the initial classification model to obtain a prediction classification, including:
  • the training parameters are used to iteratively calculate the initial classification model, and the prediction classification generated by the classification model used in each iteration is obtained;
  • determining a residual between the actual classification and the prediction classification according to the gradient loss function included in the initial classification model including:
  • the initial classification model is corrected to obtain a final classification model, including:
  • the classification model used in the M round iteration is corrected, and the classification model used in the M+1 iteration is obtained. After at least one iteration correction, the final classification model is obtained, wherein the M round iteration is used.
  • the classification model is obtained by correcting the classification model used in the M-1 iteration according to the residual determined by the M-1 iteration, and M is a positive integer greater than 1.
  • determining a residual between the actual classification and the prediction classification generated by each iteration according to the gradient loss function included in the initial classification model including:
  • a residual between the actual classification and each round of the predicted classification is determined according to the training parameter, the actual classification, and a distance factor that characterizes the difference between the category to which the actual classification belongs and the category to which each of the prediction categories belongs.
  • determining, between the actual classification and each round of prediction classification, according to the training parameter, the actual classification, and a distance factor that represents a gap between a category to which the actual classification belongs and a category to which each of the prediction categories belongs Residuals including:
  • x i is the training parameter
  • i is a positive integer greater than 1
  • y ik is the actual classification
  • p k (x i ) is the prediction probability function of the kth iteration
  • F k (x i ) is the prediction function of the kth iteration.
  • D yk is a distance factor that characterizes the difference between the category of the actual classification and the category of the k- th prediction classification.
  • F′ k (x i ) is the modified prediction function of the k- th iteration
  • F′ l (x i ) is the first l rounded iterative correction prediction function
  • l takes values from 1 to K, where K is the number of classifications of the actual classification.
  • the initial classification can be performed according to residuals of different sizes.
  • the model is modified in a targeted manner to quickly improve the accuracy of the classification model.
  • the embodiment of the present application provides a data classification method, including:
  • the target classification model is a final classification model obtained according to any one of the foregoing methods for training the classification model
  • An embodiment of the present application provides an apparatus for training a classification model, including:
  • a sample obtaining unit configured to acquire a training sample, where the training sample includes a training parameter and an actual classification corresponding to the training parameter;
  • a model training unit configured to perform classification training on the initial classification model by using the training parameter obtained by the sample acquisition unit, to obtain a prediction classification
  • a residual determining unit configured to determine, according to a gradient loss function included in the initial classification model, a residual between the actual classification and the predicted classification trained by the model training unit, the gradient loss function characterizing the first category and the second a distance factor of the gap between the categories, the first category is the category to which the forecast category belongs, and the second category is the category to which the actual category belongs;
  • the model correction unit is configured to correct the initial classification model according to the residual determined by the residual determination unit to obtain a final classification model.
  • the model training unit is configured to perform an iterative calculation on the initial classification model using the training parameter, and obtain a prediction classification generated by the classification model used in each iteration;
  • the residual determination unit is configured to determine a residual between the actual classification and the prediction classification generated by each iteration according to the gradient loss function included in the initial classification model;
  • the model correction unit is configured to correct the classification model used in the M round iteration according to the residual determined by the M round iteration, obtain the classification model used in the M+1 iteration, and obtain the final classification model after at least one iteration correction.
  • the classification model used in the M round iteration is obtained by correcting the classification model used in the M-1 iteration according to the residual determined by the M-1 iteration, and M is a positive integer greater than 1.
  • the residual determining unit is configured to determine the actual according to the training parameter, the actual classification, and a distance factor that represents a gap between a category of the actual classification and a category of each of the predicted classification categories. The residual between the classification and each round of prediction classification.
  • the residual determining unit is configured to determine a residual between the predicted classification generated by the k-th iteration and the actual classification using the following formula;
  • x i is the training parameter
  • i is a positive integer greater than 1
  • y ik is the actual classification
  • p k (x i ) is the prediction probability function of the kth iteration
  • F k (x i ) is the prediction function of the kth iteration.
  • D yk is a distance factor that characterizes the difference between the category of the actual classification and the category of the k- th prediction classification.
  • F′ k (x i ) is the modified prediction function of the k- th iteration
  • F′ l (x i ) is the first l rounded iterative correction prediction function
  • l takes values from 1 to K, where K is the number of classifications of the actual classification.
  • the embodiments of the present application provide a beneficial effect of possible implementations of various parts in a device for training a classification model. See the foregoing beneficial effects of a method for training a classification model.
  • An embodiment of the present application provides an apparatus for data classification, including:
  • a data receiving unit configured to receive data to be classified
  • a data classification unit configured to classify the data to be classified received by the data receiving unit by using a target classification model, and obtain a classification result;
  • the target classification model is a training classification model according to any one of the foregoing training classification models The final classification model obtained by the device;
  • a data output unit configured to output the classification result obtained by classifying the data classification unit.
  • the embodiments of the present application provide a beneficial effect of possible implementations of various parts in a device for data classification, and refer to a method for data classification corresponding to the beneficial effects of the method.
  • An embodiment of the present application provides an apparatus for training a classification model, including:
  • the memory is configured to store program code and transmit the program code to the processor
  • the processor is operative to perform the method of training a classification model according to any one of the methods of training a classification model described above in accordance with an instruction in the program code.
  • An embodiment of the present application provides a device for data classification, where the device includes:
  • the memory is configured to store program code and transmit the program code to the processor
  • the processor is operative to perform a method of data classification according to any one of the foregoing methods of data classification in accordance with instructions in the program code.
  • the embodiments of the present application provide a beneficial effect of possible implementations of various parts in a device for data classification, and refer to a beneficial effect of a method corresponding to a data classification method.
  • the embodiment of the present application provides a storage medium for storing program code, the program code for performing the method for training a classification model in any one of the foregoing methods for training a classification model.
  • the embodiments of the present application provide a beneficial effect of possible implementations of various parts in a storage medium. See the beneficial effects of a method corresponding to a method for training a classification model.
  • the embodiment of the present application provides a storage medium for storing program code for performing a method for classifying data according to any one of the foregoing methods for data classification.
  • the embodiments of the present application provide a beneficial effect of possible implementations of various parts in a storage medium. See the beneficial effects of the method corresponding to a data classification method.
  • the embodiment of the present application provides a computer program product including instructions, which when executed on a computer, causes the computer to perform any of the foregoing methods of training a classification model to train a classification model.
  • the embodiments of the present application provide a beneficial effect of possible implementations of various parts of a computer program product including instructions. See the beneficial effects of a method corresponding to a method for training a classification model.
  • the embodiment of the present application provides a computer program product including instructions, when executed on a computer, causing the computer to perform a method of data classification according to any one of the foregoing methods for data classification.
  • the embodiments of the present application provide a beneficial effect of possible implementations of various parts of a computer program product including instructions. See the beneficial effects of the method corresponding to a data classification method.
  • the embodiment of the present application provides a method for training a classification model, including:
  • the terminal acquires a training sample, where the training sample includes a training parameter and an actual classification corresponding to the training parameter;
  • the terminal uses the training parameter to perform classification training on the initial classification model to obtain a prediction classification
  • the terminal Determining, by the terminal, a residual between the actual classification and the prediction classification according to a gradient loss function included in the initial classification model, the gradient loss function including a distance factor characterizing a gap between the first category and the second category, the first The category is the category to which the forecast category belongs, and the second category is the category to which the actual category belongs;
  • the terminal corrects the initial classification model according to the residual, and obtains a final classification model.
  • the terminal uses the training parameter to perform classification training on the initial classification model to obtain a prediction classification, including:
  • the terminal uses the training parameter to perform an iterative calculation on the initial classification model, and obtains a prediction classification generated by the classification model used in each iteration;
  • the terminal determines a residual between the actual classification and the predicted classification according to a gradient loss function included in the initial classification model, including:
  • the terminal determines a residual between the actual classification and the prediction classification generated by each iteration according to the gradient loss function included in the initial classification model;
  • the terminal corrects the initial classification model according to the residual, and obtains a final classification model, including:
  • the terminal corrects the classification model used in the M round iteration according to the residual determined by the M round iteration, and obtains the classification model used in the M+1 iteration. After at least one iteration correction, the final classification model is obtained, wherein the M round iteration
  • the classification model used is obtained by correcting the classification model used in the M-1 iteration according to the residual determined by the M-1 iteration, and M is a positive integer greater than 1.
  • the terminal determines a residual between the actual classification and the prediction classification generated by each iteration according to the gradient loss function included in the initial classification model, including:
  • the terminal determines the residual between the actual classification and each round of the predicted classification according to the training parameter, the actual classification, and a distance factor that characterizes the difference between the category to which the actual classification belongs and the category to which each of the prediction categories belongs.
  • the terminal determines the actual classification and each round of prediction classification according to the training parameter, the actual classification, and a distance factor that represents a gap between the category of the actual classification and the category of each of the predicted classification categories.
  • the terminal determines the residual between the predicted classification generated by the kth iteration and the actual classification using the following formula
  • x i is the parameter, i is a positive integer greater than 1
  • y ik is the actual classification
  • p k (x i ) is the prediction probability function of the kth iteration
  • F k (x i ) is the prediction function of the kth iteration.
  • D yk is a distance factor that characterizes the difference between the category of the actual classification and the category of the k- th prediction classification.
  • F′ k (x i ) is the modified prediction function of the k- th iteration
  • F′ l (x i ) is the first l rounded iterative correction prediction function
  • l takes values from 1 to K, where K is the number of classifications of the actual classification.
  • the embodiment of the present application provides a data classification method, including:
  • the terminal receives the data to be classified
  • the terminal classifies the data to be classified by using a target classification model to obtain a classification result; wherein the target classification model is a final classification model obtained by the method of training the classification model according to any one of the foregoing;
  • the terminal outputs the classification result.
  • the training sample includes a training parameter and an actual classification, and the actual classification is actually related to the training parameter, as compared with the method for training the classification model provided by the embodiment of the present application.
  • a distance factor may be introduced in the gradient loss function of the initial classification model, and the distance is utilized.
  • the factor indicates a difference between the category to which the actual classification belongs and the category to which the prediction category belongs, such that when different classification errors are generated, that is, when the degree of difference between the predicted classification and the actual classification is different, the corresponding
  • the distance factors may be different, so that the gradient loss function is different, so that the residual between the actual classification and the prediction classification determined according to the gradient loss function is different, and since the residuals of different sizes correspond to different degrees of classification errors, Therefore, the initial classification model can be performed based on residuals of different sizes. Amended targeted, you can quickly improve the accuracy of the classification model. Further, when the classification accuracy of the classification model is improved, when the classification model is used for data classification, the accuracy of the data classification is also improved.
  • FIG. 1 is a schematic diagram of an embodiment of a method for training a classification model in an embodiment of the present application
  • FIG. 2 is a schematic diagram of an embodiment of a method for data classification in an embodiment of the present application
  • FIG. 3 is a schematic diagram of an embodiment of an apparatus for training a classification model in an embodiment of the present application
  • FIG. 4 is a schematic diagram of an embodiment of an apparatus for classifying data in an embodiment of the present application.
  • FIG. 5 is a schematic diagram of another embodiment of an apparatus for training a classification model in an embodiment of the present application.
  • FIG. 6 is a schematic diagram of another embodiment of an apparatus for classifying data in an embodiment of the present application.
  • the embodiment of the present application provides a method for training a classification model by introducing a distance factor into a gradient loss function of an initial classification model, so that when different classification errors are generated, corresponding distance factors are different, thereby making the gradient loss function different. Therefore, the residual between the actual classification and the prediction classification determined according to the gradient loss function is different, and since the residuals of different sizes correspond to different degrees of classification errors, the residuals may be initialized according to different sizes.
  • the classification model is modified in a targeted manner to quickly improve the accuracy of the classification model.
  • the embodiment of the present application further provides a corresponding data classification method, and the classification model trained by the foregoing method is used for data classification, which can improve the accuracy of data classification.
  • the embodiment of the present application also provides a corresponding device. The details are described below separately.
  • Data classification is usually the merging of data with common attributes or characteristics. Data classification is widely used in many fields. For example, in the aspect of information promotion, the user's academic qualifications can be classified according to the historical browsing information of the user in the network, or the information can be browsed according to the history of the user in the network. The user's age is classified, so that it is convenient to push some information suitable for the degree or the age of the user to achieve accurate push.
  • xgboost is a classification model with high classification accuracy
  • eXtreme Gradient Boosting is an integrated learning model. Its basic idea is to combine hundreds of tree models with low classification accuracy to become a model with high accuracy. This model will iterate continuously, generating a new tree for each iteration.
  • the xgboost model uses the idea of gradient descent when generating a new tree for each iteration, that is, based on all the trees generated by the previous iteration, continuing the iteration toward minimizing the direction of the given objective function.
  • the prediction classification obtained by using the training samples may be different, that is, different classification errors occur, but the current method of training the xgboost classification model has the same residual for different classification errors.
  • the academic qualifications can be divided into seven categories: doctoral, master's degree, undergraduate, junior college, high school, junior high school and elementary school. Classifying a primary school student's training sample into junior high school and classifying a primary school student's training sample into a doctor's degree is a different classification error for the primary school's training sample, but the residuals generated by the two are equal, it is not easy to determine the classification model. The direction of the correction.
  • a method of training a classification model can quickly train a high-accuracy classification model.
  • the classification model provided by the embodiment of the present application is actually a classification regression model, and the classification model is not limited to being applied to data classification, and may also be applied to data regression.
  • an embodiment of a method for training a classification model provided by an embodiment of the present application includes:
  • the training classification model requires a large number of training samples.
  • Each training sample may include training parameters and actual classifications for training the classification model.
  • the actual classification may be the classification direction corresponding to the training parameters, and the actual classification is accurate, and the training parameters may be The actual classification has a series of parameters associated with it, and the training parameters correspond to the actual classification.
  • the training parameters may be various types of parameters such as favorite colors, sports types, dietary preferences, and dressing preferences, and the actual classification may be 18, 30, and 50 years old.
  • the training parameters may be the types of reading, the type of participation activities, and the type of public interest.
  • the actual classification may be doctoral, master's, undergraduate, junior college, high school, junior high school, and elementary school.
  • the initial classification model may be pre-developed by the developer and stored in the computer, and the training parameters may be input in the computer, and the initial classification model may start the iterative process.
  • each round of iteration can generate a prediction classification generated by the classification model used in the iteration of the round, and the prediction classification of each iteration can be used to optimize the classification model used in the next iteration.
  • the gradient loss function includes a distance that represents a gap between the first category and the second category. a factor, the first category is a category to which the prediction category belongs, and the second category is a category to which the actual category belongs.
  • the categories in the embodiment of the present application may be represented by a numerical label.
  • the category label corresponding to the academic category is as shown in Table 1 below:
  • Table 1 Category Label Table
  • Table 1 of this table is only an example.
  • the classification of academic qualifications is not limited to these types. It can also be in the categories of kindergarten, post-doctoral and secondary school. However, no matter how many categories, the principles are the same, and each category will correspond. There is a category label.
  • the training parameters are primary school students, the actual classification for primary school students is primary school, and the primary school has a label value of 6. If the primary school students are classified as doctors, then the prediction is classified as a doctor, and the doctor's label value is 0, which indicates between the primary school and the doctoral. The distance factor of the gap is 6. If the primary school students are classified as junior high school students, the prediction is classified as junior high school, and the junior high school label value is 5, and the distance factor indicating the gap between primary school and junior high school is 1.
  • the initial classification model can be modified according to the residuals of different sizes.
  • the training sample includes a training parameter and an actual classification, and the actual classification is actually related to the training parameter, as compared with the method for training the classification model provided by the embodiment of the present application.
  • a distance factor may be introduced in the gradient loss function of the initial classification model, and the distance factor is used to represent the difference between the category to which the actual classification belongs and the category to which the prediction classification belongs, such that when different When the classification is wrong, that is, when the degree of difference between the predicted classification and the actual classification is different, the corresponding distance factors are different, so that the gradient loss function is different, and thus the actual classification determined according to the gradient loss function Different from the residuals of the prediction classification, since the residuals of different sizes correspond to different degrees of classification errors, the initial classification model can be modified according to the residuals of different sizes, and the classification can be quickly improved.
  • the classification training is performed on the initial classification model by using the training parameters to obtain a prediction classification, which may include:
  • determining the residual between the actual classification and the prediction classification according to the gradient loss function included in the initial classification model may include:
  • the initial classification model is modified according to the residual to obtain a final classification model, which may include:
  • the classification model used in the M round iteration is corrected, and the classification model used in the M+1 iteration is obtained. After at least one iteration correction, the final classification model is obtained, wherein the M round iteration is used.
  • the classification model is obtained by correcting the classification model used in the M-1 iteration according to the residual determined by the M-1 iteration, and M is a positive integer greater than 1.
  • each round of iteration can obtain the classification model used in the iteration of the round to generate a prediction classification, for example, the round is the Mth round, and the prediction classification and actuality can be generated according to the classification model used in the M round iteration.
  • the initial classification model is optimized to obtain the classification model used in the second round of iteration, and then the second round of iterative operation is performed.
  • the determining a residual between the actual classification and the prediction classification generated by each iteration according to the gradient loss function included in the initial classification model includes:
  • a residual between the actual classification and each round of the predicted classification is determined according to the training parameter, the actual classification, and a distance factor that characterizes the difference between the category to which the actual classification belongs and the category to which each of the prediction categories belongs.
  • the actual classification when the distance between the actual classification and each round of the predicted classification is determined according to the training parameter, the actual classification, and the distance factor that characterizes the difference between the category to which the actual classification belongs and the category to which each of the prediction categories belongs.
  • the following formula may be used to determine the residual between the predicted classification generated by the k-th iteration and the actual classification;
  • x i is the training parameter
  • i is a positive integer greater than 1
  • y ik is the actual classification
  • p k (x i ) is the prediction probability function of the kth iteration
  • F k (x i ) is the prediction function of the kth iteration
  • D yk is a distance factor that characterizes the difference between the category of the actual classification and the category of the k- th prediction classification
  • F′ k (x i ) is the modified prediction function of the k- th iteration
  • F′ l (x i ) For the modified prediction function of the first round of iteration, the value of l is from 1 to K, where K is the number of classifications of the actual classification.
  • the xgboost classification model does not introduce the distance factor (original gradient loss function) training in the gradient loss function, and the prediction probability function of the kth iteration is The following uses the original gradient loss function as an example to illustrate the residual calculation process in the case of a classification error.
  • the embodiment of the present application provides different residuals for different classification errors, that is, different cost penalties are provided, thereby improving the accuracy of the xgboost classification model as a whole.
  • This technical solution can be used for order classification such as age and education.
  • an embodiment of a method for data classification provided by an embodiment of the present application includes:
  • the target classification model is used to classify the to-be-classified data to obtain a classification result.
  • the target classification model is a final classification model obtained by using the method for training a classification model in the foregoing embodiment.
  • the training sample includes a training parameter and an actual classification, and the actual classification is actually related to the training parameter, as compared with the method for training the classification model provided by the embodiment of the present application.
  • a distance factor may be introduced in the gradient loss function of the initial classification model, and the distance is utilized.
  • the factor indicates a difference between the category to which the actual classification belongs and the category to which the prediction category belongs, such that when different classification errors are generated, that is, when the degree of difference between the predicted classification and the actual classification is different, the corresponding
  • the distance factors may be different, so that the gradient loss function is different, so that the residual between the actual classification and the prediction classification determined according to the gradient loss function is different, and since the residuals of different sizes correspond to different degrees of classification errors, Therefore, the initial classification model can be performed based on residuals of different sizes. Amended targeted, you can quickly improve the accuracy of the classification model. Further, when the classification accuracy of the classification model is improved, when the classification model is used for data classification, the accuracy of the data classification is also improved.
  • an embodiment of an apparatus 30 for training a classification model provided by an embodiment of the present application includes:
  • a sample obtaining unit 301 configured to acquire a training sample, where the training sample includes a training parameter and an actual classification corresponding to the training parameter;
  • the model training unit 302 is configured to perform classification training on the initial classification model by using the training parameters acquired by the sample obtaining unit 301 to obtain a prediction classification.
  • a residual determining unit 303 configured to determine a residual between the actual classification and the predicted classification trained by the model training unit 302 according to a gradient loss function included in the initial classification model, the gradient loss function And including a distance factor that characterizes a gap between the first category and the second category, the first category is a category to which the predicted category belongs, and the second category is a category to which the actual category belongs;
  • the model modification unit 304 is configured to correct the initial classification model according to the residual determined by the residual determination unit 303 to obtain a final classification model.
  • the sample obtaining unit 301 acquires a training sample for training a classification model, where the training sample includes a training parameter and an actual classification corresponding to the training parameter; and the model training unit 302 uses the obtained by the sample acquiring unit 301.
  • the training parameters are classified and trained on the initial classification model to obtain a prediction classification; the residual determination unit 303 determines the actual classification and the prediction trained by the model training unit 302 according to the gradient loss function included in the initial classification model.
  • the classification correction unit 304 corrects the initial classification model according to the residual determined by the residual determination unit 303 to obtain a final classification model.
  • the training sample includes a training parameter and an actual classification, and the actual classification is actually related to the training parameter, as compared with the method for training the classification model provided by the embodiment of the present application.
  • a distance factor may be introduced in the gradient loss function of the initial classification model, and the distance is utilized.
  • the factor indicates a difference between the category to which the actual classification belongs and the category to which the prediction category belongs, such that when different classification errors are generated, that is, when the degree of difference between the predicted classification and the actual classification is different, the corresponding
  • the distance factors may be different, so that the gradient loss function is different, so that the residual between the actual classification and the prediction classification determined according to the gradient loss function is different, and since the residuals of different sizes correspond to different degrees of classification errors, Therefore, the initial classification model can be performed based on residuals of different sizes. Amended targeted, you can quickly improve the accuracy of the classification model. Further, when the classification accuracy of the classification model is improved, when the classification model is used for data classification, the accuracy of the data classification is also improved.
  • the model training unit is configured to perform iterative calculation on the initial classification model using the training parameters, and obtain a prediction classification generated by the classification model used in each iteration;
  • the residual determining unit is configured to determine a residual between the actual classification and the predicted classification generated by each iteration according to the gradient loss function included in the initial classification model;
  • the model correction unit is configured to correct the classification model used in the M round iteration according to the residual determined by the M round iteration, obtain the classification model used in the M+1 iteration, and obtain the final classification model after at least one iteration correction.
  • the classification model used in the M round iteration is obtained by correcting the classification model used in the M-1 iteration according to the residual determined by the M-1 iteration, and M is a positive integer greater than 1.
  • the residual determining unit is configured to determine, between the actual classification and each round of prediction classification, according to the training parameter, the actual classification, and a distance factor that characterizes a gap between a category to which the actual classification belongs and a category to which each of the prediction categories belongs Residual.
  • the residual determination unit is further configured to determine a residual between the predicted classification generated by the k-th iteration and the actual classification using a formula
  • x i is the training parameter
  • i is a positive integer greater than 1
  • y ik is the actual classification
  • p k (x i ) is the prediction probability function of the kth iteration
  • F k (x i ) is the prediction function of the kth iteration
  • D yk is a distance factor that characterizes the difference between the category of the actual classification and the category of the k- th prediction classification
  • F′ k (x i ) is the modified prediction function of the k- th iteration
  • F′ l (x i ) For the modified prediction function of the first round of iteration, the value of l is from 1 to K, where K is the number of classifications of the actual classification.
  • the apparatus for training the classification model provided by the embodiment of the present application can be understood by referring to the description in the foregoing method section, and details are not repeatedly described herein.
  • an embodiment of the apparatus 40 for data classification provided by the embodiment of the present application includes:
  • a data receiving unit 401 configured to receive data to be classified
  • the data classification unit 402 is configured to classify the to-be-classified data received by the data receiving unit 401 by using a target classification model to obtain a classification result; wherein the target classification model is a final obtained by the device according to the foregoing training classification model.
  • Classification model
  • the data output unit 403 is configured to output the classification result obtained by the data classification unit 402.
  • the data receiving unit 401 receives the data to be classified; the data classification unit 402 classifies the data to be classified received by the data receiving unit 401 by using a target classification model to obtain a classification result; wherein the target classification
  • the model is a final classification model obtained according to the apparatus for training the classification model described above; the data output unit 403 outputs the classification result obtained by the classification by the data classification unit 402.
  • the apparatus for classifying data provided by the embodiment of the present application improves the classification accuracy of the classification model, thereby improving the accuracy of data classification.
  • the target classification model in this embodiment may be obtained according to any embodiment of FIG. 3. Therefore, the apparatus 40 in this embodiment may include the unit included in any embodiment of FIG.
  • the device for training the classification model may be completed by a computing device such as a computer.
  • a computing device such as a computer.
  • the following describes the process for the computing device to train the classification model in combination with the form of the computing device.
  • FIG. 5 is a schematic structural diagram of an apparatus 50 for training a classification model according to an embodiment of the present application.
  • the apparatus 50 for training a classification model includes a processor 510, a memory 550, and a transceiver 530.
  • the memory 550 can include read only memory and random access memory, and provides operational instructions and data to the processor 510.
  • a portion of the memory 550 may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 550 stores elements, executable modules or data structures, or a subset thereof, or their extension set:
  • the operation instruction can be stored in the operating system
  • a training sample is obtained by the transceiver 530, where the training sample includes a training parameter and an actual classification corresponding to the training parameter;
  • the gradient loss function including a distance factor characterizing a gap between the first category and the second category,
  • the first category is a category to which the predicted category belongs
  • the second category is a category to which the actual category belongs;
  • the initial classification model is modified according to the residual to obtain a final classification model.
  • the training sample includes a training parameter and an actual classification, and the actual classification is actually related to the training parameter, as compared with the method for training the classification model provided by the embodiment of the present application.
  • a distance factor may be introduced in the gradient loss function of the initial classification model, and the distance is utilized.
  • the factor indicates a difference between the category to which the actual classification belongs and the category to which the prediction category belongs, such that when different classification errors are generated, that is, when the degree of difference between the predicted classification and the actual classification is different, the corresponding
  • the distance factors may be different, so that the gradient loss function is different, so that the residual between the actual classification and the prediction classification determined according to the gradient loss function is different, and since the residuals of different sizes correspond to different degrees of classification errors, Therefore, the initial classification model can be performed based on residuals of different sizes. Amended targeted, you can quickly improve the accuracy of the classification model. Further, when the classification accuracy of the classification model is improved, when the classification model is used for data classification, the accuracy of the data classification is also improved.
  • the processor 510 controls the operation of the device 50 that trains the classification model, which may also be referred to as a CPU (Central Processing Unit).
  • Memory 550 can include read only memory and random access memory and provides instructions and data to processor 510. A portion of the memory 550 may also include non-volatile random access memory (NVRAM).
  • the various components of the apparatus 50 for training the classification model are coupled together by a bus system 520 in a specific application.
  • the bus system 520 may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus. However, for clarity of description, various buses are labeled as bus system 520 in the figure.
  • Processor 510 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 510 or an instruction in a form of software.
  • the processor 510 described above may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or discrete hardware. Component.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 550, and the processor 510 reads the information in the memory 550 and performs the steps of the above method in combination with its hardware.
  • processor 510 is configured to:
  • the classification model used in the M round iteration is corrected, and the classification model used in the M+1 iteration is obtained. After at least one iteration correction, the final classification model is obtained, wherein the M round iteration is used.
  • the classification model is obtained by correcting the classification model used in the M-1 iteration according to the residual determined by the M-1 iteration, and M is a positive integer greater than 1.
  • processor 510 is configured to:
  • a residual between the actual classification and each round of the predicted classification is determined according to the training parameter, the actual classification, and a distance factor that characterizes the difference between the category to which the actual classification belongs and the category to which each of the prediction categories belongs.
  • the processor 510 may determine a residual between the predicted classification generated by the kth iteration and the actual classification using a formula
  • x i is the training parameter
  • i is a positive integer greater than 1
  • y ik is the actual classification
  • p k (x i ) is the prediction probability function of the kth iteration
  • F k (x i ) is the prediction function of the kth iteration
  • D yk is a distance factor that characterizes the difference between the category of the actual classification and the category of the k- th prediction classification
  • F′ k (x i ) is the modified prediction function of the k- th iteration
  • F′ l (x i ) For the modified prediction function of the first round of iteration, the value of l is from 1 to K, where K is the number of classifications of the actual classification.
  • the apparatus for training the classification model provided by the embodiment of the present application can be understood by referring to the related description in the parts of FIG. 1 to FIG. 4, and details are not repeated herein.
  • FIG. 6 is a schematic structural diagram of an apparatus 60 for data classification provided by an embodiment of the present application.
  • the device 60 for data classification includes a processor 610, a memory 650, and a transceiver 630, which may include read only memory and random access memory, and provide operational instructions and data to the processor 610.
  • a portion of the memory 650 can also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 650 stores the following elements, executable modules or data structures, or a subset thereof, or their extended set:
  • the operation instruction can be stored in the operating system
  • the target classification model is a final classification model obtained by the device 50 for training the classification model according to the foregoing embodiment
  • the classification result is output through the transceiver 630.
  • the data classification device Compared with the data classification accuracy in the prior art, the data classification device provided by the embodiment of the present application improves the classification accuracy of the classification model, thereby improving the accuracy of the data classification.
  • the processor 610 controls the operation of the device 60 for data classification, which may also be referred to as a CPU (Central Processing Unit).
  • Memory 650 can include read only memory and random access memory and provides instructions and data to processor 610. A portion of the memory 650 can also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the various components of the device 60 for data classification in a particular application are coupled together by a bus system 620, which may include, in addition to the data bus, a power bus, a control bus, a status signal bus, and the like. However, for clarity of description, various buses are labeled as bus system 620 in the figure.
  • Processor 610 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 610 or an instruction in a form of software.
  • the processor 610 described above may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or discrete hardware. Component.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 650, and the processor 610 reads the information in the memory 650 and performs the steps of the above method in combination with its hardware.
  • the target classification model in this embodiment may be obtained according to any embodiment of FIG. 5. Therefore, the processor 610 in this embodiment may execute the operation instruction executed in any embodiment of FIG. 5.
  • the apparatus for classifying data provided by the embodiment of the present application can be understood by referring to the related description in the parts of FIG. 1 to FIG. 4, and details are not repeatedly described herein.
  • the embodiment of the present application also provides a computer program product comprising instructions, when executed on a computer, causing the computer to perform the method of training a classification model according to any of the preceding embodiments.
  • the embodiment of the present application also provides a computer program product comprising instructions, when executed on a computer, causing the computer to perform the method of data classification according to any of the preceding embodiments.
  • the program may be stored in a computer readable storage medium, and the storage medium may include: ROM, RAM, disk or CD.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种训练分类模型的方法及装置,通过在初始分类模型的梯度损失函数中引入距离因子,该距离因子表示实际分类所属类别与预测分类所属类别之间的差距,这样,当产生不同的分类错误时,即预测分类与实际分类之间的差异程度不同时,对应的距离因子会不同,从而使得梯度损失函数不同,进而使得根据梯度损失函数确定出的实际分类与预测分类之间的残差不同,由于不同大小的残差对应不同程度的分类错误,因此,可以根据不同大小的残差对初始分类模型进行有针对性地的修正,可以快速提高分类模型的精度。本申请实施例还提供相应的数据分类的方法及装置。

Description

一种训练分类模型的方法、数据分类的方法及装置
本申请要求于2016年12月12日提交中国专利局、申请号201611139498.5、申请名称为“一种分类模型训练的方法、数据分类的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据处理技术领域,具体涉及训练分类模型、数据分类的方法及装置。
背景技术
集成树模型(eXtreme Gradient Boosting,简称Xgboost)是根据迭代的决策树(Gradient Boosting Decision Tree,GBDT)原理,基于C++实现的分类集成学习模型,其特点是利用中央处理器(Central Processing Unit,CPU)的多线程,实现高精度和快运算速度的分类。
Xgboost在进行分类时会发生分类错误的问题,如:将小学生分类成初中生或者将小学生分类成博士生,都属于分类错误,即存在分类不精确的问题。此外,在模型训练阶段会对分类错误的情况进行代价惩罚,从而逐步提高模型分类的精度,但是,目前对所有分类错误的情况都给予相同的代价惩罚,不利于快速提高模型的分类精度。
发明内容
为了解决现有技术中分类模型训练不精确、训练速度慢的问题,本申请实施例提供一种训练分类模型的方法,通过在初始分类模型的梯度损失函数中引入表征实际分类所属类别与预测分类所属类别之间差距的距离因子,可以针对不同的分类错误产生不同大小的残差,从而可以快速提高分类模型的分类精度。本申请实施例还提供了相应的数据分类方法,可以提高数据分类的精确度。本申请实施例还提供了相应的装置。
本申请实施例提供一种训练分类模型的方法,包括:
获取训练样本,该训练样本包括训练参数以及训练参数对应的实际分类;
使用该训练参数对初始分类模型进行分类训练,得到预测分类;
根据该初始分类模型所包含的梯度损失函数,确定该实际分类与该预测分类之间的残差,该梯度损失函数包括表征第一类别与第二类别之间差距的距离因子,该第一类别为该预测分类所属类别,该第二类别为该实际分类所属类别;
根据该残差,对该初始分类模型进行修正,得到最终分类模型。
在一种可能的实现方式中,使用该训练参数对初始分类模型进行分类训练,得到预测分类,包括:
使用该训练参数对初始分类模型进行迭代计算,得到每轮迭代所使用的分类模型产生的预测分类;
对应地,根据该初始分类模型所包含的梯度损失函数,确定该实际分类与该预测分类之间的残差,包括:
根据该初始分类模型所包含的梯度损失函数,确定该实际分类与每轮迭代产生的预测分类之间的残差;
对应地,根据该残差,对该初始分类模型进行修正,得到最终分类模型,包括:
根据M轮迭代确定的残差,修正M轮迭代所使用的分类模型,得到M+1轮迭代所使用的分类模型,通过至少一轮迭代修正后得到最终分类模型,其中,M轮迭代所使用的分类模型为根据第M-1轮迭代确定的残差对第M-1轮迭代所使用的分类模型进行修正得到的,M为大于1的正整数。
在一种可能的实现方式中,根据该初始分类模型所包含的梯度损失函数,确定该实际分类与每轮迭代产生的预测分类之间的残差,包括:
根据该训练参数、该实际分类、以及表征该实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定该实际分类与每轮预测分类之间的残差。
在一种可能的实现方式中,根据该训练参数、该实际分类、以及表征该实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定该实际分类与每轮预测分类之间的残差,包括:
使用如下公式确定第k轮迭代产生的预测分类与该实际分类之间的残差;
Figure PCTCN2017107626-appb-000001
Figure PCTCN2017107626-appb-000002
F′k(xi)=Dyk*Fk(xi)
其中,xi为该训练参数,i为大于1的正整数,yik为该实际分类,
Figure PCTCN2017107626-appb-000003
为第k轮迭代产生的预测分类与该实际分类之间的残差,pk(xi)为第k轮迭代的预测概率函数,Fk(xi)为第k轮迭代的预测函数,Dyk为表征该实际分类所属类别与第k轮预测分类所属类别之间差距的距离因子,F′k(xi)为第k轮迭代的修正预测函数,F′l(xi)为第l轮迭代的修正预测函数,l的取值从1到K,其中,K为该实际分类的分类数量。
在该实现方式中,通过在初始分类模型的梯度损失函数中引入表示实际分类所属类别与预测分类所属类别之间差距的距离因子,这样,当产生不同的分类错误时,即该预测分类与该实际分类之间的差异程度不同时,对应的距离因子会不同,从而使得梯度损失函数 不同,进而使得根据梯度损失函数确定出的该实际分类与该预测分类之间的残差不同,由于不同大小的残差对应不同程度的分类错误,因此,可以根据不同大小的残差对初始分类模型进行有针对性地的修正,可以快速提高分类模型的精度。
本申请实施例提供一种数据分类的方法,包括:
接收待分类数据;
使用目标分类模型对该待分类数据进行分类,得到分类结果;其中,该目标分类模型为根据前述一种训练分类模型的方法中任一项方法得到的最终分类模型;
输出该分类结果。
当利用前述一种训练分类模型的方法中任一项方法得到的最终分类模型进行数据分类时,由于该最终分类模型的分类精度较高,因此提高了数据分类的精确度。
本申请实施例提供一种训练分类模型的装置,包括:
样本获取单元,用于获取训练样本,该训练样本包括训练参数以及该训练参数对应的实际分类;
模型训练单元,用于使用该样本获取单元获取的该训练参数对初始分类模型进行分类训练,得到预测分类;
残差确定单元,用于根据该初始分类模型所包含的梯度损失函数,确定该实际分类与该模型训练单元训练的该预测分类之间的残差,该梯度损失函数表征第一类别与第二类别之间差距的距离因子,该第一类别为该预测分类所属类别,该第二类别为该实际分类所属类别;
模型修正单元,用于根据该残差确定单元确定的残差,对该初始分类模型进行修正,得到最终分类模型。
在一种可能的实现方式中,该模型训练单元,用于使用该训练参数对初始分类模型进行迭代计算,得到每轮迭代所使用的分类模型产生的预测分类;
该残差确定单元,用于根据该初始分类模型所包含的梯度损失函数,确定该实际分类与每轮迭代产生的预测分类之间的残差;
该模型修正单元,用于根据M轮迭代确定的残差,修正M轮迭代所使用的分类模型,得到M+1轮迭代所使用的分类模型,通过至少一轮迭代修正后得到最终分类模型,其中,M轮迭代所使用的分类模型为根据第M-1轮迭代确定的残差对第M-1轮迭代所使用的分类模型进行修正得到的,M为大于1的正整数。
在一种可能的实现方式中,该残差确定单元,用于根据该训练参数、该实际分类、以及表征该实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定该实际分类与每轮预测分类之间的残差。
在一种可能的实现方式中,该残差确定单元,用于使用如下公式确定第k轮迭代产生的预测分类与该实际分类之间的残差;
Figure PCTCN2017107626-appb-000004
Figure PCTCN2017107626-appb-000005
F′k(xi)=Dyk*Fk(xi)
其中,xi为该训练参数,i为大于1的正整数,yik为该实际分类,
Figure PCTCN2017107626-appb-000006
为第k轮迭代产生的预测分类与该实际分类之间的残差,pk(xi)为第k轮迭代的预测概率函数,Fk(xi)为第k轮迭代的预测函数,Dyk为表征该实际分类所属类别与第k轮预测分类所属类别之间差距的距离因子,F′k(xi)为第k轮迭代的修正预测函数,F′l(xi)为第l轮迭代的修正预测函数,l的取值从1到K,其中,K为该实际分类的分类数量。
本申请实施例提供一种训练分类模型的装置中各个部分可能的实现方式的有益效果,参见前述一种训练分类模型的方法中与之对应的方法的有益效果。
本申请实施例提供一种数据分类的装置,包括:
数据接收单元,用于接收待分类数据;
数据分类单元,用于使用目标分类模型对该数据接收单元接收的待分类数据进行分类,得到分类结果;其中,该目标分类模型为根据前述一种训练分类模型的装置中任一项训练分类模型的装置得到的最终分类模型;
数据输出单元,用于输出该数据分类单元分类得到的该分类结果。
本申请实施例提供一种数据分类的装置中各个部分可能的实现方式的有益效果,参见一种数据分类的方法中与之对应的方法有益效果。
本申请实施例提供一种训练分类模型的设备,包括:
处理器以及存储器;
该存储器用于存储程序代码,并将该程序代码传输给该处理器;
该处理器用于根据该程序代码中的指令执行前述一种训练分类模型的方法中任一项训练分类模型的方法。
本申请实施例提供一种数据分类的设备,该设备包括:
处理器以及存储器;
该存储器用于存储程序代码,并将该程序代码传输给该处理器;
该处理器用于根据该程序代码中的指令执行前述一种数据分类的方法中任一项数据分类的方法。
本申请实施例提供一种数据分类的设备中各个部分可能的实现方式的有益效果,参见一种数据分类的方法中与之对应的方法的有益效果。
本申请实施例提供一种存储介质,该存储介质用于存储程序代码,该程序代码用于执行前述一种训练分类模型的方法中任一项训练分类模型的方法。
本申请实施例提供一种存储介质中各个部分可能的实现方式的有益效果,参见一种训练分类模型的方法中与之对应的方法的有益效果。
本申请实施例提供一种存储介质,该存储介质用于存储程序代码,该程序代码用于执行前述一种数据分类的方法中任一项数据分类的方法。
本申请实施例提供一种存储介质中各个部分可能的实现方式的有益效果,参见一种数据分类的方法中与之对应的方法的有益效果。
本申请实施例提供一种包括指令的计算机程序产品,当其在计算机上运行时,使得该计算机执行前述一种训练分类模型的方法中任一项训练分类模型的方法。
本申请实施例提供一种包括指令的计算机程序产品中各个部分可能的实现方式的有益效果,参见一种训练分类模型的方法中与之对应的方法的有益效果。
本申请实施例提供一种包括指令的计算机程序产品,当其在计算机上运行时,使得该计算机执行前述一种数据分类的方法中任一项数据分类的方法。
本申请实施例提供一种包括指令的计算机程序产品中各个部分可能的实现方式的有益效果,参见一种数据分类的方法中与之对应的方法的有益效果。
本申请实施例提供一种训练分类模型的方法,包括:
终端获取训练样本,该训练样本包括训练参数以及该训练参数对应的实际分类;
终端使用该训练参数对初始分类模型进行分类训练,得到预测分类;
终端根据该初始分类模型所包含的梯度损失函数,确定该实际分类与该预测分类之间的残差,该梯度损失函数包括表征第一类别与第二类别之间差距的距离因子,该第一类别为该预测分类所属类别,该第二类别为该实际分类所属类别;
终端根据该残差,对该初始分类模型进行修正,得到最终分类模型。
在一种可能的实现方式中,该终端使用该训练参数对初始分类模型进行分类训练,得到预测分类,包括:
终端使用该训练参数对初始分类模型进行迭代计算,得到每轮迭代所使用的分类模型产生的预测分类;
对应地,该终端根据该初始分类模型所包含的梯度损失函数,确定该实际分类与该预测分类之间的残差,包括:
终端根据该初始分类模型所包含的梯度损失函数,确定该实际分类与每轮迭代产生的预测分类之间的残差;
对应地,该终端根据该残差,对该初始分类模型进行修正,得到最终分类模型,包括:
终端根据M轮迭代确定的残差,修正M轮迭代所使用的分类模型,得到M+1轮迭代所使用的分类模型,通过至少一轮迭代修正后得到最终分类模型,其中,M轮迭代所使用的分类模型为根据第M-1轮迭代确定的残差对第M-1轮迭代所使用的分类模型进行修正得到的,M为大于1的正整数。
在一种可能的实现方式中,该终端根据该初始分类模型所包含的梯度损失函数,确定该实际分类与每轮迭代产生的预测分类之间的残差,包括:
终端根据该训练参数、该实际分类、以及表征该实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定该实际分类与每轮预测分类之间的残差。
在一种可能的实现方式中,终端根据该训练参数、该实际分类、以及表征该实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定该实际分类与每轮预测分类之间的残差,包括:
终端使用如下公式确定第k轮迭代产生的预测分类与该实际分类之间的残差;
Figure PCTCN2017107626-appb-000007
Figure PCTCN2017107626-appb-000008
F′k(xi)=Dyk*Fk(xi)
其中,xi为该参数,i为大于1的正整数,yik为该实际分类,
Figure PCTCN2017107626-appb-000009
为第k轮迭代产生的预测分类与该实际分类之间的残差,pk(xi)为第k轮迭代的预测概率函数,Fk(xi)为第k轮迭代的预测函数,Dyk为表征该实际分类所属类别与第k轮预测分类所属类别之间差距的距离因子,F′k(xi)为第k轮迭代的修正预测函数,F′l(xi)为第l轮迭代的修正预测函数,l的取值从1到K,其中,K为该实际分类的分类数量。
本申请实施例提供一种数据分类的方法,包括:
终端接收待分类数据;
终端使用目标分类模型对该待分类数据进行分类,得到分类结果;其中,该目标分类模型为根据前述任一项训练分类模型的方法得到的最终分类模型;
终端输出该分类结果。
与现有技术中分类模型训练不精确、训练速度慢相比,本申请实施例提供的训练分类模型的方法中,所述训练样本包括训练参数和实际分类,所述实际分类是与训练参数实际对应的分类,在对初始分类模型进行分类训练得到预测分类后,由于所述预测分类与所述实际分类可能不同,因此,可以在初始分类模型的梯度损失函数中引入距离因子,利用所述距离因子表示所述实际分类所属类别与所述预测分类所属类别之间的差距,这样,当产生不同的分类错误时,即所述预测分类与所述实际分类之间的差异程度不同时,对应的距离因子会不同,从而使得梯度损失函数不同,进而使得根据梯度损失函数确定出的所述实际分类与所述预测分类之间的残差不同,由于不同大小的残差对应不同程度的分类错误,因此,可以根据不同大小的残差对初始分类模型进行有针对性地的修正,可以快速提高分类模型的精度。进一步地,当分类模型的分类精确度提高后,当利用该分类模型进行数据分类时,也提高了数据分类的精确度。
附图说明
图1是本申请实施例中训练分类模型的方法的一实施例示意图;
图2是本申请实施例中数据分类的方法的一实施例示意图;
图3是本申请实施例中训练分类模型的装置的一实施例示意图;
图4是本申请实施例中数据分类的装置的一实施例示意图;
图5是本申请实施例中训练分类模型的装置的另一实施例示意图;
图6是本申请实施例中数据分类的装置的另一实施例示意图。
具体实施方式
本申请实施例提供一种训练分类模型的方法,通过在初始分类模型的梯度损失函数中引入距离因子,这样,当产生不同的分类错误时,对应的距离因子会不同,从而使得梯度损失函数不同,进而使得根据梯度损失函数确定出的所述实际分类与所述预测分类之间的残差不同,由于不同大小的残差对应不同程度的分类错误,因此,可以根据不同大小的残差对初始分类模型进行有针对性地修正,可以快速提高分类模型的精度。本申请实施例还提供了相应的数据分类方法,使用了前述方法训练得到的分类模型进行数据分类,可以提高数据分类的精确度。本申请实施例还提供了相应的装置。以下分别进行详细说明。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。
数据分类通常是把具有共同属性或特征的数据归并在一起。数据分类在多个领域中都有广泛应用,例如:在信息推广方面,可以根据用户在网络中的历史浏览信息对该用户的学历进行分类,或者,可以根据用户在网络中的历史浏览信息对用户的年龄进行分类,从而方便为用户推送一些合适该学历或者该年龄的信息,以实现精准推送。
申请人经研究发现,在进行数据分类时,通常可以使用预先训练好的分类模型对数据进行分类,其中,xgboost是目前所使用的分类精度较高的分类模型,xgboost的全称是eXtreme Gradient Boosting。Boosting分类器属于集成学习模型,它基本思想是把成百上千个分类准确率较低的树模型组合起来,成为一个准确率很高的模型。这个模型会不断地迭代,每次迭代就生成一颗新的树。xgboost模型则是在每次迭代生成一棵新的树的时候采用梯度下降的思想,即以之前迭代生成的所有树为基础,向着最小化给定目标函数的方向继续进行迭代。
因使用当前方法训练xgboost分类模型时,利用训练样本得到的预测分类可能会不同,即发生不同的分类错误,但是当前训练xgboost分类模型的方法,针对不同的分类错误所产生的残差都是相同的,例如:以学历为例,可以将学历分为博士、硕士、本科、大专、高中、初中和小学七个类别。将一个小学生的训练样本分类到初中和将一个小学生的训练样本分类到博士,是针对该小学生的训练样本发生的不同分类错误,但是二者所产生的残差相等,则不容易确定该分类模型修正的方向。因此,为解决将训练样本分到不同类所产生的残差相等,导致无法根据残差对不同的分类错误进行针对性地修正,进而导致分类模型训练速度低下的问题,本申请实施例提供一种训练分类模型的方法,可以快速的训练出高精确度的分类模型。
由于分类和回归在数学模型中本质是一样的,二者区别在于分类处理的是离散数据,回归处理的是连续数据。所以本申请实施例所提供的分类模型实际上是分类回归模型,该分类模型不限于应用于数据分类,也可以应用于数据回归。
参阅图1,本申请实施例提供的训练分类模型的方法的一实施例包括:
101、获取训练样本,所述训练样本包括训练参数以及所述训练参数对应的实际分类。
训练分类模型需要大量的训练样本,每个训练样本中可以包括用于训练分类模型的训练参数和实际分类,实际分类可以是训练参数对应的分类方向,实际分类是准确的,训练参数可以是与实际分类有关联的一系列参数,训练参数与实际分类相对应。以训练年龄分类模型为例,训练参数可以是喜欢的颜色、运动种类、饮食偏好和着装偏好等种类的参数,实际分类可以为18岁、30岁和50岁等年龄数值。若以学历为例,训练参数可以是读书的类型、参加活动的类型和关注公众号的类型等种类的参数,实际分类可以是博士、硕士、本科、大专、高中、初中和小学。
102、使用所述训练参数对初始分类模型进行分类训练,得到预测分类。
其中,初始分类模型可以是开发人员预先开发好并存入计算机的,在该计算机中输入训练参数,该初始分类模型即可以开始迭代过程。
本申请实施例中,每轮迭代可以产生该轮迭代所使用的分类模型产生的预测分类,每轮迭代的预测分类都可以用于优化下一轮迭代所使用的分类模型。
103、根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与所述预测分类之间的残差,所述梯度损失函数包括表征第一类别与第二类别之间差距的距离因子,所述第一类别为所述预测分类所属类别,所述第二类别为所述实际分类所属类别。
本申请实施例中的类别可以用数值标签的形式来表示,例如:以类别为学历为例,学历类别对应的类别标签如下表1所示:
表1:类别标签表
类别 标签
博士 0
硕士 1
本科 2
大专 3
高中 4
初中 5
小学 6
当然,本处表1只是举例说明,学历类别的划分不限于这几种,还可以有幼儿园、博士后和中专等类别,然无论有多少种类别,原理都是相同的,每种类别都会对应有一个类别标签。
如果训练参数为小学生,那么对小学生的实际分类则为小学,小学的标签数值为6,若将小学生分类到了博士,那么预测分类为博士,博士的标签数值为0,则表征小学和博士之间差距的距离因子取值为6。若将小学生分类到了初中生,则预测分类为初中,初中的标签数值为5,则表征小学和初中之间差距的距离因子取值为1。
可见,在使用分类模型时,将小学生错分为初中和错分为博士的两种分类错误产生的距离因子不同,从而使得二者的梯度损失函数不同,进而使得根据梯度损失函数确定出的所述实际分类与所述预测分类之间的残差不同。由于不同大小的残差对应不同程度的分类错误,接下来,便可以根据不同大小的残差对初始分类模型进行有针对性地的修正。
104、根据所述残差,对所述初始分类模型进行修正,得到最终分类模型。
与现有技术中分类模型训练不精确、训练速度慢相比,本申请实施例提供的训练分类模型的方法中,所述训练样本包括训练参数和实际分类,所述实际分类是与训练参数实际对应的分类,在对初始分类模型进行分类训练得到预测分类后,由于所述预测分类与所述 实际分类可能不同,因此,可以在初始分类模型的梯度损失函数中引入距离因子,利用所述距离因子表示所述实际分类所属类别与所述预测分类所属类别之间的差距,这样,当产生不同的分类错误时,即所述预测分类与所述实际分类之间的差异程度不同时,对应的距离因子会不同,从而使得梯度损失函数不同,进而使得根据梯度损失函数确定出的所述实际分类与所述预测分类之间的残差不同,由于不同大小的残差对应不同程度的分类错误,因此,可以根据不同大小的残差对初始分类模型进行有针对性地的修正,可以快速提高分类模型的精度。进一步地,当分类模型的分类精确度提高后,当利用该分类模型进行数据分类时,也提高了数据分类的精确度。
在本实施例中,由于最终分类模型是利用训练样本对初始分类模型不断地进行迭代、修正得到的,因此所述使用所述训练参数对初始分类模型进行分类训练,得到预测分类,可以包括:
使用所述训练参数对初始分类模型进行迭代计算,得到每轮迭代所使用的分类模型产生的预测分类;
对应地,所述根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与所述预测分类之间的残差,可以包括:
根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与每轮的预测结果之间的残差;
对应地,所述根据所述残差,对所述初始分类模型进行修正,得到最终分类模型,可以包括:
根据M轮迭代确定的残差,修正M轮迭代所使用的分类模型,得到M+1轮迭代所使用的分类模型,通过至少一轮迭代修正后得到最终分类模型,其中,M轮迭代所使用的分类模型为根据第M-1轮迭代确定的残差对第M-1轮迭代所使用的分类模型进行修正得到的,M为大于1的正整数。
本申请实施例中,每轮迭代都可以得到该轮迭代所使用的分类模型产生预测分类,例如:该轮为第M轮,可以根据第M轮迭代所使用的分类模型产生的预测分类与实际分类,确定第M轮的残差,用第M轮的残差修正第M轮所使用的分类模型,实现对分类模型的一次优化。例如:若M=1,则是使用训练参数训练初始分类模型,产生第一轮的预测分类,根据第一轮的预测分类和实际分类确定第一轮的残差,使用第一轮的残差优化该初始分类模型,得到第2轮迭代所使用的分类模型,再执行第2轮的迭代操作。
作为一种示例,所述根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与每轮迭代产生的预测分类之间的残差,包括:
根据该训练参数、该实际分类、以及表征该实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定该实际分类与每轮预测分类之间的残差。
在实施过程中,当根据该训练参数、该实际分类、以及表征该实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定该实际分类与每轮预测分类之间的残差时,具体可以使用如下公式确定第k轮迭代产生的预测分类与所述实际分类之间的残差;
Figure PCTCN2017107626-appb-000010
Figure PCTCN2017107626-appb-000011
F′k(xi)=Dyk*Fk(xi)
其中,xi为所述训练参数,i为大于1的正整数,yik为所述实际分类,
Figure PCTCN2017107626-appb-000012
为第k轮迭代产生的预测分类与所述实际分类之间的残差,pk(xi)为第k轮迭代的预测概率函数,Fk(xi)为第k轮迭代的预测函数,Dyk为表征所述实际分类所属类别与第k轮预测分类所属类别之间差距的距离因子,F′k(xi)为第k轮迭代的修正预测函数,F′l(xi)为第l轮迭代的修正预测函数,l的取值从1到K,其中,K为所述实际分类的分类数量。
需要说明的是,xgboost分类模型在梯度损失函数中未引入距离因子(原梯度损失函数)训练,第k轮迭代的预测概率函数为
Figure PCTCN2017107626-appb-000013
下面以使用原梯度损失函数为例说明在发生分类错误的情况下的残差计算过程。
还是以表1中的学历分类为例,下面有3个训练样本,如表2所示:
表2:学历分类训练样本
样本 label 学历 Xgboost标识label
y1 6 小学 y1=(0,0,0,0,0,0,1)
y2 5 初中 y2=(0,0,0,0,0,1,0)
y3 0 博士 y3=(1,0,0,0,0,0,0)
以对小学生的训练样本y1的预测过程为例:设第k-1棵树模型的预测分类为Fk-1(x)=(0,0,0,0.3,0,0.8,0),该预测分类结果则是把小学生预测为初中,那么第k棵树模型对应的残差为:
T arg etk=y1-pk-1
=(0,0,0,0,0,0,1)-(0.12,0.12,0.12,0.16,0.12,0.26,0.12)
=(-0.12,-0.12,-0.12,-0.16,-0.12,-0.26,0.88)
假设预测分类为Fk-1(x)=(0.8,0,0,0.3,0,0,0),该预测分类结果则是把小学生预测为博士,那么第k棵树模型对应的残差为:
T arg etk=y1-pk-1
=(0,0,0,0,0,0,1)-(0.26,0.12,0.12,0.16,0.12,0.12,0.12)
=(-0.26,-0.12,-0.12,-0.16,-0.12,-0.12,0.88)
从以上两个结果可以看出,预测分类为初中得到的残差和预测分类为博士得到的残差在数值上相等,只是在向量中的位置不同。
继续以表2中训练样本y1的预测过程为例,若使用本申请实施例中的梯度损失函数在发生分类错误的情况下计算残差,则残差的计算过程如下:
把小学生预测为初中时,
Figure PCTCN2017107626-appb-000014
所产生的残差为:
T arg etk=y1-pk-1
=(0,0,0,0,0,0,1)-(0.12,0.12,0.12,0.16,0.12,0.26,0.12)
=(-0.12,-0.12,-0.12,-0.16,-0.12,-0.26,0.88)
把小学生预测为博士时,
Figure PCTCN2017107626-appb-000015
所产生的残差为:
T arg etk=y1-pk-1
=(0,0,0,0,0,0,1)-(0.95,0.008,0.008,0.01,0.008,0.008,0.008)
=(-0.95,-0.008,-0.008,-0.01,-0.12,-0.008,0.92)
本申请上述示例中的T arg etk
Figure PCTCN2017107626-appb-000016
由以上两个结果的对比可见,使用本申请实施例中的梯度损失函数计算残差时,可以针对不同的分类错误,产生不同的残差,即在预测分类为初中得到的残差和预测分类为博士得到的残差是不同的,这样可以明确修改目标,有利于快速优化分类模型。
本申请实施例为不同的分类错误提供了不同的残差,即提供了不同的代价惩罚,从而整体上提高xgboost分类模型的精度。本技术方案可以用来做有序性分类比如年龄和学历等。
参阅图2,本申请实施例提供的数据分类的方法的一实施例包括:
201、接收待分类数据。
202、使用目标分类模型对所述待分类数据进行分类,得到分类结果;其中,所述目标分类模型为使用前述实施例中训练分类模型的方法得到的最终分类模型。
203、输出所述分类结果。
与现有技术中分类模型训练不精确、训练速度慢相比,本申请实施例提供的训练分类模型的方法中,所述训练样本包括训练参数和实际分类,所述实际分类是与训练参数实际对应的分类,在对初始分类模型进行分类训练得到预测分类后,由于所述预测分类与所述实际分类可能不同,因此,可以在初始分类模型的梯度损失函数中引入距离因子,利用所述距离因子表示所述实际分类所属类别与所述预测分类所属类别之间的差距,这样,当产生不同的分类错误时,即所述预测分类与所述实际分类之间的差异程度不同时,对应的距离因子会不同,从而使得梯度损失函数不同,进而使得根据梯度损失函数确定出的所述实际分类与所述预测分类之间的残差不同,由于不同大小的残差对应不同程度的分类错误,因此,可以根据不同大小的残差对初始分类模型进行有针对性地的修正,可以快速提高分类模型的精度。进一步地,当分类模型的分类精确度提高后,当利用该分类模型进行数据分类时,也提高了数据分类的精确度。
参阅图3,本申请实施例提供的训练分类模型的装置30的一实施例包括:
样本获取单元301,用于获取训练样本,所述训练样本包括训练参数以及所述训练参数对应的实际分类;
模型训练单元302,用于使用所述样本获取单元301获取的所述训练参数对初始分类模型进行分类训练,得到预测分类;
残差确定单元303,用于根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与所述模型训练单元302训练的所述预测分类之间的残差,所述梯度损失函数包括表征第一类别与第二类别之间差距的距离因子,所述第一类别为所述预测分类所属类别,所述第二类别为所述实际分类所属类别;
模型修正单元304,用于根据所述残差确定单元303确定的残差,对所述初始分类模型进行修正,得到最终分类模型。
本申请实施例中,样本获取单元301获取用于训练分类模型的训练样本,所述训练样本包括训练参数和与训练参数对应的实际分类;模型训练单元302使用所述样本获取单元301获取的所述训练参数对初始分类模型进行分类训练,得到预测分类;残差确定单元303根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与所述模型训练单元302训练的所述预测分类之间的残差,所述梯度损失函数包括表征第一类别与第二类别之间差距的距离因子,所述第一类别为所述预测分类所属类别,所述第二类别为所述实际分类所属类别;模型修正单元304根据所述残差确定单元303确定的残差,对所述初始分类模型进行修正,得到最终分类模型。
与现有技术中分类模型训练不精确、训练速度慢相比,本申请实施例提供的训练分类模型的方法中,所述训练样本包括训练参数和实际分类,所述实际分类是与训练参数实际对应的分类,在对初始分类模型进行分类训练得到预测分类后,由于所述预测分类与所述实际分类可能不同,因此,可以在初始分类模型的梯度损失函数中引入距离因子,利用所述距离因子表示所述实际分类所属类别与所述预测分类所属类别之间的差距,这样,当产生不同的分类错误时,即所述预测分类与所述实际分类之间的差异程度不同时,对应的距离因子会不同,从而使得梯度损失函数不同,进而使得根据梯度损失函数确定出的所述实际分类与所述预测分类之间的残差不同,由于不同大小的残差对应不同程度的分类错误,因此,可以根据不同大小的残差对初始分类模型进行有针对性地的修正,可以快速提高分类模型的精度。进一步地,当分类模型的分类精确度提高后,当利用该分类模型进行数据分类时,也提高了数据分类的精确度。
作为一种示例,本申请实施例提供的训练分类模型的装置30的另一实施例中,
所述模型训练单元,用于使用所述训练参数对初始分类模型进行迭代计算,得到每轮迭代所使用的分类模型产生的预测分类;
所述残差确定单元,用于根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与每轮迭代产生的预测分类之间的残差;
所述模型修正单元,用于根据M轮迭代确定的残差,修正M轮迭代所使用的分类模型,得到M+1轮迭代所使用的分类模型,通过至少一轮迭代修正后得到最终分类模型,其中,M轮迭代所使用的分类模型为根据第M-1轮迭代确定的残差对第M-1轮迭代所使用的分类模型进行修正得到的,M为大于1的正整数。
作为一种示例,本申请实施例提供的训练分类模型的装置30的另一实施例中,
所述残差确定单元,用于根据该训练参数、该实际分类、以及表征该实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定该实际分类与每轮预测分类之间的残差。
所述残差确定单元,还用于使用如下公式确定第k轮迭代产生的预测分类与所述实际分类之间的残差;
Figure PCTCN2017107626-appb-000017
Figure PCTCN2017107626-appb-000018
F′k(xi)=Dyk*Fk(xi)
其中,xi为所述训练参数,i为大于1的正整数,yik为所述实际分类,
Figure PCTCN2017107626-appb-000019
为第k轮迭代产生的预测分类与所述实际分类之间的残差,pk(xi)为第k轮迭代的预测概率函数, Fk(xi)为第k轮迭代的预测函数,Dyk为表征所述实际分类所属类别与第k轮预测分类所属类别之间差距的距离因子,F′k(xi)为第k轮迭代的修正预测函数,F′l(xi)为第l轮迭代的修正预测函数,l的取值从1到K,其中,K为所述实际分类的分类数量。
本申请实施例提供的训练分类模型的装置可以参阅前述方法部分的描述进行理解,本处不再重复赘述。
参阅图4,本申请实施例提供的数据分类的装置40的一实施例包括:
数据接收单元401,用于接收待分类数据;
数据分类单元402,用于使用目标分类模型对所述数据接收单元401接收的所述待分类数据进行分类,得到分类结果;其中,所述目标分类模型为根据前述训练分类模型的装置得到的最终分类模型;
数据输出单元403,用于输出所述数据分类单元402分类得到的所述分类结果。
本申请实施例中,数据接收单元401接收待分类数据;数据分类单元402使用目标分类模型对所述数据接收单元401接收的所述待分类数据进行分类,得到分类结果;其中,所述目标分类模型为根据前述训练分类模型的装置得到的最终分类模型;数据输出单元403输出所述数据分类单元402分类得到的所述分类结果。本申请实施例提供的数据分类的装置,因分类模型的分类精确度提高,从而也提高了数据分类的精确度。
需要说明的是,本实施例中的目标分类模型,可以根据图3任一实施例得到的,因此,本实施例中的装置40可以包括图3任一实施例包括的单元。
本申请实施例中,训练分类模型的装置可以由计算机等计算设备来完成,下面结合计算设备的形态,介绍计算设备用于训练分类模型的过程。
图5是本申请实施例提供的训练分类模型的装置50的结构示意图。所述训练分类模型的装置50包括处理器510、存储器550和收发器530,存储器550可以包括只读存储器和随机存取存储器,并向处理器510提供操作指令和数据。存储器550的一部分还可以包括非易失性随机存取存储器(NVRAM)。
在一些实施方式中,存储器550存储了如下的元素,可执行模块或者数据结构,或者他们的子集,或者他们的扩展集:
在本申请实施例中,通过调用存储器550存储的操作指令(该操作指令可存储在操作系统中),
通过收发器530获取训练样本,所述训练样本包括训练参数以及所述训练参数对应的实际分类;
使用所述训练参数对初始分类模型进行分类训练,得到预测分类;
根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与所述预测分类之间的残差,所述梯度损失函数包括表征第一类别与第二类别之间差距的距离因子,所述第一类别为所述预测分类所属类别,所述第二类别为所述实际分类所属类别;
根据所述残差,对所述初始分类模型进行修正,得到最终分类模型。
与现有技术中分类模型训练不精确、训练速度慢相比,本申请实施例提供的训练分类模型的方法中,所述训练样本包括训练参数和实际分类,所述实际分类是与训练参数实际对应的分类,在对初始分类模型进行分类训练得到预测分类后,由于所述预测分类与所述实际分类可能不同,因此,可以在初始分类模型的梯度损失函数中引入距离因子,利用所述距离因子表示所述实际分类所属类别与所述预测分类所属类别之间的差距,这样,当产生不同的分类错误时,即所述预测分类与所述实际分类之间的差异程度不同时,对应的距离因子会不同,从而使得梯度损失函数不同,进而使得根据梯度损失函数确定出的所述实际分类与所述预测分类之间的残差不同,由于不同大小的残差对应不同程度的分类错误,因此,可以根据不同大小的残差对初始分类模型进行有针对性地的修正,可以快速提高分类模型的精度。进一步地,当分类模型的分类精确度提高后,当利用该分类模型进行数据分类时,也提高了数据分类的精确度。
处理器510控制训练分类模型的装置50的操作,处理器510还可以称为CPU(Central Processing Unit,中央处理单元)。存储器550可以包括只读存储器和随机存取存储器,并向处理器510提供指令和数据。存储器550的一部分还可以包括非易失性随机存取存储器(NVRAM)。具体的应用中训练分类模型的装置50的各个组件通过总线系统520耦合在一起,其中总线系统520除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统520。
上述本申请实施例揭示的方法可以应用于处理器510中,或者由处理器510实现。处理器510可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器510中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器510可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器550,处理器510读取存储器550中的信息,结合其硬件完成上述方法的步骤。
作为一种示例,处理器510用于:
使用所述训练参数对初始分类模型进行迭代计算,得到每轮迭代所使用的分类模型产生的预测分类;
根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与每轮迭代产生的预测分类之间的残差;
根据M轮迭代确定的残差,修正M轮迭代所使用的分类模型,得到M+1轮迭代所使用的分类模型,通过至少一轮迭代修正后得到最终分类模型,其中,M轮迭代所使用的分类模型为根据第M-1轮迭代确定的残差对第M-1轮迭代所使用的分类模型进行修正得到的,M为大于1的正整数。
作为一种示例,处理器510用于:
根据该训练参数、该实际分类、以及表征该实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定该实际分类与每轮预测分类之间的残差。
在实施过程中,处理器510可以使用如下公式确定第k轮迭代产生的预测分类与所述实际分类之间的残差;
Figure PCTCN2017107626-appb-000020
Figure PCTCN2017107626-appb-000021
F′k(xi)=Dyk*Fk(xi)
其中,xi为所述训练参数,i为大于1的正整数,yik为所述实际分类,
Figure PCTCN2017107626-appb-000022
为第k轮迭代产生的预测分类与所述实际分类之间的残差,pk(xi)为第k轮迭代的预测概率函数,Fk(xi)为第k轮迭代的预测函数,Dyk为表征所述实际分类所属类别与第k轮预测分类所属类别之间差距的距离因子,F′k(xi)为第k轮迭代的修正预测函数,F′l(xi)为第l轮迭代的修正预测函数,l的取值从1到K,其中,K为所述实际分类的分类数量。
本申请实施例提供的训练分类模型的装置可以参阅图1至图4部分的相关描述进行理解,本处不再重复赘述。
图6是本申请实施例提供的数据分类的装置60的结构示意图。所述数据分类的装置60包括处理器610、存储器650和收发器630,存储器650可以包括只读存储器和随机存取存储器,并向处理器610提供操作指令和数据。存储器650的一部分还可以包括非易失性随机存取存储器(NVRAM)。
在一些实施方式中,存储器650存储了如下的元素,可执行模块或者数据结构,或者他们的子集,或者他们的扩展集:
在本申请实施例中,通过调用存储器650存储的操作指令(该操作指令可存储在操作系统中),
通过收发器630接收待分类数据;
使用目标分类模型对所述待分类数据进行分类,得到分类结果;其中,所述目标分类模型为根据前述实施例中训练分类模型的装置50得到的最终分类模型;
通过收发器630输出所述分类结果。
与现有技术中数据分类精确度不够高相比,本申请实施例提供的数据分类的装置,因分类模型的分类精确度提高,从而也提高了数据分类的精确度。
处理器610控制数据分类的装置60的操作,处理器610还可以称为CPU(Central Processing Unit,中央处理单元)。存储器650可以包括只读存储器和随机存取存储器,并向处理器610提供指令和数据。存储器650的一部分还可以包括非易失性随机存取存储器(NVRAM)。具体的应用中数据分类的装置60的各个组件通过总线系统620耦合在一起,其中总线系统620除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统620。
上述本申请实施例揭示的方法可以应用于处理器610中,或者由处理器610实现。处理器610可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器610中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器610可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器650,处理器610读取存储器650中的信息,结合其硬件完成上述方法的步骤。
需要说明的是,本实施例中的目标分类模型,可以根据图5任一实施例得到的,因此,本实施例中的处理器610可以执行图5任一实施例执行的操作指令。
本申请实施例提供的数据分类的装置可以参阅图1至图4部分的相关描述进行理解,本处不再重复赘述。
本申请实施例还提供了一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行前述实施例中任一项所述的训练分类模型的方法。
本申请实施例还提供了一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行前述实施例中任一项所述的数据分类的方法。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:ROM、RAM、磁盘或光盘等。
以上对本申请实施例所提供的分类模型训练的方法、数据分类的方法以及装置进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (21)

  1. 一种训练分类模型的方法,包括:
    获取训练样本,所述训练样本包括训练参数以及所述训练参数对应的实际分类;
    使用所述训练参数对初始分类模型进行分类训练,得到预测分类;
    根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与所述预测分类之间的残差,所述梯度损失函数包括表征第一类别与第二类别之间差距的距离因子,所述第一类别为所述预测分类所属类别,所述第二类别为所述实际分类所属类别;
    根据所述残差,对所述初始分类模型进行修正,得到最终分类模型。
  2. 根据权利要求1所述的方法,所述使用所述训练参数对初始分类模型进行分类训练,得到预测分类,包括:
    使用所述训练参数对初始分类模型进行迭代计算,得到每轮迭代所使用的分类模型产生的预测分类;
    对应地,所述根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与所述预测分类之间的残差,包括:
    根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与每轮迭代产生的预测分类之间的残差;
    对应地,所述根据所述残差,对所述初始分类模型进行修正,得到最终分类模型,包括:
    根据M轮迭代确定的残差,修正M轮迭代所使用的分类模型,得到M+1轮迭代所使用的分类模型,通过至少一轮迭代修正后得到最终分类模型,其中,所述M轮迭代所使用的分类模型为根据第M-1轮迭代确定的残差对第M-1轮迭代所使用的分类模型进行修正得到的,所述M为大于1的正整数。
  3. 根据权利要求2所述的方法,所述根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与每轮迭代产生的预测分类之间的残差,包括:
    根据所述训练参数、所述实际分类、以及表征所述实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定所述实际分类与每轮预测分类之间的残差。
  4. 根据权利要求3所述的方法,所述根据所述训练参数、所述实际分类、以及表征所述实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定所述实际分类与每轮预测分类之间的残差,包括:
    使用如下公式确定第k轮迭代产生的预测分类与所述实际分类之间的残差;
    Figure PCTCN2017107626-appb-100001
    Figure PCTCN2017107626-appb-100002
    F′k(xi)=Dyk*Fk(xi)
    其中,xi为所述训练参数,i为大于1的正整数,yik为所述实际分类,
    Figure PCTCN2017107626-appb-100003
    为第k轮迭代产生的预测分类与所述实际分类之间的残差,pk(xi)为第k轮迭代的预测概率函数,Fk(xi)为第k轮迭代的预测函数,Dyk为表征所述实际分类所属类别与第k轮预测分类所属类别之间差距的距离因子,F′k(xi)为第k轮迭代的修正预测函数,Fl'(xi)为第l轮迭代的修正预测函数,l的取值从1到K,其中,K为所述实际分类的分类数量。
  5. 一种数据分类的方法,包括:
    接收待分类数据;
    使用目标分类模型对所述待分类数据进行分类,得到分类结果;其中,所述目标分类模型为根据权利要求1至4任一项所述方法得到的最终分类模型;
    输出所述分类结果。
  6. 一种训练分类模型的装置,包括:
    样本获取单元,用于获取训练样本,所述训练样本包括训练参数以及所述训练参数对应的实际分类;
    模型训练单元,用于使用所述样本获取单元获取的所述训练参数对初始分类模型进行分类训练,得到预测分类;
    残差确定单元,用于根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与所述模型训练单元训练的所述预测分类之间的残差,所述梯度损失函数表征第一类别与第二类别之间差距的距离因子,所述第一类别为所述预测分类所属类别,所述第二类别为所述实际分类所属类别;
    模型修正单元,用于根据所述残差确定单元确定的残差,对所述初始分类模型进行修正,得到最终分类模型。
  7. 根据权利要求6所述的装置,
    所述模型训练单元,用于使用所述训练参数对初始分类模型进行迭代计算,得到每轮迭代所使用的分类模型产生的预测分类;
    所述残差确定单元,用于根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与每轮迭代产生的预测分类之间的残差;
    所述模型修正单元,用于根据M轮迭代确定的残差,修正M轮迭代所使用的分类模型,得到M+1轮迭代所使用的分类模型,通过至少一轮迭代修正后得到最终分类模型,其中,所述M轮迭代所使用的分类模型为根据第M-1轮迭代确定的残差对第M-1轮迭代所使用的分类模型进行修正得到的,所述M为大于1的正整数。
  8. 根据权利要求7所述的装置,
    所述残差确定单元,用于根据所述训练参数、所述实际分类、以及表征所述实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定所述实际分类与每轮预测分类之间的残差。
  9. 根据权利要求8所述的装置,
    所述残差确定单元,用于使用如下公式确定第k轮迭代产生的预测分类与所述实际分类之间的残差;
    Figure PCTCN2017107626-appb-100004
    Figure PCTCN2017107626-appb-100005
    F′k(xi)=Dyk*Fk(xi)
    其中,xi为所述训练参数,i为大于1的正整数,yik为所述实际分类,
    Figure PCTCN2017107626-appb-100006
    为第k轮迭代产生的预测分类与所述实际分类之间的残差,pk(xi)为第k轮迭代的预测概率函数,Fk(xi)为第k轮迭代的预测函数,Dyk为表征所述实际分类所属类别与第k轮预测分类所属类别之间差距的距离因子,F′k(xi)为第k轮迭代的修正预测函数,Fl'(xi)为第l轮迭代的修正预测函数,l的取值从1到K,其中,K为所述实际分类的分类数量。
  10. 一种数据分类的装置,包括:
    数据接收单元,用于接收待分类数据;
    数据分类单元,用于使用目标分类模型对所述数据接收单元接收的所述待分类数据进行分类,得到分类结果;其中,所述目标分类模型为根据权利要求6至9任一项所述装置得到的最终分类模型;
    数据输出单元,用于输出所述数据分类单元分类得到的所述分类结果。
  11. 一种训练分类模型的设备,所述设备包括:
    处理器以及存储器;
    所述存储器用于存储程序代码,并将所述程序代码传输给所述处理器;
    所述处理器用于根据所述程序代码中的指令执行权利要求1-4任一项所述的训练分类模型的方法。
  12. 一种数据分类的设备,所述设备包括:
    处理器以及存储器;
    所述存储器用于存储程序代码,并将所述程序代码传输给所述处理器;
    所述处理器用于根据所述程序代码中的指令执行权利要求5所述的数据分类的方法。
  13. 一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行权利要求1-4任一项所述的训练分类模型的方法。
  14. 一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行权利要求5所述的数据分类的方法
  15. 一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行权利要求1-4任一项所述的训练分类模型的方法。
  16. 一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行权利要求5任一项所述的数据分类的方法。
  17. 一种训练分类模型的方法,包括:
    终端获取训练样本,所述训练样本包括训练参数以及所述训练参数对应的实际分类;
    终端使用所述训练参数对初始分类模型进行分类训练,得到预测分类;
    终端根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与所述预测分类之间的残差,所述梯度损失函数包括表征第一类别与第二类别之间差距的距离因子,所述第一类别为所述预测分类所属类别,所述第二类别为所述实际分类所属类别;
    终端根据所述残差,对所述初始分类模型进行修正,得到最终分类模型。
  18. 根据权利要求17所述的方法,所述终端使用所述训练参数对初始分类模型进行分类训练,得到预测分类,包括:
    终端使用所述训练参数对初始分类模型进行迭代计算,得到每轮迭代所使用的分类模型产生的预测分类;
    对应地,所述终端根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与所述预测分类之间的残差,包括:
    终端根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与每轮迭代产生的预测分类之间的残差;
    对应地,所述终端根据所述残差,对所述初始分类模型进行修正,得到最终分类模型,包括:
    终端根据M轮迭代确定的残差,修正M轮迭代所使用的分类模型,得到M+1轮迭代所使用的分类模型,通过至少一轮迭代修正后得到最终分类模型,其中,所述M轮迭代所使用的分类模型为根据第M-1轮迭代确定的残差对第M-1轮迭代所使用的分类模型进行修正得到的,所述M为大于1的正整数。
  19. 根据权利要求18所述的方法,所述终端根据所述初始分类模型所包含的梯度损失函数,确定所述实际分类与每轮迭代产生的预测分类之间的残差,包括:
    终端根据所述训练参数、所述实际分类、以及表征所述实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定所述实际分类与每轮预测分类之间的残差。
  20. 根据权利要求19所述的方法,所述终端根据所述训练参数、所述实际分类、以及表征所述实际分类所属类别与每轮预测分类所属类别之间差距的距离因子,确定所述实际分类与每轮预测分类之间的残差,包括:
    终端使用如下公式确定第k轮迭代产生的预测分类与所述实际分类之间的残差;
    Figure PCTCN2017107626-appb-100007
    Figure PCTCN2017107626-appb-100008
    F′k(xi)=Dyk*Fk(xi)
    其中,xi为所述参数,i为大于1的正整数,yik为所述实际分类,
    Figure PCTCN2017107626-appb-100009
    为第k轮迭代产生的预测分类与所述实际分类之间的残差,pk(xi)为第k轮迭代的预测概率函数,Fk(xi)为第k轮迭代的预测函数,Dyk为表征所述实际分类所属类别与第k轮预测分类所属类别之间差距的距离因子,F′k(xi)为第k轮迭代的修正预测函数,Fl'(xi)为第l轮迭代的修正预测函数,l的取值从1到K,其中,K为所述实际分类的分类数量。
  21. 一种数据分类的方法,包括:
    终端接收待分类数据;
    终端使用目标分类模型对所述待分类数据进行分类,得到分类结果;其中,所述目标分类模型为根据权利要求17至20任一项所述方法得到的最终分类模型;
    终端输出所述分类结果。
PCT/CN2017/107626 2016-12-12 2017-10-25 一种训练分类模型的方法、数据分类的方法及装置 WO2018107906A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/286,894 US11386353B2 (en) 2016-12-12 2019-02-27 Method and apparatus for training classification model, and method and apparatus for classifying data

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611139498.5A CN108615044A (zh) 2016-12-12 2016-12-12 一种分类模型训练的方法、数据分类的方法及装置
CN201611139498.5 2016-12-12

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/286,894 Continuation US11386353B2 (en) 2016-12-12 2019-02-27 Method and apparatus for training classification model, and method and apparatus for classifying data

Publications (1)

Publication Number Publication Date
WO2018107906A1 true WO2018107906A1 (zh) 2018-06-21

Family

ID=62557907

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/107626 WO2018107906A1 (zh) 2016-12-12 2017-10-25 一种训练分类模型的方法、数据分类的方法及装置

Country Status (3)

Country Link
US (1) US11386353B2 (zh)
CN (1) CN108615044A (zh)
WO (1) WO2018107906A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020082865A1 (zh) * 2018-10-24 2020-04-30 阿里巴巴集团控股有限公司 用于构建机器学习模型的特征选取方法、装置以及设备
CN111680754A (zh) * 2020-06-11 2020-09-18 北京字节跳动网络技术有限公司 图像分类方法、装置、电子设备及计算机可读存储介质
CN112528109A (zh) * 2020-12-01 2021-03-19 中科讯飞互联(北京)信息科技有限公司 一种数据分类方法、装置、设备及存储介质
CN112883193A (zh) * 2021-02-25 2021-06-01 中国平安人寿保险股份有限公司 一种文本分类模型的训练方法、装置、设备以及可读介质
CN114663714A (zh) * 2022-05-23 2022-06-24 阿里巴巴(中国)有限公司 图像分类、地物分类方法和装置
CN115130592A (zh) * 2022-07-01 2022-09-30 中昊芯英(杭州)科技有限公司 一种样本生成芯片
CN117274724A (zh) * 2023-11-22 2023-12-22 电子科技大学 基于可变类别温度蒸馏的焊缝缺陷分类方法

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189937B (zh) * 2018-08-22 2021-02-09 创新先进技术有限公司 一种特征关系推荐方法及装置、一种计算设备及存储介质
JP7176359B2 (ja) 2018-11-05 2022-11-22 株式会社リコー 学習装置および学習方法
CN109829490B (zh) * 2019-01-22 2022-03-22 上海鹰瞳医疗科技有限公司 修正向量搜索方法、目标分类方法及设备
CN109858558B (zh) * 2019-02-13 2022-01-21 北京达佳互联信息技术有限公司 分类模型的训练方法、装置、电子设备及存储介质
CN110210233B (zh) * 2019-04-19 2024-05-24 平安科技(深圳)有限公司 预测模型的联合构建方法、装置、存储介质及计算机设备
CN110263638B (zh) * 2019-05-16 2023-04-18 山东大学 一种基于显著信息的视频分类方法
US11750436B2 (en) * 2019-05-30 2023-09-05 Nokia Technologies Oy Learning in communication systems
CN112149706B (zh) * 2019-06-28 2024-03-15 北京百度网讯科技有限公司 模型训练方法、装置、设备和介质
CN110378306B (zh) * 2019-07-25 2021-11-02 厦门美图之家科技有限公司 年龄预测方法、装置及图像处理设备
CN112396445A (zh) * 2019-08-16 2021-02-23 京东数字科技控股有限公司 用于识别用户身份信息的方法和装置
CN110751197A (zh) * 2019-10-14 2020-02-04 上海眼控科技股份有限公司 图片分类方法、图片模型训练方法及设备
CN111224890A (zh) * 2019-11-08 2020-06-02 北京浪潮数据技术有限公司 一种云平台的流量分类方法、系统及相关设备
CN111695593A (zh) * 2020-04-29 2020-09-22 平安科技(深圳)有限公司 基于XGBoost的数据分类方法、装置、计算机设备及存储介质
CN111696636B (zh) * 2020-05-15 2023-09-22 平安科技(深圳)有限公司 一种基于深度神经网络的数据处理方法及装置
US20220083571A1 (en) * 2020-09-16 2022-03-17 Synchrony Bank Systems and methods for classifying imbalanced data
CN112270547A (zh) * 2020-10-27 2021-01-26 上海淇馥信息技术有限公司 基于特征构造的金融风险评估方法、装置和电子设备
CN113762005A (zh) * 2020-11-09 2021-12-07 北京沃东天骏信息技术有限公司 特征选择模型的训练、对象分类方法、装置、设备及介质
CN112508062A (zh) * 2020-11-20 2021-03-16 普联国际有限公司 一种开集数据的分类方法、装置、设备及存储介质
CN114519114A (zh) * 2020-11-20 2022-05-20 北京达佳互联信息技术有限公司 多媒体资源分类模型构建方法、装置、服务器及存储介质
CN112418520B (zh) * 2020-11-22 2022-09-20 同济大学 一种基于联邦学习的信用卡交易风险预测方法
CN112465043B (zh) * 2020-12-02 2024-05-14 平安科技(深圳)有限公司 模型训练方法、装置和设备
CN112651458B (zh) * 2020-12-31 2024-04-02 深圳云天励飞技术股份有限公司 分类模型的训练方法、装置、电子设备及存储介质
CN112633407B (zh) * 2020-12-31 2023-10-13 深圳云天励飞技术股份有限公司 分类模型的训练方法、装置、电子设备及存储介质
CN113011529B (zh) * 2021-04-28 2024-05-07 平安科技(深圳)有限公司 文本分类模型的训练方法、装置、设备及可读存储介质
CN113065614B (zh) * 2021-06-01 2021-08-31 北京百度网讯科技有限公司 分类模型的训练方法和对目标对象进行分类的方法
CN116343201B (zh) * 2023-05-29 2023-09-19 安徽高哲信息技术有限公司 谷粒类别识别方法、装置及计算机设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050100209A1 (en) * 2003-07-02 2005-05-12 Lockheed Martin Corporation Self-optimizing classifier
CN104102705A (zh) * 2014-07-09 2014-10-15 南京大学 一种基于大间隔分布学习的数字媒体对象分类方法
CN104850531A (zh) * 2014-02-19 2015-08-19 日本电气株式会社 一种建立数学模型的方法和装置
CN105787046A (zh) * 2016-02-28 2016-07-20 华东理工大学 一种基于单边动态下采样的不平衡数据分类系统

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080101705A1 (en) * 2006-10-31 2008-05-01 Motorola, Inc. System for pattern recognition with q-metrics
US20100250523A1 (en) * 2009-03-31 2010-09-30 Yahoo! Inc. System and method for learning a ranking model that optimizes a ranking evaluation metric for ranking search results of a search query
US20150006259A1 (en) * 2013-06-27 2015-01-01 Kyruus, Inc. Methods and systems for providing performance improvement recommendations to professionals
US20150006422A1 (en) * 2013-07-01 2015-01-01 Eharmony, Inc. Systems and methods for online employment matching
US20150302755A1 (en) * 2014-04-22 2015-10-22 Google Inc. Measurement of educational content effectiveness
US9564123B1 (en) * 2014-05-12 2017-02-07 Soundhound, Inc. Method and system for building an integrated user profile
US9552549B1 (en) * 2014-07-28 2017-01-24 Google Inc. Ranking approach to train deep neural nets for multilabel image annotation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050100209A1 (en) * 2003-07-02 2005-05-12 Lockheed Martin Corporation Self-optimizing classifier
CN104850531A (zh) * 2014-02-19 2015-08-19 日本电气株式会社 一种建立数学模型的方法和装置
CN104102705A (zh) * 2014-07-09 2014-10-15 南京大学 一种基于大间隔分布学习的数字媒体对象分类方法
CN105787046A (zh) * 2016-02-28 2016-07-20 华东理工大学 一种基于单边动态下采样的不平衡数据分类系统

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020082865A1 (zh) * 2018-10-24 2020-04-30 阿里巴巴集团控股有限公司 用于构建机器学习模型的特征选取方法、装置以及设备
CN111680754A (zh) * 2020-06-11 2020-09-18 北京字节跳动网络技术有限公司 图像分类方法、装置、电子设备及计算机可读存储介质
CN111680754B (zh) * 2020-06-11 2023-09-19 抖音视界有限公司 图像分类方法、装置、电子设备及计算机可读存储介质
CN112528109A (zh) * 2020-12-01 2021-03-19 中科讯飞互联(北京)信息科技有限公司 一种数据分类方法、装置、设备及存储介质
CN112528109B (zh) * 2020-12-01 2023-10-27 科大讯飞(北京)有限公司 一种数据分类方法、装置、设备及存储介质
CN112883193A (zh) * 2021-02-25 2021-06-01 中国平安人寿保险股份有限公司 一种文本分类模型的训练方法、装置、设备以及可读介质
CN114663714A (zh) * 2022-05-23 2022-06-24 阿里巴巴(中国)有限公司 图像分类、地物分类方法和装置
CN115130592A (zh) * 2022-07-01 2022-09-30 中昊芯英(杭州)科技有限公司 一种样本生成芯片
CN117274724A (zh) * 2023-11-22 2023-12-22 电子科技大学 基于可变类别温度蒸馏的焊缝缺陷分类方法
CN117274724B (zh) * 2023-11-22 2024-02-13 电子科技大学 基于可变类别温度蒸馏的焊缝缺陷分类方法

Also Published As

Publication number Publication date
CN108615044A (zh) 2018-10-02
US20190197429A1 (en) 2019-06-27
US11386353B2 (en) 2022-07-12

Similar Documents

Publication Publication Date Title
WO2018107906A1 (zh) 一种训练分类模型的方法、数据分类的方法及装置
US11373090B2 (en) Techniques for correcting linguistic training bias in training data
CN110147456B (zh) 一种图像分类方法、装置、可读存储介质及终端设备
US10922628B2 (en) Method and apparatus for machine learning
Gu et al. Incremental support vector learning for ordinal regression
CN108897829B (zh) 数据标签的修正方法、装置和存储介质
Lan et al. Sparse factor analysis for learning and content analytics
US10262272B2 (en) Active machine learning
Ardia Financial Risk Management with Bayesian Estimation of GARCH Models Theory and Applications
Doppa et al. HC-Search: A learning framework for search-based structured prediction
Stempfel et al. Learning SVMs from sloppily labeled data
CN109086654B (zh) 手写模型训练方法、文本识别方法、装置、设备及介质
Johansson et al. Conformal prediction using decision trees
WO2021174723A1 (zh) 训练样本扩充方法、装置、电子设备及存储介质
CN103559504A (zh) 图像目标类别识别方法及装置
CN111026544B (zh) 图网络模型的节点分类方法、装置及终端设备
JP6807909B2 (ja) データ評価方法、装置、機器及び読み取り可能な記憶媒体
WO2022052484A1 (zh) 文本情绪识别方法、装置、终端设备和存储介质
US20210406464A1 (en) Skill word evaluation method and device, electronic device, and non-transitory computer readable storage medium
WO2020143301A1 (zh) 一种训练样本有效性检测方法、计算机设备及计算机非易失性存储介质
CN112420125A (zh) 分子属性预测方法、装置、智能设备和终端
Yousefnezhad et al. A new selection strategy for selective cluster ensemble based on diversity and independency
JP2019067299A (ja) ラベル推定装置及びラベル推定プログラム
WO2021147405A1 (zh) 客服语句质检方法及相关设备
Sivasubramanian et al. Adaptive mixing of auxiliary losses in supervised learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17881373

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17881373

Country of ref document: EP

Kind code of ref document: A1