CN111222553A

CN111222553A - Training data processing method and device of machine learning model and computer equipment

Info

Publication number: CN111222553A
Application number: CN201911403575.7A
Authority: CN
Inventors: 饶慧林
Original assignee: Guangzhou Huaduo Network Technology Co Ltd
Current assignee: Guangzhou Cubesili Information Technology Co Ltd
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2020-06-02
Anticipated expiration: 2039-12-30
Also published as: CN111222553B

Abstract

The application provides a training data processing method and device of a machine learning model and computer equipment, and relates to the technical field of machine learning model training, wherein the training data processing method of the machine learning model comprises the following steps: acquiring a characteristic parameter set of the updated training data of the machine learning model; wherein the feature parameter set comprises a plurality of candidate feature parameters; determining the range of the characteristic parameters to be selected according to the type of the machine learning model and the type of the characteristic parameters; in the range of the characteristic parameters, selecting characteristic parameters from the characteristic parameter set in sequence and inputting the characteristic parameters into the machine learning model for training; and acquiring an output result of the machine learning model, calculating an AUC value of the output result, and selecting a target characteristic parameter from the characteristic parameter set as training data according to the AUC value. The training data processing scheme of the machine learning model can improve the efficiency of model training.

Description

Training data processing method and device of machine learning model and computer equipment

Technical Field

The application relates to the technical field of machine learning model training, in particular to a training data processing method and device of a machine learning model and computer equipment.

Background

In the training process of the machine learning model, characteristics need to be added or modified to the machine learning model. In order to increase training samples for machine learning, different variables need to be added to the features, or different combinations of the features are input into the machine learning model one by one, and the training efficiency is low through complicated training and waiting.

Disclosure of Invention

In order to overcome the problem of low training efficiency of the current machine learning model, the following technical scheme is specially provided:

in a first aspect, the present application provides a method for processing training data of a machine learning model, including the following steps:

acquiring a characteristic parameter set of the updated training data of the machine learning model; wherein the feature parameter set comprises a plurality of candidate feature parameters;

determining the range of the characteristic parameters to be selected according to the type of the machine learning model and the type of the characteristic parameters;

in the range of the characteristic parameters, selecting characteristic parameters from the characteristic parameter set in sequence and inputting the characteristic parameters into the machine learning model for training;

and acquiring an output result of the machine learning model, calculating an AUC value of the output result, and selecting a target characteristic parameter from the characteristic parameter set as training data according to the AUC value.

In one embodiment, the step of obtaining the feature parameter set of the training data of the updated machine learning model includes:

and acquiring newly added or modified characteristic parameters of the training data of the machine learning model, and updating a characteristic parameter set.

In one embodiment, the step of sequentially selecting feature parameters from the feature parameter set within the range of the feature parameters and inputting the selected feature parameters into the machine learning model for training includes:

determining the interval of the characteristic parameters obtained twice in the range of the characteristic parameters according to the granularity of the characteristic parameters;

and sequentially acquiring each characteristic parameter within the range of the characteristic parameters according to the interval of the characteristic parameters.

In one embodiment, the step of determining the range of the feature parameters to be selected according to the type of the machine learning model and the type of the feature parameters includes:

confirming the value characteristics of the newly added or modified characteristic parameters according to the type of the machine learning model;

and determining the range of the characteristic parameter to be selected according to the value characteristics of the newly added or modified characteristic parameter.

In one embodiment, the step of sequentially obtaining each feature parameter within the range of the feature parameter according to the interval of the feature parameter includes:

when the characteristic parameters newly added or modified by the training data of the machine learning model are continuous characteristic parameters, sequentially acquiring each characteristic parameter according to the interval of the characteristic parameters in the range of the characteristic parameters;

and inputting each characteristic parameter into the machine learning model for training.

when the training of the machine learning model requires a plurality of characteristic parameters for training, the characteristic parameters comprise discrete characteristic quantity;

and in the range of the characteristic parameters, sequentially acquiring the combination of the characteristic parameters corresponding to the characteristic quantity from the characteristic parameter set, and inputting the combination into the machine learning model for training.

In one embodiment, the step of sequentially acquiring combinations of feature parameters corresponding to feature quantities from the feature parameter set within the range of the feature parameters and inputting the combinations into the machine learning model for training includes:

and according to the corresponding feature quantity, sequentially acquiring all combinations of feature parameters from the feature parameter set, and inputting the combinations of feature parameters into the machine learning model one by one for training.

In a second aspect, the present application further provides a training data processing apparatus for a machine learning model, including:

the acquisition module is used for acquiring a characteristic parameter set of the updated training data of the machine learning model; wherein the feature parameter set comprises a plurality of candidate feature parameters;

the range determining module is used for determining the range of the characteristic parameters needing to be selected according to the type of the machine learning model and the type of the characteristic parameters;

the training module is used for sequentially selecting characteristic parameters from the characteristic parameter set in the range of the characteristic parameters and inputting the characteristic parameters into the machine learning model for training;

and the selection module is used for acquiring the output result of the machine learning model, calculating an AUC value of the output result, and selecting a target characteristic parameter from the characteristic parameter set as training data according to the AUC value.

In a third aspect, the present application further provides a computer device, comprising:

one or more processors;

a memory;

one or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, the one or more computer programs being configured to perform any of the training data processing methods of the machine learning model provided in the first aspect.

In a fourth aspect, the present application further provides a computer-readable storage medium, wherein the computer-readable storage medium stores thereon a computer program, and the computer program, when executed by a processor, implements the method for processing training data of a machine learning model according to any one of the aspects provided in the first aspect.

The training data processing method, the training data processing device and the computer equipment of the machine learning model have the beneficial effects that:

according to the training data processing method and device for the machine learning model and the computer device, the computer device sets the range of the characteristic parameters in the process of updating the characteristic set and training the machine learning model, corresponding training data are automatically acquired one by one according to the set range, and the optimal characteristic parameters are acquired as the training data according to the AUC value corresponding to the result of each training. Therefore, the problem that training efficiency is low due to the fact that training samples of characteristic parameters need to be trained one by one and wait at present according to manual experience is solved, and the efficiency of model training in machine learning is improved.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic flowchart of a training data processing method of a machine learning model according to an embodiment of the present application;

FIG. 2 is a detailed flowchart illustrating a step S130 of a training data processing method of a machine learning model according to an embodiment of the present application;

FIG. 3 is a schematic flow chart illustrating a training data processing method of a machine learning model according to another embodiment of the present disclosure;

FIG. 4 is a flow chart illustrating a training data processing method for a machine learning model according to another embodiment of the present disclosure;

FIG. 5 is a schematic flow chart corresponding to the embodiment of FIG. 4;

FIG. 6 is a flowchart illustrating a method for processing training data of a machine learning model according to yet another embodiment of the present application;

FIG. 7 is a schematic flow chart corresponding to the embodiment of FIG. 6;

FIG. 8 is a block diagram of an apparatus at a training data point of a machine learning model provided in an embodiment of the present application;

fig. 9 is a schematic diagram of an internal structure of a computer device provided in an embodiment of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.

It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

Referring to fig. 1, fig. 1 is a flowchart illustrating a training data processing method of a machine learning model according to an embodiment of the present application.

The training data processing method of the machine learning model comprises the following steps:

s110, acquiring a characteristic parameter set of training data of the machine learning model to be updated; wherein the feature parameter set comprises a plurality of candidate feature parameters;

s120, determining the range of the characteristic parameters needing to be selected according to the type of the machine learning model and the type of the characteristic parameters;

s130, sequentially selecting characteristic parameters from the characteristic parameter set in the range of the characteristic parameters, and inputting the characteristic parameters into the machine learning model for training;

s140, obtaining an output result of the machine learning model, calculating an AUC value of the output result, and selecting a target characteristic parameter from the characteristic parameter set as training data according to the AUC value.

In steps S110 to S140, in the training of the machine learning model training, candidate feature parameters are added to the machine learning model according to different purposes of use, requirements and test requirements of the machine learning model. And updating the training data of the machine learning model by using the alternative characteristic parameters. The feature parameters of the training data of the machine learning model form feature parameter combinations. The feature parameter set includes original feature parameters and updated feature parameters in training data of the machine learning model. The set of feature parameters provides alternative feature parameters for training of the machine learning model.

The range of the corresponding characteristic parameters is different for different types of machine learning models and the types of the characteristic parameters needing to be updated. The range of the characteristic parameter includes a combination of the respective characteristic parameters within the range, in addition to the interval formed by the minimum value and the maximum value. The combination form may include a discrete type and a continuous type.

And according to the type and the combination form of the characteristic parameters, sequentially selecting the characteristic parameters from the characteristic parameter set in the range of the characteristic parameters, and inputting the characteristic parameters obtained each time into a machine learning model for training. The range of the characteristic parameter may be a value only for the characteristic parameter, or may be obtained by determining other characteristic parameters according to the value of the characteristic parameter.

And training one by one according to the selected characteristic parameters each time to obtain each output result of the training of the machine learning model. And respectively solving the corresponding AUC value according to each output result. And comparing the obtained plurality of AUC values according to the obtained AUC value of each output result, selecting the optimal characteristic parameter as a target characteristic parameter from the characteristic parameter set, and inputting the target characteristic parameter into the updated machine learning model as training data.

According to the training data processing method of the machine learning model, during the process that the computer equipment trains the machine learning model and updates the characteristic parameter set, the range of the characteristic parameters is set, and in the range, the characteristic parameters are selected from the characteristic parameter set in sequence according to the types of the characteristic parameters and input into the machine learning model for training to obtain the target characteristic parameters. The training data processing method of the machine learning model provided by the application solves the problem of low training efficiency caused by the fact that training samples of characteristic parameters are obtained according to experience of operators and are trained and waited one by one in the training process of the existing machine learning model.

The step S110 may further include:

and S111, acquiring newly added or modified characteristic parameters of the training data of the machine learning model, and updating a characteristic parameter set.

In the training of the machine learning model training, according to different use purposes, requirements and test requirements of the machine learning model, updating a characteristic parameter set of original training data of the machine learning model, wherein the updating mode comprises modifying the characteristic parameters of the original characteristic parameter set or adding a new characteristic parameter set to the original characteristic parameter set. And sequentially selecting characteristic parameters from the updated characteristic parameter set in the range of the characteristic parameters, and inputting the characteristic parameters into the machine learning model for training.

Referring to fig. 2, fig. 2 is a detailed flowchart illustrating a step S130 of a training data processing method of a machine learning model according to an embodiment of the present application.

For step S130, it may further include:

s131, determining the interval of the characteristic parameters obtained twice in the range of the characteristic parameters according to the granularity of the characteristic parameters;

and S132, sequentially acquiring each characteristic parameter within the range of the characteristic parameters according to the interval of the characteristic parameters.

In the process of steps S131 to S132, the characteristic parameter is expressed in the form of a numerical value. And determining the granularity of the characteristic parameters according to the test requirements, and determining the interval of the characteristic parameters obtained by two adjacent times of training within the range of the characteristic parameters according to the granularity. The granularity of the characteristic parameter needs to be combined with the type of the characteristic parameter. If the characteristic parameter is used to characterize the weight, and the interval is (0,1), the granularity is greater than 0 and less than 1; if the feature parameter is used for representing the number of feature parameters obtained during training, the granularity is a positive integer greater than or equal to 1.

And according to the interval of the characteristic parameters, sequentially corresponding characteristic parameters in the range of the characteristic parameters, and inputting the characteristic parameters into a machine learning model for training.

Referring to fig. 3, fig. 3 is a flowchart illustrating a training data processing method of a machine learning model according to another embodiment of the present application.

On the basis of the expansion of step S110, step S120 may further include:

s121, confirming the value-taking characteristics of the newly added or modified characteristic parameters according to the type of the machine learning model;

and S122, determining the range of the characteristic parameter to be selected according to the value characteristics of the newly added or modified characteristic parameter.

In steps S121-S122, the computer device obtains a model file of the machine learning model, obtaining a type of the machine learning model. And determining updated characteristic parameters according to the type of the machine learning model and the purpose of model training, wherein the updated characteristic parameters comprise newly added characteristic parameters and modified characteristic parameters. And determining the value characteristics according to the types of the newly added or modified characteristic parameters. The value-taking characteristics include a value-taking range corresponding to the characteristic parameter, and also can be included in training data to obtain the quantity of other characteristic parameters.

And determining the range of the characteristic parameter to be selected according to the value characteristic of the newly added or modified characteristic parameter.

Referring to fig. 4, fig. 4 is a flowchart illustrating a training data processing method of a machine learning model according to yet another embodiment of the present application.

On this basis, step S132 may further include:

s11, when the newly added or modified characteristic parameters of the training data of the machine learning model are continuous characteristic parameters, sequentially acquiring each characteristic parameter according to the interval of the characteristic parameters in the range of the characteristic parameters;

and S12, inputting each characteristic parameter into the machine learning model for training.

In steps S11-S12, when the feature parameters added or modified in the training data of the machine learning model are continuous, each feature parameter is sequentially obtained according to the interval between two adjacent feature parameters of the feature parameter within the range of the feature parameters, and each feature parameter is input into the machine learning model for training.

In order to more clearly illustrate the execution of the steps S11-S12, a specific embodiment is described below:

and detecting the junk mails by using the LS model. The specific function of the LS model is: x1y1+ x2y2+ … … + xnyn. The summed value ranges of the function are (0, 1). If spam, the result of the function tends to 1; for non-spam, the result of the function tends to be 0. And detecting the junk mails according to the calculation result of the function. Where x1, x2, … …, xn are variables and y1, y2, … …, yn are the values of the samples evaluated for different parts of the mail. For example, y1, y2, … …, yn may characterize the sender, subject, body, etc. of the mail piece, respectively. In order to detect the influence of different parts of the mail on the junk mail, a characteristic parameter of weight is added to the model, the characteristic parameter is input into the model, different weights are distributed to different parts of the mail, and the result of the obtained function is compared with the situation of a prediction sample.

In this example, the computer device assigns different weights a, b, c to different parts of the mail, and 0< a + b + c <1, with the ranges of values a, b, c being (0,1), respectively. According to the detection experience, the intervals of the characteristic parameters, namely the intervals of the value ranges of a, b and c are respectively 0.02, namely the value ranges of a, b and c can be taken from 0.02 at the minimum. And combining the characteristic parameters a, b and c according to the relation of a, b and c to form the characteristic parameters (a, b and c). And a, b and c are sub-parameters of the characteristic parameters (a, b and c), and each sub-parameter is a continuous characteristic parameter. The characteristic parameters (a, b, c) may be arbitrarily combined as long as the above-described value conditions are satisfied. Each sub-parameter starts from 0.02, continuous value taking is carried out towards the trend of a value of 1, taking the sub-parameter a as an example, the value of a is 0.02,0.04, 0.06 and … …, the value taking situations of the sub-parameters b and c are the same as a, all combinations which meet the conditions are obtained from all the values of a, b and c according to the condition that 0< a + b + c <1, and the combinations are input into the machine learning model for training.

Referring to fig. 5, fig. 5 is a flowchart corresponding to an embodiment of fig. 4.

The process of training data processing and training for the machine learning model may include the steps of:

s51, adding characteristic parameters according to the training purpose of the model;

s52, if the added characteristic parameter is continuous, sequentially forming each characteristic parameter combination according to the value condition which is in the range of the characteristic parameter and accords with the added characteristic parameter;

s53, inputting the characteristic parameter combinations to the machine learning model one by one;

and S54, obtaining an output result of each training of the machine learning model, and calculating an AUC value corresponding to the output result.

If the modified characteristic parameters are continuous characteristic parameters, the computer equipment can continuously value the modified characteristic parameters within the range of the characteristic parameters according to instructions and input the modified characteristic parameters into the machine learning model one by one for training.

As can be seen from the above embodiments, in the process of machine learning model training, the number of related feature parameters is large, and if an operator needs to obtain the feature parameters one by one based on experience, even under the condition of performing combined value-taking on a plurality of sub-parameters of the feature parameters, the data size involved in the training is huge, and if the operator needs to obtain the feature parameters one by one based on experience, omission easily occurs, and the result of model training is affected. In the application, the computer equipment can sequentially acquire continuous characteristic parameters within the range of the characteristic parameters according to the intervals of the characteristic parameters and input the continuous characteristic parameters into the machine learning model for training, so that the efficiency of model training is improved, and the integrity of the training is improved.

Referring to fig. 6, fig. 6 is a flowchart illustrating a training data processing method of a machine learning model according to still another embodiment of the present application.

In addition, step S132 may further include:

s21, when the training of the machine learning model requires a plurality of characteristic parameters for training, the characteristic parameters comprise discrete characteristic quantity;

and S22, sequentially acquiring combinations of the characteristic parameters corresponding to the characteristic quantities from the characteristic parameter set in the range of the characteristic parameters, and inputting the combinations into the machine learning model for training.

In steps S21-S22, in this embodiment, the feature parameter set required for training the machine learning model includes a plurality of candidate feature parameters. The feature parameter set includes discrete feature quantities, which are quantities determined to be required to extract other feature parameters during the training of the machine learning model. For example, in the feature parameters of the machine learning model, 5 candidate feature parameters are included in addition to the feature parameters of the discrete feature quantity, but for training of the machine learning model, several feature parameters may be extracted for training, and the feature parameters of the discrete feature quantity are determined whether to extract 1 to 3 feature parameters for training or 2 to 5 feature parameters for training in the training of the machine learning model. And the range of the characteristic parameter corresponding to the characteristic parameter of the discrete characteristic quantity is [1,3] or [2,5] respectively. The values within the range of the characteristic parameter are discrete number values.

On this basis, the step S22 may further include:

and S221, sequentially acquiring all combinations of the characteristic parameters from the characteristic parameter set according to the corresponding characteristic quantity, and inputting the combinations into the machine learning model one by one for training.

In step S221, combinations of feature parameters corresponding to the number of features are sequentially obtained from the feature parameter set according to the range of the feature parameters. Taking the range of the feature parameter as [1,3] as an example, the obtained feature quantity may be one of 1,2,3, that is, 1,2,3 feature parameters are extracted from the feature parameter set, and the feature parameters with the corresponding quantity obtained by extraction are trained in the machine learning model.

In order to more clearly illustrate the execution of steps S21-S22 when the characteristic parameter includes a discrete number of characteristics, a specific embodiment is described below:

and detecting the influence factors of the online time of the audience in the live broadcast room by using the model. In this embodiment, the machine learning model may include 4 feature parameters, such as a layout theme of a live broadcast room, a live broadcast item of a main broadcast, a live broadcast time period, and a type of the main broadcast.

The discrete feature number is 4, and according to the training requirement of a general machine learning model, the range of the feature parameters of the discrete feature number is [1,4], in the training process of the model, the feature parameters corresponding to the feature number are sequentially extracted in the range of the feature parameters, and the combination formed by the feature parameters is input to the machine learning model for training. And the characteristic parameters extracted each time are in different combination modes.

Moreover, the set of characteristic parameters may also include necessary characteristic parameters, such as factors such as the online frequency of the viewer, the online duration of the viewer, the historical preferences of the viewer, and the like. In this case, the feature parameter set of the machine learning model includes the necessary feature parameters, the candidate feature parameters, and the feature parameters of the discrete feature quantity are only for the feature quantities of the candidate feature parameters.

And (4) inputting the feature parameter combination formed by the extracted optional feature parameters and the necessary feature parameters into a machine learning model for training one by one.

Referring to fig. 7, fig. 7 is a flowchart corresponding to an embodiment of fig. 6.

s71, acquiring necessary characteristic parameters and alternative characteristic parameters according to the adjustment purpose of the model;

s72, determining the characteristic parameters of discrete characteristic quantity according to the alternative characteristic parameters;

s73, determining the minimum value and the maximum value in the characteristic parameters of the discrete characteristic quantity by referring to the empirical value, and forming the range of the characteristic parameters of the discrete characteristic quantity;

s74, according to the range, obtaining alternative characteristic parameters corresponding to the characteristic quantity in the characteristic parameter combination in sequence;

s75, combining the optional characteristic parameters extracted each time with the necessary characteristic parameters, and inputting the combined characteristic parameters into the machine learning model one by one;

s76, obtaining an output result of each training of the machine learning model, and calculating an AUC value corresponding to the output result;

and S77, selecting target characteristic parameters from the characteristic parameter set according to the AUC values to serve as training data.

According to the training data processing method of the machine learning model, the range of the feature quantity of model training is limited through computer equipment, the optional feature parameters are extracted according to the range of the feature quantity, and the machine learning model is trained one by one. In this way, the computer device can obtain all feature parameter combinations available for training within the allowable number of candidate feature parameters, and input the feature parameter combinations into the machine learning model. Training data can be rapidly acquired and the machine learning model can be trained. And after comparison is carried out according to the AUC value of each training result, the characteristic parameter corresponding to the AUC value closest to 1 is taken as the target characteristic parameter and is taken as the training data of the subsequent model detection data.

Referring to fig. 8, fig. 8 is a schematic structural diagram of an apparatus at a training data processing of a machine learning model according to an embodiment of the present application.

Based on the same inventive concept as the training data processing method of the machine learning model, the embodiment of the present application further provides a training data processing apparatus of a machine learning model, including:

an obtaining module 81, configured to obtain a feature parameter set of the updated training data of the machine learning model; wherein the feature parameter set comprises a plurality of candidate feature parameters;

a range determining module 82, configured to determine a range of the feature parameter to be selected according to the type of the machine learning model and the type of the feature parameter;

the training module 83 is configured to select feature parameters from the feature parameter set in sequence in the range of the feature parameters, and input the feature parameters into the machine learning model for training;

and the selecting module 84 is configured to obtain an output result of the machine learning model, calculate an AUC value of the output result, and select a target feature parameter from the feature parameter set according to the AUC value as training data.

Referring to fig. 9, fig. 9 is a schematic diagram of an internal structure of a computer device according to an embodiment of the present application. As shown in fig. 9, the computer apparatus includes a processor 91, a storage medium 92, a memory 93, and a network interface 94, which are connected by a system bus. The storage medium 92 of the computer device stores an operating system, a database and computer readable instructions, the database may store control information sequences, and the computer readable instructions, when executed by the processor 91, may cause the processor 91 to implement a data transmission method, and the processor 91 may implement the functions of the obtaining module 81, the range determining module 82, the training module 83 and the selecting module 84 in the training data processing apparatus of a machine learning model in the embodiment shown in fig. 8. The processor 91 of the computer device is used to provide computing and control capabilities to support the operation of the entire computer device. The memory 93 of the computer device may have stored therein computer readable instructions that, when executed by the processor 91, may cause the processor 91 to perform a data transfer method. The network interface 94 of the computer device is used for communicating with the terminal connection. Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, the present application also proposes a storage medium storing computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of: acquiring a characteristic parameter set of the updated training data of the machine learning model; wherein the feature parameter set comprises a plurality of candidate feature parameters; determining the range of the characteristic parameters to be selected according to the type of the machine learning model and the type of the characteristic parameters; in the range of the characteristic parameters, selecting characteristic parameters from the characteristic parameter set in sequence and inputting the characteristic parameters into the machine learning model for training; and acquiring an output result of the machine learning model, calculating an AUC value of the output result, and selecting a target characteristic parameter from the characteristic parameter set as training data according to the AUC value.

By combining the above embodiments, the application has the following greatest beneficial effects:

according to the training data processing method of the machine learning model, the computer equipment updates the feature set and trains the machine learning model, the range of the feature parameters is set, corresponding training data are automatically acquired one by one according to the setting, and the optimal feature parameters are obtained as the training data according to the AUC value corresponding to the result of each training. Therefore, the problem that training efficiency is low due to the fact that training samples of characteristic parameters need to be trained one by one and wait at present according to manual experience is solved, and the efficiency of model training in machine learning is improved.

Further, the updated feature parameters include newly added or modified feature parameters.

And determining the interval of the characteristic parameters obtained twice in the range of the characteristic parameters according to the granularity of the characteristic parameters.

When the new or modified feature parameters of the training data of the machine learning model are continuous feature parameters, each feature parameter can be sequentially obtained according to the interval of the feature parameters within the range of the feature parameters, and the feature parameters are input into the machine learning model one by one for training. Like this, applicable training sample size's that corresponds in the characteristic parameter the more condition, also can avoid artifical input one by one moreover and probably appear the condition of omitting, under the prerequisite that improves model training efficiency in the machine learning, can also improve the degree of accuracy of model training.

When the training of the machine learning model requires a plurality of characteristic parameters for training, the characteristic parameters of discrete characteristic quantity are added, the combinations of the characteristic parameters corresponding to the characteristic quantity can be sequentially obtained from the characteristic parameter set according to the range of the characteristic parameters, different characteristic parameter combinations are input into the machine learning model for training, and thus, all the combinations of the characteristic parameters can be quickly obtained, the combinations of all the characteristic parameters are trained, the training can be more comprehensive, and the training efficiency and accuracy are improved.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims

1. A training data processing method of a machine learning model is characterized by comprising the following steps:

2. The method of claim 1,

the step of obtaining the feature parameter set of the updated training data of the machine learning model includes:

3. The method of claim 2,

and in the range of the characteristic parameters, sequentially selecting the characteristic parameters from the characteristic parameter set and inputting the characteristic parameters into the machine learning model for training, wherein the method comprises the following steps of:

4. The method of claim 3,

the step of determining the range of the feature parameters to be selected according to the type of the machine learning model and the type of the feature parameters includes:

5. The method of claim 3,

the step of sequentially acquiring each characteristic parameter within the range of the characteristic parameter according to the interval of the characteristic parameter comprises the following steps:

6. The method of claim 4,

7. The method of claim 6,

the step of sequentially acquiring the combination of the characteristic parameters corresponding to the characteristic quantity from the characteristic parameter set in the range of the characteristic parameters and inputting the combination of the characteristic parameters into the machine learning model for training comprises the following steps:

8. A training data processing apparatus for a machine learning model, comprising:

9. A computer device, comprising:

one or more processors;

a memory;

one or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, the one or more computer programs being configured to perform the training data processing method of the machine learning model according to any of claims 1 to 7.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the training data processing method of the machine learning model of any one of claims 1 to 7.