CN115618219A - Model training method and device, electronic equipment and storage medium - Google Patents
Model training method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN115618219A CN115618219A CN202110720159.0A CN202110720159A CN115618219A CN 115618219 A CN115618219 A CN 115618219A CN 202110720159 A CN202110720159 A CN 202110720159A CN 115618219 A CN115618219 A CN 115618219A
- Authority
- CN
- China
- Prior art keywords
- parameter
- model
- leaf node
- tree model
- threshold
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 230000008569 process Effects 0.000 claims abstract description 16
- 238000001514 detection method Methods 0.000 claims description 16
- 238000003066 decision tree Methods 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 11
- 230000003247 decreasing effect Effects 0.000 claims description 4
- 230000000694 effects Effects 0.000 description 9
- 238000013138 pruning Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 6
- 230000001360 synchronised effect Effects 0.000 description 6
- 230000006835 compression Effects 0.000 description 5
- 238000007906 compression Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000004821 distillation Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application discloses a model training method, a device, an electronic device and a storage medium, wherein the model training method comprises the following steps: under the condition that a first leaf node is obtained by branching in the process of training a tree model, determining a first parameter and a second parameter of the first leaf node; if the first parameter and the second parameter meet a first set condition, performing branch processing on the first leaf node; finishing the training of the tree model based on the first leaf node after the branch processing to obtain the trained tree model; loading the trained tree model to a memory of a terminal device; wherein the first parameter characterizes a number of samples classified to the first leaf node; the second parameter characterizes a weight parameter of the first leaf node; the first setting condition is characterized in that the first parameter is larger than a first threshold value and the second parameter is larger than a second threshold value.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a model training method and apparatus, an electronic device, and a storage medium.
Background
As the number of networked terminal devices increases, machine learning models are deployed to more and more terminal devices, such as drones, home service robots, and the like. The problem that the application scene of the model is limited exists because the memory consumption of the machine learning model is large, the memory space of the terminal device is limited, and part of the model cannot be loaded to the memory for use.
Disclosure of Invention
In view of this, embodiments of the present application provide a model training method, an apparatus, an electronic device, and a storage medium, so as to at least solve the problem that the application scenario of the related art model is limited.
The technical scheme of the embodiment of the application is realized as follows:
the embodiment of the application provides a model training method, which comprises the following steps:
under the condition that a first leaf node is obtained by branching in the process of training a tree model, determining a first parameter and a second parameter of the first leaf node;
if the first parameter and the second parameter meet a first set condition, performing branch processing on the first leaf node;
finishing the training of the tree model based on the first leaf node after the branch processing to obtain the trained tree model;
loading the trained tree model to a memory of the terminal equipment; wherein,
the first parameter characterizes a number of samples classified to the first leaf node; the second parameter characterizes a weight parameter of the first leaf node; the first setting condition is characterized in that the first parameter is larger than a first threshold value and the second parameter is larger than a second threshold value.
In the above scheme, the method further includes:
if the first parameters and the second parameters of all leaf nodes of the tree model do not meet the first set condition, judging whether the model parameters corresponding to the tree model meet the second set condition;
and under the condition that the model parameters corresponding to the tree model meet the second set condition, updating the first threshold value and/or the second threshold value, and training the tree model again.
In the above solution, the model parameter represents a file memory of the tree model, the second setting condition represents that the file memory of the tree model is larger than a set memory size, and the updating the first threshold and/or the second threshold includes at least one of:
increasing the first threshold by a first set value;
increasing the second threshold by a second set value.
In the above solution, the second setting value is determined based on the second threshold and the number of samples of the second leaf node of each decision tree in the tree model; the second leaf node characterizes the leaf node classified to the least number of samples.
In the foregoing solution, the model parameter represents a detection rate of the tree model, the second setting condition represents that the detection rate of the tree model is smaller than a set threshold, and the updating the first threshold and/or the second threshold includes at least one of:
decreasing the first threshold by a third set value;
reducing the second threshold by a fourth set value.
In the foregoing solution, when the first parameter and the second parameter of the first leaf node are determined, the method includes:
determining the second parameter based on third parameters of all samples classified to the first leaf node; the third parameter characterizes a corresponding sample weight parameter.
In the above scheme, the tree model is a tree model of an XGBoost algorithm.
The embodiment of the present application further provides a model training device, including:
the first processing unit is used for determining a first parameter and a second parameter of a first leaf node under the condition that the first leaf node is obtained by branching in the process of training the tree model;
the second processing unit is used for performing branch processing on the first leaf node if the first parameter and the second parameter meet a first set condition;
the third processing unit is used for finishing the training of the tree model based on the first leaf node after the branch processing to obtain the trained tree model;
the loading unit is used for loading the trained tree model to a memory of the terminal equipment; wherein,
the first parameter characterizes a number of samples classified to the first leaf node; the second parameter characterizes a weight parameter of the first leaf node; the first setting condition is characterized in that the first parameter is larger than a first threshold value and the second parameter is larger than a second threshold value.
An embodiment of the present application further provides an electronic device, including: a processor and a memory for storing a computer program capable of running on the processor,
wherein the processor is configured to execute the steps of the model training method when running the computer program.
An embodiment of the present application further provides a storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above model training method.
In the embodiment of the application, in the process of training the tree model, under the condition that a first leaf node is obtained through branching, a first parameter and a second parameter of the first leaf node are determined, whether the determined first parameter and the determined second parameter meet a first set condition is judged, if the first parameter and the second parameter meet the first set condition, branching processing is performed on the first leaf node, training of the tree model is completed based on the first leaf node after the branching processing, the trained tree model is obtained, the trained tree model is loaded to a memory of a terminal device, the first parameter represents the number of samples classified into the first leaf node, the second parameter represents a weight parameter of the first leaf node, and the first set condition represents that the first parameter is greater than a first threshold and the second parameter is greater than a second threshold. In this way, the trained tree model is loaded to the memory of the terminal device by taking the size relationship between the first parameter and the second parameter and the corresponding threshold as a condition for whether to continue the branch processing on the first leaf node, so that the memory space of the terminal device, which needs to be occupied by the tree model, is reduced, and the memory space occupied by the trained tree model is not larger than the memory of the terminal device, so that the model can be widely applied to various terminal devices or various scenes.
Drawings
Fig. 1 is a schematic flowchart of a model training method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a model training method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of threshold adjustment provided by an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
As the number of networked terminal devices increases, machine learning models are deployed to more and more terminal devices, such as drones, home service robots, and the like. Because the model is strictly limited by the memory size of the equipment in a specific environment, when the memory consumption of the machine learning model is large and the memory space of the terminal equipment is limited, the model cannot be loaded into the memory for use, and the problem that the application scene of the model is limited exists.
Meanwhile, in more and more application scenes, a plurality of models need to be put into a limited memory of the device to realize various detection functions of different types, and the models with high memory consumption are also limited in the application scenes.
In the related technology, the complexity of the model can be reduced in modes of model distillation, weight pruning, model quantization and the like, so that the memory consumed by the model is reduced, and the model compression is realized. Wherein, the model distillation and the weight pruning are suitable for a depth model and are not suitable for a tree model structure. The model quantization is to use a low-precision numerical value instead of a high-precision numerical value in the model to reduce the space occupied by the memory, but the model precision is reduced more in this way.
Based on this, in various embodiments of the present application, in the process of training a tree model, under the condition that a first leaf node is obtained through branching, a first parameter and a second parameter of the first leaf node are determined, it is determined whether the determined first parameter and second parameter meet a first setting condition, if the first parameter and second parameter meet the first setting condition, branching processing is performed on the first leaf node, training of the tree model is completed based on the first leaf node after the branching processing, a trained tree model is obtained, the trained tree model is loaded to a memory of a terminal device, the first parameter represents a number of samples classified into the first leaf node, the second parameter represents a weight parameter of the first leaf node, the first setting condition represents that the first parameter is greater than a first threshold, and the second parameter is greater than a second threshold. In this way, the trained tree model is loaded to the memory of the terminal device by taking the size relationship between the first parameter and the second parameter and the corresponding threshold value as a condition for whether to continue the branch processing on the first leaf node, so that the memory space of the terminal device required to be occupied by the tree model is reduced, and the memory space occupied by the trained tree model is not larger than the memory of the terminal device, so that the model can be widely applied to various types of terminal devices or various scenes.
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Fig. 1 is a schematic view of an implementation flow of a model training method provided in an embodiment of the present application, where an execution subject may be an electronic device including a terminal. As shown in fig. 1, the model training method includes:
step 101: and under the condition that a first leaf node is obtained by branching in the process of training the tree model, determining a first parameter and a second parameter of the first leaf node.
Wherein the first parameter characterizes a number of samples classified to a first leaf node; the second parameter characterizes a weight parameter of the first leaf node.
In this embodiment, in the process of training the tree model, under the condition that a previous node is branched to obtain a first leaf node, a first parameter and a second parameter corresponding to each obtained first leaf node are determined. Here, the first parameter characterizes a number of samples of the set sample set classified to the first leaf node, and the second parameter characterizes a weight parameter of the first leaf node. The number of samples of a node refers to the number of samples distributed in each node when a set of samples is set to be predicted by using a tree. The set sample set can be set according to needs, and can also be selected from an existing database.
Step 102: and if the first parameter and the second parameter meet a first set condition, performing branch processing on the first leaf node.
Wherein the first setting condition is characterized in that the first parameter is larger than a first threshold value and the second parameter is larger than a second threshold value.
And if the first parameter and the second parameter of the first leaf node meet the first set condition, performing branching processing on the first leaf node, wherein the first leaf node becomes a parent node at the moment, and branching to obtain a new first leaf node. Here, the first setting condition indicates that the determined first parameter of the first leaf node is greater than the first threshold and the second parameter is greater than the second threshold, and the first threshold and the second threshold may be set as needed. The first threshold may characterize a sample number threshold for a leaf node and the second threshold may characterize a weight parameter threshold for the leaf node.
And when the determined first parameter and the second parameter of the first leaf node do not meet the first set condition, the first leaf node is not subjected to branch processing.
Step 103: and finishing the training of the tree model based on the first leaf node after the branch processing to obtain the trained tree model.
And finishing the training of the tree model based on the first leaf node obtained after the branch processing to obtain the trained tree model.
Step 104: and loading the trained tree model to a memory of the terminal equipment.
Here, the terminal device may be a terminal device that is the subject of the model training method execution, or may be another terminal device than the subject of the model training method execution. Based on the model training method, the memory space occupied by the trained tree model is not larger than the memory of the terminal equipment, and the terminal equipment can store and use the trained tree model. In this way, the size relationship between the first parameter and the second parameter and the corresponding threshold is used as a condition for whether to continue the branch processing on the first leaf node, the trained tree model is loaded to the memory of the terminal device, the number of nodes of the tree model is reduced, the model compression effect of reducing the complexity and the size of the model is realized, the memory space of the terminal device, which needs to be occupied by the trained tree model, is reduced, and the memory space occupied by the trained tree model is not larger than the memory of the terminal device, so that the model can be widely applied to various types of terminal devices or various scenes.
Wherein, in an embodiment, the method further comprises:
if the first parameters and the second parameters of all leaf nodes of the tree model do not meet the first set condition, judging whether the model parameters corresponding to the tree model meet the second set condition;
and under the condition that the model parameters corresponding to the tree model meet the second set condition, updating the first threshold value and/or the second threshold value, and training the tree model again.
When the first parameters and the second parameters of all leaf nodes of the tree model do not meet the first set condition, whether the model parameters corresponding to the tree model meet the second set condition or not is judged, when the model parameters corresponding to the tree model meet the second set condition, the first threshold value and/or the second threshold value are/is updated, and the tree model is trained again based on the updated first threshold value and the updated second threshold value. Here, the model parameter characterizes at least one characteristic parameter of the tree model, and the second set condition characterizes a condition for stopping training of the tree model.
In this way, by setting the second setting condition as a condition for determining whether the tree model is trained again, feedback adjustment of the generated model can be realized, so that the model can be generally applied to various terminals or various scenes.
In an embodiment, the model parameter represents a file memory of the tree model, the second setting condition represents that the file memory of the tree model is larger than a set memory size, and the updating the first threshold and/or the second threshold includes at least one of:
increasing the first threshold by a first set value;
increasing the second threshold by a second set value.
And setting the second setting condition as that the file memory of the tree model is larger than the set memory size, and updating the first threshold value and/or the second threshold value when the model parameters corresponding to the tree model meet the second setting condition, namely the file memory corresponding to the tree model is larger than the set memory size. Specifically, the manner of updating the first threshold and/or the second threshold may be to increase the first threshold by a first set value only, to increase the second threshold by a second set value only, or to increase the first threshold by both the first set value and the second set value. Here, the first set value and the second set value may be set as needed.
Whether the file memory of the tree model is larger than the set memory size or not is judged, whether the tree model is trained again or not is determined, and when the condition of branch processing is determined to be trained again, the condition of branch processing is adjusted until the tree model with the file memory meeting the requirement is obtained through training, so that the memory space occupied by the tree model can be reduced, and the tree model can be widely suitable for various terminals or various scenes.
In an embodiment, the second set point is determined based on the second threshold and a sample number of second leaf nodes of each decision tree in the tree model; the second leaf node characterizes the leaf node classified to the least number of samples.
The second set value is determined based on a second threshold and a number of samples of a second leaf node of each decision tree in the tree model, where the second leaf node characterizes a leaf node classified to the least number of samples.
The second set value may be determined by:
first, the sample average weight avg is determined by equation (1) w :
Wherein,
∝ i the number of samples of the leaf node with the minimum sample in the ith decision tree in the tree model;
W 2 the weight parameter threshold value is a second threshold value and represents the leaf node;
tree _ num is the total number of decision trees in the tree model.
Then, a second set value W 'is determined by equation (2)' 2 :
W′ 2 =λ*(avg w *N)-W 2 (2)
Wherein,
n is the total number of samples of the set sample set;
lambda is an adjusting parameter.
Thus, the average weight avg of the samples is based on the number of samples of the leaf node w And the second threshold value is adjusted, so that the times of model training can be reduced, and the training cost is saved.
In an embodiment, the model parameter characterizes a detection rate of the tree model, the second set condition characterizes that the detection rate of the tree model is smaller than a set threshold, and the updating the first threshold and/or the second threshold includes at least one of:
reducing the first threshold by a third set value;
reducing the second threshold by a fourth set value.
And setting the second setting condition as that the black detection rate of the tree model is smaller than the set threshold value when the determined white-to-false alarm rate is determined, and updating the first threshold value and/or the second threshold value when the model parameters corresponding to the tree model meet the second setting condition, namely the black detection rate corresponding to the tree model is smaller than the set threshold value. Specifically, the first threshold and/or the second threshold may be updated by reducing the first threshold by only the third setting value, by reducing the second threshold by only the fourth setting value, or by reducing the first threshold by both the third setting value and the fourth setting value. Here, the third setting value and the fourth setting value may be set as needed. The white false alarm rate represents the number of false alarms of the model on the positive samples/the number of positive samples, and the black detection rate represents the number of detected negative samples/the number of negative samples.
Whether the black relevance ratio of the tree model is smaller than a set threshold value or not is judged, whether the tree model is trained again or not is determined, and when the condition of branch processing is adjusted when the tree model is determined to be trained again until the black relevance ratio meets the requirement, so that the relevance ratio of the model to a negative sample can be improved, and the classification effect of the model is improved.
In one embodiment, in the determining the first and second parameters of the first leaf node, the method comprises:
determining the second parameter based on third parameters of all samples classified to the first leaf node; the third parameter characterizes a corresponding sample weight parameter.
When the first parameter and the second parameter of the first leaf node are determined, the second parameter of the first leaf node is determined based on the sample weight parameters corresponding to all the samples classified into the first leaf node. Here, all the sample weight parameters may be set to the same value at the time of initial training.
The second parameter w may be determined by equation (3):
w=∑β i (3)
wherein,
β i and sample weight parameters corresponding to the samples classified to the leaf nodes.
In an embodiment, the tree model is a tree model of an XGBoost algorithm.
The XGboost algorithm is used for the trained tree model, and for the XGboost algorithm tree model, the weight parameters of each sample can be changed in the training process, so that the generalization capability of the model is improved, and the effect of better model prediction capability is realized.
The present application will be described in further detail with reference to the following application examples.
With reference to fig. 2, the corresponding model training method includes tree node information statistics, pruning judgment, and threshold feedback adjustment. Whether the nodes are split or not is judged through a set threshold, and after the model is trained, if the model compression effect is not good, the distribution quantity and the weight of samples of the sample set on the nodes of the model can be judged, and then the threshold is adjusted according to the judgment result to find a more optimal threshold.
(1) And (5) counting tree node information.
The XGboost model comprises a plurality of decision trees, and information statistics of tree nodes needs to be carried out on each decision tree in the training process, wherein the information statistics comprises the number of samples of the nodes and weight parameters of the samples.
(2) And (6) pruning judgment.
In order to ensure the effect of the model, nodes with the least influence on the model need to be pruned. The nodes with small sample number and small sample weight are selected for pruning. Pruning, namely analyzing through a tree model structure, and pruning leaf nodes which have side effects or little significance on classification results.
According to the counted number n of the node samples, the sample weights corresponding to the samples classified to the leaf nodesWeight parameter beta i And obtaining a weight parameter w representing a leaf node, and judging whether to branch according to the number n of the node samples and the weight parameter w. Let the threshold of the number of node samples be N 1 The node sample weight threshold is W 2 。
1) When n is<N 1 Or w<W 2 The time node does not branch;
2) When n is>N 1 And w>W 2 The model branches normally.
(3) Threshold feedback adjustment
The new sample set cannot accurately set the threshold value, so that the model compression effect is poor. Here, as shown in fig. 3, by the feedback adjustment mechanism, after the model is trained, the model is judged, and the threshold is readjusted based on the judgment result. Here, the first version of the model represents the model obtained after the first training.
And predicting in a sample set by using the model obtained by training, and counting the number of leaf nodes of each decision tree in the tree model. Sample weight threshold W based on set leaf nodes 2 Average weight avg for sample set samples w Performing estimation, and sample average weight avg of sample set w The number of samples in the sample library may be reflected. Setting the number of samples of the leaf node with the least samples in the ith decision tree to be- i If the total number of the decision trees is tree _ num, the average weight avg of the samples w Can be obtained according to equation (1).
If the compression effect of the model is not enough and the required file memory is larger than the set memory size, the average weight avg of the samples is calculated according to the number of the samples of the leaf nodes w The second threshold is increased.
If the model is compressed too much, so that the classification effect of the model is poor, if the black detection rate is lower than 90% under the condition of determining the white-false alarm rate (such as 1 ‰), the threshold value is reduced, and the leaf node number of the tree model is increased. The white false alarm rate represents the number of false alarms of the model on the positive samples/the number of positive samples, and the black detection rate represents the number of detected negative samples/the number of negative samples.
In order to reduce the memory consumption of the model, the application embodiment of the application provides a method for reducing the size of the XGboost model based on leaf node information,
counting the number of samples and weight parameters of the leaf nodes, and setting a threshold, wherein when the number of samples and the weight parameters of the leaf nodes in the model are smaller than the set threshold, the nodes are not subjected to branch processing, so that the model in the training process is subjected to pruning, the number of nodes of the tree model can be effectively reduced, the consumption of the memory of the model is reduced, and the model is compressed by more than 20% under the condition that the change of the detectable rate is controlled within 1%.
In order to implement the method according to the embodiment of the present application, an embodiment of the present application further provides a model training apparatus, as shown in fig. 4, the apparatus includes:
a first processing unit 401, configured to determine a first parameter and a second parameter of a first leaf node when a branch is obtained in a process of training a tree model to obtain the first leaf node;
a second processing unit 402, configured to perform branch processing on the first leaf node if the first parameter and the second parameter satisfy a first set condition;
a third processing unit 403, configured to complete training of the tree model based on the first leaf node after the branch processing, to obtain the trained tree model;
a loading unit 404, configured to load the trained tree model into a memory of a terminal device; wherein,
the first parameter characterizes a number of samples classified to a first leaf node; the second parameter represents a weight parameter of the first leaf node; the first setting condition characterizes that the first parameter is greater than a first threshold and the second parameter is greater than a second threshold.
Wherein, in one embodiment, the apparatus further comprises:
the fourth processing unit is used for judging whether the model parameters corresponding to the tree model meet second set conditions or not if the first parameters and the second parameters of all leaf nodes of the tree model do not meet the first set conditions;
and the fifth processing unit is used for updating the first threshold value and/or the second threshold value and training the tree model again when the model parameters corresponding to the tree model meet the second set condition.
In an embodiment, the model parameter represents a file memory of the tree model, the second set condition represents that the file memory of the tree model is larger than a set memory size, and the updating the first threshold and/or the second threshold includes at least one of:
increasing the first threshold by a first set value;
increasing the second threshold by a second set value.
In one embodiment, the second set point is determined based on the second threshold and a sample number of second leaf nodes of each decision tree in the tree model; the second leaf node is characterized and classified to the leaf node with the least number of samples.
In one embodiment, the model parameter characterizes a detection rate of the tree model, the second set condition characterizes the detection rate of the tree model being less than a set threshold, and the updating the first threshold and/or the second threshold includes at least one of:
decreasing the first threshold by a third set value;
reducing the second threshold by a fourth set value.
In one embodiment, the first processing unit 401 is configured to:
determining the second parameter based on third parameters of all samples classified to the first leaf node; the third parameter characterizes a corresponding sample weight parameter.
In one embodiment, the tree model is a tree model of the XGBoost algorithm.
In practical applications, the first Processing Unit 401, the second Processing Unit 402, the third Processing Unit 403, the loading Unit 404, the fourth Processing Unit, and the fifth Processing Unit may be implemented by a Processor in a model-based training device, such as a Central Processing Unit (CPU), a Digital Signal Processor (DSP), a Micro Control Unit (MCU), or a Programmable Gate Array (FPGA).
It should be noted that: in the model training apparatus provided in the above embodiment, only the division of the program modules is exemplified when performing model training, and in practical applications, the processing may be distributed to different program modules according to needs, that is, the internal structure of the apparatus may be divided into different program modules to complete all or part of the processing described above. In addition, the model training device and the model training method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.
Based on the hardware implementation of the program module, and in order to implement the model training method according to the embodiment of the present application, an embodiment of the present application further provides an electronic device, as shown in fig. 5, where the electronic device 500 includes:
a communication interface 510 capable of performing information interaction with other devices, such as network devices and the like;
and the processor 520 is connected with the communication interface 510 to realize information interaction with other devices, and is used for executing the method provided by one or more technical solutions when running a computer program. And the computer program is stored on the memory 530.
Specifically, the processor 520 is configured to:
under the condition that a first leaf node is obtained by branching in the process of training a tree model, determining a first parameter and a second parameter of the first leaf node;
if the first parameter and the second parameter meet a first set condition, performing branch processing on the first leaf node;
finishing the training of the tree model based on the first leaf node after the branch processing to obtain the trained tree model;
loading the trained tree model to a memory of the terminal equipment; wherein,
the first parameter characterizes a number of samples classified to the first leaf node; the second parameter characterizes a weight parameter of the first leaf node; the first setting condition is characterized in that the first parameter is larger than a first threshold value and the second parameter is larger than a second threshold value.
Wherein, in one embodiment, the processor 520 is configured to:
if the first parameters and the second parameters of all leaf nodes of the tree model do not meet the first set condition, judging whether the model parameters corresponding to the tree model meet the second set condition;
and under the condition that the model parameters corresponding to the tree model meet the second set condition, updating the first threshold value and/or the second threshold value, and training the tree model again.
In an embodiment, the model parameter represents a file memory of the tree model, the second set condition represents that the file memory of the tree model is larger than a set memory size, and the updating the first threshold and/or the second threshold includes at least one of:
increasing the first threshold by a first set value;
increasing the second threshold by a second set value.
In one embodiment, the second set point is determined based on the second threshold and a sample number of second leaf nodes of each decision tree in the tree model; the second leaf node characterizes the leaf node classified to the least number of samples.
In one embodiment, the model parameter characterizes a detection rate of the tree model, the second set condition characterizes the detection rate of the tree model being less than a set threshold, and the updating the first threshold and/or the second threshold includes at least one of:
decreasing the first threshold by a third set value;
reducing the second threshold by a fourth set value.
In one embodiment, the processor 520 is configured to:
determining the second parameter based on third parameters of all samples classified to the first leaf node; the third parameter characterizes a corresponding sample weight parameter.
In one embodiment, the tree model is a tree model of the XGBoost algorithm.
Of course, in practice, the various components in the electronic device 500 are coupled together by a bus system 540. It is understood that the bus system 540 is used to enable communications among the components. The bus system 540 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 540 in fig. 5.
The memory 530 in the embodiments of the present application is used to store various types of data to support the operation of the electronic device 500. Examples of such data include: any computer program for operating on the electronic device 500.
It will be appreciated that the memory 530 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a magnetic random access Memory (Flash Memory), a magnetic surface Memory, an optical Disc, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), synchronous Static Random Access Memory (SSRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), enhanced Synchronous Dynamic Random Access Memory (ESDRAM), enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), synchronous Dynamic Random Access Memory (SLDRAM), direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 530 described in embodiments herein is intended to comprise, without being limited to, these and any other suitable types of memory.
The method disclosed in the embodiments of the present application may be applied to the processor 520, or implemented by the processor 520. Processor 520 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 520. The processor 520 described above may be a general purpose processor, a DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 520 may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 530, and the processor 520 reads the program in the memory 530 and performs the steps of the aforementioned methods in conjunction with its hardware.
Optionally, when the processor 520 executes the program, the corresponding process implemented by the electronic device in each method of the embodiment of the present application is implemented, and for brevity, no further description is given here.
In an exemplary embodiment, the present application further provides a storage medium, i.e., a computer storage medium, specifically a computer readable storage medium, for example, a memory 530 storing a computer program, which is executable by a processor 520 of an electronic device to perform the steps of the foregoing method. The computer readable storage medium may be Memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash Memory, magnetic surface Memory, optical disk, or CD-ROM.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus, electronic device and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
The technical means described in the embodiments of the present application may be arbitrarily combined without conflict. Unless otherwise specified and limited, the term "coupled" is to be construed broadly, e.g., as meaning electrical connections, communications between two elements, direct connections, indirect connections through intermediary media, and the like, as well as the specific meaning of the terms as used herein.
In addition, in the examples of the present application, "first", "second", and the like are used for distinguishing similar objects, and are not necessarily used for describing a specific order or a sequential order. It should be understood that "first \ second \ third" distinct objects may be interchanged under appropriate circumstances such that the embodiments of the application described herein may be implemented in an order other than those illustrated or described herein.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Various combinations of the specific features in the embodiments described in the detailed description may be made without contradiction, for example, different embodiments may be formed by different combinations of the specific features, and in order to avoid unnecessary repetition, various possible combinations of the specific features in the present application will not be described separately.
Claims (10)
1. A method of model training, the method comprising:
under the condition that a first leaf node is obtained by branching in the process of training a tree model, determining a first parameter and a second parameter of the first leaf node;
if the first parameter and the second parameter meet a first set condition, performing branch processing on the first leaf node;
finishing the training of the tree model based on the first leaf node after the branch processing to obtain the trained tree model;
loading the trained tree model to a memory of the terminal equipment; wherein,
the first parameter characterizes a number of samples classified to the first leaf node; the second parameter characterizes a weight parameter of the first leaf node; the first setting condition is characterized in that the first parameter is larger than a first threshold value and the second parameter is larger than a second threshold value.
2. The method of claim 1, further comprising:
if the first parameters and the second parameters of all leaf nodes of the tree model do not meet the first set condition, judging whether the model parameters corresponding to the tree model meet the second set condition;
and under the condition that the model parameters corresponding to the tree model meet the second set condition, updating the first threshold value and/or the second threshold value, and training the tree model again.
3. The model training method according to claim 2, wherein the model parameter represents a file memory of the tree model, the second set condition represents that the file memory of the tree model is larger than a set memory size, and the updating the first threshold and/or the second threshold comprises at least one of:
increasing the first threshold by a first set value;
increasing the second threshold by a second set value.
4. A model training method according to claim 3, wherein the second set value is determined based on the second threshold value and a sample number of second leaf nodes per decision tree in the tree model; the second leaf node is characterized and classified to the leaf node with the least number of samples.
5. The model training method according to claim 2, wherein the model parameter characterizes a detection rate of the tree model, the second set condition characterizes the detection rate of the tree model being smaller than a set threshold, and the updating the first threshold and/or the second threshold comprises at least one of:
decreasing the first threshold by a third set value;
reducing the second threshold by a fourth set value.
6. The method of model training of claim 1, wherein in said determining a first parameter and a second parameter of the first leaf node, the method comprises:
determining the second parameter based on third parameters of all samples classified to the first leaf node; the third parameter characterizes a corresponding sample weight parameter.
7. The model training method according to any one of claims 1 to 6, wherein the tree model is a tree model of an XGboost algorithm.
8. A model training apparatus, comprising:
the first processing unit is used for determining a first parameter and a second parameter of a first leaf node under the condition that the first leaf node is obtained by branching in the process of training the tree model;
the second processing unit is used for performing branch processing on the first leaf node if the first parameter and the second parameter meet a first set condition;
the third processing unit is used for finishing the training of the tree model based on the first leaf node after the branch processing to obtain the trained tree model;
the loading unit is used for loading the trained tree model to a memory of the terminal equipment; wherein,
the first parameter characterizes a number of samples classified to the first leaf node; the second parameter characterizes a weight parameter of the first leaf node; the first setting condition is characterized in that the first parameter is larger than a first threshold value and the second parameter is larger than a second threshold value.
9. An electronic device, comprising: a processor and a memory for storing a computer program capable of running on the processor,
wherein the processor is adapted to perform the steps of the model training method of any one of claims 1 to 7 when running the computer program.
10. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the model training method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110720159.0A CN115618219A (en) | 2021-06-28 | 2021-06-28 | Model training method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110720159.0A CN115618219A (en) | 2021-06-28 | 2021-06-28 | Model training method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115618219A true CN115618219A (en) | 2023-01-17 |
Family
ID=84855518
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110720159.0A Pending CN115618219A (en) | 2021-06-28 | 2021-06-28 | Model training method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115618219A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116383655A (en) * | 2023-04-07 | 2023-07-04 | 北京百度网讯科技有限公司 | Sample generation method, model training method, text processing method and device |
-
2021
- 2021-06-28 CN CN202110720159.0A patent/CN115618219A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116383655A (en) * | 2023-04-07 | 2023-07-04 | 北京百度网讯科技有限公司 | Sample generation method, model training method, text processing method and device |
CN116383655B (en) * | 2023-04-07 | 2024-01-05 | 北京百度网讯科技有限公司 | Sample generation method, model training method, text processing method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110826071B (en) | Software vulnerability risk prediction method, device, equipment and storage medium | |
CN107832062B (en) | Program updating method and terminal equipment | |
CN113379301A (en) | Method, device and equipment for classifying users through decision tree model | |
CN109992473B (en) | Application system monitoring method, device, equipment and storage medium | |
CN112367384B (en) | Kafka cluster-based dynamic speed limiting method and device and computer equipment | |
CN113298127B (en) | Method for training anomaly detection model and electronic equipment | |
CN117175664B (en) | Energy storage charging equipment output power self-adaptive adjusting system based on use scene | |
CN114168318A (en) | Training method of storage release model, storage release method and equipment | |
CN112181430B (en) | Code change statistical method, device, electronic equipment and storage medium | |
US20210203685A1 (en) | Disaster security calculation method, and user terminal and non-transitory medium implementing same | |
CN117033424A (en) | Query optimization method and device for slow SQL (structured query language) statement and computer equipment | |
CN115618219A (en) | Model training method and device, electronic equipment and storage medium | |
CN113190405A (en) | Node health detection method and device, electronic equipment and storage medium | |
CN113162888A (en) | Security threat event processing method and device and computer storage medium | |
CN112905885B (en) | Method, apparatus, device, medium and program product for recommending resources to user | |
CN115033551A (en) | Database migration method and device, electronic equipment and storage medium | |
CN112672405B (en) | Power consumption calculation method, device, storage medium, electronic equipment and server | |
CN111159009B (en) | Pressure testing method and device for log service system | |
CN112295216B (en) | Method, system, electronic device and storage medium for analyzing time delay disconnection of player | |
CN115169692A (en) | Time series prediction method and related device | |
JP6670966B1 (en) | Plant operating condition determining apparatus, plant control system, operating condition determining method, and program | |
CN111931930A (en) | Model pruning method and device and electronic equipment | |
CN113656263B (en) | Data processing method, system, storage medium and terminal | |
CN116541252B (en) | Computer room fault log data processing method and device | |
CN112306824B (en) | Disk performance evaluation method, system, device and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |