CN117195682A - Construction method of hardware XGBoost model and data prediction method based on hardware XGBoost model - Google Patents

Construction method of hardware XGBoost model and data prediction method based on hardware XGBoost model Download PDF

Info

Publication number
CN117195682A
CN117195682A CN202310922799.9A CN202310922799A CN117195682A CN 117195682 A CN117195682 A CN 117195682A CN 202310922799 A CN202310922799 A CN 202310922799A CN 117195682 A CN117195682 A CN 117195682A
Authority
CN
China
Prior art keywords
hardware
sub
decision tree
xgboost
xgboost model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310922799.9A
Other languages
Chinese (zh)
Inventor
曾琳
梁志远
朱家鑫
尚禹宏
兰军
何滇
张野
宋子奇
查梦凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhu Research Institute of Xidian University
Original Assignee
Wuhu Research Institute of Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhu Research Institute of Xidian University filed Critical Wuhu Research Institute of Xidian University
Priority to CN202310922799.9A priority Critical patent/CN117195682A/en
Publication of CN117195682A publication Critical patent/CN117195682A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a construction method of a hardware XGBoost model and a data prediction method based on the hardware XGBoost model, wherein the construction method of the hardware XGBoost model comprises the following steps: determining the number of hardware sub-modules participating in decision tree construction and the number of judgment node layers in the sub-decision tree corresponding to each hardware sub-module according to the depth of the XGBoost decision tree to be constructed and the maximum number of judgment node layers executable by the hardware sub-modules in the hardware processor; constructing sub decision trees in all hardware sub modules; a signal transmission link between each hardware sub-module is constructed, so that each hardware sub-module sequentially passes through the enabling signal transmission feature vector until the last hardware sub-module, and the construction of the XGBoost decision tree is completed; the sub-decision tree in the last hardware sub-module is a general tree with the maximum number of judgment node layers and including all the branches and leaf nodes. By executing the method, the XGBoost model with higher calculation efficiency and better universality can be obtained.

Description

Construction method of hardware XGBoost model and data prediction method based on hardware XGBoost model
Technical Field
The invention relates to the technical field of integrated circuit design, in particular to a construction method of a hardware XGBoost model and a data prediction method based on the hardware XGBoost model.
Background
XGBoost is an integrated machine learning algorithm based on decision trees, which presents the advantages of high efficiency and accuracy over large-scale data sets. However, XGBoost requires a large number of computations, which limits its application in real-time and low latency. To solve this problem, researchers have proposed a scheme for hardware-optimized acceleration of XGBoost.
In XGBoost algorithm hardware optimization acceleration, common methods include optimizing computations using SIMD instruction sets, accelerating computations using a GPU, and so forth. When the SIMD instruction set is used for optimizing calculation, the continuity of data needs to be ensured, namely the data is required to be arranged in a memory according to a certain sequence, which brings certain challenges to an irregular data structure; when using a GPU to accelerate computation, data needs to be transferred from host memory to GPU device memory, which requires a certain amount of time and bandwidth, and if the amount of data is small, the overhead of such data transfer may offset the computational advantage of the GPU. In addition, the GPU acceleration calculation needs to consider the problems of thread synchronization, data consistency and the like, and is complex to implement.
Therefore, how to improve the computing efficiency of XGBoost algorithm and reduce the computing resource consumption is a problem to be solved.
Disclosure of Invention
In view of the above, the present invention provides a method for constructing XGBoost models on a hardware platform and in a plurality of hardware sub-modules therein, so as to improve the calculation efficiency and the calculation resource consumption of the XGBoost models constructed and obtained; and correspondingly provides a generic use method of the XGBoost model.
According to a first aspect, the present invention provides a method for constructing a hardware XGBoost model, including the following steps:
determining the number of hardware sub-modules participating in decision tree construction and the number of judgment node layers in the sub-decision tree corresponding to each hardware sub-module according to the depth of the XGBoost decision tree to be constructed and the maximum number of judgment node layers executable by the hardware sub-modules in the hardware processor;
constructing sub decision trees in all hardware sub modules;
a signal transmission link between each hardware sub-module is constructed, so that each hardware sub-module sequentially passes through the enabling signal transmission feature vector until the last hardware sub-module, and the construction of the XGBoost decision tree is completed; the sub-decision tree in the last hardware sub-module is a general tree with the maximum number of judgment node layers and including all the branches and leaf nodes.
In an alternative embodiment, before the step of constructing the sub-decision tree within each hardware sub-module, the method further comprises the steps of:
writing a header file in a hardware processor according to all types of decision trees corresponding to the maximum judgment node layer number; the header file comprises a plurality of function functions, and each function corresponds to a decision tree type;
the step of constructing a sub-decision tree in each hardware sub-module specifically comprises the following steps:
and calling a function in the header file to construct a sub-decision tree in each hardware sub-module.
In an alternative embodiment, the hardware processor is an FPGA processor, and each hardware sub-module is a state machine in the FPGA processor.
In an alternative embodiment, the maximum decision node layer is level 3, and the number of functions is 26.
According to a second aspect, the present invention also provides a data prediction method based on a hardware XGBoost model, including the following steps:
obtaining a feature vector to be predicted corresponding to the data to be predicted, and adding redundancy parameters into the feature vector to be predicted according to a general tree in the XGBoost decision tree constructed by the method in any embodiment of the first aspect to obtain a simulation vector;
and inputting the simulation vector into the XGBoost decision tree to obtain a prediction result corresponding to the feature vector to be predicted.
In an alternative embodiment, the redundancy parameter is set according to the difference between the actual decision tree corresponding to the feature vector to be predicted and the XGBoost decision tree.
In an alternative embodiment, the data prediction method based on the hardware XGBoost model further includes the following steps:
and inputting the prediction result into a sigmoid function to obtain a prediction probability value.
The technical scheme provided by the invention has the following advantages:
1. according to the construction method of the hardware XGBoost model, the XGBoost model is directly constructed on the hardware platform, and the XGBoost model is constructed in a plurality of hardware sub-modules according to the resource limitation in the hardware processor, so that the performance of the hardware processor can be fully exerted, and the calculation efficiency of the XGBoost model is improved; meanwhile, a sub-decision tree in the last hardware sub-module on the feature vector transfer link is a general tree, so that a foundation can be provided for the general application of the XGBoost model.
2. According to the construction method of the hardware XGBoost model, the function is written in the header file of the hardware processor, so that the sub decision trees in all hardware sub modules can be constructed directly by calling the function in the header file, the XGBoost model is convenient to modify, and the code readability and maintainability of the constructed XGBoost model are improved.
3. According to the data prediction method based on the hardware XGBoost model, the redundancy parameters are added in the feature vector to be predicted to obtain the simulation vector, so that the simulation vector is used for simulating a specific decision tree on a general tree after being input into the general tree, the generalized use of the XGBoost model is realized, the complexity and the repeatability of the XGBoost model design are reduced, the algorithm design efficiency is improved, the pruning workload in the construction process of the XGBoost model can be reduced, and the operation resource consumption of a hardware algorithm is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for constructing a hardware XGBoost model according to embodiment 1 of the present invention;
FIG. 2 is a flowchart of another method for constructing a hardware XGBoost model according to embodiment 1 of the present invention;
FIG. 3 is a diagram showing an example of the type of sub-decision tree in the hardware sub-module provided in embodiment 1 of the invention;
FIG. 4 is a flow chart of a method of data prediction method based on the hardware XGBoost model according to embodiment 2 of the present invention;
FIG. 5A is an exemplary diagram of an actual decision tree provided by an embodiment of the present invention;
FIG. 5B is an exemplary diagram of an implementation of the actual decision tree of FIG. 5A in a general tree.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the description of the present invention, it should be noted that the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Example 1
Fig. 1 shows a flowchart of a method for constructing a hardware XGBoost model in an embodiment of the present invention, specifically, as shown in fig. 1, the method may include the following steps:
s101: and determining the number of hardware sub-modules participating in the construction of the decision tree and the number of judgment node layers in the sub-decision tree corresponding to each hardware sub-module according to the depth of the XGBoost decision tree to be constructed and the maximum number of judgment node layers executable by the hardware sub-modules in the hardware processor.
In this embodiment, taking the maximum number of layers of judging nodes executable by the hardware submodule as an example, if the depth of the to-be-constructed XGBoost decision tree is 8 and the corresponding number of layers of judging nodes is 7, the 7 layers of judging nodes can be disassembled into 3 parts, namely 2 layers, 2 layers and 3 layers, respectively, that is, the number of hardware submodules built by the hardware processor with the decision tree is 3 at this time, and the number of judging nodes in the decision tree in the 3 hardware submodules is 2 layers, 2 layers and 3 layers respectively.
S102: and constructing sub-decision trees in all hardware sub-modules.
S103: a signal transmission link between each hardware sub-module is constructed, so that each hardware sub-module sequentially passes through the enabling signal transmission feature vector until the last hardware sub-module, and the construction of the XGBoost decision tree is completed; the sub-decision tree in the last hardware sub-module is a general tree with the maximum number of judgment node layers and including all the branches and leaf nodes.
In a specific implementation manner of this embodiment, in order to increase convenience in constructing the XGBoost decision tree and code readability and maintainability of the XGBoost model, the method for constructing the XGBoost hardware model in this embodiment may also include the following steps:
s201: and determining the number of hardware sub-modules participating in the construction of the decision tree and the number of judgment node layers in the sub-decision tree corresponding to each hardware sub-module according to the depth of the XGBoost decision tree to be constructed and the maximum number of judgment node layers executable by the hardware sub-modules in the hardware processor.
S202: writing a header file in a hardware processor according to all types of decision trees corresponding to the maximum judgment node layer number; the header file contains a plurality of function functions, each function corresponding to a decision tree type.
S203: and calling a function in the header file to construct a sub-decision tree in each hardware sub-module.
S204: a signal transmission link between each hardware sub-module is constructed, so that each hardware sub-module sequentially passes through the enabling signal transmission feature vector until the last hardware sub-module, and the construction of the XGBoost decision tree is completed; the sub-decision tree in the last hardware sub-module is a general tree with the maximum number of judgment node layers and including all the branches and leaf nodes.
In this embodiment, taking the situation that the depth of the XGBoost decision tree to be built is 8, the number of hardware sub-modules is 3, and the number of judgment node layers in the sub-decision tree in each hardware sub-module is 2 layers, 2 layers and 3 layers as an example, the types of the sub-decision tree in the hardware sub-modules mainly include four types of type a, type B, type C and type D (the general tree in the last hardware sub-module is the type D tree) shown in fig. 3, where type a corresponds to 8 branch graphs, type B corresponds to 16 branch graphs, type C corresponds to 1 branch graph, type D corresponds to 1 branch graph, and 26 branch graphs in total, so 26 function functions can be written in the header file.
In this embodiment, each judgment node in the hardware sub-module may be designed using a nested conditional judgment statement, and each sub-module may use a state machine to implement selection of a decision tree node and transfer of a feature vector.
In this embodiment, the hardware processor may be an FPGA processor.
In summary, in the method for constructing the XGBoost model of the hardware in the embodiment, the XGBoost model is directly constructed on the hardware platform, and the XGBoost model is constructed in a plurality of hardware sub-modules according to the resource limitation in the hardware processor, so that the performance of the hardware processor can be fully exerted, and the computing efficiency of the XGBoost model is improved; meanwhile, the XGBoost model which can be widely applied can be constructed by taking the sub-decision tree in the last hardware sub-module on the feature vector transmission link as a general tree.
Example 2
FIG. 4 shows a flowchart of a data prediction method based on a hardware XGBoost model in an embodiment of the present invention, specifically, as shown in FIG. 4, the method may include the following steps:
s401: obtaining a feature vector to be predicted corresponding to the data to be predicted, and adding redundancy parameters into the feature vector to be predicted according to a general tree in the XGBoost decision tree constructed by the method in the embodiment 1 to obtain a simulation vector.
S402: and inputting the simulation vector into the XGBoost decision tree to obtain a prediction result corresponding to the feature vector to be predicted.
In this embodiment, the redundancy parameter is set according to the difference between the actual decision tree corresponding to the feature vector to be predicted and the general tree in the XGBoost model.
Specifically, taking the actual decision tree corresponding to the feature vector to be predicted as an example of the decision tree shown in fig. 5A, it is necessary to add a condition judgment feature corresponding to the right branch of the general tree in fig. 5B to the feature vector to be predicted, to obtain a corresponding simulation vector, and the judgment condition in the general tree of the simulation vector is shown in fig. 5B.
In an alternative implementation manner, on the basis of the steps S401 and S402, the method for constructing the hardware XGBoost model in this embodiment may further include the following steps:
s403: and inputting the prediction result into a sigmoid function to obtain a prediction probability value.
In summary, in the data prediction method based on the hardware XGBoost model in the embodiment, redundancy parameters are added in the feature vector to be predicted to obtain a simulation vector, so that after the simulation vector is input into a general tree, a specific decision tree can be simulated on the general tree, the generalized use of the XGBoost model is realized, the complexity and repeatability of the XGBoost model design are reduced, the efficiency of algorithm design is improved, the pruning workload in the construction process of the XGBoost model can be reduced, and the operation resource consumption of the hardware algorithm is reduced.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. While still being apparent from variations or modifications that may be made by those skilled in the art are within the scope of the invention.

Claims (7)

1. The construction method of the hardware XGBoost model is characterized by comprising the following steps of:
determining the number of hardware sub-modules participating in the construction of the decision tree and the number of judging node layers in the sub-decision tree corresponding to each hardware sub-module according to the depth of the XGBoost decision tree to be constructed and the maximum number of judging node layers executable by the hardware sub-modules in the hardware processor;
constructing sub decision trees in the hardware sub modules;
constructing signal transmission links among the hardware sub-modules, so that the hardware sub-modules sequentially pass through the enabled signal transmission feature vector until the last hardware sub-module to finish construction of the XGBoost decision tree; the last sub-decision tree in the hardware sub-module is a general tree with the maximum judgment node layer number and including all branches and leaf nodes.
2. The method for constructing a hardware XGBoost model according to claim 1, further comprising, before the step of constructing a sub-decision tree within each of the hardware sub-modules, the steps of:
writing a header file in a hardware processor according to all types of decision trees corresponding to the maximum judgment node layer number; the header file comprises a plurality of function functions, and each function corresponds to a decision tree type;
the step of constructing the sub-decision tree in each hardware sub-module specifically comprises the following steps:
and calling a function in the header file to construct a sub-decision tree in each hardware sub-module.
3. The method for constructing a hardware XGBoost model according to claim 2, wherein the hardware processor is an FPGA processor, and each of the hardware sub-modules is a state machine in the FPGA processor.
4. A method for constructing a XGBoost model of hardware according to claim 3, wherein the maximum judging node layer is a layer 3, and the number of functions is 26.
5. The data prediction method based on the hardware XGBoost model is characterized by comprising the following steps of:
obtaining a feature vector to be predicted corresponding to data to be predicted, and adding redundancy parameters into the feature vector to be predicted according to a general tree in the XGBoost decision tree constructed by the method of any one of claims 1-4 to obtain a simulation vector;
and inputting the simulation vector into the XGBoost decision tree to obtain a prediction result corresponding to the feature vector to be predicted.
6. The data prediction method based on the hardware XGBoost model according to claim 5, wherein the redundancy parameter is set according to a distinction between an actual decision tree corresponding to the feature vector to be predicted and the XGBoost decision tree.
7. The hardware XGBoost model-based data prediction method according to claim 5 or 6, further comprising the steps of:
and inputting the prediction result into a sigmoid function to obtain a prediction probability value.
CN202310922799.9A 2023-07-24 2023-07-24 Construction method of hardware XGBoost model and data prediction method based on hardware XGBoost model Pending CN117195682A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310922799.9A CN117195682A (en) 2023-07-24 2023-07-24 Construction method of hardware XGBoost model and data prediction method based on hardware XGBoost model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310922799.9A CN117195682A (en) 2023-07-24 2023-07-24 Construction method of hardware XGBoost model and data prediction method based on hardware XGBoost model

Publications (1)

Publication Number Publication Date
CN117195682A true CN117195682A (en) 2023-12-08

Family

ID=88995039

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310922799.9A Pending CN117195682A (en) 2023-07-24 2023-07-24 Construction method of hardware XGBoost model and data prediction method based on hardware XGBoost model

Country Status (1)

Country Link
CN (1) CN117195682A (en)

Similar Documents

Publication Publication Date Title
CN112052958A (en) Method, apparatus, device and computer-readable storage medium for model training
CN116883229A (en) Pipeline parallel method to accelerate neural network training in heterogeneous GPU clusters
JP7246447B2 (en) Model training method, apparatus, electronic device, storage medium, development system and program
US20240422067A1 (en) Data synchronization method and apparatus, and device and storage medium
CN117271101B (en) Operator fusion method and device, electronic equipment and storage medium
CN115062786B (en) Quantum bit mapping and quantum gate scheduling method for quantum computer
CN116151374A (en) Distributed model reasoning method, device, equipment, storage medium and program product
CN115879529A (en) Method, medium and device for automatic parallel strategy search based on network-level simulation
CN114610648A (en) Test method, device and equipment
CN116991560A (en) Parallel scheduling method, device, equipment and storage medium for language model
US20090064120A1 (en) Method and apparatus to achieve maximum outer level parallelism of a loop
CN119046022B (en) A method, device, equipment and medium for determining a distributed parallel solution
CN114398949B (en) Training method of impulse neural network model, storage medium and computing equipment
CN118331591B (en) Method, device, storage medium and equipment for deploying intelligent algorithm on satellite
CN117829242B (en) Model processing method and related equipment
CN116303219B (en) A method, device and electronic device for obtaining grid files
CN117195682A (en) Construction method of hardware XGBoost model and data prediction method based on hardware XGBoost model
CN115016943B (en) A parallel computing method, system, device and storage medium
CN117350384A (en) Model parallel reasoning method and device, electronic equipment and storage medium
WO2022057459A1 (en) Tensorcore-based int4 data type processing method and system, device, and medium
CN116402091A (en) Hybrid engine intelligent computing method and device for artificial intelligent chip
CN110415162B (en) Adaptive Graph Partitioning Method for Heterogeneous Fusion Processors in Big Data
CN114912570A (en) Method, apparatus, device and readable medium for accelerating neural network model optimization
CN118113342B (en) A heterogeneous acceleration method, device, electronic device and storage medium
CN118626275B (en) Heterogeneous computing resource virtualization processing method, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination