WO2020156000A1

WO2020156000A1 - Computer implemented event risk assessment method and device

Info

Publication number: WO2020156000A1
Application number: PCT/CN2019/129863
Authority: WO
Inventors: 李彬; 张可尊
Original assignee: 阿里巴巴集团控股有限公司
Priority date: 2019-02-01
Filing date: 2019-12-30
Publication date: 2020-08-06
Also published as: CN110008349B; CN110008349A; TWI723528B; TW202030685A

Abstract

A computer implemented event risk assessment method and device. In the method above, first a natural language processing model is used to extract a plurality of sample events from a content text library, which comprises identifying a first sample event and event type corresponding thereto and extracting a first event element of the first sample event according to the event type. Then, in a knowledge graph associated with the first sample event, a first associated element associated with the first event element is obtained. Next, according to the event type, the first event element and the first associated element, event features of the first sample event are determined. A GBDT model may be trained on the basis of the event features of each sample event among the plurality of sample events and a calibrated risk value of each sample event. Therefore, the trained GBDT model may be used to assess the risk value of a second event to be analyzed, and the features of the assessed risk value may also be expounded upon.

Description

Method and device for computer-executed event risk assessment

Technical field

One or more embodiments of this specification relate to the field of machine learning, and more particularly to methods and devices for assessing event risk using machine learning.

Background technique

With the development of computer technology, machine learning has been applied to various technical fields for analyzing and predicting various business data. In many application scenarios, it is necessary to analyze and predict various business events, especially the risk of various events, such as public opinion risk, security risk, etc., in order to provide early warning and assist relevant business personnel in business preparation.

Therefore, it is hoped to provide an improved scheme that can effectively evaluate the risk of an event.

Summary of the invention

One or more embodiments of this specification describe a computer-implemented event risk assessment method and device, which constructs event features by expanding the elements of the event, and trains the GBDT model to achieve an effective assessment of the risk of the event, and can evaluate the estimated risk value Provide corresponding feature explanation.

According to the first aspect, a computer-executed event risk assessment method is provided, including:

Using a natural language processing model, extracting multiple sample events from the content text database, the multiple sample events including a first sample event, and the extracting multiple sample events includes identifying the first sample event and its corresponding first sample event An event type, and at least one first event element of the first sample event is extracted according to the first event type;

Acquiring at least one first associated element associated with the at least one first event element in at least one knowledge graph corresponding to at least one field associated with the first sample event;

Determine the event characteristics of the first sample event according to the first event type, the at least one first event element, and the at least one first correlation element;

Training the gradient boosting decision tree GBDT model according to the event characteristics of each sample event in the multiple sample events and the calibration risk value of each sample event to obtain the trained GBDT model;

Using the trained GBDT model, the risk assessment of the second event to be analyzed is performed.

In an embodiment, at least one event element of the first sample event is extracted in the following manner:

Determine a first template corresponding to the first event type; use the first template to extract at least one first event element of the first sample event from the content text library.

In an embodiment, the at least one first event element includes at least one of the following: event time, event location, implementation subject, event object, fact type, and event level.

According to one embodiment, the related elements are obtained in the following way:

The at least one first event element is mapped to the first node in the at least one knowledge graph; the node directly connected to the first node in the at least one knowledge graph is used as the at least one associated element.

In an embodiment, the above-mentioned knowledge graph may include: enterprise knowledge graph, product knowledge graph, character knowledge graph, information knowledge graph, stock knowledge graph, fund knowledge graph, and institution knowledge graph.

According to one embodiment, after training the GBDT model, performing risk assessment on the second event to be analyzed specifically includes:

Acquiring the event type of the second event and at least one second event element;

In the at least one knowledge graph, acquiring at least one second associated element associated with the at least one second event element;

Determine the event characteristics of the second event according to the event type of the second event, the at least one second event element, and the at least one second correlation element;

The event characteristics of the second event are input into the trained GBDT model, and the risk value of the second event is determined according to the model output.

Further, in one embodiment, the second event element is obtained in the following manner:

Identifying the second event and the second event type from the input text;

According to the second event type, the at least one second event element is extracted from the input text.

Alternatively, the input second event and the at least one second event element may be directly received.

In one embodiment, the trained GBDT model includes at least one decision tree, the decision tree includes branch nodes and leaf nodes, each branch node corresponds to a feature, and has the risk score and node weight obtained by training. , Wherein the node weight is determined based on the respective node loss values of the branch node and the split node, and the node loss value is determined based on the difference between the nominal risk value of the sample event falling into the node and the risk score of the node. In this case, the risk assessment of the second event to be analyzed also includes:

Determining the decision path of the second event in the decision tree according to the event characteristics of the second event;

Determine each branch node passed by the decision path, and obtain the feature and node weight corresponding to each branch node;

For the first feature included in the event feature of the second event, the feature weight of the first feature is determined according to the node weight of at least one branch node corresponding to the first feature in each branch node, as The importance of this first characteristic to the risk value.

According to another embodiment, the GBDT model obtained by training includes at least one decision tree, and the decision tree includes branch nodes and leaf nodes; after obtaining such a GBDT model, performing risk assessment on the second event to be analyzed specifically includes :

Acquiring at least one second event element of the second event;

Dividing a second event in the decision tree according to the at least one second event element, and determining a subtree of the decision tree based on the divided stop nodes;

Determining a first leaf node in the subtree that meets a predetermined condition, and a conditional path from the root node to the first leaf node;

The feature combination corresponding to the branch and trunk nodes included in the conditional path is acquired, and the feature combination is used as the influence feature of the second event under the predetermined condition.

Further, in one embodiment, each leaf node in the decision tree obtains a risk score through training, and each branch node corresponds to a feature, and has the risk score obtained by training and the node weight, wherein the node weight is based on The node loss value of the branch node and the node after the split is determined, and the node loss value is determined based on the difference between the calibrated risk value of the sample event falling into the node and the risk score of the node; accordingly, in an implementation In the example, the risk assessment of the second event to be analyzed also includes one or more of the following:

Determining the first risk score corresponding to the first leaf node as the risk value of the second event under the predetermined condition;

The importance of each feature corresponding to each branch node in the feature combination is determined according to the node weight of each branch node in the conditional path.

According to a second aspect, a computer-executed event risk assessment device is provided, including:

The extraction unit is configured to use a natural language processing model to extract a plurality of sample events from the content text library, the plurality of sample events including a first sample event, and the extracting the plurality of sample events includes identifying the first sample event And its corresponding first event type, and extracting at least one first event element of the first sample event according to the first event type;

An associating unit configured to obtain at least one first associated element associated with the at least one first event element in at least one knowledge graph corresponding to at least one field associated with the first sample event;

A feature determining unit configured to determine the event feature of the first sample event according to the first event type, the at least one first event element, and the at least one first correlation element;

The training unit is configured to train the gradient boosting decision tree GBDT model according to the event characteristics of each sample event in the multiple sample events and the calibration risk value of each sample event to obtain the trained GBDT model;

The evaluation unit is configured to use the trained GBDT model to perform risk evaluation on the second event to be analyzed.

According to a third aspect, there is provided a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method of the first aspect.

According to a fourth aspect, there is provided a computing device, including a memory and a processor, characterized in that executable code is stored in the memory, and when the processor executes the executable code, the method of the first aspect is implemented .

According to the method and device provided by the embodiment of this specification, a more comprehensive event feature is constructed by expanding the event elements in the knowledge graph of the related field. Based on the event characteristics of the sample events and the calibrated risk value, a GBDT model including a decision tree can be trained. Using such a GBDT model, not only can the risk value be evaluated for the unknown risk to be assessed, but also the risk value can be characterized. In this way, while realizing quantitative prediction, it can also make the prediction result have a stronger logical expression And interpretability.

Description of the drawings

In order to explain the technical solutions of the embodiments of the present invention more clearly, the following will briefly introduce the drawings used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. A person of ordinary skill in the art can obtain other drawings based on these drawings without creative work.

Figure 1 is a schematic diagram of the implementation process of an embodiment disclosed in this specification;

Figure 2 shows a flowchart of an event risk assessment method according to an embodiment;

Figure 3 shows a decision tree trained according to an embodiment;

Figure 4 shows a flow chart of performing risk assessment on a second event in an embodiment;

Figure 5 shows the division process of the second event in the decision tree in one embodiment;

Fig. 6 shows a flow of steps for feature interpretation in an embodiment;

Fig. 7 shows a flowchart of steps for evaluating a second event according to an embodiment;

Fig. 8 shows a schematic block diagram of an event evaluation device according to an embodiment.

detailed description

The following describes the solutions provided in this specification with reference to the drawings.

As mentioned earlier, in a variety of application scenarios, various incidents need to be studied and risk assessed, for example, to determine the impact and risk of a certain Internet company’s user information leakage incident on network security. Generally speaking, the methods of analysis in the field of such event research mainly include two types: quantitative methods and qualitative methods. Quantitative methods often use quantitative methods to mine public opinion factors, and construct quantitative public opinion factors based on AI algorithms, that is, first factorize the event and use some quantitative indicators, such as the level of historical investment income within a predetermined time after the event. Measure the impact and risk of the event. However, such schemes often lack a detailed division of event types, lose the logical context of the event, and are not well interpretable. In addition, the impact and risk of an event depend on the granularity of the event during factorization. Often, the event definition does not distinguish a certain key attribute characteristic of the event, which makes it difficult to discover truly meaningful factors or characteristics.

Qualitative methods often use manual labeling to manually complete event definition and risk analysis. This process requires strong professional analysis and individual analysis of each event. Failure to systematize and automate the analysis results in low analysis efficiency. Moreover, whether the analysis result is correct depends on whether the subjective experience of the analyst can cover the key attributes of the event. In addition, the conclusions of qualitative analysis can only be judged in the positive and negative directions, and the judgment of the degree of influence cannot be quantified, and it is highly subjective.

On this basis, the embodiments of this specification provide an improved scheme for assessing event risk, which provides objective and quantitative predictive analysis while also making the predictive result more interpretable. Figure 1 is a schematic diagram of the implementation process of an embodiment disclosed in this specification. As shown in Fig. 1, according to the solution of the embodiment, first sample events are extracted, and features are constructed for the sample events. When constructing the characteristics of the event, not only the elements of the event itself are considered, but also the knowledge graphs of related fields are combined to dig out relevant elements from the knowledge graphs to jointly form the event characteristics, which makes the event characteristics more comprehensive and rich. On this basis, the gradient boosting decision tree GBDT model is trained using the event characteristics of multiple sample events and the calibrated risk, and the decision tree is obtained through training. In this decision tree, the path from the root node to the leaf nodes corresponds to a combination of features. In this way, not only can the trained GBDT model be used to evaluate the risk of the event to be analyzed, but also the feature combination corresponding to the decision path in the decision tree can be used to explain the contribution and impact of various features on the event risk, so that Event analysis has a stronger logical context and interpretability. The implementation of the above concept is described in detail below.

Fig. 2 shows a flowchart of an event risk assessment method according to an embodiment. It can be understood that the method can be executed by any device, device, platform, or device cluster with computing and processing capabilities. As shown in Figure 2, the risk assessment method includes at least the following steps: Step 21, using a natural language processing model to extract a plurality of sample events from the content text database, the plurality of sample events including the first sample event, Extracting multiple sample events includes identifying a first sample event and its corresponding first event type, and extracting at least one first event element of the first sample event according to the first event type; step 22, in and In at least one knowledge graph corresponding to at least one field associated with the first sample event, at least one first associated element associated with the at least one first event element is acquired; step 23, according to the first The event type, at least one first event element, and the at least one first correlation element determine the event feature of the first sample event; step 24, according to the event feature of each sample event in the plurality of sample events, And the calibrated risk value of each sample event, train the gradient boosting decision tree GBDT model to obtain the trained GBDT model; step 25, use the trained GBDT model to perform risk assessment on the second event to be analyzed.

It can be understood that in the above steps, steps 21-24 involve the training process of the GBDT model for event evaluation, and step 25 involves the process of prediction and evaluation using the trained model. The following describes the implementation of the above steps in conjunction with specific examples.

First, in step 21, a natural language processing model is used to extract multiple events from the content text database as sample events for model training. According to the field of the event to be analyzed, the above-mentioned content text library can include financial news, technology news, scientific research articles, and so on. It can be understood that there have been many event extraction models based on natural language processing, and these models can all be used for event extraction in step 21.

Generally, the event extraction process includes at least the following steps: First, perform word segmentation on sentences in the text based on natural language processing, remove stop words and other preprocessing, to obtain the word segmentation set; optionally, perform entity recognition on the word segmentation in the word segmentation set ; Then, determine the trigger word of the event from the word segmentation set. Generally, the trigger word type corresponds to the event type. Once the trigger word and trigger word type are determined, the event type can be determined. Furthermore, in order to express the event, the argument words used as arguments and the roles of each argument word are also determined from the word segmentation set. By extracting and determining trigger words and argument words, an event can be identified and the event type of the event can be determined.

According to the embodiment of the present specification, in step 21, extracting each event also includes extracting elements of each event. Take any one of these events, referred to as the first sample event below as an example, to describe the process of extracting event elements. It should be understood that the descriptions of "first" and "second" in this article are only used to distinguish similar objects and do not have other limiting meanings.

As mentioned above, by extracting and determining trigger words and argument words from the content text database, the first sample event can be identified and the event type of the first sample event can be determined. Correspondingly, according to the event type of the first sample event, hereinafter referred to as the first event type, the event elements of the first sample event are extracted from the aforementioned content text library. Event elements can include event time, event location, implementation subject, event object, fact type, event level, and so on. According to one embodiment, the event elements to be extracted are related to event types, and different event types correspond to different event elements.

For example, in a specific example, the first sample event identified from the content text database is "XY company vaccine fraud event", and the event type corresponding to this event is "product fraud". For such event types, the event elements that need to be extracted can include implementation subjects, product categories, event levels, and so on.

In another specific example, the first sample event identified was "passing someone to increase holdings of AB company stocks", and the event type corresponding to this event was "senior management increasing holdings". For such event types, the event elements that need to be extracted can include event time, characters, fact types, numerical elements (holding ratio), and so on.

According to an embodiment, an element template may be provided for each event type in advance, and the element template may define each element to be extracted under the corresponding event type. Optionally, the element template can also define the data format of each element. Therefore, for the first sample event, the element template corresponding to the first event type can be determined; the element template is used to extract the event element of the first sample event from the content text library.

In this way, the first sample event and the corresponding event type are identified from the content text database, and each event element corresponding to the event type is extracted. Hereinafter, the event element of the first sample event extracted from the content text library is called the first event element.

In order to characterize the first sample event more comprehensively and abundantly, in step 22, in at least one knowledge graph corresponding to the field associated with the first sample event, the associated element associated with the first event element is obtained.

It can be understood that in the prior art, various forms of knowledge graphs have been organized for various fields or various topics. These knowledge graphs can include corporate knowledge graphs, product knowledge graphs, character knowledge graphs, information knowledge graphs, stock knowledge graphs, fund knowledge graphs, institutional knowledge graphs, etc. In step 22, at least one knowledge graph can be selected according to the field associated with the first sample event. For example, when the first sample event is a "product fraud" type of event, the available knowledge graphs of related fields include enterprise knowledge graphs, institutional knowledge graphs, product knowledge graphs, and so on. When the first sample event is an event of "executive holdings increase", the available knowledge graphs of related fields may include person knowledge graphs, corporate knowledge graphs, stock knowledge graphs, fund knowledge graphs, and so on.

In this way, after the knowledge graph corresponding to the domain associated with the first sample event is determined, the event elements can be expanded in these knowledge graphs to obtain the association associated with the first event element extracted in step 21 Elements.

Generally, the knowledge graph can be organized into the form of a node connection graph, which includes multiple nodes, each node corresponds to a knowledge point, and the nodes corresponding to the knowledge points with the association relationship are connected by connecting edges. Starting from a certain node, the node that can be reached through a connecting edge is called the first degree associated node of the node, and the node that can be reached through at least k connecting edges is called the k degree associated node, or the k-order neighbor node.

Based on this, in step 22, the first event element extracted in step 21 can be mapped to the node in the above-mentioned knowledge graph, which is called the first node; then, starting from the first node, the knowledge graph and the first node The associated node serves as the associated element of the first sample event.

Specifically, in an embodiment, a node directly connected to the first node, that is, a once-associated node, can be selected as the associated element. In another embodiment, the node associated with the first node with the largest degree k can also be selected as the associated element, where the value of k can be preset as required, for example, k=3.

For example, assuming that the first sample event is a "product fraud" event, the extracted event elements include the implementation subject: company, product category: medicine, and so on. For the event element "company", the once-related nodes can be determined in the corporate knowledge graph, such as "sector" and "region". For the event element "medicine", it can be determined in the product knowledge graph. The nodes that were once associated include, for example, "side effects", etc. Therefore, the above associated nodes: "section", "region", "side effects", etc., can be used as the associated elements of the first sample event.

In this way, through the knowledge graph of related fields, the element expression of the first sample event is expanded.

Next, in step 23, the event characteristics of the first sample event are determined according to the event type of the first sample event, the first event element extracted in step 21, and the associated element expanded in step 22.

Specifically, in an embodiment, the event feature of the first sample event may be represented by a feature vector F, F=<f1, f2, f3,..., fn>. The n features f1-fn in the feature vector F include the event type of the first sample event, the feature corresponding to the first event element extracted in step 21, and the feature corresponding to the associated element obtained in step 22 feature. These features can be either discrete features or continuous features. In this way, a comprehensive event feature is constructed for the first sample event.

On the other hand, the calibrated risk value of the first sample event can also be obtained as the label of the sample, and the calibrated risk value is used to reflect the true degree of event influence in the history of the first sample event. In one embodiment, the calibrated risk value is determined by manual labeling, that is, the degree of influence caused by the first sample event is artificially measured, and a grade or score of the degree of influence/risk degree is given. In another embodiment, some existing index values are used as calibrated risk values. For example, for events in the economic field, the impact of the event can be reflected by the changes in the corresponding company's stock price, and correspondingly, some stock price indicators can be used as the calibrated risk value. More specifically, for example, the cumulative stock price increase/decrease within 3 days after the occurrence of the event can be used as the calibrated risk value, or the maximum retracement index in 5 days after the event occurs as the calibrated risk value.

In this way, the calibrated risk value of the first sample event is also obtained as the label of the sample. The event feature and label of the first sample event together constitute a training sample.

As mentioned above, the first sample event is any one of the aforementioned multiple sample events. Therefore, for each of the above-mentioned multiple sample events, the aforementioned steps 21-23 can be used to determine the event characteristics of each sample event and the calibration risk value of each sample event, so as to obtain multiple training samples.

Therefore, in step 24, the gradient boosting decision tree GBDT model is trained according to the event characteristics of each sample event mentioned above and the calibrated risk value of each sample event.

The GBDT model includes at least one decision tree, which is trained through the following process. First, according to the previous steps, the training sample set has been obtained

Where N is the number of sample events. Among them, F ⁽ⁱ⁾ is the feature vector of the i-th sample event, which is, for example, an n-dimensional vector, that is, F=(f ₁ , f ₂ ,..., f _n ), and y ⁽ⁱ⁾ is the i-th sample event Calibration risk value. Then, the N sample events are segmented through the decision tree, the split feature and feature threshold are set at each branch node of the decision tree, and the corresponding feature of the sample event is compared with the feature threshold at the branch node. The sample events are divided into corresponding child nodes. Through this process, finally the N sample events are divided into each leaf node. Therefore, the score of each leaf node can be obtained, that is, the average value of the calibration risk value (ie y ⁽ⁱ⁾ ) of each sample event in the leaf node.

On this basis, you can continue to train further decision trees in the direction where the residuals decrease. That is, after obtaining the above-mentioned decision tree, the residual r ^{(i) of} each sample event is obtained by subtracting the calibrated risk value of each sample event from the leaf node score of the sample event in the aforementioned decision tree, To

It is a new training set, which corresponds to the same sample event set as D1. In the same way as above, a further decision tree can be obtained. In this decision tree, N sample events are also divided into each leaf node, and the score of each leaf node is the value of the residual value of each sample event Mean. Similarly, multiple decision trees can be obtained sequentially, and each decision tree is obtained based on the residual of the previous decision tree. Thus, a GBDT model including multiple decision trees can be obtained.

Fig. 3 shows a decision tree trained according to an embodiment. As shown in Figure 3, the trained decision tree includes branch nodes and leaf nodes. Each branch node is set with a split feature and a feature threshold. Each sample event compares the split feature with the feature threshold at the branch node. , And enter the next branch node, and finally be divided into leaf nodes. For example, the arrow from node 0 to node 1 is marked with "f1≤0.5", and the arrow from node 0 to node 2 is marked with "f1>0.5", where f1 represents feature 1, more specifically, feature 1 such as Is the "event type", which is the split feature of node 0, and 0.5 is the split threshold of node 0.

It can be seen that in the decision tree obtained by training, the path from the root node to the leaf node passes through a combination of several branch nodes, each branch node corresponds to a split feature, so the path corresponds to a feature combination, and the feature combination It reflects that a sample event is classified into the feature based on the corresponding leaf node.

Generally, a leaf node in a decision tree will obtain a corresponding score through training. The score is, for example, the average value of the calibration risk value of each sample event in the leaf node, or the average value of the residual.

According to the embodiment of the present specification, each branch node is also assigned a certain score, and the score is determined based on the score of the leaf node covered by the branch node. For example, in one embodiment, the score of a branch node may be determined as the average value of the scores of the leaf nodes covered by the branch node.

In another embodiment, the score of the branch node is determined based on the following formula:

Among them, N _c1 and N _c2 are the sample numbers of the child nodes c1 and c2 that fall into the branch node during model training. That is, the score of the parent node is the weighted average of the scores of its two child nodes, and the weight of the two child nodes is the number of samples that fall into it during the model training process. In this way, starting from the leaf nodes, the score of each branch node can be determined layer by layer.

For the purpose of example, Figure 3 shows the scores of some nodes under the node, where the scores of the branch nodes are the average of the scores of the covered leaf nodes.

In this way, each branch node is also assigned a corresponding score. The above score can also be called the risk score of the node.

On this basis, it is also possible to assign node weights to each branch node through the training process. For a branch node A, it can be determined based on the respective node loss value of each node before and after the branch node A is split. The node loss value is based on the calibration risk value of the sample event falling into the node and the risk of the node The difference between the scores is determined.

Specifically, assume that the branch node A is split into two child nodes, L and R (L and R can be leaf nodes or branch nodes). Then, the weight of node A can be defined as:

The loss value of node L + the loss value of node R-the loss value of A.

Among them, the loss value of node L is determined based on the difference between the calibrated risk value of the sample event falling into node L and the risk score of node L. More specifically, the loss value may be the sum of the squares of the difference between the nominal risk value of each sample and the risk score of the node. Or, in other examples, it may also be the root mean square of the difference. Similarly, the loss value of node R and the loss value of node A can be obtained, and then the weight of node A can be obtained.

Through the above method, each branch node is given a node weight. Since each branch node also corresponds to a feature, the node weight can reflect in a certain sense, the role played by the feature during this split, and to a certain extent reflect the contribution of the feature to the decision path.

Based on the GBDT model obtained from the above training, the risk assessment of events with unknown results can be carried out. Moreover, due to the characteristics of the decision tree in the above GBDT model, the risk assessment results can be better explained.

The following describes the process of risk assessment using the GBDT model. That is, in step 25 of FIG. 2, the GBDT model obtained by training is used to perform risk assessment on the event to be analyzed. For clarity and simplicity of description, the event to be analyzed is called the second event.

Fig. 4 shows a flow chart of performing risk assessment on the second event in an embodiment, that is, the sub-steps of step 25 above. It can be understood that in order to evaluate the second event, the event feature of the second event must first be constructed, and the construction process of the event feature corresponds to the construction method of the event feature of the sample event in the GBDT model training phase.

Specifically, in step 251, the event type of the second event and at least one second event element are acquired.

In one embodiment, the event type and event elements of the second event may be directly input by the user. For example, when a user wants to query or evaluate the risk or impact of an event, he can directly enter the description of the second event in the query interface, such as "FF Company User Data Leakage", and then select the event type "Information Leakage" Then, in the element template provided according to the event type, enter the event elements of the event, such as the implementation subject, data category, event level, and so on.

In another embodiment, the text describing the second event may be input to the evaluation system, and the evaluation system performs event identification and element extraction. The above-mentioned input text can be, for example, news reports such as financial information, or various articles on the Internet. The process of event recognition and element extraction is similar to the aforementioned step 21. That is, the natural language processing model is used to identify the second event and the second event type from the input text; and according to the second event type, the event element of the second event is extracted from the input text.

After the event element of the second event is obtained, in step 252, in at least one knowledge graph related to the field of the second event, the associated element associated with the event element of the second event is obtained. Specifically, in the knowledge graph, the event element of the second event may be mapped to the second node, and then the node associated with the second node may be used as the associated element. This process is similar to the aforementioned step 22 and will not be repeated here.

Then, in step 253, the event characteristics of the second event are determined according to the event type, event elements, and related elements of the second event, which are hereinafter referred to as second event characteristics. The second event feature can be expressed as a feature vector V. In this way, an event feature is constructed for the second event.

Next, in step 254, the event feature V of the second event is input to the GBDT model obtained by the aforementioned training, and the risk value of the second event is determined according to the model output.

As mentioned above, the GBDT model obtained by training includes at least one decision tree, and the branch nodes in the decision tree correspond to split features and feature thresholds. After the second event feature V is input into the GBDT model, at each branch node i of the decision tree, the feature value of the feature corresponding to the split feature of the branch node in the feature vector V is compared with the feature threshold, and according to the comparison For the result, the second event is divided into nodes of the next level until it is divided into leaf nodes.

FIG. 5 shows the division process of the second event in the decision tree in one embodiment, which is the same as the decision tree shown in FIG. 3. Specifically, assume that the split feature at node 0 is f1 "event type", and the feature threshold is 0.5; the split feature at node 2 is f3 "implementing subject", and the feature threshold is 0.6. The event feature vector V of the second event is input into the decision tree. At node 0, suppose that in the second event feature V, the feature value corresponding to the "event type" is 0.8, which is greater than the feature threshold value 0.5 of the split feature, so the second event is divided from node 0 to node 2. Next, at node 2, judge the split feature "implementation subject". Assuming that the feature value of the feature "implementing subject" in the second event feature vector V is 0.2, which is smaller than the feature threshold of the split feature 0.6, the second event is then divided into node 5. This continues until the second event is divided into the leaf node 16.

As mentioned above, through training, each leaf node gets a corresponding score. Therefore, the GBDT model can output the score of the leaf node to which the second event is divided. Therefore, in step 254, the leaf node output by the model can be As the risk value of the second event. For example, the score 0.062 of the leaf node 16 in FIG. 5 can be used as the risk value of the second event. In the case that the GBDT model includes multiple decision trees, the second event in each decision tree will be divided into corresponding leaf nodes. At this time, the GBDT model can determine the corresponding score of the leaf node where the second event is located in each decision tree, and use the sum of the corresponding scores of each leaf node, that is, the total score, as the output result. Therefore, the total score output by the GBDT model can be used as the risk value of the second event.

Above, by inputting the event characteristics of the second event into the trained GBDT model, the risk value of the second event can be determined according to the model output, so as to perform a quantitative risk assessment of the second event.

In addition, in one embodiment, performing risk assessment on the second event in step 25 may also include, after the risk value of the second event is given in step 254, performing characteristic interpretation on the risk value of the second event.

Fig. 6 shows a flow of steps for feature interpretation in an embodiment. As shown in FIG. 6, in step 61, the decision path of the second event in the decision tree is determined according to the event characteristics of the second event. As mentioned above, in order to give the risk value of the second event, at each branch node of the decision tree, the second event is divided into child nodes according to the characteristic value of the corresponding feature of the second event until the leaf node is reached. In this way, the path taken from the root node to the leaf node to which the second event is divided in the decision tree is the decision path.

For example, as shown in FIG. 5, the second event is finally divided into leaf nodes 16, and the path from root node 0 through node 2, node 5, node 11 to node 16 is the decision path of the second event.

It can be understood that when the GBDT model contains multiple decision trees, the corresponding decision path can be determined in each decision tree.

Next, in step 62, each branch node through which the decision path passes is determined, and the characteristics and node weights corresponding to each branch node are obtained.

It can be understood that the starting point of the decision path is the root node of the decision tree, and the ending point is the leaf node to which the second event is divided, and nodes other than the leaf nodes can be used as branch nodes. In this way, each branch node included in the decision path can be determined. In the case where the decision path is multiple paths, each branch node included in the multiple paths is determined.

As mentioned above, according to the embodiment of this specification, each branch node in the decision tree is given a certain node weight. In this way, the node weight of each branch node in the decision path can be determined.

Therefore, in step 63, a certain feature included in the event feature of the second event is called the first feature, and the node weight of at least one branch node corresponding to the first feature among the above-mentioned branch nodes is determined. The feature weight of the first feature is used as the importance of the first feature to the risk value.

It should be understood that each branch node in a decision tree corresponds to a feature, but a feature can appear in multiple branch nodes of multiple decision trees, or even multiple branch nodes of the same decision tree. Therefore, for the above-mentioned first feature, at least one branch node corresponding to the first feature can be determined from the branch nodes included in the decision path, the node weight of the at least one branch node can be obtained, and the feature can be determined accordingly The feature weight. Specifically, in an example, the feature weight of the first feature may be an average value of the node weights of the at least one branch node corresponding to the first feature. In this way, the feature weight of the first feature is obtained, and the feature weight can reflect the contribution or importance of the first feature to the risk value of the second event. Correspondingly, the feature weight of each feature in the event feature of the second event can be obtained as the contribution or importance to the risk value of the second event.

In one embodiment, the corresponding features can be ranked according to the ranking of the feature weights of each feature, thereby indicating the importance ranking of the features that affect the risk value of the second event.

For example, in a specific example, the second event is "listed company historical financial fraud." Through the method of the above embodiment, it can be concluded that the characteristics that have an impact on the risk value of the event are in order of importance: "penalty type", "fact type", "stock performance" and "penalty organization".

In short, in the decision tree included in the GBDT model, the second event is divided into leaf nodes via the decision path, and the risk value of the second event is determined by the score of the leaf node. In addition, the decision path passes through multiple branch nodes, and each branch node corresponds to a feature. Therefore, the decision path can correspond to the feature combination of the split features of each branch node passed. Through the node weight of each branch node, the contribution or importance of the corresponding feature to the final risk value result can be measured, that is, the characteristic interpretation of the risk value result is performed. Therefore, in the above process, not only the risk value of the second event is determined through the GBDT model, but also the characteristic interpretation of the risk value can be performed, that is to say, the magnitude of the role played by each characteristic when the risk value is obtained .

The above describes the process of obtaining the comprehensive event characteristics of the second event after expanding the event elements through the knowledge graph for the second event to be evaluated, and inputting the event characteristics into the trained GBDT model to obtain the risk value of the second event. On this basis, the parameters in the GBDT model can also be used to characterize the obtained risk value. The above evaluation process is applicable to the situation where the corresponding elements of the second event can be obtained, and then the event characteristics can be constructed.

According to an implementation manner, the GBDT model obtained through the above training can also be applied to conditional prediction of events for which complete event characteristics cannot be obtained, that is, when only a few elements of the event can be obtained, different conditions or different conditions are given. The assessment of the different risk trends of the event under the circumstances.

For example, you want to evaluate the possible impact of the "vaccine fraud of a company" incident. Assume that the event type that can only obtain the event is "product fraud", the implementation subject is a company, and other elements are difficult to obtain. At this time, the GBDT model obtained by the above training can also be used to give an assessment of the risk trend of the event in different situations, for example, under what conditions are met, the event will have a great impact on public opinion risk, and what is being met? Under conditions, the impact of the event will be minimized. The evaluation process for such a second event is described below.

Fig. 7 shows a flowchart of steps for evaluating a second event according to an embodiment.

As shown in Fig. 7, first, in step 71, at least one event element of the second event is acquired. As mentioned above, this step process is suitable for the case where the second event element is incomplete. Therefore, the event element obtained in step 71 can be a small number of incomplete event elements, for example, only the implementation subject, or even the event type. . For example, for the above-mentioned "vaccine fraud of a certain company" incident, it is assumed that the event type that can only obtain the incident is "product fraud" and the subject of implementation is a certain company.

Next, in step 72, the second event is divided in the decision tree according to the at least one event element, and the subtree of the decision tree is determined based on the divided stop nodes.

It can be understood that due to incomplete event elements and incomplete event characteristics, it is often impossible to obtain a complete decision path from the root node to the leaf node in the decision tree. At this point, the second event can be divided in the decision tree according to the obtained elements, the stop node that cannot be divided and the division stopped is determined, and the subtree of the decision tree is determined based on the stop node. The subtree is The node area covered by the stop node.

It is described in conjunction with the schematic diagram of the decision tree in FIG. First, at node 0, determine the split feature "event type". Assuming that the event type of the second event "a vaccine fraud by a company" is 0.3, which is less than the characteristic threshold 0.5, the second event is then classified to node 1. The split feature at node 1 is f2 "penalty type". However, as described above, because the elements of the second event are incomplete, this feature cannot be obtained, so the second event cannot be divided, and node 1 is the stop node. The node area covered by node 1 is the aforementioned subtree.

Then, in step 73, the first leaf node in the aforementioned subtree that meets the predetermined condition and the conditional path from the root node to the first leaf node are determined.

The above-mentioned predetermined conditions can be set according to evaluation needs, for example, the risk is the largest, the risk is the smallest, the risk value meets a certain threshold, and so on.

If the predetermined condition is that the risk is the greatest, then the leaf node with the largest score is selected from the leaf nodes included in the subtree as the first leaf node. The path from the root node to the leaf node is the above conditional path.

Using the above example and combining with Figure 3, the stop node is node 1, and the determined subtree contains

leaf nodes

7, 8, 9, 10. Assuming that node 8 has the largest score, then node 8 can be determined as the maximum risk condition The leaf node of, the path from node 0 to node 8, that is, the

path containing nodes

0, 1, 3, and 8 is used as the above conditional path.

In the case of other predetermined conditions, the corresponding leaf node is selected as the first leaf node according to the score of each leaf node.

Next, in step 74, the feature combination corresponding to the branch nodes included in the conditional path is obtained, and the feature combination is used as the influence feature of the second event under the predetermined condition.

It can be understood that the conditional path corresponds to the division path of the second event under a predetermined condition assumed to occur. Therefore, the feature combinations corresponding to the branch and trunk nodes included in the path are those that have an impact on the second event and make it meet the aforementioned predetermined conditions. For example, if the predetermined condition is the maximum risk, then the feature combination corresponding to the conditional path at this time is the impact feature that causes the second event to appear the maximum risk. In this way, the conditional prediction and interpretation of the second event are performed, and different impact characteristics under different conditions are given to help predict the subsequent trend of the event.

Further, according to an implementation manner, the following information can also be provided as an assessment of the second event. For example, in one embodiment, the score of the above-mentioned first leaf node may be provided as the risk value of the second event under predetermined conditions. For example, in the case where the predetermined condition is the maximum risk, the score of node 8 may be provided as the possible maximum risk value of the second event.

In an embodiment, the importance of each feature in the above-mentioned feature combination may be determined according to the node weight of the branch node in the above-mentioned conditional path. This process is similar to the aforementioned step 63.

Through the above method, the second event with fewer elements and incomplete features can be evaluated, and the corresponding characteristic conditions that the second event will meet when different risk results appear are given, so as to make better use of the characteristics of the GBDT model to assess the future of the event. The risk is explained and predicted.

According to another embodiment, a device for event risk assessment is provided. The device can be deployed in any device, platform or device cluster with computing and processing capabilities. Fig. 8 shows a schematic block diagram of an event evaluation device according to an embodiment. As shown in Figure 8, the evaluation device 800 includes:

The extraction unit 81 is configured to use a natural language processing model to extract a plurality of sample events from the content text library, the plurality of sample events including a first sample event, and the extracting a plurality of sample events includes, identifying the first sample An event and its corresponding first event type, and extract at least one first event element of the first sample event according to the first event type;

The associating unit 82 is configured to obtain at least one first associated element associated with the at least one first event element in at least one knowledge graph corresponding to at least one field associated with the first sample event;

The determining unit 83 is configured to determine the event feature of the first sample event according to the first event type, the at least one first event element, and the at least one first correlation element;

The training unit 84 is configured to train the gradient boosting decision tree GBDT model according to the event characteristics of each sample event in the multiple sample events and the calibration risk value of each sample event to obtain the trained GBDT model;

The evaluation unit 85 is configured to use the trained GBDT model to perform risk evaluation on the second event to be analyzed.

In an embodiment, the extracting unit 81 is specifically configured to: determine a first template corresponding to the first event type; use the first template to extract the first sample event from the content text library At least one element of the first event.

According to one embodiment, the aforementioned first event element includes at least one of the following: event time, event location, implementation subject, event object, fact type, and event level.

In an embodiment, the associating unit 82 is specifically configured as follows:

Map the at least one first event element to the first node in the at least one knowledge graph; use the node directly connected to the first node in the at least one knowledge graph as the at least one first associated element .

According to an embodiment, the above-mentioned knowledge graph may include one or more of the following: enterprise knowledge graph, product knowledge graph, character knowledge graph, information knowledge graph, stock knowledge graph, fund knowledge graph, and institution knowledge graph.

According to an embodiment, the evaluation unit 85 includes:

The element acquisition module 851 is configured to acquire the event type of the second event and at least one second event element;

The element association module 852 is configured to obtain at least one second association element associated with the at least one second event element in the at least one knowledge graph;

The first determining module 853 is configured to determine the event feature of the second event according to the event type of the second event, the at least one second event element, and the at least one second correlation element;

The second determining module 854 is configured to input the event characteristics of the second event into the trained GBDT model, and determine the risk value of the second event according to the model output.

Specifically, in one embodiment, the element acquisition module 851 is configured to:

Identifying the second event and the second event type from the input text;

In another embodiment, the element acquisition module 851 is configured to:

The input second event and the at least one second event element are received.

According to one embodiment, the GBDT model obtained by training includes at least one decision tree, the decision tree includes branch nodes and leaf nodes, each branch node corresponds to a feature, and has the risk score and node weight obtained by training. , Wherein the node weight is determined based on the respective node loss values of the branch node and the split node, and the node loss value is determined based on the difference between the nominal risk value of the sample event falling into the node and the risk score of the node;

Correspondingly, in an embodiment, the evaluation unit 85 further includes (not shown):

A decision path determining module, configured to determine the decision path of the second event in the decision tree according to the event characteristics of the second event;

A node weight determination module, configured to determine each branch node through which the decision path passes, and obtain the characteristics and node weights corresponding to each branch node;

The importance determination module is configured to determine the first feature included in the event feature of the second event according to the node weight of at least one branch node corresponding to the first feature in each branch node The feature weight of a feature is used as the importance of the first feature to the risk value.

According to another embodiment, the evaluation unit 85 includes (not shown):

An element acquisition module configured to acquire at least one second event element of the second event;

A subtree determining module, configured to divide a second event in the decision tree according to the at least one second event element, and determine a subtree of the decision tree based on the divided stop nodes;

A conditional path determining module, configured to determine a first leaf node in the subtree that meets a predetermined condition and a conditional path from the root node to the first leaf node;

The feature determination module is configured to obtain a feature combination corresponding to a branch node included in the conditional path, and use the feature combination as an impact feature of the second event under the predetermined condition.

In an embodiment, each leaf node in the decision tree obtains a risk score through training, and each branch node corresponds to a feature, and has the risk score obtained by training and the node weight, wherein the node weight is based on the The node loss value of the branch node and the node after the split is determined, and the node loss value is determined based on the difference between the calibrated risk value of the sample event falling into the node and the risk score of the node;

Correspondingly, the evaluation unit further includes one or more of the following:

A third determining module, configured to determine the first risk score corresponding to the first leaf node as the risk value of the second event under the predetermined condition;

The fourth determining module is configured to determine the importance of each feature corresponding to each branch node in the feature combination according to the node weight of each branch node in the conditional path.

Through the above devices, the training and use of the GBDT model can be realized, and the event risk can be effectively evaluated and explained.

According to another embodiment, there is also provided a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method described in conjunction with FIG. 2.

According to an embodiment of still another aspect, there is also provided a computing device, including a memory and a processor, the memory stores executable code, and when the processor executes the executable code, a combination of FIGS. 2 and 4 is implemented. The method described.

Those skilled in the art should be aware that in one or more of the above examples, the functions described in the present invention can be implemented by hardware, software, firmware or any combination thereof. When implemented by software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or codes on the computer-readable medium.

The specific embodiments described above further describe the purpose, technical solutions and beneficial effects of the present invention in further detail. It should be understood that the above descriptions are only specific embodiments of the present invention and are not intended to limit the scope of the present invention. The protection scope, any modification, equivalent replacement, improvement, etc. made on the basis of the technical solution of the present invention shall be included in the protection scope of the present invention.

Claims

A computer-executed event risk assessment method includes:

Using a natural language processing model, extracting multiple sample events from the content text database, the multiple sample events including a first sample event, and the extracting multiple sample events includes identifying the first sample event and its corresponding first sample event An event type, and at least one first event element of the first sample event is extracted according to the first event type;

Acquiring at least one first associated element associated with the at least one first event element in at least one knowledge graph corresponding to at least one field associated with the first sample event;

Determine the event characteristics of the first sample event according to the first event type, the at least one first event element, and the at least one first correlation element;

Training the gradient boosting decision tree GBDT model according to the event characteristics of each sample event in the multiple sample events and the calibration risk value of each sample event to obtain the trained GBDT model;

Using the trained GBDT model, the risk assessment of the second event to be analyzed is performed.
The method according to claim 1, wherein said extracting at least one first event element of said first sample event according to a first event type comprises:

Determine the first template corresponding to the first event type;

Using the first template, extract at least one first event element of the first sample event from the content text library.
The method according to claim 1 or 2, wherein the at least one first event element includes at least one of the following: event time, event location, implementation subject, event object, fact type, event level.
The method according to claim 1, wherein obtaining at least one associated element associated with the at least one event element comprises:

Mapping the at least one first event element to the first node in the at least one knowledge graph;

A node directly connected to the first node in the at least one knowledge graph is used as the at least one first associated element.
The method according to claim 1 or 4, wherein the at least one knowledge graph comprises an enterprise knowledge graph, a product knowledge graph, a character knowledge graph, an information knowledge graph, a stock knowledge graph, a fund knowledge graph, and an institution knowledge graph.
The method according to claim 1, wherein using the trained GBDT model to perform risk assessment on the second event to be analyzed comprises:

Acquiring the event type of the second event and at least one second event element;

In the at least one knowledge graph, acquiring at least one second associated element associated with the at least one second event element;

Determine the event characteristics of the second event according to the event type of the second event, the at least one second event element, and the at least one second correlation element;

The event characteristics of the second event are input into the trained GBDT model, and the risk value of the second event is determined according to the model output.
The method according to claim 6, wherein acquiring the event type of the second event and at least one second event element comprises:

Identifying the second event and the second event type from the input text;

According to the second event type, the at least one second event element is extracted from the input text.
The method according to claim 6, wherein acquiring the event type of the second event and at least one second event element comprises:

The input second event and the at least one second event element are received.
The method according to claim 6, wherein the trained GBDT model includes at least one decision tree, the decision tree includes a branch node and a leaf node, each branch node corresponds to a feature, and has training results The node weight is determined based on the node loss value of the branch node and the node after the split, and the node loss value is based on the calibrated risk value of the sample event falling into the node and the node's risk score Value difference;

The use of the trained GBDT model to perform risk assessment on the second event to be analyzed further includes:

Determining the decision path of the second event in the decision tree according to the event characteristics of the second event;

Determine each branch node passed by the decision path, and obtain the feature and node weight corresponding to each branch node;

For the first feature included in the event feature of the second event, the feature weight of the first feature is determined according to the node weight of at least one branch node corresponding to the first feature in each branch node, as The importance of this first characteristic to the risk value.
The method according to claim 1, wherein the trained GBDT model includes at least one decision tree, and the decision tree includes branch nodes and leaf nodes;

The use of the trained GBDT model to perform risk assessment on the second event to be analyzed includes:

Acquiring at least one second event element of the second event;

Dividing a second event in the decision tree according to the at least one second event element, and determining a subtree of the decision tree based on the divided stop nodes;

Determining a first leaf node in the subtree that meets a predetermined condition, and a conditional path from the root node to the first leaf node;

The feature combination corresponding to the branch and trunk nodes included in the conditional path is acquired, and the feature combination is used as the influence feature of the second event under the predetermined condition.
The method according to claim 10, wherein each leaf node in the decision tree has a risk score obtained by training, each branch node corresponds to a feature, and has a risk score obtained by training and a node weight, The node weight is determined based on the respective node loss values of the branch node and the split node, and the node loss value is determined based on the difference between the calibrated risk value of the sample event falling into the node and the risk score of the node;

The use of the trained GBDT model to perform risk assessment on the second event to be analyzed further includes one or more of the following:

Determining the first risk score corresponding to the first leaf node as the risk value of the second event under the predetermined condition;

The importance of each feature corresponding to each branch node in the feature combination is determined according to the node weight of each branch node in the conditional path.
A computer-executed event risk assessment device includes:

The extraction unit is configured to use a natural language processing model to extract a plurality of sample events from the content text library, the plurality of sample events including a first sample event, and the extracting the plurality of sample events includes identifying the first sample event And its corresponding first event type, and extracting at least one first event element of the first sample event according to the first event type;

An associating unit configured to obtain at least one first associated element associated with the at least one first event element in at least one knowledge graph corresponding to at least one field associated with the first sample event;

A determining unit configured to determine the event feature of the first sample event according to the first event type, the at least one first event element, and the at least one first correlation element;

The training unit is configured to train the gradient boosting decision tree GBDT model according to the event characteristics of each sample event in the multiple sample events and the calibration risk value of each sample event to obtain the trained GBDT model;

The evaluation unit is configured to use the trained GBDT model to perform risk evaluation on the second event to be analyzed.
The device according to claim 12, wherein the extraction unit is configured to:

Determine the first template corresponding to the first event type;

Using the first template, extract at least one first event element of the first sample event from the content text library.
The device according to claim 12 or 13, wherein the at least one first event element includes at least one of the following: event time, event location, implementation subject, event object, fact type, and event level.
The device according to claim 12, wherein the associating unit is configured to:

Mapping the at least one first event element to the first node in the at least one knowledge graph;

A node directly connected to the first node in the at least one knowledge graph is used as the at least one first associated element.
The device according to claim 12 or 15, wherein the at least one knowledge graph comprises an enterprise knowledge graph, a product knowledge graph, a character knowledge graph, an information knowledge graph, a stock knowledge graph, a fund knowledge graph, and an institution knowledge graph.
The device according to claim 12, wherein the evaluation unit comprises:

The element acquisition module is configured to acquire the event type of the second event and at least one second event element;

An element association module, configured to obtain at least one second association element associated with the at least one second event element in the at least one knowledge graph;

A first determining module configured to determine the event feature of the second event according to the event type of the second event, the at least one second event element, and the at least one second correlation element;

The second determining module is configured to input the event characteristics of the second event into the trained GBDT model, and determine the risk value of the second event according to the model output.
The apparatus according to claim 17, wherein the element acquisition module is configured to:

Identifying the second event and the second event type from the input text;

According to the second event type, the at least one second event element is extracted from the input text.
The apparatus according to claim 17, wherein the element acquisition module is configured to:

The input second event and the at least one second event element are received.
The device according to claim 17, wherein the trained GBDT model includes at least one decision tree, the decision tree includes a branch node and a leaf node, and each branch node corresponds to a feature and has the training result The node weight is determined based on the node loss value of the branch node and the node after the split, and the node loss value is based on the calibrated risk value of the sample event falling into the node and the node's risk score Value difference;

The evaluation unit also includes:

A decision path determining module, configured to determine the decision path of the second event in the decision tree according to the event characteristics of the second event;

A node weight determination module, configured to determine each branch node through which the decision path passes, and obtain the characteristics and node weights corresponding to each branch node;

The importance determination module is configured to determine the first feature included in the event feature of the second event according to the node weight of at least one branch node corresponding to the first feature in each branch node The feature weight of a feature is used as the importance of the first feature to the risk value.
The device according to claim 12, wherein the trained GBDT model includes at least one decision tree, and the decision tree includes branch nodes and leaf nodes;

The evaluation unit includes:

An element acquisition module configured to acquire at least one second event element of the second event;

A subtree determining module, configured to divide a second event in the decision tree according to the at least one second event element, and determine a subtree of the decision tree based on the divided stop nodes;

A conditional path determining module, configured to determine a first leaf node in the subtree that meets a predetermined condition and a conditional path from the root node to the first leaf node;

The feature determination module is configured to obtain a feature combination corresponding to a branch node included in the conditional path, and use the feature combination as an impact feature of the second event under the predetermined condition.
The device according to claim 21, wherein each leaf node in the decision tree has a risk score obtained by training, and each branch node corresponds to a feature, and has a risk score obtained by training and a node weight, The node weight is determined based on the respective node loss values of the branch node and the split node, and the node loss value is determined based on the difference between the calibrated risk value of the sample event falling into the node and the risk score of the node;

The evaluation unit also includes one or more of the following:

A third determining module, configured to determine the first risk score corresponding to the first leaf node as the risk value of the second event under the predetermined condition;

The fourth determining module is configured to determine the importance of each feature corresponding to each branch node in the feature combination according to the node weight of each branch node in the conditional path.
A computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method of any one of claims 1-11.
A computing device, comprising a memory and a processor, characterized in that executable code is stored in the memory, and when the processor executes the executable code, the device described in any one of claims 1-11 is implemented method.