CN112085589B - Method and device for determining safety of rule model and server - Google Patents

Method and device for determining safety of rule model and server Download PDF

Info

Publication number
CN112085589B
CN112085589B CN202010908614.5A CN202010908614A CN112085589B CN 112085589 B CN112085589 B CN 112085589B CN 202010908614 A CN202010908614 A CN 202010908614A CN 112085589 B CN112085589 B CN 112085589B
Authority
CN
China
Prior art keywords
rule
probability
rule model
hit
miss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010908614.5A
Other languages
Chinese (zh)
Other versions
CN112085589A (en
Inventor
张文彬
李漓春
殷山
李翰林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010908614.5A priority Critical patent/CN112085589B/en
Publication of CN112085589A publication Critical patent/CN112085589A/en
Application granted granted Critical
Publication of CN112085589B publication Critical patent/CN112085589B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • Finance (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Accounting & Taxation (AREA)
  • Educational Administration (AREA)
  • Technology Law (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The specification provides a method, a device and a server for determining the safety of a rule model. Based on the method, the rule model to be detected can be converted into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node corresponds to a rule or a rule connecting word respectively; and then, carrying out structural splitting on the rule model of the binary tree structure by utilizing the structural characteristics of the binary tree, and determining the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of no hit of the rule model by combining the data value distribution of the attribute through recursive calculation so as to determine whether the rule model has safety risks. Therefore, the safety of the rule model can be determined efficiently and accurately, and the risk of data leakage caused by the fact that the data provider runs the unsafe rule model is reduced.

Description

Method and device for determining safety of rule model and server
Technical Field
The specification belongs to the technical field of internet, and particularly relates to a method, a device and a server for determining security of a rule model.
Background
In some data processing scenarios, the model generator is often separate from the data provider.
Usually, the data provider can respond to the request of the model generator, and run the rule model provided by the model generator by using the data resource owned by the own party to obtain a corresponding processing result; and feeding back the processing result to the model generator. Therefore, the model generator can obtain a corresponding processing result on the premise of not contacting the data resources owned by the data provider; and can carry out specific data processing according to the processing result.
However, if the rule model itself is not secure, the data provider may leak the data resources owned by the data provider in the course of running the rule model.
Therefore, a method for determining the security of the rule model more efficiently and accurately is needed.
Disclosure of Invention
The specification provides a method, a device and a server for determining the safety of a rule model, so that the safety of the rule model can be determined efficiently and accurately, and the risk of data leakage caused by unsafe operation of the rule model by a data provider is reduced.
The method, the device and the server for determining the safety of the rule model are realized as follows:
a method of determining security of a rule model, comprising: acquiring a rule model and data value distribution of attributes; wherein the rule model comprises a rule set, the rule set comprises a plurality of rules, and the rules are connected through rule connecting words; converting the rule model into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively; according to the rule model of the binary tree structure and the data value distribution of the attributes, determining the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model through recursive calculation; and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of no hit of the rule model.
A device for determining security of a rule model, comprising: the acquisition module is used for acquiring the rule model and the data value distribution of the attribute; the rule model comprises a rule set, wherein the rule set comprises a plurality of rules which are connected through rule connecting words; the conversion module is used for converting the rule model into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively; the calculation module is used for determining the preset guessing probability of the attribute under the condition of the hit of the rule model and the preset guessing probability of the attribute under the condition of the miss of the rule model through recursive calculation according to the rule model of the binary tree structure and the data value distribution of the attribute; and the determining module is used for determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
A method of determining security of a rule model, comprising: acquiring a rule model and data value distribution of attributes; wherein the rule model comprises a rule set, the rule set comprises a plurality of rules, and the rules are connected through rule connecting words; converting the rule model into a preset tree structure according to a preset conversion rule to obtain a rule model of the preset tree structure; the preset rule model of the tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively; according to the preset rule model of the tree structure and the data value distribution of the attributes, determining the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model through recursive calculation; and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
A method of determining security of a rule model, comprising: acquiring a rule model and data value distribution of attributes; wherein the rule model comprises a rule set, the rule set comprises a plurality of rules, and the rules are connected through rule connecting words; determining a preset guess probability of the attribute under the condition of hit of the rule model and a preset guess probability of the attribute under the condition of miss of the rule model according to the rule model and the data value distribution of the attribute; and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
A server comprising a processor and a memory for storing processor-executable instructions, the instructions when executed by the processor implementing a fetch rule model, and a data value distribution of attributes; wherein the rule model comprises a rule set, the rule set comprises a plurality of rules, and the rules are connected through rule connecting words; converting the rule model into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively; determining a preset guess probability of the attribute under the condition of hit of the rule model and a preset guess probability of the attribute under the condition of miss of the rule model through recursive calculation according to the rule model of the binary tree structure and the data value distribution of the attribute; and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
A computer readable storage medium having stored thereon computer instructions that, when executed, implement obtaining a rule model, and a data value distribution for an attribute; the rule model comprises a rule set, wherein the rule set comprises a plurality of rules which are connected through rule connecting words; converting the rule model into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively; according to the rule model of the binary tree structure and the data value distribution of the attributes, determining the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model through recursive calculation; and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
According to the method, the device and the server for determining the safety of the rule model, the rule model to be detected is converted into the binary tree structure according to the preset conversion rule, so that the rule model with the binary tree structure is obtained; the rule model of the binary tree structure comprises a plurality of nodes, and each node corresponds to a rule or a rule connecting word respectively; and then, carrying out structural splitting on the rule model of the binary tree structure by utilizing the structural characteristics of the binary tree, determining the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of no hit of the rule model by combining the data value distribution of the attribute through recursive calculation, and determining whether the rule model has a safety risk or not based on the guessing probabilities. Therefore, the safety of the rule model can be determined efficiently and accurately, and the risk of data leakage caused by the fact that the data provider runs the unsafe rule model is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present specification, the drawings needed to be used in the embodiments will be briefly described below, and the drawings in the following description are only some of the embodiments described in the present specification, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without any creative effort.
Fig. 1 is a schematic diagram of an embodiment of a structural composition of a system to which a method for determining security of a rule model provided in an embodiment of the present specification is applied;
FIG. 2 is a flow diagram illustrating a method for determining security of a rule model provided in one embodiment of the present description;
FIG. 3 is a diagram illustrating an embodiment of a method for determining security of a rule model using an embodiment of the present specification, in an example scenario;
FIG. 4 is a diagram illustrating an embodiment of a method for determining security of a rule model using an embodiment of the present specification, in an example scenario;
FIG. 5 is a diagram illustrating an embodiment of a method for determining security of a rule model using an embodiment of the present specification, in an example scenario;
FIG. 6 is a diagram illustrating an embodiment of a method for determining security of a rule model using an embodiment of the present specification, in an example scenario;
FIG. 7 is a flowchart illustrating a method for determining security of a rule model provided in an embodiment of the present description;
fig. 8 is a schematic structural component diagram of a server provided in an embodiment of the present specification;
fig. 9 is a schematic structural component diagram of a device for determining security of a rule model according to an embodiment of the present specification.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.
The embodiment of the specification provides a method for determining the safety of a rule model, and the method can be particularly applied to a system comprising a first server, a second server and a third server.
In particular, reference may be made to fig. 1. The first server may specifically include a server disposed on the model generator side. The second server may specifically include a server disposed on the data provider side. The third server may specifically include a third-party-side server that is responsible for detecting the security of the rule model. The third party may be understood as a service provider trusted by the model generator and the data provider and responsible for detecting the security of the rule model.
In an implementation, the first server may configure and construct a rule model including only one rule set in order to perform corresponding data processing (for example, determining credit risk of the user) by using data resources owned by the data provider. The rule set may specifically include a plurality of rules, and the rules may be connected by a rule connection word (e.g., "and", "or").
The first server sends the rule model to the second server, sends the rule model to the third server for detection, and sends a detection request about the safety of the rule model to the third server. The third server has authority to disassemble and read specific rules contained in the rule model.
The third server may receive and respond to the detection request, obtain data value distribution for detecting an attribute of the rule model, and detect whether the rule model has a security risk according to the data value distribution of the attribute. The data value distribution of the attribute may be provided by the second server, or may be generated by the third server itself.
When the security of the rule model is specifically detected, the third server may convert the rule model into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to one rule or one rule connecting word respectively. And determining the preset guess probability of the attribute under the condition of hit of the rule model and the preset guess probability of the attribute under the condition of miss of the rule model through recursive calculation according to the rule model of the binary tree structure and the data value distribution of the attribute. And finally, determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
In the case where it is determined that the rule model does not have a security risk, the third server may generate and send security prompt information to the second server. After receiving the security prompt message, the second server can normally run the rule model by using the own data resource to obtain a corresponding processing result; and feeding back the processing result to the first server. The first server may complete corresponding data processing (e.g., credit risk rating of the user, etc.) according to the processing result.
In the case where it is determined that the rule model is at risk for security, the third server may generate and send risk hint information to the second server. The second server can receive the risk prompt information and refuse to operate the rule model by using the data resource owned by the second server, so that the data resource owned by the data provider can be effectively prevented from being leaked.
In this embodiment, the first server, the second server, and the third server may specifically include a server that is applied to a data processing system side and is capable of implementing functions such as data transmission and data processing. Specifically, the first server, the second server, and the third server may be, for example, an electronic device having data operation, storage, and network interaction functions. Alternatively, the first server, the second server, and the third server may also be software programs that run in the electronic device and support data processing, storage, and network interaction. In this embodiment, the number of the servers included in the first server, the second server, and the third server is not specifically limited. The first server, the second server, and the third server may be specifically one server, or may be several servers, or a server cluster formed by several servers.
Referring to fig. 2, an embodiment of the present disclosure provides a method for determining security of a rule model. Wherein, the method is particularly applied to the third server side. In particular implementations, the method may include the following.
S201: acquiring a rule model and data value distribution of attributes; the rule model comprises a rule set, wherein the rule set comprises a plurality of rules, and the rules are connected through rule connecting words.
In some embodiments, the third server may be specifically understood as a server disposed on the third party side. The third party may be specifically understood as a server which is independent of the data provider and the model generator, and is trusted by both the data provider and the model generator and responsible for detecting the security of the rule model.
In some embodiments, where the model generator allows the disclosure of the rule model to the data provider, the method may also be applied to a second server disposed on the model generator side. That is, the data provider may detect the security of the rule model and the like by the second server using this method.
In some embodiments, in the case that the model generator needs to perform self-checking on the generated rule model, the method may also be applied to a first server disposed on the model generator side. That is, the model generator may also detect the security of the generated rule model by the first server using this method. Specifically, the first server sends the rule model to the second server only when the first server determines that the generated rule model has no security risk through detection.
The embodiments of the present specification will be described specifically mainly by taking an example of applying the method to the third server. For the case of application to the first server, the second server, the following embodiment for application to the third server may be referred to.
In some embodiments, the rule model may specifically include a set of rules. The rule set may further include one or more rules.
In some embodiments, the rules are used to detect whether a certain attribute characteristic of a data object satisfies a certain predetermined data value range. The rule may specifically include: attributes, operators, and data elements such as data thresholds.
The attribute may be specifically understood as parameter data for characterizing a certain attribute characteristic of the data object. For example, the above-mentioned attribute may be monthly income, default rate, height, occupation, and the like. The data threshold may be specifically understood as an upper limit value and/or a lower limit value of a data value set for an attribute in a rule. E.g., 1000 yuan, 15 times, 5%, etc. The above operator may particularly be understood as a symbol in a rule defining a decision relation between an attribute and a data threshold. For example, > (greater than signs), < (less than signs), ≧ or (greater than or equal to signs), and the like. Of course, the above listed attributes, operators, data thresholds are only illustrative.
Specifically, for example, in rule 1 "monthly income of user >1000 yuan", the attribute is "monthly income", the operator is ">, and the data threshold is" 1000 yuan ". If a user has monthly revenue data of 2000 dollars, greater than 1000 dollars, it is understood that the user hits rule 1. If a user has monthly revenue data of 500 dollars, less than 1000 dollars, it is understood that the user has not hit rule 1.
In some embodiments, the rule set may include only one rule. For example, rule set 1 may contain only one rule, rule 1. If a user hits rule 1, it can be understood that the user hits rule set 1. If a user does not hit rule 1, then it can be understood that the user does not hit rule set 1.
In some embodiments, the rule set may also include a plurality of different rules. The plurality of different rules can be connected together through the rule connecting words to form a rule set. The regular conjunction may specifically include conjunction such as "AND" (e.g., AND), "OR" (e.g., OR).
For example, in rule set 2 "number of default times of user >5 (can be denoted as rule 2), or default rate of user >0.5 (can be denoted as rule 3)," rule 2 and rule 3 are connected together by rule conjunction "or" to form a rule set, i.e., rule set 2. If a user hits at least one of rule 2 and rule 3 above, it is understood that the user hits rule set 2. If a user does not hit either rule 2or rule 3, then it is understood that the user did not hit rule set 2.
In some embodiments, the rule set may include a variety of different attributes. Where each attribute appears only once in the rule set.
In some embodiments, the model generator may configure corresponding rules according to specific application scenarios and data processing requirements; combining the rules to obtain a rule set; and treat the rule set as a rule model (also referred to as a rule-based model). And the data provider runs the rule model by using the owned data resources to obtain and feed back a corresponding processing result so as to perform specific data processing.
As can be seen in figure 3. The rule model generated by the model generator includes only one rule set, i.e., "rule 1AND (rule 2OR rule 3)". Wherein, the rule set further includes three rules, which are respectively: rule 1, rule 2, and rule 3. Specifically, the rule 2 and the rule 3 are connected together through a rule connecting word OR; the combination of the rule 1AND the connected rule 2 AND rule 3 is further connected together through a rule connecting word AND to form a rule set.
In some embodiments, the model generator and the data provider tend to be separate. In this case, the model generator may transmit the rule model described above to the data provider. The data provider can use the own data resource, for example, a database containing information data of a large number of data objects, etc., to run the rule model to obtain a corresponding processing result; and feeding the processing result back to the model generator so that the model generator can obtain and utilize the processing result to complete corresponding data processing. This also reduces the risk of data resources owned by the data provider being compromised.
In some embodiments, since some rule models have security risks, when a data provider runs such rule models with own data resources to obtain corresponding processing results, the data provider still has a risk of data leakage, which threatens data security of the data provider.
For example, the model generator, when generating the rule model, intentionally configures the rule set n in the rule model as "monthly income =5000 dollars for the user". At this time, if the data provider directly utilizes the owned data resources, the data provider inquires the information data of the user L to be detected (for example, the monthly income of the user L is 5000 yuan); the information data of the user L is input to the rule model, and a processing result of the user L hitting the rule set n (for example, hitting the rule set n) is obtained and fed back to the model generator. In this case, although the data provider does not directly leak the information data that the monthly income data of the user L is 5000 yuan to the model generator, the model generator can accurately guess that the monthly income data of the user L is 5000 yuan based on the processing result. That is, the data resources of the data provider have been compromised.
Therefore, in order to avoid the leakage of the owned data resources when the data provider runs the rule model and protect the data security of the data provider, the data provider can entrust a third party to test the security of the rule model before running the rule model by using the own data resources; and under the condition that the third party determines that the rule model has no security risk, the data provider reuses the data resource of the third party to run the rule model.
In some embodiments, in specific implementation, the first server disposed on the model generator side sends the rule model to the second server, and simultaneously sends a detection request carrying the rule model to the third server to request the third server to perform security detection on the rule model.
The third server may obtain the rule model to be detected from the received detection request. The third server has the authority of disassembling the rule model and acquiring data such as specific rules in the rule model according to a data processing protocol established before the third server and the first server.
In some embodiments, while the first server sends the rule model to the third server, some information related to the rule model, which is allowed to be disclosed to the third server, for example, identification information of a rule set included in the rule model, identification information of attributes in the rule set, occurrence times of each attribute in the rule model, and the like, may also be sent to the third server as basic information of the rule model to assist the third server in performing security detection on the rule model. Accordingly, the third server can obtain the basic information of the rule model through the first server.
The identification information of the rule set may specifically be a name of the rule set, or may be a number of the rule set. The identification information of one rule set corresponds to one rule set. The identification information of the attribute may be specifically a name of the attribute, a number of the attribute, or the like. The identification information of one attribute corresponds to one attribute.
Meanwhile, the third server may further acquire a data value distribution for detecting an attribute of the rule model. The data value distribution of the attribute may specifically include a distribution ratio of different data values of the attribute in the sample data.
In specific implementation, the third server may obtain sample data provided by the second server and used for detecting the rule model, and calculate data value distribution of the corresponding attribute according to the sample data. The third server may directly acquire the data value distribution of the attribute for detection disclosed by the second server, or the like. The data value distribution of the attribute generated by the third server through data fitting and the like for the rule model to be detected can also be used. The third server specifically obtains the data value distribution of the attribute by using any manner, which is not limited in this specification.
S202: converting the rule model into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to one rule or one rule connecting word respectively.
In some embodiments, the Binary tree structure may be specifically understood as a tree data structure based on recursive definition. A binary tree may be a set of n finite elements, wherein the set may be empty, or an ordered tree consisting of an element called root node and two disjoint and respectively located two sides of the root node and called left and right subtrees respectively. When the set is empty, the binary tree may be referred to as an empty binary tree. One node in the binary tree corresponds to one element in the set.
In some embodiments, in specific implementation, referring to fig. 4, the rule model may be converted into a binary tree structure according to a preset conversion rule, so as to obtain the rule model with the binary tree structure.
In particular, as shown in fig. 5. The rule model of the binary tree structure may specifically comprise a plurality of nodes. Each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively. For example, node numbered 2 corresponds to rule 1, AND node numbered 1 corresponds to the rule conjunction "AND" (i.e., sum).
Further, in the rule model of the binary tree structure, nodes without parents (e.g., nodes numbered 1) may be denoted as root nodes, and nodes other than the root nodes (e.g., nodes numbered 2) may be denoted as leaf nodes.
The rule model is firstly converted into the rule model with the binary tree structure, so that the structural characteristics of the binary tree can be utilized subsequently, the rule model is pertinently subjected to structural splitting to obtain a plurality of nodes, and the split nodes are traversed and calculated through a recursive algorithm, so that the detection on the rule model can be efficiently and accurately finished, and the overall processing efficiency is improved.
In some embodiments, the rule model is converted into a binary tree structure according to a preset conversion rule to obtain the rule model with the binary tree structure, and in specific implementation, reference may be made to the following embodiments.
In this embodiment, since the rule model only includes one rule set, the rules in the rule set may be expanded in the order from left to right. For example, referring to fig. 6, the expanded rule set in the rule model can be represented as: rule 1AND (rule 2OR rule 3).
Furthermore, the composition structure of the first level of two in the rule set can be split according to the expanded rule set; and determining the regular connection words between the two first-level composition structures as root nodes, and determining the two first-level composition structures as left and right subtrees which are connected with the root nodes and are respectively positioned at the two sides of the root nodes.
Further, according to a similar manner, the two first-level constituent structures on the left sub-tree and the right sub-tree are respectively split continuously until a single rule is split to serve as a leaf node and further splitting cannot be performed, so that a rule model of a corresponding binary tree structure is obtained.
For example, fig. 5 and 6 may be combined. According to a preset conversion rule, aiming at the expanded rule set, the following two first-level composition structures can be split: rule 1, and (rule 2OR rule 3). Further, in the rule set, a rule conjunction word "AND" between the two first-level constituent structures of the rule 1AND the (rule 2OR rule 3) may be determined as a root node, AND the two first-level constituent structures of the rule 1AND the (rule 2OR rule 3) may be determined as a left sub-tree AND a right sub-tree connected to the root node, respectively.
The left sub-tree can then be processed to find: the first level on the left sub-tree has a composition structure of rule 1, only contains one rule, and can not be split any more. Rule 1 may then be determined to be a leaf node on the left sub-tree that is connected to the root node, completing the split on one side of the left sub-tree.
And processing the right subtree at the same time to find: the constituent structure of the first level on the right sub-tree is (rule 2OR rule 3), i.e. the constituent structure of the first level on the right sub-tree is not a single rule and can be further split. In this case, the right sub-tree (rule 2OR rule 3) may be further split into two second-level constituent structures, respectively, in a manner of splitting into the first-level constituent structure before: rule 2 and rule 3. Further, the rule conjunction word "OR" between rule 2 and rule 3 may be determined as a leaf node connected to the root node on the right sub-tree, and the above two second-level constituent structures are determined as a left sub-tree and a right sub-tree connected to the leaf node of "OR" and located on both sides of the leaf node, respectively.
Since the above two second levels of constituent structures are single rules that cannot be split any more, rule 2 may be determined as a leaf node on the left sub-tree connected to the leaf node of "OR" and rule 3 may be determined as a leaf node on the right sub-tree connected to the leaf node of "OR".
Thus, the conversion of the rule model is completed, and the rule model of the corresponding binary tree structure is obtained.
S203: and determining the preset guess probability of the attribute under the condition of hit of the rule model and the preset guess probability of the attribute under the condition of miss of the rule model through recursive calculation according to the rule model of the binary tree structure and the data value distribution of the attribute.
In some embodiments, the predetermined guess probability may also be referred to as a maximum guess probability. Accordingly, the predetermined guess probability of the attribute in the case of a hit in the rule model may be specifically understood as the maximum value of the guess probability in guessing the specific data value of the attribute in the case of a known hit in the rule model. The predetermined guessing probability of the attribute in the case of a rule model miss may be specifically understood as the maximum value of the guessing probability when the specific data value of the attribute is guessed in the case of a known rule model miss.
Generally, if the value of the preset guessing probability of a certain attribute in the case of a hit in a rule model is larger based on a certain rule model, the data value of the attribute is easier to guess under the condition of known hit in the rule model. Accordingly, the greater the likelihood that running the rule model reveals data resources of the data provider, the higher the security risk of the rule model.
If the smaller the value of the predetermined guessing probability of a certain attribute in the case of a rule model miss based on a certain rule model, it is more difficult to guess the data value of the attribute under the condition of a known rule model miss. Accordingly, the more likely the rule model is run to reveal data resources of the data provider, the less the security risk of the rule model.
Therefore, the preset guess probability of the attribute in the case of the hit of the rule model and the preset guess probability of the attribute in the case of the miss of the rule model can be used as a judgment index to determine the safety of the rule model.
In some embodiments, in specific implementation, the structural characteristics of the binary tree structure may be utilized to obtain a plurality of nodes by structurally splitting the rule model of the binary tree structure. And calculating the plurality of nodes one by adopting a recursive algorithm so as to quickly and efficiently calculate the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
In some embodiments, referring to fig. 5, the predetermined guessing probability of the attribute in case of hit of the rule model and the predetermined guessing probability of the attribute in case of no hit of the rule model are determined by recursive calculation according to the rule model of the binary tree structure and the data value distribution of the attribute, and the following may be included in the implementation.
S1: and recursively calculating the hit probability and/or miss probability of the rule at each node and the hit probability and/or miss probability of the rule model from the leaf nodes in the direction to the root node according to the rule model of the binary tree structure and the data value distribution of the attributes.
S2: and recursively calculating the preset guess probability of the attribute under the condition of the hit of the rule model and the preset guess probability of the attribute under the condition of the miss of the rule model from the root node along the direction far away from the root node according to the rule model of the binary tree structure and the hit probability and/or miss probability of the rule model.
In some embodiments, in implementation, the leaf node at the end (i.e., the leaf node not connected to the left sub-tree or the right sub-tree) may be selected from the rule model of the binary tree structure as a starting node, and the hit probability and/or miss probability of the rule of each node is recursively calculated node by node in a direction toward the root node; until the last node in the rule model of the binary tree structure, namely the root node, is recursively calculated; the hit probability and/or miss probability of the rule set (i.e., the rule model) may then be determined based on the hit probability and/or miss probability of the rule computed at the root node.
In some embodiments, when specifically calculating the hit probability and/or the miss probability of the rule at each node, taking the calculation of the hit probability and/or the miss probability of the rule at the current node as an example, when specifically calculating the hit probability and/or the miss probability of the rule at the current node, the hit probability and/or the miss probability of the rule at the current node may be calculated as follows: detecting a data type corresponding to a current node; under the condition that the data type corresponding to the current node is determined to be a rule, the rule corresponding to the current node is obtained to serve as the current rule, and attributes are extracted from the current rule to serve as current attributes; determining the data value distribution of the current attribute according to the data value distribution of the attribute; and determining the hit probability and/or the miss probability of the current rule according to the data value distribution of the current attribute, wherein the hit probability and/or the miss probability are used as the hit probability and/or the miss probability of the rule at the current node.
The data type corresponding to the current node may specifically include a rule or a rule conjunction word. The rule conjunctions may further include: and/or both of these different types.
In some embodiments, when it is determined that the data type corresponding to the current node is a rule, the rule corresponding to the current node may be obtained as the current rule; further, an attribute can be extracted from the current rule as a current attribute; calculating the hit probability of the current rule according to the data value distribution of the attributes, wherein the hit probability is used as the hit probability of the rule at the current node; and/or calculating the miss probability of the current rule according to the data value distribution of the attribute, wherein the miss probability is used as the miss probability of the rule at the current node.
Specifically, for example, the rule a corresponding to the current node is: x >3. The current attribute is X. The data value distribution of X can be obtained from the data value distributions of attributes, for example, the ratio of X for data values 1, 2, 3, 4, 5 is 1. At this time, the probability of hit P (hit) = (3 + 1)/(1 +2+3+ 1) =4/10 of rule a can be determined. The hit probability may then be determined as the hit probability of the rule at the current node. On the contrary, the miss probability P (miss) = (3 +2+ 1)/(1 +2+3+ 1) =6/10 of rule a can be determined. The hit probability may then be determined as the hit probability of the rule at the current node.
In some embodiments, when calculating the miss probability of the rule at the current node, the miss probability of the rule at the current node may also be calculated by calculating a difference value obtained by subtracting the hit probability (e.g., P (hit)) of the rule at the current node by 1. Wherein, the hit probability of the rule at the current node may be determined according to the data value distribution of the attribute. Similarly, when calculating the hit probability of the rule at the current node, the hit probability of the rule at the current node may also be calculated by calculating a difference obtained by subtracting the miss probability ((e.g., P (miss)) of the rule at the current node by 1. Wherein, the miss probability of the rule at the current node may be determined according to the data value distribution of the attribute.
In some embodiments, under the condition that the data type corresponding to the current node is determined to be a rule connecting word, obtaining the hit probability and/or miss probability of a rule at a node which is connected with the current node and is far away from the root node; and determining the hit probability and/or miss probability of the rule at the current node according to the hit probability and/or miss probability of the rule at the node which is connected with the current node and is far away from the root node.
In some embodiments, in a case where it is determined that the data type corresponding to the current node is a rule conjunct, it may be further determined whether the rule conjunction is a sum, or an or; and then, according to the type of the rule connecting word, the hit probability and/or the miss probability of the rule at the current node can be determined by utilizing the hit probability and/or the miss probability of the rule at two nodes which are connected with the current node and are far away from the root node.
Specifically, the hit probabilities of the rules at two nodes connected to the current node and far from the root node may be denoted as P (1) and P (2), respectively. Accordingly, the miss probabilities of the rules at two nodes connected to the current node and far from the root node can be denoted as 1-P (1) and 1-P (2), respectively.
Under the condition that the rule connecting word at the current node is determined to be AND (sum), the hit probability of the rule at the current node can be calculated according to the following mode according to the calculation rule matched with the AND: p (hit) = P (1) × P (2); and/or, a miss probability of a rule at the current node: p (miss) =1-P (1) × P (2).
In the case that it is determined that the rule link at the current node is OR, the miss probability of the rule at the current node may be calculated according to a calculation rule matching the OR in the following manner: p (miss) = (1-P (1)) = (1-P (2)); and/or, a miss probability of a rule at the current node: p (hit) =1- (1-P (1)) × (1-P (2)).
According to the mode, the hit probability and/or the hit probability of the rule at each node in a plurality of nodes contained in the rule model of the binary tree structure can be recursively calculated from the leaf nodes to the root node; traversing all nodes until the hit probability and/or the miss probability of the rule at the last root node is calculated; and determining the hit probability and/or the miss probability of the rule at the root node as the hit probability and/or the miss probability of the rule model. Therefore, the hit probability and/or miss probability of the rule model can be efficiently and accurately calculated by using the structural characteristics of the binary tree.
In some embodiments, the structural features of the binary tree may be reused to traverse each node in the rule model of the binary tree structure from the root node in a direction away from the root node according to the rule model of the binary tree structure, the hit probability and/or the miss probability of the rule model, so as to calculate the preset guess probability of the attribute in case of hit of the rule model and the preset guess probability of the attribute in case of miss of the rule model by a recursive algorithm.
Specifically, the conditional hit probability and/or the conditional miss probability of the rule at each node in the case of hit by the rule model and the conditional hit probability and/or the conditional miss probability of the rule at each node in the case of miss by the rule model may be determined by recursive computation starting from the root node in a direction away from the root node according to the rule model of the binary tree structure and the hit probability and/or the miss probability of the rule at each node in the direction away from the root node; and meanwhile, according to the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition that the rule model is hit, the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition that the rule model is not hit, and the data value distribution of the attribute, the preset guess probability of the attribute under the condition that the rule model is hit and the preset guess probability of the attribute under the condition that the rule model is not hit are determined.
In some embodiments, specifically, two situations can be distinguished: a hit rule model and a miss rule model. According to the two conditions, traversing each node from the root node along the direction far away from the root node, and calculating the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of hit of the rule model and the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of miss of the rule model one by one through a recursive algorithm; and then, by combining the data value distribution of the attributes, the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model are calculated.
In the specific calculation, taking the current node as an example, similar to calculating the hit probability and/or miss probability of the rule at each current node, whether the data type corresponding to the current node is a rule or a rule conjunction word may be detected first; according to the data type corresponding to the current node, determining the conditional hit probability and/or the conditional miss probability of the rule at the current node under the condition that the rule model is hit and the conditional hit probability and/or the conditional miss probability of the rule at the current node under the condition that the rule model is not hit according to the rule matched with the corresponding data type; and then determining the preset guess probability of the attribute under the condition that the rule model is hit and the preset guess probability of the attribute under the condition that the rule model is not hit.
The conditional hit probability of the rule at the current node may be specifically understood as a probability value when the rule at the current node is hit under a condition that a hit condition or a miss condition of the rule model is known. The above-mentioned conditional miss probability at the current node may be specifically understood as a probability value when the rule at the current node is determined to miss under the condition that the rule model is known to hit or miss.
In some embodiments, when the data type corresponding to the current node is determined to be a rule by detecting the data type corresponding to the current node, the rule corresponding to the current node may be obtained as the current rule, and the maximum guessing probability of the attribute when the current rule hits and/or the maximum guessing probability of the attribute when the current rule misses is calculated according to the current rule and the data value distribution of the attribute. And simultaneously determining the conditional hit probability and/or the conditional miss probability of the current rule at the current node under the condition of hit of the rule model, and the conditional hit probability and/or the conditional miss probability of the current rule at the current node under the condition of miss of the rule model. And then determining the preset guess probability of the attribute under the condition that the rule model is hit and the preset guess probability of the attribute under the condition that the rule model is not hit according to the maximum guess probability of the attribute under the condition that the current rule is hit and/or the maximum guess probability of the attribute under the condition that the current rule is not hit, the conditional hit probability and/or the conditional miss probability of the current rule at the current node under the condition that the rule model is not hit, and the conditional hit probability and/or the conditional miss probability of the current rule at the current node under the condition that the rule model is not hit.
Specifically, for example, the rule a corresponding to the current node is: x >3. The current attribute is X. The data value distribution of X can be obtained from the data value distributions of attributes, for example, the ratio of X for data values 1, 2, 3, 4, 5 is 1.
In the case of a hit for the current rule a, guessing the data value for X only guesses data values greater than 3 (e.g., 4 and 5). According to the distribution of the data values of X, the guessing probability that the data value of X is 4 is 3/(3 + 1) =3/4 can be calculated; the guessing probability of calculating the data value of X as 5 is 3/(3 + 1) =1/4. Comparing the two probability values to find that 3/4 is greater than 1/4, and determining 3/4 as the maximum guess probability of X in case of hit of rule a.
In a similar manner, guessing the data value of X only guesses data values less than or equal to 3 (e.g., 1, 2, and 3) given that the current rule a misses. According to the distribution of the data value of X, the guessing probability that the data value of X is 1 can be calculated to be 1/(1 +2+ 3) =1/6; calculating the guessing probability of the data value of X being 2/(1 +2+ 3) =2/6; the guessing probability of calculating the data value of X as 3 is 3/(1 +2+ 3) =3/6. Comparing the three probability values shows that 3/6 is greater than 1/6 and also greater than 2/6, and then 3/6 can be determined as the maximum guess probability of X in the case of the miss of rule a.
Meanwhile, according to the hit probability and/or the miss probability of the rule model, the conditional hit probability and/or the conditional miss probability of the rule a at the current node under the condition that the rule model is hit and the conditional hit probability and/or the conditional miss probability of the rule a at the current node under the condition that the rule model is missed are determined by combining the connected associated nodes based on the previous hierarchy.
Multiplying the maximum guess probability of X under the condition that the rule a is hit by the conditional hit probability of the rule a at the current node under the condition that the rule model is hit and the conditional hit probability of the rule a at the current node under the condition that the rule model is not hit respectively; and simultaneously multiplying the maximum guess probability of X under the condition that the rule a is not hit by the condition miss probability of the rule a at the current node under the condition that the rule model is hit and the condition miss probability of the rule a at the current node under the condition that the rule model is not hit respectively to obtain corresponding 4 probability values.
Two probability values corresponding to the hit condition of the rule model are selected from the 4 probability values, and the maximum value of the two probability values is selected as the preset guess probability of the attribute X under the hit condition of the rule model (or called the maximum guess probability of the attribute X under the hit condition of the rule model). Two probability values corresponding to the rule model missing condition are selected from the 4 probability values, and the maximum value of the two probability values is selected as the preset guessing probability of the attribute X in the rule model missing condition (or the maximum guessing probability of the attribute X in the rule model missing condition).
In some embodiments, in the case that the data type corresponding to the current node is determined to be the regular conjunction by detecting the data type corresponding to the current node, it may be further determined whether the regular conjunction is an AND OR; further, the conditional hit probability and/or the conditional miss probability of the rule at the next hierarchical node in both cases of a rule model hit and a rule model miss may be calculated from the conditional hit probability and/or the conditional miss probability of the rule at the current node in the case of a rule model hit, and the conditional hit probability and/or the conditional miss probability of the rule at the current node in the case of a rule model miss, by selecting a manner of matching with the rule connecting word according to the rule connecting word.
And because the current node corresponds to the rule connecting word and does not contain the attribute, the preset guessing probability of a certain attribute under the condition of hit of the rule model and the preset guessing probability under the condition of no hit of the rule model do not need to be calculated at the current node. The current node can be used as a correlation node of the next-level node to assist in the recursive computation of the next-level node.
Specifically, when it is determined that the rule connection is an AND, the conditional hit probabilities of the rules at the left AND right child nodes of the next hierarchy connected to the current node may be calculated as a product of the conditional hit probability of the rule at the current node (for example, the conditional hit probability of the rule at the current node in the case of a rule model hit, etc.) multiplied by 1.
And then calculating the conditional miss probability of the rule at the left and right child nodes connected with the current node.
Specifically, for example, the hit probability of the rule at the current node is denoted as P (hit), the hit probability of the rule at the left child node is denoted as P (left), and the hit probability of the rule at the right child node is denoted as P (right). The conditional miss probability of the rule at the left child node is calculated as P (left conditional miss) = (1-P (left))/(1-P (left) × P (right)). Similarly, the conditional miss probability of a rule at the right child node can be calculated as P (right conditional miss) = (1-P (right))/(1-P (left) × P (right)). And multiplying the two probability values by the conditional miss probability value of the rule at the current node respectively to obtain the conditional miss probability of the rule at the left and right child nodes of the next level.
When the rule connection is determined to be OR, the conditional miss probability of the rule at the left and right child nodes of the next hierarchy connected to the current node may be calculated as the product of the conditional miss probability of the rule at the current node multiplied by 1.
And then calculating the conditional hit probability of the rules at the left and right child nodes connected with the current node.
Specifically, for example, the hit probability of the rule at the current node is denoted as P (hit), the hit probability of the rule at the left child node is denoted as P (left), and the hit probability of the rule at the right child node is denoted as P (right). The conditional hit probability of the rule at the left child node is calculated as P (left conditional hit) = P (left)/(1- (1-P (left)) × (1-P (right))). Similarly, the conditional hit probability of the rule at the right child node can be calculated as P (right conditional hit) = P (right)/(1- (1-P (left)) × (1-P (right))). And multiplying the two probability values by the conditional hit probability value of the rule at the current node respectively to obtain the conditional hit probability of the rule at the left and right child nodes of the next level.
Therefore, the corresponding conditional hit probability and/or conditional miss probability of the next-level node of the current node can be calculated.
According to the mode, the structural characteristics of the binary tree structure can be fully utilized, the multiple nodes are subjected to recursive calculation one by one, and the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition that the rule model is hit and the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition that the rule model is not hit are determined; and determining the preset guess probability of the attribute under the condition of the hit of the rule model and the preset guess probability of the attribute under the condition of the miss of the rule model according to the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of the hit of the rule model, the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of the miss of the rule model and the data value distribution of the attribute.
S204: and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
In some embodiments, in implementation, the preset guessing probability of the attribute in the case of a hit of the rule model and the preset guessing probability of the attribute in the case of a miss of the rule model may be used as one index parameter; the preset guessing probability of the attribute in the case of hit of the rule model and the preset guessing probability of the attribute in the case of miss of the rule model can be compared with the preset safety threshold value to judge whether the probability of the data value of the attribute in the guessing changes in the case of hit or miss of the known rule model after the processing of the rule model.
If the preset guess probability of the attribute in the case of the hit of the rule model and/or the preset guess probability of the attribute in the case of the miss of the rule model is greater than the preset safety threshold, it can be considered that: under the condition of hit or miss of the known rule model, the real data value of the attribute is relatively easier to guess, and the data resource owned by the data provider is easier to leak; and then the rule model can be judged to have safety risk.
If the preset guess probability of the attribute in case of a hit of the rule model and/or the preset guess probability of the attribute in case of a miss of the rule model is less than or equal to the preset safety threshold, it may be considered that: under the condition of hit or miss of the known rule model, the real data value of the attribute is still difficult to guess, and data resources owned by a data provider are difficult to leak; and then the rule model can be judged to have no security risk.
In some embodiments, in specific implementation, a probability value with the largest value may be further screened out from preset guess probabilities of the attributes in the case of a hit rule model and preset guess probabilities of the attributes in the case of a miss rule model, and the probability value is used as a leakage indication parameter of the data value of the attribute; comparing the leakage indication parameter of the data value of the attribute with a preset probability threshold value to obtain a comparison result; and determining whether the rule model has a safety risk or not according to the comparison result.
In some embodiments, the attribute may specifically include a plurality of attributes. Correspondingly, the attribute of the data value with the leakage indication parameter larger than the preset probability threshold value can be determined as the risk attribute with the leakage risk according to the comparison result. The risk attribute can thus be determined from the rule model in a fine-grained manner and prompted to the first server. Therefore, the first server can pertinently modify the rule containing the risk attribute, so that the modified rule model has no security risk and can be normally operated and used.
In some embodiments, the specific value of the preset safety threshold may be determined according to the sensitivity of the data value change of the attribute, the tolerance of error, and other factors.
Specifically, for example, for some application scenarios with higher accuracy requirements, the tolerance for errors is usually small, and the tolerance range is also relatively small; meanwhile, if the change amplitude of the data value of the attribute is small and the sensitivity is high, the value of the preset safety threshold value can be set to be relatively small. In contrast, for some application scenarios with low precision requirements, the tolerance for errors is usually large, and the tolerance range is also relatively large; meanwhile, if the data value of the attribute has large change amplitude and low sensitivity, the preset safety threshold value can be set to be relatively large so as to reduce the false alarm rate.
In some embodiments, in a case where it is determined that the rule model does not have a security risk, in a specific implementation, the third server may generate and send a security prompt message to the second server to prompt the second server that the rule model does not have a security risk, and may normally operate the rule model.
Correspondingly, the second server can receive and normally use the data resource owned by the own party to run the rule model according to the safety prompt information.
Specifically, for example, the second server may receive a data processing request from the first server, where the data processing request carries an identity (e.g., an identity ID of a user) of a data object to be queried. The second server may retrieve information data matching the identity of the data object from an owned data resource (e.g., a database of the user) according to the identity of the data object carried in the data processing request. And inputting the information data into a rule model, operating the rule model, and outputting whether the data object hits the rule model or which rule in the rule model is hit specifically, and the like as a processing result.
And the second server may feed back the processing result to the first server. The first server may perform corresponding data processing according to the processing result. For example, the first server may determine the specific credit risk of the user according to a preset credit risk rating rule according to whether the user object hits the rule model in the processing result or a rule in the user specific hit rule model.
In some embodiments, in a case where it is determined that the rule model has a security risk, in a specific implementation, the third server may generate and send risk prompt information to the second server to prompt the second server that the rule model has a security risk, and may refuse to run the rule model.
Accordingly, the second server can receive and refuse to use the own data resource to run the rule model according to the risk prompt message.
Therefore, the second server can avoid that the data resources of the own party are leaked or the leakage exceeds the tolerance range because the unsafe rule model is operated by utilizing the data resources of the own party, so that the data safety of the data provider can be effectively protected, and the risk that the data resources of the data provider are leaked is reduced.
As can be seen from the above, in the method for determining the security of the rule model provided in the embodiment of the present specification, the rule model to be detected is converted into the binary tree structure according to the preset conversion rule, so as to obtain the rule model with the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node corresponds to a rule or a rule connecting word respectively; and then, carrying out structural splitting on the rule model of the binary tree structure by utilizing the structural characteristics of the binary tree, and determining the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model by combining the data value distribution of the attribute through recursive calculation so as to determine whether the rule model has a safety risk. Therefore, the safety of the rule model can be determined efficiently and accurately, and the risk of data leakage caused by the fact that the data provider runs the unsafe rule model is reduced.
In some embodiments, the determining, by recursive computation, the preset guessing probability of the attribute in the case of a hit in the rule model and the preset guessing probability of the attribute in the case of a miss in the rule model according to the rule model of the binary tree structure and the data value distribution of the attribute may include: according to the rule model of the binary tree structure and the data value distribution of the attributes, starting from leaf nodes and along the direction to a root node, recursively calculating the hit probability and/or miss probability of the rule at each node and the hit probability and/or miss probability of the rule model; according to the rule model of the binary tree structure and the hit probability and/or miss probability of the rule model, starting from the root node and along the direction far away from the root node, recursively calculating the preset guess probability of the attribute under the condition of hit of the rule model and the preset guess probability of the attribute under the condition of miss of the rule model.
The preset guessing probability may be a maximum guessing probability. Accordingly, the predetermined guess probability for an attribute in the case of a rule model hit may be specifically understood as the maximum guess probability for guessing the data value of the attribute in the case of a known rule model hit. The predetermined guess probability of an attribute in the event of a rule model miss may be understood in particular as the maximum guess probability of guessing the data value of the attribute in the event of a known rule model miss.
In some embodiments, the above-mentioned hit probability and/or miss probability of the rule at each node is recursively calculated from the leaf nodes in a direction toward the root node according to the rule model of the binary tree structure and the data value distribution of the attributes, and in a specific implementation, taking processing a current node of a plurality of nodes as an example, the hit probability and/or miss probability of the rule at the current node may be calculated in the following manner: detecting a data type corresponding to a current node; under the condition that the data type corresponding to the current node is determined to be a rule, the rule corresponding to the current node is obtained to serve as the current rule, and attributes are extracted from the current rule to serve as current attributes; determining the data value distribution of the current attribute according to the data value distribution of the attribute; and determining the hit probability and/or the miss probability of the current rule according to the data value distribution of the current attribute, wherein the hit probability and/or the miss probability are used as the hit probability and/or the miss probability of the rule at the current node.
The multiple nodes in the rule model of the binary tree structure may be recursively calculated one by one in the above manner of calculating the hit probability and/or the miss probability of the rule at the current node, so as to obtain the hit probability and/or the miss probability of the rule at each node.
In some embodiments, after detecting the data type corresponding to the current node, when the method is implemented specifically, the method may further include: under the condition that the data type corresponding to the current node is determined to be a rule connecting word, acquiring the hit probability and/or miss probability of a rule at a node which is connected with the current node and is far away from the root node; and determining the hit probability and/or miss probability of the rule at the current node according to the hit probability and/or miss probability of the rule at the node which is connected with the current node and is far away from the root node.
In some embodiments, the determining the hit probability and/or the miss probability of the rule model may include: obtaining hit probability and/or miss probability of rules at two nodes connected with a root node; and determining the hit probability and/or the miss probability of the rule at the root node as the hit probability and/or the miss probability of the rule model according to the rule connecting words corresponding to the root node and the hit probability and/or the miss probability of the rule at the two nodes connected with the root node.
In some embodiments, the above recursively calculating, according to the rule model of the binary tree structure, the hit probability and/or the miss probability of the rule model, starting from the root node in a direction away from the root node, the preset guess probability of the attribute in the case of a hit in the rule model and the preset guess probability of the attribute in the case of a miss in the rule model may include: according to the rule model of the binary tree structure and the hit probability and/or miss probability of the rule model, determining the conditional hit probability and/or conditional miss probability of the rule at each node under the condition of the hit of the rule model and the conditional hit probability and/or conditional miss probability of the rule at each node under the condition of the miss of the rule model by recursive computation starting from the root node and along the direction far away from the root node; and determining the preset guess probability of the attribute under the condition of the rule model hit and the preset guess probability of the attribute under the condition of the rule model miss according to the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of the rule model hit, the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of the rule model miss and the data value distribution of the attribute.
In some embodiments, the determining whether the rule model has a security risk according to the preset guessing probability of the attribute in the case of a hit in the rule model and the preset guessing probability of the attribute in the case of a miss in the rule model may include: determining leakage indication parameters of data values of the attributes according to the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model; comparing the leakage indication parameters of the data values of the attributes with a preset probability threshold value to obtain a comparison result; and determining whether the rule model has a safety risk or not according to the comparison result.
In some embodiments, in particular, a numerical maximum probability value may be selected as the leakage indication parameter of the data value of the attribute from the preset guess probability of the attribute in the case of a hit of the rule model and the preset guess probability of the attribute in the case of a miss of the rule model, and the preset guess probability of the attribute in the case of a miss of the rule model.
In some embodiments, the attributes may specifically include a plurality of different attributes.
Specifically, the probability value with the largest value may be found from two probability values, namely, the preset guess probability of the same attribute in the case of a hit of the rule model and the preset guess probability of the attribute in the case of a miss of the rule model, as the leakage indication parameter of the data value of the attribute (or the attribute is referred to as being based on the maximum guess probability of the rule model).
In some embodiments, the specific value of the preset safety threshold may be determined according to factors such as sensitivity of the data value change of the attribute, and tolerance of error. In the case that the attribute includes a plurality of different attributes, the preset safety threshold may also include a plurality of preset safety thresholds with different values. Wherein each preset safety threshold corresponds to an attribute.
In some embodiments, the determining whether the rule model has a security risk according to the comparison result may include: and determining that the rule model has a safety risk under the condition that the leakage indication parameter of the data value with the attribute is determined to be larger than the preset probability threshold according to the comparison result.
In some embodiments, in a case that the attributes include a plurality of different attributes, if the leakage indication parameter of the data value of at least one of the data values of the plurality of attributes is greater than a preset safety threshold corresponding to the attribute, it may be determined that the rule model has a safety risk.
Further, the attribute may also be determined as a risk attribute. The risk attribute may be specifically understood as an attribute having a risk of data leakage by operating the rule model.
If the leakage indication parameters of the data values of the plurality of attributes are all less than or equal to the corresponding preset safety threshold, it can be determined that the rule model has no safety risk.
In some embodiments, when the method is implemented in a case where it is determined that the rule model has a security risk, the method may further include: generating risk prompt information; and the risk prompt information is used for prompting a data provider to refuse to run the rule model.
On the contrary, under the condition that the rule model is determined to have no security risk, security prompt information can be generated; and the safety prompt information is used for prompting the data provider that the rule model can normally run. The server of the data provider can normally run the rule model by using the data resource owned by the own party only when the server of the data provider receives the safety prompt information and determines that the rule model has no safety risk, so that the data security of the data provider can be effectively protected.
In some embodiments, the method, when implemented, may further include: and according to the comparison result, determining the attribute of the data value with the leakage indication parameter larger than the preset probability threshold value as the risk attribute with leakage risk.
As can be seen from the above, in the method for determining the security of the rule model provided in the embodiment of the present specification, the rule model to be detected is converted into a binary tree structure according to the preset conversion rule, so as to obtain the rule model with the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node corresponds to a rule or a rule connecting word respectively; and then, carrying out structural splitting on the rule model of the binary tree structure by utilizing the structural characteristics of the binary tree, and determining the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model by combining the data value distribution of the attribute through recursive calculation so as to determine whether the rule model has a safety risk. Therefore, the safety of the rule model can be determined efficiently and accurately, and the risk of data leakage caused by the fact that the data provider runs the unsafe rule model is reduced.
Referring to fig. 7, an embodiment of the present specification further provides a method for determining security of a rule model. When the method is implemented, the following contents may be included.
S701: acquiring a rule model and data value distribution of attributes; wherein the rule model comprises a rule set, the rule set comprises a plurality of rules, and the rules are connected through rule connecting words.
S702: converting the rule model into a preset tree structure according to a preset conversion rule to obtain a rule model of the preset tree structure; the rule model of the preset tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively.
S703: and determining the preset guess probability of the attribute under the condition of hit of the rule model and the preset guess probability of the attribute under the condition of miss of the rule model through recursive calculation according to the preset rule model of the tree structure and the data value distribution of the attribute.
S704: and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
In some embodiments, the preset tree structure may specifically include a multi-way tree. Of course, the multi-way tree listed above is merely illustrative. In specific implementation, according to specific situations and processing requirements, other suitable tree data structures such as a binary tree may also be introduced as the preset tree structure. The present specification is not limited to these.
Thus, the rule model of the preset tree structure can be obtained by converting the rule model to be detected into the preset tree structure; and then, carrying out structural splitting on a rule model of the preset tree structure by utilizing the structural characteristics of the preset tree structure, and determining the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model by combining the data value distribution of the attribute through recursive calculation so as to finally determine whether the rule model has a safety risk. Therefore, the safety of the rule model can be determined efficiently and accurately, and the risk of data leakage caused by the fact that the data provider runs the unsafe rule model is reduced.
The embodiment of the present specification further provides another method for determining the security of the rule model, which specifically includes the following contents.
S1: acquiring a rule model and data value distribution of attributes; the rule model comprises a rule set, wherein the rule set comprises a plurality of rules, and the rules are connected through rule connecting words.
S2: and determining the preset guess probability of the attribute under the condition of hit of the rule model and the preset guess probability of the attribute under the condition of miss of the rule model according to the rule model and the data value distribution of the attribute.
S3: and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
In some embodiments, determining a preset guess probability of the attribute in the case of a hit in the rule model and a preset guess probability of the attribute in the case of a miss in the rule model according to the rule model and the data value distribution of the attribute may include: according to the rule model and the data value distribution of the attributes, determining the hit probability and/or miss probability of each rule in the rule model and the hit probability and/or miss probability of the rule model; and calculating the preset guess probability of the attribute in the rule under the condition of the hit of the rule model and the preset guess probability of the attribute under the condition of the miss of the rule model according to the hit probability and/or the miss probability of the rule model and the hit probability and/or the miss probability of each rule.
In some embodiments, the hit probability and/or miss probability of each rule in the rule model is determined according to the rule model and the data value distribution of the attribute, and the specific implementation may include: the hit probability and/or miss probability of the current rule is calculated as follows: extracting attributes from the current rule as current attributes; determining the data value distribution of the current attribute according to the data value distribution of the attribute; and determining the hit probability and/or miss probability of the current rule according to the data value distribution of the current attribute.
In some embodiments, the determining the hit probability and/or the miss probability of the rule model may include: determining rule connecting words among rules in the rule model; and calculating the hit probability and/or the miss probability of the rule model according to the rule connecting words and the hit probability and/or the miss probability of the rules connected by the rule connecting words.
In some embodiments, the calculating, according to the hit probability and/or the miss probability of the rule model and the hit probability and/or the miss probability of each rule, a preset guess probability of the attribute in the rule in case of hit in the rule model and a preset guess probability of the attribute in case of miss in the rule model may include: determining the conditional hit probability and/or the conditional miss probability of each rule under the condition that the rule model is hit and the conditional hit probability and/or the conditional miss probability of each rule under the condition that the rule model is not hit according to the rule connecting words among the rules in the rule model and the hit probability and/or the miss probability of the rule model; and determining the preset guess probability of the attribute under the condition of the rule model hit and the preset guess probability of the attribute under the condition of the rule model miss according to the conditional hit probability and/or the conditional miss probability of each rule under the condition of the rule model hit, the conditional hit probability and/or the conditional miss probability of each rule under the condition of the rule model miss and the data value distribution of the attribute.
In some embodiments, the predetermined probability of guessing the attribute in case of hit in the rule model and the predetermined probability of guessing the attribute in case of miss in the rule model may be determined by recursive calculation according to the rule model and the data value distribution of the attribute.
In some embodiments, in specific implementation, the rule model may be converted into a binary tree structure to obtain a rule model with a binary tree structure; and then according to the rule model of the binary tree structure and the data value distribution of the attributes, the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model are determined more efficiently and conveniently through recursive calculation.
Of course, the above-listed calculation is only an illustrative example. In specific implementation, according to specific conditions and specific structural characteristics of the rule model, other suitable manners can be adopted to calculate the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
Embodiments of the present specification further provide a server, including a processor and a memory for storing processor-executable instructions, where the processor, when implemented, may perform the following steps according to the instructions: acquiring a rule model and data value distribution of attributes; wherein the rule model comprises a rule set, the rule set comprises a plurality of rules, and the rules are connected through rule connecting words; converting the rule model into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively; according to the rule model of the binary tree structure and the data value distribution of the attributes, determining the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model through recursive calculation; and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
In order to more accurately complete the above instructions, referring to fig. 8, the present specification further provides another specific server, where the server includes a network communication port 801, a processor 802, and a memory 803, and the above structures are connected by an internal cable, so that the structures may perform specific data interaction.
The network communication port 801 may be specifically configured to obtain a rule model and data value distribution of an attribute; the rule model comprises a rule set, wherein the rule set comprises a plurality of rules, and the rules are connected through rule connecting words.
The processor 802 may be specifically configured to convert the rule model into a binary tree structure according to a preset conversion rule, so as to obtain a rule model with a binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively; according to the rule model of the binary tree structure and the data value distribution of the attributes, determining the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model through recursive calculation; and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
The memory 803 may be specifically configured to store a corresponding instruction program.
In this embodiment, the network communication port 801 may be a virtual port bound to different communication protocols so as to send or receive different data. For example, the network communication port may be a port responsible for web data communication, a port responsible for FTP data communication, or a port responsible for mail data communication. In addition, the network communication port can also be a communication interface or a communication chip of an entity. For example, it may be a wireless mobile network communication chip, such as GSM, CDMA, etc.; it can also be a Wifi chip; it may also be a bluetooth chip.
In the present embodiment, the processor 802 may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth. The description is not intended to be limiting.
In this embodiment, the memory 803 may include multiple layers, and in a digital system, the memory may be any memory as long as it can store binary data; in an integrated circuit, a circuit without a real form and with a storage function is also called a memory, such as a RAM, a FIFO and the like; in the system, the storage device in physical form is also called a memory, such as a memory bank, a TF card and the like.
The present specification further provides a computer storage medium of a method for determining security based on the rule model, where the computer storage medium stores computer program instructions, and when the computer program instructions are executed, the computer program instructions implement: acquiring a rule model and data value distribution of attributes; wherein the rule model comprises a rule set, the rule set comprises a plurality of rules, and the rules are connected through rule connecting words; converting the rule model into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively; determining a preset guess probability of the attribute under the condition of hit of the rule model and a preset guess probability of the attribute under the condition of miss of the rule model through recursive calculation according to the rule model of the binary tree structure and the data value distribution of the attribute; and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of no hit of the rule model.
In this embodiment, the storage medium includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk Drive (HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.
In this embodiment, functions and effects specifically realized by the program instructions stored in the computer storage medium may be explained in comparison with other embodiments, and are not described herein again.
Referring to fig. 9, on a software level, the embodiment of the present specification further provides a device for determining security of a rule model, where the device may specifically include the following structural modules.
The obtaining module 901 may be specifically configured to obtain a rule model and data value distribution of an attribute; wherein the rule model comprises a rule set, the rule set comprises a plurality of rules, and the rules are connected through rule connecting words.
The conversion module 902 may be specifically configured to convert the rule model into a binary tree structure according to a preset conversion rule, so as to obtain a rule model with a binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to one rule or one rule connecting word respectively.
The calculating module 903 may be specifically configured to determine, according to the rule model of the binary tree structure and the data value distribution of the attribute, a preset guess probability of the attribute under a condition that the rule model is hit and a preset guess probability of the attribute under a condition that the rule model is not hit through recursive calculation.
The determining module 904 may be specifically configured to determine whether the rule model has a security risk according to the preset guessing probability of the attribute in the case of a hit of the rule model and the preset guessing probability of the attribute in the case of a miss of the rule model.
In some embodiments, the calculation module 903 may specifically include the following structural units:
the first calculating unit may be specifically configured to recursively calculate, starting from a leaf node and in a direction toward a root node, a hit probability and/or a miss probability of a rule at each node and a hit probability and/or a miss probability of a rule model according to the rule model of the binary tree structure and the data value distribution of the attribute;
the second calculating unit may be specifically configured to recursively calculate, starting from the root node and along a direction away from the root node, a preset guess probability of the attribute in the case of a hit in the rule model and a preset guess probability of the attribute in the case of a miss in the rule model according to the rule model of the binary tree structure and the hit probability and/or the miss probability of the rule model.
In some embodiments, the first calculating unit may be specifically configured to calculate the hit probability and/or the miss probability of the rule at the current node in the following manner: detecting a data type corresponding to a current node; under the condition that the data type corresponding to the current node is determined to be a rule, the rule corresponding to the current node is obtained to serve as the current rule, and attributes are extracted from the current rule to serve as current attributes; determining the data value distribution of the current attribute according to the data value distribution of the attribute; and determining the hit probability and/or the miss probability of the current rule according to the data value distribution of the current attribute, wherein the hit probability and/or the miss probability are used as the hit probability and/or the miss probability of the rule at the current node.
In some embodiments, the second computing unit is specifically configured to determine, by recursive computation, a conditional hit probability and/or a conditional miss probability of the rule at each node in a case where the rule model is hit and a conditional hit probability and/or a conditional miss probability of the rule at each node in a case where the rule model is missed, based on the rule model of the binary tree structure, the hit probability and/or the miss probability of the rule model, starting from the root node in a direction away from the root node; and determining the preset guess probability of the attribute under the condition of the rule model hit and the preset guess probability of the attribute under the condition of the rule model miss according to the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of the rule model hit, the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of the rule model miss and the data value distribution of the attribute.
In some embodiments, the determining module 904 may specifically include the following structural units:
the first determining unit may be specifically configured to determine the leakage indication parameter of the data value of the attribute according to the preset guessing probability of the attribute in the case of hit in the rule model and the preset guessing probability of the attribute in the case of miss in the rule model
The comparison unit is specifically configured to compare the leakage indication parameter of the data value of the attribute with a preset probability threshold to obtain a comparison result;
the second determining unit may be specifically configured to determine whether the rule model has a security risk according to the comparison result.
In some embodiments, the first determining unit, when implemented, may be configured to determine a numerical maximum probability value from the preset guessing probabilities of the attributes in case of a hit of the rule model and the preset guessing probabilities of the attributes in case of a miss of the rule model as the leakage indication parameter of the data value of the attribute.
It should be noted that, the units, devices, modules, etc. illustrated in the above embodiments may be implemented by a computer chip or an entity, or implemented by a product with certain functions. For convenience of description, the above devices are described as being divided into various modules by functions, which are described separately. It is to be understood that, in implementing the present specification, functions of each module may be implemented in one or more pieces of software and/or hardware, or a module that implements the same function may be implemented by a combination of a plurality of sub-modules or sub-units, or the like. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
As can be seen from the above, the device for determining the security of the rule model provided in the embodiment of the present specification converts the rule model into the binary tree structure, and then can efficiently and accurately determine the security of the rule model by using the structural characteristics of the binary tree, thereby reducing the risk of data leakage caused by running an unsafe rule model on a data provider.
Although the present specification provides method steps as described in the examples or flowcharts, additional or fewer steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an apparatus or client product in practice executes, it may execute sequentially or in parallel (e.g., in a parallel processor or multithreaded processing environment, or even in a distributed data processing environment) according to the embodiments or methods shown in the figures. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the presence of additional identical or equivalent elements in a process, method, article, or apparatus that comprises the recited elements is not excluded. The terms first, second, etc. are used to denote names, but not any particular order.
Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may therefore be considered as a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be conceived to be both a software module implementing the method and a structure within a hardware component.
This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
From the above description of the embodiments, it is clear to those skilled in the art that the present specification can be implemented by software plus necessary general hardware platform. With this understanding, the technical solutions in the present specification may be essentially embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments in the present specification.
The embodiments in the present specification are described in a progressive manner, and the same or similar parts in the embodiments are referred to each other, and each embodiment focuses on differences from other embodiments. The description is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
While the specification has been described with examples, those skilled in the art will appreciate that there are numerous variations and permutations of the specification that do not depart from the spirit of the specification, and it is intended that the appended claims include such variations and modifications that do not depart from the spirit of the specification.

Claims (20)

1. A method of determining security of a rule model, comprising:
acquiring a rule model and data value distribution of attributes; wherein the rule model comprises a rule set, the rule set comprises a plurality of rules, and the rules are connected through rule connecting words;
converting the rule model into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively;
according to the rule model of the binary tree structure and the data value distribution of the attributes, determining the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model through recursive calculation; the method comprises the following steps: according to the rule model of the binary tree structure and the data value distribution of the attributes, starting from leaf nodes and along the direction to a root node, recursively calculating the hit probability and/or miss probability of the rule at each node and the hit probability and/or miss probability of the rule model; according to the rule model of the binary tree structure and the hit probability and/or miss probability of the rule model, starting from the root node and along the direction far away from the root node, recursively calculating the preset guess probability of the attribute under the condition of hit of the rule model and the preset guess probability of the attribute under the condition of miss of the rule model;
and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
2. The method according to claim 1, wherein recursively calculating hit probabilities and/or miss probabilities of rules at respective nodes starting from leaf nodes in a direction towards a root node according to the rule model of the binary tree structure and the data value distribution of the attributes comprises: the hit probability and/or miss probability of a rule at the current node is calculated as follows:
detecting a data type corresponding to a current node;
under the condition that the data type corresponding to the current node is determined to be a rule, the rule corresponding to the current node is obtained to serve as the current rule, and attributes are extracted from the current rule to serve as current attributes;
determining the data value distribution of the current attribute according to the data value distribution of the attribute;
and determining the hit probability and/or the miss probability of the current rule according to the data value distribution of the current attribute, wherein the hit probability and/or the miss probability are used as the hit probability and/or the miss probability of the rule at the current node.
3. The method of claim 2, after detecting the data type corresponding to the current node, the method further comprising:
under the condition that the data type corresponding to the current node is determined to be a rule connecting word, acquiring the hit probability and/or miss probability of a rule at a node which is connected with the current node and is far away from the root node;
and determining the hit probability and/or miss probability of the rule at the current node according to the hit probability and/or miss probability of the rule at the node which is connected with the current node and is far away from the root node.
4. The method of claim 3, determining a hit probability and/or a miss probability of a rule model, comprising:
obtaining hit probability and/or miss probability of rules at two nodes connected with a root node;
and determining the hit probability and/or the miss probability of the rule at the root node as the hit probability and/or the miss probability of the rule model according to the rule connecting words corresponding to the root node and the hit probability and/or the miss probability of the rule at the two nodes connected with the root node.
5. The method according to claim 1, wherein the recursively calculating, starting from the root node in a direction away from the root node, a preset guess probability for the attribute in case of a rule model hit and a preset guess probability for the attribute in case of a rule model miss, based on the rule model of the binary tree structure, the hit probability and/or the miss probability of the rule model, comprises:
according to the rule model of the binary tree structure and the hit probability and/or miss probability of the rule model, determining the conditional hit probability and/or conditional miss probability of the rule at each node under the condition of hit of the rule model and the conditional hit probability and/or conditional miss probability of the rule at each node under the condition of miss of the rule model by recursive computation starting from the root node and along the direction far away from the root node;
and determining the preset guess probability of the attribute under the condition of the rule model hit and the preset guess probability of the attribute under the condition of the rule model miss according to the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of the rule model hit, the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of the rule model miss and the data value distribution of the attribute.
6. The method of claim 1, determining whether the rule model has a security risk based on a preset probability of guessing for the attribute in the case of a rule model hit and a preset probability of guessing for the attribute in the case of a rule model miss, comprising:
determining leakage indication parameters of data values of the attributes according to the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model;
comparing the leakage indication parameter of the data value of the attribute with a preset probability threshold value to obtain a comparison result;
and determining whether the rule model has a safety risk or not according to the comparison result.
7. The method of claim 6, said determining whether the rule model is at risk for security based on the comparison, comprising:
and determining that the rule model has a safety risk under the condition that the leakage indication parameter of the data value with the attribute is determined to be larger than the preset probability threshold according to the comparison result.
8. The method of claim 7, where it is determined that the rule model is at security risk, further comprising:
generating risk prompt information; and the risk prompt information is used for prompting a data provider to refuse to run the rule model.
9. The method of claim 7, further comprising:
and according to the comparison result, determining the attribute of the data value with the leakage indication parameter larger than the preset probability threshold value as the risk attribute with leakage risk.
10. A device for determining security of a rule model, comprising:
the acquisition module is used for acquiring the rule model and the data value distribution of the attribute; the rule model comprises a rule set, wherein the rule set comprises a plurality of rules which are connected through rule connecting words;
the conversion module is used for converting the rule model into a binary tree structure according to a preset conversion rule to obtain a rule model of the binary tree structure; the rule model of the binary tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively;
the calculation module is used for determining the preset guessing probability of the attribute under the condition of the hit of the rule model and the preset guessing probability of the attribute under the condition of the miss of the rule model through recursive calculation according to the rule model of the binary tree structure and the data value distribution of the attribute; the calculation module comprises: a first calculating unit, configured to recursively calculate, starting from a leaf node and in a direction toward a root node, a hit probability and/or a miss probability of a rule at each node and a hit probability and/or a miss probability of a rule model according to the rule model of the binary tree structure and the data value distribution of the attribute; the second calculation unit is used for recursively calculating the preset guess probability of the attribute under the condition of the regular model hit and the preset guess probability of the attribute under the condition of the regular model miss from the root node along the direction far away from the root node according to the regular model of the binary tree structure and the hit probability and/or miss probability of the regular model;
and the determining module is used for determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
11. The apparatus according to claim 10, wherein the first computing unit is specifically configured to compute the hit probability and/or the miss probability of the rule at the current node in the following manner: detecting a data type corresponding to a current node; under the condition that the data type corresponding to the current node is determined to be a rule, the rule corresponding to the current node is obtained to serve as the current rule, and attributes are extracted from the current rule to serve as current attributes; determining the data value distribution of the current attribute according to the data value distribution of the attribute; and determining the hit probability and/or the miss probability of the current rule according to the data value distribution of the current attribute, wherein the hit probability and/or the miss probability are used as the hit probability and/or the miss probability of the rule at the current node.
12. The apparatus according to claim 10, wherein the second computing unit is specifically configured to determine, based on the rule model of the binary tree structure, the hit probability and/or the miss probability of the rule model, from the root node, through recursive computation in a direction away from the root node, a conditional hit probability and/or a conditional miss probability of the rule at each node in case of a hit in the rule model, and a conditional hit probability and/or a conditional miss probability of the rule at each node in case of a miss in the rule model; and determining the preset guess probability of the attribute under the condition of the rule model hit and the preset guess probability of the attribute under the condition of the rule model miss according to the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of the rule model hit, the conditional hit probability and/or the conditional miss probability of the rule at each node under the condition of the rule model miss and the data value distribution of the attribute.
13. The apparatus of claim 10, the determining module comprising:
the first determining unit is used for determining leakage indication parameters of the data values of the attributes according to the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model;
the comparison unit is used for comparing the leakage indication parameters of the data values of the attributes with a preset probability threshold value to obtain a comparison result;
and the second determining unit is used for determining whether the rule model has a safety risk or not according to the comparison result.
14. A method of determining security of a rule model, comprising:
acquiring a rule model and data value distribution of attributes; wherein the rule model comprises a rule set, the rule set comprises a plurality of rules, and the rules are connected through rule connecting words;
converting the rule model into a preset tree structure according to a preset conversion rule to obtain a rule model of the preset tree structure; the rule model of the preset tree structure comprises a plurality of nodes, and each node in the plurality of nodes corresponds to a rule or a rule connecting word respectively;
according to the preset rule model of the tree structure and the data value distribution of the attributes, determining the preset guessing probability of the attributes under the condition of hit of the rule model and the preset guessing probability of the attributes under the condition of miss of the rule model through recursive calculation; the method comprises the following steps: according to the preset rule model of the tree structure and the data value distribution of the attributes, starting from a leaf node and along the direction to a root node, recursively calculating the hit probability and/or miss probability of the rule at each node and the hit probability and/or miss probability of the rule model; according to the preset rule model of the tree structure and the hit probability and/or miss probability of the rule model, starting from the root node and along the direction far away from the root node, recursively calculating the preset guess probability of the attribute under the condition of hit of the rule model and the preset guess probability of the attribute under the condition of miss of the rule model;
and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of miss of the rule model.
15. The method of claim 14, the predetermined tree structure comprising a multi-way tree.
16. A method of determining security of a rule model, comprising:
acquiring a rule model and data value distribution of attributes; the rule model comprises a rule set, wherein the rule set comprises a plurality of rules which are connected through rule connecting words;
determining a preset guess probability of the attribute under the condition of hit of the rule model and a preset guess probability of the attribute under the condition of miss of the rule model according to the rule model and the data value distribution of the attribute; the method comprises the following steps: according to the rule model and the data value distribution of the attributes, determining the hit probability and/or miss probability of each rule in the rule model and the hit probability and/or miss probability of the rule model; calculating the preset guess probability of the attribute in the rule under the condition of the hit of the rule model and the preset guess probability of the attribute under the condition of the miss of the rule model according to the hit probability and/or the miss probability of the rule model and the hit probability and/or the miss probability of each rule;
and determining whether the rule model has a safety risk or not according to the preset guessing probability of the attribute under the condition of hit of the rule model and the preset guessing probability of the attribute under the condition of no hit of the rule model.
17. The method of claim 16, determining a hit probability and/or a miss probability for each rule in the rule model based on the rule model and the data value distribution of the attributes, comprising: the hit probability and/or miss probability of the current rule is calculated as follows:
extracting attributes from the current rule as current attributes;
determining the data value distribution of the current attribute according to the data value distribution of the attribute;
and determining the hit probability and/or miss probability of the current rule according to the data value distribution of the current attribute.
18. The method of claim 17, determining a hit probability and/or a miss probability of a rule model, comprising:
determining rule connecting words among rules in the rule model;
and calculating the hit probability and/or the miss probability of the rule model according to the rule connecting words and the hit probability and/or the miss probability of the rules connected by the rule connecting words.
19. The method of claim 18, calculating a preset guess probability for an attribute in a rule in case of a rule model hit and a preset guess probability for an attribute in case of a rule model miss based on hit probabilities and/or miss probabilities of the rule model and hit probabilities and/or miss probabilities of the respective rules, comprising:
according to rule connecting words among the rules in the rule model and the hit probability and/or miss probability of the rule model, determining the conditional hit probability and/or conditional miss probability of each rule under the condition that the rule model is hit, and the conditional hit probability and/or conditional miss probability of each rule under the condition that the rule model is not hit;
and determining the preset guess probability of the attribute under the condition of the rule model hit and the preset guess probability of the attribute under the condition of the rule model miss according to the conditional hit probability and/or the conditional miss probability of each rule under the condition of the rule model hit, the conditional hit probability and/or the conditional miss probability of each rule under the condition of the rule model miss and the data value distribution of the attribute.
20. A server comprising a processor and a memory for storing processor-executable instructions which, when executed by the processor, implement the steps of the method of any one of claims 1 to 9, or 14 to 15, or 16 to 19.
CN202010908614.5A 2020-09-02 2020-09-02 Method and device for determining safety of rule model and server Active CN112085589B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010908614.5A CN112085589B (en) 2020-09-02 2020-09-02 Method and device for determining safety of rule model and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010908614.5A CN112085589B (en) 2020-09-02 2020-09-02 Method and device for determining safety of rule model and server

Publications (2)

Publication Number Publication Date
CN112085589A CN112085589A (en) 2020-12-15
CN112085589B true CN112085589B (en) 2022-11-22

Family

ID=73732474

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010908614.5A Active CN112085589B (en) 2020-09-02 2020-09-02 Method and device for determining safety of rule model and server

Country Status (1)

Country Link
CN (1) CN112085589B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257098B (en) * 2020-12-21 2021-03-12 蚂蚁智信(杭州)信息技术有限公司 Method and device for determining safety of rule model

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559422B (en) * 2013-11-25 2017-04-19 中国航空综合技术研究所 Safety probability risk assessment method for multi-failure-mode correlation system
CN107623697B (en) * 2017-10-11 2020-07-14 北京邮电大学 Network security situation assessment method based on attack and defense random game model
CN110728290B (en) * 2018-07-17 2020-07-31 阿里巴巴集团控股有限公司 Method and device for detecting security of data model
CN109558520A (en) * 2018-11-28 2019-04-02 平安科技(深圳)有限公司 A kind of data processing method and device based on user's portrait
CN111461874A (en) * 2020-04-13 2020-07-28 浙江大学 Credit risk control system and method based on federal mode
CN111539021A (en) * 2020-04-26 2020-08-14 支付宝(杭州)信息技术有限公司 Data privacy type identification method, device and equipment
CN111490995A (en) * 2020-06-12 2020-08-04 支付宝(杭州)信息技术有限公司 Model training method and device for protecting privacy, data processing method and server

Also Published As

Publication number Publication date
CN112085589A (en) 2020-12-15

Similar Documents

Publication Publication Date Title
CN113489713B (en) Network attack detection method, device, equipment and storage medium
CN108256322B (en) Security testing method and device, computer equipment and storage medium
CN110851872A (en) Risk assessment method and device for private data leakage
CN114885334B (en) High-concurrency short message processing method
CN109145651B (en) Data processing method and device
CN111010387B (en) Illegal replacement detection method, device, equipment and medium for Internet of things equipment
CN112085588B (en) Method and device for determining safety of rule model and data processing method
CN112085589B (en) Method and device for determining safety of rule model and server
CN112613893A (en) Method, system, equipment and medium for identifying malicious user registration
CN109905366A (en) Terminal device safe verification method, device, readable storage medium storing program for executing and terminal device
CN113783876A (en) Network security situation perception method based on graph neural network and related equipment
CN109413108A (en) A kind of WAF detection method and system based on safety
CN113051571B (en) Method and device for detecting false alarm vulnerability and computer equipment
CN111176567B (en) Storage supply verification method and device for distributed cloud storage
CN112085590B (en) Method and device for determining safety of rule model and server
CN114205816B (en) Electric power mobile internet of things information security architecture and application method thereof
CN115567316A (en) Method and device for detecting abnormality of access data
CN112257098B (en) Method and device for determining safety of rule model
CN111753295B (en) Vulnerability exploitation program detection method based on vulnerability exploitation program characteristics
CN112085369B (en) Safety detection method, device, equipment and system of rule model
CN110532758B (en) Risk identification method and device for group
CN113256256A (en) Work order early warning method, device, equipment and storage medium
CN112182592B (en) Method and device for determining and processing safety of rule set
CN117707653B (en) Parameter monitoring method, device, electronic equipment and computer readable storage medium
CN116915463B (en) Call chain data security analysis method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant