CN112084531A - Data sensitivity grading method, device, equipment and storage medium - Google Patents

Data sensitivity grading method, device, equipment and storage medium Download PDF

Info

Publication number
CN112084531A
CN112084531A CN202010950325.1A CN202010950325A CN112084531A CN 112084531 A CN112084531 A CN 112084531A CN 202010950325 A CN202010950325 A CN 202010950325A CN 112084531 A CN112084531 A CN 112084531A
Authority
CN
China
Prior art keywords
data
knowledge
sensitivity
privacy
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010950325.1A
Other languages
Chinese (zh)
Other versions
CN112084531B (en
Inventor
李冰
沈俊青
陆克贤
江易
赵尚上
王魁
俞山青
翁漂洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Chinaoly Technology Co ltd
Original Assignee
Hangzhou Chinaoly Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Chinaoly Technology Co ltd filed Critical Hangzhou Chinaoly Technology Co ltd
Priority to CN202010950325.1A priority Critical patent/CN112084531B/en
Publication of CN112084531A publication Critical patent/CN112084531A/en
Application granted granted Critical
Publication of CN112084531B publication Critical patent/CN112084531B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data sensitivity grading method, a data sensitivity grading device, data sensitivity grading equipment and a storage medium, and relates to the technical field of data processing. The method comprises the following steps: performing knowledge extraction on the acquired data to be evaluated to acquire at least one knowledge data; determining the corresponding nodes of the knowledge data in the privacy inference probability tree according to a preset relation algorithm; acquiring a sensitivity rating corresponding to each knowledge data according to the privacy inference probability tree, the node corresponding to each knowledge data in the privacy inference probability tree and a preset first calculation formula; and acquiring the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each knowledge data and a preset second calculation formula. Compared with the prior art, the method avoids neglecting some data which are seemingly not private, and further enhances the confidentiality of the data.

Description

Data sensitivity grading method, device, equipment and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for grading data sensitivity.
Background
With the continuous development of big data technology, people have stronger protection consciousness on private data, each enterprise or individual has the corresponding private data, the private data can be exposed to intentional disclosure, or unintentionally exposed through an untrusted third party, or lost, and the like, and the private data of the corresponding user can be exposed to the damage of the user interests.
In order to avoid the above problems, in the prior art, the privacy data is generally encrypted in a data encryption manner, so that the privacy of the user is protected, and the privacy data is prevented from being revealed.
However, a large number of data mining algorithms or knowledge inference algorithms and the like exist at present, so that privacy data of a user can be obtained by mining and reasoning data which look like low value, and the problem that privacy of the user is still leaked through indirect sensitive data is caused.
Disclosure of Invention
The present application aims to provide a method, an apparatus, a device, and a storage medium for grading a data sensitivity level, so as to solve the problem that in the prior art, the sensitivity level of some low-value-like data is not graded, so that it is possible to obtain privacy data of a user by mining and reasoning the low-value-like data, thereby revealing privacy of the user.
In order to achieve the above purpose, the technical solutions adopted in the embodiments of the present application are as follows:
in a first aspect, an embodiment of the present application provides a data sensitivity grading method, where the method includes:
performing knowledge extraction on the acquired data to be evaluated to acquire at least one knowledge data;
determining a node corresponding to each knowledge data in a privacy inference probability tree according to a preset relation algorithm, wherein the privacy inference probability tree comprises: nodes corresponding to different knowledge data and reasoning relation among the nodes;
acquiring a sensitivity rating corresponding to each knowledge data according to the privacy inference probability tree, a node corresponding to each knowledge data in the privacy inference probability tree and a preset first calculation formula;
and acquiring the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each knowledge data and a preset second calculation formula.
Optionally, before extracting knowledge from the acquired data to be evaluated and acquiring at least one piece of knowledge data, the method further includes:
acquiring each private data in a preset knowledge network, other data related to each private data and a deduction relation between each other data and each private data;
and constructing the privacy inference probability tree according to each privacy data in the preset knowledge network, other data related to each privacy data and a deduction relation between each other data and each privacy data, wherein a root node of the privacy inference probability tree is a node corresponding to the privacy data.
Optionally, the knowledge data comprises: entity data, relationship data, and time data.
Optionally, the privacy inference probability tree further comprises: reasoning the probability of the root node from each node;
the acquiring the sensitivity rating corresponding to each knowledge data according to the privacy inference probability tree, the node corresponding to each knowledge data in the privacy inference probability tree, and a preset first calculation formula includes:
according to the formula
Figure BDA0002675414940000031
Calculating to obtain the sensitive score corresponding to the knowledge data iIs divided into SiWherein: k is the number of all private data in the preset knowledge network;
Figure BDA0002675414940000032
representing the cumulative probability corresponding to all knowledge data i in the jth privacy inference probability tree;
according to the sensitivity score S corresponding to the knowledge data iiAnd presetting a rating rule, and determining the sensitive rating corresponding to the knowledge data i.
Optionally, the obtaining the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each piece of knowledge data and a preset second calculation formula includes:
according to the formula
Figure BDA0002675414940000033
Calculating to obtain the sensitivity score of the data to be evaluated; wherein L is the number of the data to be evaluated including knowledge data, alpha is a weight coefficient, SiThe sensitivity score corresponding to the knowledge data i is obtained;
and determining the sensitivity rating corresponding to the data to be evaluated according to the sensitivity rating of the data to be evaluated and a preset rating rule.
Optionally, the method further comprises:
and encrypting the knowledge data meeting the preset conditions according to the sensitivity rating corresponding to each knowledge data, the sensitivity rating of the data to be evaluated and a preset encryption rule.
Optionally, the preset relationship algorithm includes at least one of: similarity algorithms or relational inference algorithms.
In a second aspect, another embodiment of the present application provides a data sensitivity grading apparatus, including: an acquisition module and a determination module, wherein:
the acquisition module is used for extracting knowledge of the acquired data to be evaluated to acquire at least one piece of knowledge data;
the determining module is configured to determine, according to a preset relationship algorithm, a node corresponding to each piece of knowledge data in a privacy inference probability tree, where the privacy inference probability tree includes: nodes corresponding to different knowledge data and reasoning relation among the nodes;
the obtaining module is specifically configured to obtain a sensitivity rating corresponding to each piece of knowledge data according to the privacy inference probability tree, a node corresponding to each piece of knowledge data in the privacy inference probability tree, and a preset first calculation formula; and acquiring the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each knowledge data and a preset second calculation formula.
Optionally, the apparatus further comprises: building a module, wherein:
the acquisition module is specifically configured to acquire each private data in a preset knowledge network, other data related to each private data, and a derivation relationship between each other data and each private data;
the building module is configured to build the privacy inference probability tree according to each piece of privacy data in the preset knowledge network, other data related to each piece of privacy data, and a derivation relationship between each piece of other data and each piece of privacy data, where a root node of the privacy inference probability tree is a node corresponding to the privacy data.
Optionally, the apparatus further comprises: a calculation module for calculating according to a formula
Figure BDA0002675414940000041
Calculating to obtain a sensitivity score S corresponding to the knowledge data iiWherein: k is the number of all private data in the preset knowledge network;
Figure BDA0002675414940000042
representing the cumulative probability corresponding to all knowledge data i in the jth privacy inference probability tree;
the determining module is specifically configured to determine a sensitivity score S corresponding to the knowledge data iiAnd presetting a rating rule, and determining the sensitive rating corresponding to the knowledge data i.
OptionallyThe calculation module is specifically used for calculating according to a formula
Figure BDA0002675414940000043
Figure BDA0002675414940000051
Calculating to obtain the sensitivity score of the data to be evaluated; wherein L is the number of the data to be evaluated including knowledge data, alpha is a weight coefficient, SiThe sensitivity score corresponding to the knowledge data i is obtained;
the determining module is specifically configured to determine a sensitivity rating corresponding to the data to be evaluated according to the sensitivity rating of the data to be evaluated and a preset rating rule.
Optionally, the apparatus further comprises: and the encryption module is used for encrypting the knowledge data meeting the preset conditions according to the sensitivity rating corresponding to each knowledge data, the sensitivity rating of the data to be evaluated and a preset encryption rule.
In a third aspect, another embodiment of the present application provides a data sensitivity classification device, including: a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, when the data sensitivity grading device runs, the processor communicates with the storage medium through the bus, and the processor executes the machine-readable instructions to execute the steps of the method according to any one of the first aspect.
In a fourth aspect, another embodiment of the present application provides a storage medium having a computer program stored thereon, where the computer program is executed by a processor to perform the steps of the method according to any one of the above first aspects.
By adopting the data sensitivity degree grading method provided by the application, after the data to be evaluated is obtained, the node corresponding to each knowledge data corresponding to the data to be evaluated can be determined in the privacy inference probability tree according to the preset relation algorithm, and the sensitivity grade corresponding to each knowledge data is obtained according to the privacy inference probability tree, the node corresponding to each knowledge data in the privacy inference probability tree and the preset first calculation formula; and then, acquiring the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each knowledge data and a preset second calculation formula, so as to determine whether the current data to be evaluated is sensitive data or not, or whether the current data to be evaluated contains sensitive data of privacy data, and realizing that data possibly causing privacy leakage in the data can be re-distinguished through the sensitivity rating, thereby avoiding neglecting some data which are not considered to be privacy data, and further enhancing the confidentiality of the data.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic flowchart of a data sensitivity grading method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart illustrating a data sensitivity grading method according to another embodiment of the present application;
fig. 3 is a schematic structural diagram of private data stored in a predetermined knowledge network according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a privacy inference tree according to an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a probabilistic inference tree for privacy according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating a data sensitivity grading method according to another embodiment of the present application;
FIG. 7 is a flowchart illustrating a data sensitivity grading method according to another embodiment of the present application;
FIG. 8 is a schematic structural diagram of a data sensitivity grading apparatus according to an embodiment of the present application;
FIG. 9 is a schematic structural diagram of a data sensitivity grading apparatus according to another embodiment of the present application;
fig. 10 is a schematic structural diagram of a data sensitivity grading device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments.
The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
Additionally, the flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
The data sensitivity grading method provided by the embodiment of the present application is explained below with reference to a plurality of specific application examples. Fig. 1 is a schematic flow chart of a data sensitivity grading method according to an embodiment of the present application, as shown in fig. 1, the method includes:
s101: and extracting knowledge from the acquired data to be evaluated to acquire at least one piece of knowledge data.
The knowledge extraction is a process of extracting each knowledge point in the data to be evaluated through identification, understanding, screening and formatting. The above process can also be regarded as a process of semantically labeling unstructured data to be evaluated.
Optionally, the data to be evaluated may be, for example: the data in the form of table, log or any other text form that can be extracted by knowledge may be obtained from a database, may also be obtained in real time, may be data to be evaluated corresponding to a certain company user, may be data to be evaluated corresponding to a certain individual user, and the source and form of the data to be evaluated are not limited in this application, and may be flexibly adjusted according to the user's needs.
S102: and determining the corresponding nodes of the knowledge data in the privacy inference probability tree according to a preset relation algorithm.
Wherein the privacy inference probability tree comprises: nodes corresponding to different knowledge data and reasoning relations among the nodes. The number of the privacy inference probability trees can be 1 or more, and each knowledge data corresponding to the data to be evaluated can have corresponding nodes in a plurality of privacy inference probability trees or can have a plurality of corresponding nodes in one privacy inference probability tree; the nodes corresponding to the knowledge data may be in the same privacy inference probability tree or may be dispersed in different privacy inference probability trees, and the correspondence between the specific knowledge data and the privacy inference logic tree is determined according to actual conditions, which is not limited to the cases given in the above embodiments.
For example, in some possible embodiments, the preset relationship algorithm is a user preset algorithm, and may include at least one of the following: the selection of the similarity algorithm or the relationship inference algorithm, and the specific preset relationship algorithm, can be flexibly adjusted according to the user's needs, and the application is not limited herein.
S103: and acquiring the sensitivity rating corresponding to each knowledge data according to the privacy inference probability tree, the node corresponding to each knowledge data in the privacy inference probability tree and a preset first calculation formula.
Optionally, in one embodiment of the present application, the sensitivity rating may include, for example: the method includes the steps that the probability that each knowledge can deduce the privacy data is higher, the corresponding sensitivity rating is higher, otherwise, the probability that each knowledge can deduce the privacy data is lower, the corresponding sensitivity rating is lower, and the knowledge data which is lower than a preset insensitivity threshold is insensitivity data. Some or all of the data may be subsequently encrypted according to the sensitivity rating, and the like, without limitation.
S104: and acquiring the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each knowledge data and a preset second calculation formula.
The method for grading the sensitivity of the data to be evaluated is similar to the method for grading the sensitivity of the knowledge data, and is not described herein again, that is, the data to be evaluated is graded as a whole.
Optionally, in an embodiment of the application, the data to be evaluated may be the data to be disclosed, and before the data to be evaluated is disclosed, the data to be evaluated is evaluated and classified uniformly, so that a problem that after the data is disclosed, the disclosed data containing sensitive data may cause user privacy disclosure under an inference algorithm can be avoided.
By adopting the data sensitivity degree grading method provided by the application, after the data to be evaluated is obtained, the node corresponding to each knowledge data corresponding to the data to be evaluated can be determined in the privacy inference probability tree according to the preset relation algorithm, and the sensitivity grade corresponding to each knowledge data is obtained according to the privacy inference probability tree, the node corresponding to each knowledge data in the privacy inference probability tree and the preset first calculation formula; and then, acquiring the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each knowledge data and a preset second calculation formula, so as to determine whether the current data to be evaluated is sensitive data or not, or whether the current data to be evaluated contains sensitive data of privacy data, and realizing that data possibly causing privacy leakage in the data can be re-distinguished through the sensitivity rating, thereby avoiding neglecting some data which are not considered to be privacy data, and further enhancing the confidentiality of the data.
Optionally, on the basis of the above embodiments, the embodiments of the present application may further provide a data sensitivity grading method, and an implementation process of the above method is exemplified with reference to the following drawings. Fig. 2 is a schematic flowchart of a data sensitivity classification method according to another embodiment of the present application, and fig. 3 is a schematic structural diagram of private data stored in a predetermined knowledge network according to an embodiment of the present application; FIG. 4 is a schematic structural diagram of a privacy inference tree according to an embodiment of the present application; fig. 5 is a schematic structural diagram of a privacy probabilistic inference tree according to an embodiment of the present application, as shown in fig. 2, before S101, the method further includes:
s105: and acquiring each private data, other data related to each private data and a derivation relation between each other data and each private data in a preset knowledge network.
S106: and constructing a privacy inference probability tree according to each privacy data in the preset knowledge network, other data related to each privacy data and the deduction relationship between each other data and each privacy data.
Each piece of privacy data has a corresponding privacy inference probability tree, and a root node in the privacy inference probability tree is a node corresponding to the privacy data.
As shown in fig. 3, the preset knowledge network is a knowledge network pre-constructed based on internal existing knowledge, information, data, and the like, and serves as a knowledge database for evaluating the sensitivity level of subsequent data to be evaluated. The preset knowledge network comprises privacy data which need to be hidden or encrypted, for example, the privacy data can be identity card numbers, bank card numbers, mobile phone numbers and the like, and each privacy data has a corresponding privacy data label for subsequently constructing a privacy inference tree corresponding to the privacy data. The connecting line between the data nodes represents that the data nodes can reason and have certain inference probability. As shown in fig. 3, in the preset knowledge network, each data node includes corresponding knowledge data, for example, the obtained private data n1, other data n2, n3, n4, n5, n6 related to the private data n1, and the connection relationship between each other data n 2-n 6 and the private data n1 in the current preset knowledge network are as shown in fig. 3, and the connecting lines between the nodes indicate that another knowledge data node can be inferred from a certain knowledge data node under a certain probability.
For each node of the privacy data, the node may be used as a root node to construct a privacy inference tree corresponding to the privacy data, and for example, the privacy inference tree may be: according to the connection relationship between the knowledge nodes provided by fig. 3, an inference structure with privacy data n1 as a root node divergence is arranged, each father node can be obtained by child node inference, and the inference probability of the father node obtained from the child node inference is shown on a connection line, as shown in fig. 4: the knowledge data nodes n2, n3 and n4 are respectively connected with n1, so that the privacy data n1 can be obtained by inference from n2, n3 and n4, and the probabilities are p11, p12 and p 13; similarly, the knowledge data node n2 can be obtained by reasoning through the knowledge data nodes n3 and n4, and the reasoning probabilities are p21 and p22 respectively; the knowledge data node n3 can be obtained by inference of the knowledge data node n2 with the inference probability of p 23; the knowledge data node n4 can be obtained by reasoning through the knowledge data nodes n5 and n6, the reasoning probability is obtained by reasoning the reasoning probability of p24 and p25, and the privacy reasoning tree can be constructed according to the reasoning probability and the reasoning relation.
Finally, according to the formula
Figure BDA0002675414940000111
Calculating the probability of all knowledge data nodes except the root node and reasoning to the privacy data (namely the root node) from the knowledge data nodes, wherein n in the formulaijDetermined according to the j-th position corresponding to the knowledge data node i in the privacy inference tree, p (n)ij) For the probability value corresponding to this position in the privacy inference probability tree, Path (n)ij,nroot_) For the path of the location to the root node, pkThe privacy inference probability tree generated according to the privacy inference tree is shown in fig. 5.
The inference relationship between each piece of private data and other pieces of knowledge data is not necessarily the same as the inference relationship shown in fig. 3, and therefore the private inference tree and the private inference probability tree corresponding to each piece of private data are not necessarily the same.
Optionally, in an embodiment of the present application, the knowledge data may include: entity data, relationship data, and event data. The specific division of each kind of data may be configured in advance, and is not limited herein.
The entity data is a specific object corresponding to the data to be evaluated, and may be, for example, a human object, an object, a scene object, or the like; the relation data is the relation among all entity data; the event data is events which specifically occur among all the objects; for example, the content of the data to be evaluated is "li hua stayed in a certain hotel yesterday" for example, and at this time, after the knowledge of the data to be evaluated is extracted, the corresponding entity data is "li hua" and "a certain hotel"; the corresponding relationship data is "check-in"; the corresponding event data is 'Lihua yesterday stay in hotel'; it should be understood that the above embodiments are only exemplary illustrations, and the specific manner of obtaining at least one knowledge data after performing knowledge extraction on the data to be evaluated is determined according to the application scenario and the knowledge extraction algorithm, and the application is not limited herein.
Optionally, on the basis of the foregoing embodiment, an embodiment of the present application may further provide a data sensitivity level grading method, and an implementation process of obtaining a sensitivity rating corresponding to knowledge data in the foregoing method is described as follows with reference to the accompanying drawings. Fig. 6 is a schematic flow chart of a data sensitivity classification method according to another embodiment of the present application, where the privacy inference probability tree further includes: reasoning the probability of the root node from each node; as shown in fig. 6, S103 may include:
s107: according to the formula
Figure BDA0002675414940000121
Calculating to obtain a sensitivity score S corresponding to the knowledge data ii
The higher the sensitivity score is, the higher the probability of reasoning out the privacy data according to the knowledge data is, that is, the more sensitive the current knowledge data is, and the lower the sensitivity score is, the lower the probability of reasoning out the privacy data according to the knowledge data is, that is, the less sensitive the current knowledge data is; k in the formula is the number of all private data in the preset knowledge network;
Figure BDA0002675414940000122
and representing the cumulative probability corresponding to all the knowledge data i in the jth privacy inference probability tree.
S108: according to the sensitivity score S corresponding to the knowledge data iiAnd presetting a rating rule, and determining the sensitive rating corresponding to the knowledge data i.
Optionally, on the basis of the foregoing embodiment, an embodiment of the present application may further provide a data sensitivity level grading method, and an implementation process of obtaining a sensitivity rating of data to be evaluated in the foregoing method is described below with reference to the accompanying drawings. Fig. 7 is a flowchart illustrating a data sensitivity grading method according to another embodiment of the present application, and as shown in fig. 7, S104 may include:
s109: according to the formula
Figure BDA0002675414940000131
And calculating to obtain the sensitivity score of the data to be evaluated.
The higher the sensitivity score is, the higher the probability of reasoning out the privacy data according to the data to be evaluated is, that is, the more sensitive the current data to be evaluated is, and the lower the sensitivity score is, the lower the probability of reasoning out the privacy data according to the data to be evaluated is, that is, the less sensitive the current data to be evaluated is; l in the formula is the number of the data to be evaluated including the knowledge data, alpha is a weight coefficient, and SiAnd (4) giving a sensitivity score corresponding to the knowledge data i.
Optionally, in an embodiment of the present application, the weight coefficient may be determined according to a weight fitting manner, for example, according to a neural network with a pre-training number, and may be determined according to a probability that sensitive data is inferred for each knowledge data in the data to be evaluated, for example, the determination manner of the specific weight coefficient may be flexibly adjusted according to a user requirement, and is not limited to the manner provided in the foregoing embodiment.
S110: and determining the sensitivity rating corresponding to the data to be evaluated according to the sensitivity rating of the data to be evaluated and a preset rating rule.
The preset rating rule of the data to be evaluated and the preset rating rule of each knowledge data may be the same or different, and the application is not limited herein.
Optionally, on the basis of the above embodiments, the embodiments of the present application may further provide a data sensitivity grading method, and an implementation process of the above method is exemplified with reference to the following drawings. Fig. 5 is a schematic flowchart of a data sensitivity grading method according to another embodiment of the present application, and as shown in fig. 5, the method further includes:
s111: and encrypting the knowledge data meeting the preset conditions according to the sensitivity rating corresponding to each knowledge data, the sensitivity rating of the data to be evaluated and a preset encryption rule.
If the sensitivity rating of the data to be evaluated is high, it indicates that the whole data to be evaluated is sensitive data, and the data to be evaluated can be encrypted integrally subsequently, the preset encryption rule can be determined according to the sensitivity rating, for example, when the sensitivity rating of the data is low, a simpler encryption mode can be corresponded, and when the sensitivity rating of the data is high, a more complex encryption mode can be corresponded; if the sensitivity rating of the data to be evaluated is medium or low, but the knowledge data with high sensitivity rating exists, the knowledge data with high sensitivity rating in the data to be evaluated can be encrypted, and other knowledge data are not encrypted or are simply encrypted; the encryption rules and the preset conditions for encrypting the data to be evaluated and the knowledge data can be flexibly adjusted according to the user requirements, and are not limited to the encryption method provided in the above embodiment.
By adopting the data sensitivity degree grading method provided by the application, the node corresponding to each knowledge data in the data to be evaluated can be determined in the privacy inference probability tree according to the preset relation algorithm, and the sensitivity grade corresponding to each knowledge data can be obtained according to the privacy inference probability tree, the node corresponding to each knowledge data in the privacy inference probability tree and the preset first calculation formula; and then, acquiring the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each knowledge data and a preset second calculation formula, so as to determine whether the current data to be evaluated is sensitive data or not, or whether the current data to be evaluated contains sensitive data of privacy data, and encrypting each knowledge data or data to be evaluated meeting preset conditions according to a preset encryption rule, thereby avoiding the problem that after data flows out, the privacy data of a user can be obtained by mining and reasoning the privacy of the user through data which seems to be low in value because the sensitivity rating and the encryption of some data which seem to be low in value are not performed in the prior art, and the privacy of the user is leaked.
The content of the data to be evaluated is 'li hua was caught in a certain hotel yesterday' as an example for explanation, and at this time, after the knowledge of the data to be evaluated is extracted, the corresponding entity data is 'li hua' and 'a certain hotel'; the corresponding relationship data is "check-in"; the corresponding event data is 'Lihua yesterday stay in hotel'; after the method provided by the application is adopted to determine the sensitivity rating of each knowledge data and the data to be evaluated, for example, if the sensitivity rating of the Lihua in the entity data is determined to be high, the sensitivity ratings of other data are all low, and the sensitivity rating of the whole data to be evaluated is also low, the Lihua in the data to be evaluated only needs to be encrypted and stored; for example, if it is determined that the sensitivity rating of each knowledge data corresponding to the data to be evaluated is low but the sensitivity rating of the data to be evaluated is high, encrypting and storing the whole data to be evaluated, namely that li hua stays in a certain hotel yesterday; by adopting the method, the privacy of the user is not easy to be mined according to the encrypted data after the data to be evaluated is evaluated and encrypted, so that the safety of the privacy data of the user is ensured.
The data sensitivity grading device provided in the present application is explained below with reference to the accompanying drawings, and the data sensitivity grading device can execute any one of the data sensitivity grading methods in fig. 1 to 7, and specific implementation and beneficial effects thereof refer to the above description, and are not described again below.
Fig. 8 is a schematic structural diagram of a data sensitivity grading apparatus according to an embodiment of the present application, and as shown in fig. 8, the apparatus includes: an obtaining module 201 and a determining module 202, wherein:
the obtaining module 201 is configured to perform knowledge extraction on the obtained data to be evaluated, and obtain at least one knowledge data.
A determining module 202, configured to determine, according to a preset relationship algorithm, a node corresponding to each piece of knowledge data in a privacy inference probability tree, where the privacy inference probability tree includes: nodes corresponding to different knowledge data and reasoning relations among the nodes.
The obtaining module 201 is specifically configured to obtain a sensitivity rating corresponding to each knowledge data according to the privacy inference probability tree, a node corresponding to each knowledge data in the privacy inference probability tree, and a preset first calculation formula; and acquiring the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each knowledge data and a preset second calculation formula.
Fig. 9 is a schematic structural diagram of a data sensitivity grading apparatus according to an embodiment of the present application, and as shown in fig. 9, the apparatus further includes: a build module 203, wherein:
the obtaining module 201 is specifically configured to obtain each piece of privacy data in a preset knowledge network, other data related to each piece of privacy data, and a derivation relationship between each piece of other data and each piece of privacy data;
the building module 203 is configured to build a privacy inference probability tree according to each privacy data in the preset knowledge network, other data related to each privacy data, and a derivation relationship between each other data and each privacy data, where a root node of the privacy inference probability tree is a node corresponding to the privacy data.
As shown in fig. 9, the apparatus further includes: a calculation module 204 for calculating according to a formula
Figure BDA0002675414940000161
Calculating to obtain knowledge datai corresponding sensitivity score SiWherein: k is the number of all private data in the preset knowledge network;
Figure BDA0002675414940000162
and representing the cumulative probability corresponding to all the knowledge data i in the jth privacy inference probability tree.
A determining module 202, specifically configured to determine a sensitivity score S corresponding to the knowledge data iiAnd presetting a rating rule, and determining the sensitive rating corresponding to the knowledge data i.
Optionally, the calculating module 204 is specifically configured to calculate according to a formula
Figure BDA0002675414940000163
Figure BDA0002675414940000164
Calculating to obtain the sensitivity score of the data to be evaluated; wherein, L is the number of the data to be evaluated including the knowledge data, alpha is the weight coefficient, SiAnd (4) giving a sensitivity score corresponding to the knowledge data i.
The determining module 202 is specifically configured to determine a sensitivity rating corresponding to the data to be evaluated according to the sensitivity rating of the data to be evaluated and a preset rating rule.
As shown in fig. 9, the apparatus further includes: the encryption module 205 is configured to encrypt the knowledge data meeting the preset condition according to the sensitivity rating corresponding to each knowledge data, the sensitivity rating of the data to be evaluated, and a preset encryption rule.
The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.
These above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when one of the above modules is implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).
Fig. 10 is a schematic structural diagram of a data sensitivity level grading device according to an embodiment of the present application, where the data sensitivity level grading device may be integrated in a terminal device or a chip of the terminal device.
The data sensitivity grading device comprises: a processor 501, a storage medium 502, and a bus 503.
The processor 501 is used for storing a program, and the processor 501 calls the program stored in the storage medium 502 to execute the method embodiment corresponding to fig. 1-7. The specific implementation and technical effects are similar, and are not described herein again.
Optionally, the present application also provides a program product, such as a storage medium, on which a computer program is stored, including a program, which, when executed by a processor, performs embodiments corresponding to the above-described method.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to perform some steps of the methods according to the embodiments of the present application. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims (10)

1. A method for grading data sensitivity, the method comprising:
performing knowledge extraction on the acquired data to be evaluated to acquire at least one knowledge data;
determining a node corresponding to each knowledge data in a privacy inference probability tree according to a preset relation algorithm, wherein the privacy inference probability tree comprises: nodes corresponding to different knowledge data and reasoning relation among the nodes;
acquiring a sensitivity rating corresponding to each knowledge data according to the privacy inference probability tree, a node corresponding to each knowledge data in the privacy inference probability tree and a preset first calculation formula;
and acquiring the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each knowledge data and a preset second calculation formula.
2. The method of claim 1, wherein before extracting knowledge from the acquired data to be evaluated and acquiring at least one knowledge data, the method further comprises:
acquiring each private data in a preset knowledge network, other data related to each private data and a deduction relation between each other data and each private data;
and constructing the privacy inference probability tree according to each privacy data in the preset knowledge network, other data related to each privacy data and a deduction relation between each other data and each privacy data, wherein a root node of the privacy inference probability tree is a node corresponding to the privacy data.
3. The method of claim 1 or 2, wherein the knowledge data comprises: entity data, relationship data, and time data.
4. The method of claim 3, wherein the privacy inference probability tree further comprises: reasoning the probability of the root node from each node;
the acquiring the sensitivity rating corresponding to each knowledge data according to the privacy inference probability tree, the node corresponding to each knowledge data in the privacy inference probability tree, and a preset first calculation formula includes:
according to the formula
Figure FDA0002675414930000021
Calculating to obtain a sensitivity score S corresponding to the knowledge data iiWherein: k is the number of all private data in the preset knowledge network;
Figure FDA0002675414930000022
representing the cumulative probability corresponding to all knowledge data i in the jth privacy inference probability tree;
according to the sensitivity score S corresponding to the knowledge data iiAnd presetting a rating rule, and determining the sensitive rating corresponding to the knowledge data i.
5. The method of claim 3, wherein the obtaining the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each piece of knowledge data and a preset second calculation formula comprises:
according to the formula
Figure FDA0002675414930000023
Calculating to obtain the sensitivity score of the data to be evaluated; wherein L is the number of the data to be evaluated including knowledge data, alpha is a weight coefficient, SiThe sensitivity score corresponding to the knowledge data i is obtained;
and determining the sensitivity rating corresponding to the data to be evaluated according to the sensitivity rating of the data to be evaluated and a preset rating rule.
6. The method of claim 1, wherein the method further comprises:
and encrypting the knowledge data meeting the preset conditions according to the sensitivity rating corresponding to each knowledge data, the sensitivity rating of the data to be evaluated and a preset encryption rule.
7. The method of claim 1, wherein the predetermined relationship algorithm comprises at least one of: similarity algorithms or relational inference algorithms.
8. A data sensitivity rating device, said device comprising: an acquisition module and a determination module, wherein:
the acquisition module is used for extracting knowledge of the acquired data to be evaluated to acquire at least one piece of knowledge data;
the determining module is configured to determine, according to a preset relationship algorithm, a node corresponding to each piece of knowledge data in a privacy inference probability tree, where the privacy inference probability tree includes: nodes corresponding to different knowledge data and reasoning relation among the nodes;
the obtaining module is specifically configured to obtain a sensitivity rating corresponding to each piece of knowledge data according to the privacy inference probability tree, a node corresponding to each piece of knowledge data in the privacy inference probability tree, and a preset first calculation formula; and acquiring the sensitivity rating of the data to be evaluated according to the sensitivity rating corresponding to each knowledge data and a preset second calculation formula.
9. A data sensitivity rating device, said device comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the data sensitivity grading device is operated, the processor executing the machine-readable instructions to perform the method of any of the above claims 1-7.
10. A storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, performs the method of any of the preceding claims 1-7.
CN202010950325.1A 2020-09-10 2020-09-10 Data sensitivity grading method, device, equipment and storage medium Active CN112084531B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010950325.1A CN112084531B (en) 2020-09-10 2020-09-10 Data sensitivity grading method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010950325.1A CN112084531B (en) 2020-09-10 2020-09-10 Data sensitivity grading method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112084531A true CN112084531A (en) 2020-12-15
CN112084531B CN112084531B (en) 2024-05-17

Family

ID=73737421

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010950325.1A Active CN112084531B (en) 2020-09-10 2020-09-10 Data sensitivity grading method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112084531B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113285960A (en) * 2021-07-21 2021-08-20 湖南轻悦健康管理有限公司 Data encryption method and system for service data sharing cloud platform
CN115065561A (en) * 2022-08-17 2022-09-16 深圳市乙辰科技股份有限公司 Information interaction method and system based on database data storage

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617280A (en) * 2013-12-09 2014-03-05 苏州大学 Method and system for mining Chinese event information
JP2016091402A (en) * 2014-11-07 2016-05-23 株式会社日立製作所 Risk evaluation system and risk evaluation method
US20170287036A1 (en) * 2016-04-01 2017-10-05 OneTrust, LLC Data processing systems and methods for integrating privacy information management systems with data loss prevention tools or other tools for privacy design
US20180113996A1 (en) * 2016-10-20 2018-04-26 International Business Machines Corporation Determining privacy for a user and a product in a particular context
US20180191780A1 (en) * 2016-12-29 2018-07-05 Mcafee, Inc. Technologies for privacy-preserving security policy evaluation
CN109670342A (en) * 2018-12-30 2019-04-23 北京工业大学 The method and apparatus of information leakage risk measurement
WO2020001373A1 (en) * 2018-06-26 2020-01-02 杭州海康威视数字技术股份有限公司 Method and apparatus for ontology construction
CN110825879A (en) * 2019-09-18 2020-02-21 平安科技(深圳)有限公司 Case decision result determination method, device and equipment and computer readable storage medium
CN110941956A (en) * 2019-10-26 2020-03-31 华为技术有限公司 Data classification method, device and related equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617280A (en) * 2013-12-09 2014-03-05 苏州大学 Method and system for mining Chinese event information
JP2016091402A (en) * 2014-11-07 2016-05-23 株式会社日立製作所 Risk evaluation system and risk evaluation method
US20170287036A1 (en) * 2016-04-01 2017-10-05 OneTrust, LLC Data processing systems and methods for integrating privacy information management systems with data loss prevention tools or other tools for privacy design
US20180113996A1 (en) * 2016-10-20 2018-04-26 International Business Machines Corporation Determining privacy for a user and a product in a particular context
US20180191780A1 (en) * 2016-12-29 2018-07-05 Mcafee, Inc. Technologies for privacy-preserving security policy evaluation
WO2020001373A1 (en) * 2018-06-26 2020-01-02 杭州海康威视数字技术股份有限公司 Method and apparatus for ontology construction
CN109670342A (en) * 2018-12-30 2019-04-23 北京工业大学 The method and apparatus of information leakage risk measurement
CN110825879A (en) * 2019-09-18 2020-02-21 平安科技(深圳)有限公司 Case decision result determination method, device and equipment and computer readable storage medium
CN110941956A (en) * 2019-10-26 2020-03-31 华为技术有限公司 Data classification method, device and related equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈靖;彭武;王冬海;: "基于信度评估的网络安全决策系统", 计算机工程与设计, no. 05 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113285960A (en) * 2021-07-21 2021-08-20 湖南轻悦健康管理有限公司 Data encryption method and system for service data sharing cloud platform
CN113285960B (en) * 2021-07-21 2021-10-01 湖南轻悦健康管理有限公司 Data encryption method and system for service data sharing cloud platform
CN115065561A (en) * 2022-08-17 2022-09-16 深圳市乙辰科技股份有限公司 Information interaction method and system based on database data storage
CN115065561B (en) * 2022-08-17 2022-11-18 深圳市乙辰科技股份有限公司 Information interaction method and system based on database data storage

Also Published As

Publication number Publication date
CN112084531B (en) 2024-05-17

Similar Documents

Publication Publication Date Title
CN109598509B (en) Identification method and device for risk group partner
Nazer et al. Intelligent disaster response via social media analysis a survey
US11113255B2 (en) Computer-based systems configured for entity resolution for efficient dataset reduction
KR101627592B1 (en) Detection of confidential information
EP3468140A1 (en) Natural language processing artificial intelligence network and data security system
CA2728181A1 (en) Data normalisation for investigative data mining
CN113726784B (en) Network data security monitoring method, device, equipment and storage medium
CN112084531A (en) Data sensitivity grading method, device, equipment and storage medium
CN105940393A (en) Method and apparatus for social relation analysis and management
US20140007242A1 (en) Notification of Security Question Compromise Level based on Social Network Interactions
CN115632839B (en) Intelligent campus environment network supervision method and system
Yang et al. Recent development trend of blockchain technologies: A patent analysis
CN114398864A (en) Report display method, device, equipment and storage medium
Ulfath et al. Hybrid CNN-GRU framework with integrated pre-trained language transformer for SMS phishing detection
CN109359481A (en) It is a kind of based on BK tree anti-collision search about subtract method
CN105069158A (en) Data mining method and system
CN117313058A (en) Information identification method, apparatus, computer device and storage medium
Lee et al. Apply fuzzy decision tree to information security risk assessment.
US11240636B1 (en) Digital passport with verified data provenance
Bouzidi et al. A new efficient alert model for disaster management
CN112950222A (en) Resource processing abnormity detection method and device, electronic equipment and storage medium
Spranger et al. SoNA: A knowledge-based social network analysis framework for predictive policing
CN113792344B (en) Data desensitization processing method, device, equipment and storage medium
CN116150507B (en) Water army group identification method, device, equipment and medium
Matias et al. A Framework for Cybercrime Prediction on Twitter Tweets Using Text-Based Machine Learning Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant