CN112948823A - Data leakage risk assessment method - Google Patents

Data leakage risk assessment method Download PDF

Info

Publication number
CN112948823A
CN112948823A CN202110295438.7A CN202110295438A CN112948823A CN 112948823 A CN112948823 A CN 112948823A CN 202110295438 A CN202110295438 A CN 202110295438A CN 112948823 A CN112948823 A CN 112948823A
Authority
CN
China
Prior art keywords
data leakage
risk
information
constructing
information system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110295438.7A
Other languages
Chinese (zh)
Inventor
李强
余祥
田楠
李腾飞
陈立哲
舒展翔
李孟霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202110295438.7A priority Critical patent/CN112948823A/en
Publication of CN112948823A publication Critical patent/CN112948823A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/556Detecting local intrusion or implementing counter-measures involving covert channels, i.e. data leakage between processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/02Computing arrangements based on specific mathematical models using fuzzy logic

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Automation & Control Theory (AREA)
  • Biomedical Technology (AREA)
  • Fuzzy Systems (AREA)
  • Molecular Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data leakage risk assessment method, which belongs to the technical field of information security and comprises the following steps: acquiring data leakage historical information of an information system, processing the data leakage historical information, and constructing a data leakage information feature word set; matching the characteristic words in the data leakage information characteristic word set with the sensitive information in the confidential sensitive information base item by item, and constructing a characteristic word matching set by using the characteristic words successfully matched with the sensitive information; and processing the feature words in the feature word matching set by using an analytic hierarchy process and a fuzzy mathematical process to obtain a data leakage risk value of the information system. The method is used for processing the acquired information system leakage data based on an analytic hierarchy process and a fuzzy mathematical method, and realizing accurate evaluation of data leakage risks.

Description

Data leakage risk assessment method
Technical Field
The invention relates to the technical field of information security, in particular to a data leakage risk assessment method.
Background
With the development of communication technology and computer technology and the popularization of electronic devices in social life, sensitive information is being transmitted between computers through vulnerable communication lines, and the scale and range of data leakage is rapidly expanding.
At present, methods for evaluating data leakage risks mainly include the following steps: based on an analytic hierarchy process, a fuzzy mathematical process, an entropy weight theory and an association rule. The method has the defects of difficult quantification, inconvenient calculation and poor objectivity and accuracy due to the fact that the method depends on the subjective experience of experts for judgment.
Disclosure of Invention
The invention aims to overcome the defects in the background technology and realize accurate evaluation of the data leakage risk of the information system.
In order to achieve the above purpose, a data leakage risk assessment method is adopted, which comprises the following steps:
acquiring data leakage historical information of an information system, processing the data leakage historical information, and constructing a data leakage information feature word set;
matching the characteristic words in the data leakage information characteristic word set with the sensitive information in the confidential sensitive information base item by item, and constructing a characteristic word matching set by using the characteristic words successfully matched with the sensitive information;
and processing the feature words in the feature word matching set by using an analytic hierarchy process and a fuzzy mathematical process to obtain a data leakage risk value of the information system.
Further, the acquiring data leakage history information of the information system, processing the data leakage history information, and constructing a data leakage information feature word set includes:
extracting text information of the data leakage historical information;
performing word segmentation processing on the text information by using a statistical word segmentation method to obtain a word segmentation list;
and performing stop word filtering on the words in the word list, and constructing the data leakage information characteristic word set by using the words after the stop words are deleted.
Matching the characteristic words in the data leakage information characteristic word set with the sensitive information in the confidential sensitive information base item by item, and constructing a characteristic word matching set by using the characteristic words successfully matched with the sensitive information, wherein the matching process comprises the following steps: comparing the data leakage characteristic words with the sensitive words in the sensitive information base one by one, if the data leakage characteristic words are consistent with the sensitive words in the sensitive information base, matching successfully, and recording the characteristic words in a characteristic word set base; if the data is not matched with the sensitive words in the sensitive information base, the matching is not successful, the data leakage characteristic words are not recorded in the characteristic word set base, and another sensitive word is selected from the sensitive information base to repeat the process until all the sensitive words in the sensitive information base are compared.
Constructing a data leakage information characteristic word set library, comprising the following steps: extracting historical text information of the data leakage; performing word segmentation processing on the text information by using a statistical word segmentation method to obtain a word segmentation list; and filtering stop words of the words in the word list, comparing the deleted stop words with the sensitive words in the sensitive information base one by one to obtain a characteristic word set, and constructing a data leakage information characteristic word set base.
Processing the feature words in the feature word matching set by using an analytic hierarchy process and a fuzzy mathematical process to obtain a data leakage risk value of an information system, wherein the data leakage risk value comprises the following steps:
constructing hierarchical structures of the importance degrees of the data leakage risk factors of different levels;
judging the relative importance of the data leakage risk factors of different levels by adopting a scaling method according to the hierarchical structure, and constructing a judgment matrix;
calculating a sorting weight vector of a judgment matrix by using the analytic hierarchy process;
constructing a risk element set according to the data leakage risk elements, and constructing a risk evaluation set for each data leakage risk element according to a fuzzy mathematical method;
calculating a membership matrix according to the risk element set and the risk evaluation set, and calculating a ranking weight vector of the membership matrix;
and synthesizing the sequencing weight vector of the judgment matrix and the sequencing weight vector of the membership degree matrix by using the fuzzy mathematical method to obtain a data leakage risk value of the information system.
Constructing hierarchical structure of importance of data leakage risk factors of different levels, comprising:
and setting the influence factors of the importance of the data leakage risk factors, including the data leakage occurrence probability, the data leakage influence degree and the information system equipment importance, and constructing a hierarchical structure for describing the importance of the data leakage risk factors of different levels of the information system.
According to the hierarchical structure, judging the relative importance of the data leakage risk factors of different levels by adopting a scaling method, and constructing a judgment matrix, wherein the judgment matrix comprises the following steps:
setting the relative importance of the data leakage occurrence probability as IpThe relative importance of the data leakage influence degree is IFAnd the relative importance of the information system equipment is IDThe method comprises the following steps:
Figure BDA0002984160250000031
Figure BDA0002984160250000032
wherein, FhData leakage events, D, representing a File level hgA data leak event representing a leaking device class g;
constructing a judgment matrix B according to the ratio of the data leakage occurrence probability, the data leakage influence degree and the relative importance of the information system equipment importance:
Figure BDA0002984160250000041
wherein, the element B in the judgment matrix BijAnd represents the ratio of the data leakage occurrence probability of the ith element to the data leakage occurrence probability of the jth element.
Calculating the sequencing weight vector of the judgment matrix by using the analytic hierarchy process, wherein the method comprises the following steps:
and calculating to obtain a feature vector M by using the judgment matrix:
M=(m1,m2,…mi,…mn)
wherein the content of the first and second substances,
Figure BDA0002984160250000042
bi1bi2…binthe element of the judgment matrix is n, and the order of the judgment matrix is n;
normalizing the characteristic vector M to obtain a sorting weight vector W of the judgment matrix (W ═ W)1,W2,…,Wn) Wherein, in the step (A),
Figure BDA0002984160250000043
constructing a risk element set according to the data leakage risk elements, and constructing a risk evaluation set for each data leakage risk element according to a fuzzy mathematical method, wherein the risk evaluation set comprises the following steps:
the set of construction risk elements is U ═ U1,u2,…uk,…uK};
Setting the risk evaluation set of the data leakage influence degree and the data leakage occurrence probability as an equipment set of an information systemE={E1,E2,…Et,…ET}。
Calculating a membership matrix according to the risk element set and the risk evaluation set, and calculating a ranking weight vector of the membership matrix, wherein the ranking weight vector comprises the following steps:
establishing a fuzzy mapping function of the risk element set and the risk evaluation set:
f:U→F(E)
wherein F (E) is the fuzzy set totality on the risk evaluation set E, and u is satisfiedk→f(uk)=(pk1,pk2,…,pkK) E.g. the relationship of F (E), the mapping f represents the risk factor ukThe membership degree of the centralized evaluation standard of the risk evaluation, and the risk factor ukForming a membership vector P according to the membership degree of the risk evaluation setl=(pl1,pl2,…,plm),l=1,2,…,m;
Constructing a membership matrix P according to the membership degree of the risk element set U to the risk evaluation set E:
Figure BDA0002984160250000051
wherein the element p of the membership matrixkmRepresenting the probability of belonging to the mth judgment factor for the kth risk element;
and giving weight to the evaluation factors in the risk evaluation set, and setting the weight distribution set as A ═ a1,a2,…ak,…aK),akAnd (3) representing the weight of the kth judging factor relative to other judging factors, and carrying out fuzzy transformation operation:
Figure BDA0002984160250000052
v is the relative weight of the risk factors of the data level under each criterion of the equipment layer, and the ranking weight vector is obtained after normalization:
Wvb=(Wvb1,Wvb2,…,Wvbk,…,WvbK)
wherein, WvbA ranking weight vector, W, representing the system level b-th criterionvbkRepresenting the weight of the kth risk factor relative to other risk factors under system-level criteria.
Further, the synthesizing the ranking weight vector of the judgment matrix and the ranking weight vector of the membership matrix by using the fuzzy mathematical method to obtain the data leakage risk value of the information system includes:
and transposing the sorting weight vector W of the judgment matrix to obtain:
W′=WT
calculating a data leakage risk value of each device of the information system:
r={r1,r2,…rz,…rZ}
rz=WvbW′;
wherein, WvbAn ordering weight vector of the membership matrix;
and calculating the total risk value R of the information system by adopting a weighted average method as follows:
Figure BDA0002984160250000061
compared with the prior art, the invention has the following technical effects: the method comprises the step of matching feature words in a data leakage information feature word set with sensitive information in a confidential sensitive information base item by item, wherein the sensitive word base is composed of information defining confidentiality, and the matching form of the feature words and the sensitive information comprises a regular expression, a dictionary, a script and a file type. And processing the acquired feature words based on an analytic hierarchy process and a fuzzy mathematical process, calculating a data leakage risk value of the information system, obtaining quantitative evaluation of the data leakage risk of the information system, and realizing accurate evaluation of the data leakage risk of the information system.
Drawings
The following detailed description of embodiments of the invention refers to the accompanying drawings in which:
FIG. 1 is a flow chart of a method of data leak risk assessment;
FIG. 2 is a functional block diagram of a data leakage risk assessment method;
fig. 3 is a hierarchical structure diagram.
Detailed Description
To further illustrate the features of the present invention, refer to the following detailed description of the invention and the accompanying drawings. The drawings are for reference and illustration purposes only and are not intended to limit the scope of the present disclosure.
As shown in fig. 1, the present embodiment discloses a data leakage risk assessment method, including the following steps S1 to S3:
s1, acquiring data leakage history information of the information system, processing the data leakage history information, and constructing a data leakage information feature word set;
s2, matching the feature words in the data leakage information feature word set with the sensitive information in the confidential sensitive information base item by item, and constructing a feature word matching set by using the feature words successfully matched with the sensitive information;
and S3, processing the feature words in the feature word matching set by using an analytic hierarchy process and a fuzzy mathematical method to obtain a data leakage risk value of the information system.
As a further preferred technical solution, the main purpose of step S1 is to construct a data leakage information feature word set by data preprocessing, adopting text extraction, word segmentation, stop word filtering, and then placing the processed data into a corresponding position of a database, specifically as follows, S11 to S13:
s11, extracting text information of the data leakage history information, specifically:
extracting text contents of information system data leakage information with different formats, and removing format marks such as hyperlinks, stop words, punctuation marks, space marks, special characters and the like.
S12, performing word segmentation processing on the text information according to the special noun dictionary dic to obtain a word segmentation list, which specifically comprises the following steps:
and performing word segmentation processing on the extracted text information by using a statistical word segmentation algorithm to obtain a plurality of independent entries. Due to the particularity and diversity of Chinese, a special word related to information system data leakage, namely a special noun dictionary dic, is pre-stored in an information system data leakage information word segmentation library based on the statistical word segmentation, so that the word segmentation is performed on the information system data leakage content in a targeted manner.
And S13, performing stop word filtering on the words in the word list, and constructing the data leakage information characteristic word set by using the words after the stop words are deleted.
It should be noted that the stop word in this embodiment indicates that the occurrence is particularly frequent, and words having no special meaning in one sentence, such as "yes", "you", "i", "he", and the like, are removed by using a statistical word segmentation algorithm, and then the words form the data leakage information feature word set.
As a further preferable technical solution, the sensitive information in the confidential sensitive information base in step S2 describes the characteristics of the sensitive confidential information in the information system data leakage information, and the numerical values are convenient for calculation, specifically:
1) and importing a confidential sensitive information base which comprises a dictionary, a regular expression, a script and a file type, wherein the data items comprise confidential contents, time, names, behaviors and the like in information system data leakage information.
It should be noted that the sensitive information base is designed according to the professional term definition dictionary database, and the sensitive information base can be improved and increased according to a specific information system. The sensitive information base can better evaluate the data leakage of the information system, and the accuracy of the data leakage risk evaluation is improved.
2) And performing matching calculation on the information system data leakage information characteristic words obtained through data preprocessing, counting matching data if the data leakage information characteristic words are the same as certain information in the sensitive information base, and storing the characteristic words successfully matched with the sensitive information into a characteristic word matching set.
The matching process is as follows: comparing the data leakage characteristic words with the sensitive words in the sensitive information base one by one, if the data leakage characteristic words are consistent with the sensitive words in the sensitive information base, matching successfully, and recording the characteristic words in a characteristic word set base; if the data is not matched with the sensitive words in the sensitive information base, the matching is not successful, the data leakage characteristic words are not recorded in the characteristic word set base, and another sensitive word is selected from the sensitive information base to repeat the process until all the sensitive words in the sensitive information base are compared.
Constructing a data leakage information characteristic word set library, comprising the following steps: extracting historical text information of the data leakage; performing word segmentation processing on the text information by using a statistical word segmentation method to obtain a word segmentation list; and filtering stop words of the words in the word list, comparing the deleted stop words with the sensitive words in the sensitive information base one by one to obtain a characteristic word set, and constructing a data leakage information characteristic word set base.
The sensitive information base is designed according to a professional term definition dictionary database, and can be improved and increased according to a specific information system. The sensitive information base can better evaluate the data leakage of the information system, and the accuracy of the data leakage risk evaluation is improved.
As a more preferable embodiment, as shown in fig. 2, the step S3: the method for processing the feature words in the feature word matching set by using the analytic hierarchy process and the fuzzy mathematic process to obtain the data leakage risk value of the information system comprises the following steps of S31 to S36:
s31, constructing hierarchical structures of the importance degrees of the data leakage risk factors of different levels, wherein the hierarchical structures are used for describing the importance degrees of the data leakage risk factors of different levels of the information system in a hierarchical manner and are related to the factors such as 'data leakage occurrence probability', 'data leakage influence degree' and 'information system equipment importance' of the information system, as shown in FIG. 3.
S32, judging the relative importance of the data leakage risk factors of different levels by adopting a scaling method according to the hierarchical structure, and constructing a judgment matrix;
s33, calculating a sorting weight vector of the judgment matrix by using the analytic hierarchy process;
s34, constructing a risk element set according to the data leakage risk elements, and constructing a risk evaluation set for each data leakage risk element according to a fuzzy mathematical method;
s35, calculating a membership matrix according to the risk element set and the risk evaluation set, and calculating a ranking weight vector of the membership matrix;
and S36, synthesizing the sequencing weight vector of the judgment matrix and the sequencing weight vector of the membership degree matrix by using the fuzzy mathematical method to obtain the data leakage risk value of the information system.
As a more preferable embodiment, in step S32: according to the hierarchical structure, judging the relative importance of the data leakage risk factors of different levels by adopting a scaling method, and constructing a judgment matrix, which specifically comprises S321-S322:
s321, setting the relative importance of the data leakage occurrence probability as IpThe relative importance of the data leakage influence degree is IFAnd the relative importance of the information system equipment is IDThe method comprises the following steps:
Figure BDA0002984160250000101
Figure BDA0002984160250000102
wherein, FhThe data leakage event of the file level h is represented, the file level can be divided into 5 levels, namely open, internal, secret and secret, and respectively assigned with values of 1, 3, 5, 7 and 9; dgThe data leakage event of the leakage equipment grade g is represented, the leakage equipment grade is divided into 5 grades which are respectively public, internal, secret and secret, and respectively assigned with values of 1, 3, 5, 7 and 9;
s322, judging the relative importance of the data leakage risk factors of different levels of the information system by using a ninth scale method or a fifth scale method of the AHP, and constructing a judgment matrix B according to the ratio of the data leakage occurrence probability, the data leakage influence degree and the relative importance of the information system equipment importance:
Figure BDA0002984160250000103
wherein, the element B in the judgment matrix BijAnd represents the ratio of the data leakage occurrence probability of the ith element to the data leakage occurrence probability of the jth element.
As a more preferable embodiment, in step S33: calculating the sequencing weight vector of the judgment matrix by using the analytic hierarchy process, which specifically comprises the following steps:
and calculating to obtain a feature vector M by using the judgment matrix:
M=(m1,m2,…mi,…mn)
wherein the content of the first and second substances,
Figure BDA0002984160250000104
bi1bi2…binthe element of the judgment matrix is n, and the order of the judgment matrix is n;
normalizing the characteristic vector M to obtain a sorting weight vector W of the judgment matrix (W ═ W)1,W2,…,Wn) Wherein, in the step (A),
Figure BDA0002984160250000111
as a further preferred technical solution, in this embodiment, consistency check is further performed on the determination matrix, and the process is as follows:
obtaining the maximum characteristic root lambda of the matrixmax
Figure BDA0002984160250000112
And (3) carrying out consistency check:
Figure BDA0002984160250000113
and when the C.I. <0.1 shows that the matrix consistency judgment is established, all weights have no logic errors, and the judgment matrix can be used for subsequent calculation.
As a more preferable embodiment, in step S34: constructing a risk element set according to the data leakage risk elements, and constructing a risk evaluation set for each data leakage risk element according to a fuzzy mathematical method, wherein the method specifically comprises the following steps:
the set of construction risk elements is U ═ U1,u2,…uk,…uK};
Setting the risk evaluation set of the data leakage influence degree and the data leakage occurrence probability as an equipment set E ═ { E ═ E of an information system1,E2,…Et,…ET}。
It should be noted that, in the conventional fuzzy judgment, a risk evaluation set is constructed according to each risk factor, and experts evaluate each risk factor with respect to the criterion of the previous layer, so as to measure the importance degree of the risk factor. However, because the data leakage influence degree and the data leakage occurrence probability of the information system are related to the equipment of the information system, aiming at the characteristics of the data leakage of the information system, the general fuzzy judgment method is improved, and the equipment set E of the information system is set to { E ═ E1,E2,…Et,…ETAnd the risk judgment set is changed as a risk judgment set of the data leakage influence degree and the data leakage occurrence probability of the information system.
As a more preferable embodiment, in step S35: calculating a membership matrix according to the risk element set and the risk evaluation set, and calculating a ranking weight vector of the membership matrix, wherein the method comprises the following steps S351 to S353:
s351, establishing a fuzzy mapping function of the risk element set and the risk evaluation set:
f:U→F(E)
wherein F (E) is the fuzzy set totality on the risk evaluation set E, and u is satisfiedk→f(uk)=(pk1,pk2,…,pkK) E.g. the relationship of F (E), the mapping f represents the risk factor ukThe membership degree of the centralized evaluation standard of the risk evaluation, and the risk factor ukForming a membership vector P according to the membership degree of the risk evaluation setl=(pl1,pl2,…,plm),l=1,2,…,m;
S352, constructing a membership matrix P according to the membership degree of the risk element set U to the risk evaluation set E:
Figure BDA0002984160250000121
wherein the element p of the membership matrixkmRepresenting the probability of belonging to the mth determinant for the kth risk element.
S353, the importance of the risk factors is greatly influenced by the size of the judgment factors in the judgment set, the judgment factors in the risk judgment set are weighted, and the weight distribution set is set as A ═ a1,a2,…ak,…aK),akAnd (3) representing the weight of the kth judging factor relative to other judging factors, and carrying out fuzzy transformation operation:
Figure BDA0002984160250000131
wherein v isbRepresenting the relative weight of data leakage of the b-th criterion, wherein V is the relative weight of the risk factors of the data level under each criterion of the equipment layer, and normalizing to obtain the sequencing weight vector of the membership matrix:
Wvb=(Wvb1,Wvb2,…,Wvbk,…,WvbK)
wherein, WvbA ranking weight vector, W, representing the system level b-th criterionvbkRepresenting the weight of the kth risk factor relative to other risk factors under system-level criteria.
As a more preferable embodiment, in step S36: synthesizing the sequencing weight vector of the judgment matrix and the sequencing weight vector of the membership matrix by using the fuzzy mathematical method to obtain a data leakage risk value of the information system, and the method comprises the following steps:
and transposing the sorting weight vector W of the judgment matrix to obtain:
W′=WT=(W1,W2,…,Wn)T
calculating a data leakage risk value of each device of the information system:
r={r1,r2,…rz,…rZ}
rz=(Wv1z,Wv2z,…,Wvbz)W′;
wherein z represents the z-th device, WvbAnd the sorting weight vector is the membership matrix.
Because the equipment of the information system is in the same network, the importance of all the equipment of the information system is the same, and the total risk value R of the information system is calculated by adopting a weighted average method as follows:
Figure BDA0002984160250000132
as a further preferred technical solution, after calculating the risk value R of the information system, the data leakage risk level of the information system is divided into low risk, medium risk, and high risk according to the data leakage risk value, and the corresponding risk values are shown in table 1:
TABLE 1 information system data leakage risk level and risk value corresponding relation
Figure BDA0002984160250000141
It should be noted that, in this embodiment, the matching degree between the data leakage information of the information system and the confidential sensitive information base is processed, an analytic hierarchy process is used to find the ranking weight vector of the judgment matrix, the ranking weight vector of the risk elements and the membership matrix of the judgment set is found, and then a fuzzy mathematical method is used to synthesize the ranking weight vector of the judgment matrix and the ranking weight vector of the membership matrix, so as to obtain the data leakage risk value of the information system.
The technical effects of the present invention are as follows:
(1) the analytic hierarchy process, the fuzzy mathematical process and the probability calculation process are comprehensively applied, so that the operability of the data leakage risk assessment method is stronger, the calculation is easier, the influence of subjective factors on the data leakage risk assessment is effectively reduced, and the data leakage risk assessment method is more objective.
(2) Based on basic judgment that the data leakage of the information system mainly comes from using data leakage, transmitting data leakage and storing data leakage, the evaluation method for the data leakage risk of the information system, which is established by applying an analytic hierarchy process to the terminal data, the network data and the file data to layer and determining evaluation factors, weights, membership functions and component evaluation matrixes, is more targeted.
(3) The method comprises the steps of dividing elements related to data leakage risk assessment into three levels of a data level, an equipment level and a system level by utilizing the thought of an analytic hierarchy process, carrying out quantitative analysis and calculation by utilizing a fuzzy mathematical method and a probability calculation method on the basis, and converting a multi-target comprehensive evaluation problem of a network information system into a hierarchical weight decision and fuzzy mathematical membership problem for analysis and calculation.
The embodiment is mainly used for correctly mastering the essence of the analytic hierarchy process, the fuzzy mathematical process and the probability calculation process aiming at the characteristics and accurate classification of data leakage of the information system, and flexibly applying the analytic hierarchy process, the fuzzy mathematical process and the probability calculation process to the evaluation of data leakage risks of the information system.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (9)

1. A data leakage risk assessment method is characterized by comprising the following steps:
acquiring data leakage historical information of an information system, processing the data leakage historical information, and constructing a data leakage information feature word set;
matching the characteristic words in the data leakage information characteristic word set with the sensitive information in the confidential sensitive information base item by item, and constructing a characteristic word matching set by using the characteristic words successfully matched with the sensitive information;
and processing the feature words in the feature word matching set by using an analytic hierarchy process and a fuzzy mathematical process to obtain a data leakage risk value of the information system.
2. The data leakage risk assessment method according to claim 1, wherein the obtaining of data leakage history information of the information system, and the processing of the data leakage history information, and the constructing of the data leakage information feature word set include:
extracting text information of the data leakage historical information;
performing word segmentation processing on the text information by using a statistical word segmentation method to obtain a word segmentation list;
and performing stop word filtering on the words in the word list, and constructing the data leakage information characteristic word set by using the words after the stop words are deleted.
3. The data leakage risk assessment method of claim 1, wherein the processing the feature words in the feature word matching set by using an analytic hierarchy process and a fuzzy mathematical process to obtain the data leakage risk value of the information system comprises:
constructing hierarchical structures of the importance degrees of the data leakage risk factors of different levels;
judging the relative importance of the data leakage risk factors of different levels by adopting a scaling method according to the hierarchical structure, and constructing a judgment matrix;
calculating a sorting weight vector of a judgment matrix by using the analytic hierarchy process;
constructing a risk element set according to the data leakage risk elements, and constructing a risk evaluation set for each data leakage risk element according to a fuzzy mathematical method;
calculating a membership matrix according to the risk element set and the risk evaluation set, and calculating a ranking weight vector of the membership matrix;
and synthesizing the sequencing weight vector of the judgment matrix and the sequencing weight vector of the membership degree matrix by using the fuzzy mathematical method to obtain a data leakage risk value of the information system.
4. The data leakage risk assessment method of claim 3, wherein the constructing of the hierarchical structure of importance of the data leakage risk factors of different levels comprises:
and setting the influence factors of the importance of the data leakage risk factors, including the data leakage occurrence probability, the data leakage influence degree and the information system equipment importance, and constructing a hierarchical structure for describing the importance of the data leakage risk factors of different levels of the information system.
5. The method according to claim 4, wherein the evaluating the relative importance of the data leakage risk factors of different levels by using a scaling method according to the hierarchical structure to construct a judgment matrix comprises:
setting the relative importance of the data leakage occurrence probability as IpThe relative importance of the data leakage influence degree is IFAnd the relative importance of the information system equipment is IDThe method comprises the following steps:
Figure FDA0002984160240000021
Figure FDA0002984160240000022
wherein,FhData leakage events, D, representing a File level hgA data leak event representing a leaking device class g;
constructing a judgment matrix B according to the ratio of the data leakage occurrence probability, the data leakage influence degree and the relative importance of the information system equipment importance:
Figure FDA0002984160240000031
wherein, the element B in the judgment matrix BijAnd represents the ratio of the data leakage occurrence probability of the ith element to the data leakage occurrence probability of the jth element.
6. The method according to claim 4, wherein the calculating the ranking weight vector of the decision matrix by the analytic hierarchy process comprises:
and calculating to obtain a feature vector M by using the judgment matrix:
M=(m1,m2,…mi,…mn)
wherein the content of the first and second substances,
Figure FDA0002984160240000032
bi1bi2…binthe element of the judgment matrix is n, and the order of the judgment matrix is n;
normalizing the characteristic vector M to obtain a sorting weight vector W of the judgment matrix (W ═ W)1,W2,…,Wn) Wherein, in the step (A),
Figure FDA0002984160240000033
7. the data leakage risk assessment method of claim 4, wherein said constructing a set of risk elements from the data leakage risk elements and constructing a set of risk assessments for each of the data leakage risk elements from fuzzy mathematics, comprises:
the set of construction risk elements is U ═ U1,u2,…uk,…uK};
Setting the risk evaluation set of the data leakage influence degree and the data leakage occurrence probability as an equipment set E ═ { E ═ E of an information system1,E2,…Et,…ET}。
8. The method of claim 7, wherein the calculating a membership matrix according to the risk element set and the risk evaluation set and calculating an ordering weight vector of the membership matrix comprises:
establishing a fuzzy mapping function of the risk element set and the risk evaluation set:
f:U→F(E)
wherein F (E) is the fuzzy set totality on the risk evaluation set E, and u is satisfiedk→f(uk)=(pk1,pk2,…,pkK) E.g. the relationship of F (E), the mapping f represents the risk factor ukThe membership degree of the centralized evaluation standard of the risk evaluation, and the risk factor ukForming a membership vector P according to the membership degree of the risk evaluation setl=(pl1,pl2,…,plm),l=1,2,…,m;
Constructing a membership matrix P according to the membership degree of the risk element set U to the risk evaluation set E:
Figure FDA0002984160240000041
wherein the element p of the membership matrixkmRepresenting the probability of belonging to the mth judgment factor for the kth risk element;
and giving weight to the evaluation factors in the risk evaluation set, and setting the weight distribution set as A ═ a1,a2,…ak,…aK),akRepresenting the weight of the kth judging factor relative to other judging factors, and performing fuzzy transformationThe operation of (1):
Figure FDA0002984160240000042
v is the relative weight of the risk factors of the data level under each criterion of the equipment layer, and the ranking weight vector is obtained after normalization:
Wvb=(Wvb1,Wvb2,…,Wvbk,…,WvbK)
wherein, WvbA ranking weight vector, W, representing the system level b-th criterionvbkRepresenting the weight of the kth risk factor relative to other risk factors under system-level criteria.
9. The data leakage risk assessment method according to claim 4, wherein the synthesizing the ranking weight vector of the judgment matrix and the ranking weight vector of the membership matrix by using the fuzzy mathematical method to obtain the data leakage risk value of the information system comprises:
and transposing the sorting weight vector W of the judgment matrix to obtain:
W′=WT
calculating a data leakage risk value of each device of the information system:
r={r1,r2,…rz,…rZ}
rz=WvbW′;
wherein, WvbAn ordering weight vector of the membership matrix;
and calculating the total risk value R of the information system by adopting a weighted average method as follows:
Figure FDA0002984160240000051
CN202110295438.7A 2021-03-19 2021-03-19 Data leakage risk assessment method Pending CN112948823A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110295438.7A CN112948823A (en) 2021-03-19 2021-03-19 Data leakage risk assessment method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110295438.7A CN112948823A (en) 2021-03-19 2021-03-19 Data leakage risk assessment method

Publications (1)

Publication Number Publication Date
CN112948823A true CN112948823A (en) 2021-06-11

Family

ID=76226815

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110295438.7A Pending CN112948823A (en) 2021-03-19 2021-03-19 Data leakage risk assessment method

Country Status (1)

Country Link
CN (1) CN112948823A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113904874A (en) * 2021-11-30 2022-01-07 北京中超伟业信息安全技术股份有限公司 Unmanned aerial vehicle data secure transmission method
CN115618085A (en) * 2022-10-21 2023-01-17 华信咨询设计研究院有限公司 Interface data exposure detection method based on dynamic label
CN117390686A (en) * 2023-12-11 2024-01-12 北京中超伟业信息安全技术股份有限公司 Storage medium destruction system based on foreign matter identification and alarm
CN117494146A (en) * 2023-12-29 2024-02-02 山东街景智能制造科技股份有限公司 Model database management system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109377083A (en) * 2018-11-14 2019-02-22 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Methods of risk assessment, device, equipment and storage medium
CN110008311A (en) * 2019-04-04 2019-07-12 北京邮电大学 A kind of product information security risk monitoring method based on semantic analysis
CN110046518A (en) * 2019-04-12 2019-07-23 鲁东大学 A kind of individual privacy value calculation method based on big data analysis
CN110175327A (en) * 2019-05-11 2019-08-27 复旦大学 A kind of data privacy quantitative estimation method based on privacy information detection
CN110266723A (en) * 2019-07-08 2019-09-20 云南财经大学 A kind of safety of cloud service methods of risk assessment
CN110414222A (en) * 2019-06-18 2019-11-05 北京邮电大学 A kind of application privacy leakage failure detecting method and device based on component liaison
CN110839012A (en) * 2019-09-25 2020-02-25 国网思极检测技术(北京)有限公司 Troubleshooting method for preventing sensitive information from being leaked

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109377083A (en) * 2018-11-14 2019-02-22 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Methods of risk assessment, device, equipment and storage medium
CN110008311A (en) * 2019-04-04 2019-07-12 北京邮电大学 A kind of product information security risk monitoring method based on semantic analysis
CN110046518A (en) * 2019-04-12 2019-07-23 鲁东大学 A kind of individual privacy value calculation method based on big data analysis
CN110175327A (en) * 2019-05-11 2019-08-27 复旦大学 A kind of data privacy quantitative estimation method based on privacy information detection
CN110414222A (en) * 2019-06-18 2019-11-05 北京邮电大学 A kind of application privacy leakage failure detecting method and device based on component liaison
CN110266723A (en) * 2019-07-08 2019-09-20 云南财经大学 A kind of safety of cloud service methods of risk assessment
CN110839012A (en) * 2019-09-25 2020-02-25 国网思极检测技术(北京)有限公司 Troubleshooting method for preventing sensitive information from being leaked

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
姜茸等: "云计算技术安全风险评估研究", 《电子技术应用》 *
成凯: "基于模糊层次分析法的信息网络安全评估体系研究", 《大众用电》 *
肖龙等: "信息系统风险的多级模糊综合评判模型", 《四川大学学报(工程科学版)》 *
赵冬梅等: "信息系统的模糊风险评估模型", 《通信学报》 *
陈旭华等: "企业反竞争情报体系构建策略研究——基于知识产权保护的视角", 《情报杂志》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113904874A (en) * 2021-11-30 2022-01-07 北京中超伟业信息安全技术股份有限公司 Unmanned aerial vehicle data secure transmission method
CN113904874B (en) * 2021-11-30 2022-03-04 北京中超伟业信息安全技术股份有限公司 Unmanned aerial vehicle data secure transmission method
CN115618085A (en) * 2022-10-21 2023-01-17 华信咨询设计研究院有限公司 Interface data exposure detection method based on dynamic label
CN115618085B (en) * 2022-10-21 2024-04-05 华信咨询设计研究院有限公司 Interface data exposure detection method based on dynamic tag
CN117390686A (en) * 2023-12-11 2024-01-12 北京中超伟业信息安全技术股份有限公司 Storage medium destruction system based on foreign matter identification and alarm
CN117390686B (en) * 2023-12-11 2024-05-14 北京中超伟业信息安全技术股份有限公司 Storage medium destruction system based on foreign matter identification and alarm
CN117494146A (en) * 2023-12-29 2024-02-02 山东街景智能制造科技股份有限公司 Model database management system
CN117494146B (en) * 2023-12-29 2024-04-26 山东街景智能制造科技股份有限公司 Model database management system

Similar Documents

Publication Publication Date Title
CN112948823A (en) Data leakage risk assessment method
CN112613501A (en) Information auditing classification model construction method and information auditing method
CN107391760A (en) User interest recognition methods, device and computer-readable recording medium
CN109872162B (en) Wind control classification and identification method and system for processing user complaint information
CN112036550A (en) Client intention identification method and device based on artificial intelligence and computer equipment
CN106874253A (en) Recognize the method and device of sensitive information
CN112908436B (en) Clinical test data structuring method, clinical test recommending method and device
CN111915437A (en) RNN-based anti-money laundering model training method, device, equipment and medium
CN108959474B (en) Entity relation extraction method
CN108022146A (en) Characteristic item processing method, device, the computer equipment of collage-credit data
CN110765757A (en) Text recognition method, computer-readable storage medium, and computer device
US20200090058A1 (en) Model variable candidate generation device and method
CN110033284A (en) Source of houses verification method, apparatus, equipment and storage medium
CN111210402A (en) Face image quality scoring method and device, computer equipment and storage medium
CN111177367A (en) Case classification method, classification model training method and related products
CN115858785A (en) Sensitive data identification method and system based on big data
CN114663002A (en) Method and equipment for automatically matching performance assessment indexes
CN111723182B (en) Key information extraction method and device for vulnerability text
CN113887214A (en) Artificial intelligence based wish presumption method and related equipment thereof
CN112686312A (en) Data classification method, device and system
CN112131354A (en) Answer screening method and device, terminal equipment and computer readable storage medium
CN115859128B (en) Analysis method and system based on interaction similarity of archive data
CN114218462A (en) Data classification method, device, equipment and storage medium based on LSTM
CN110413782B (en) Automatic table theme classification method and device, computer equipment and storage medium
CN112685324B (en) Method and system for generating test scheme

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Yu Xiang

Inventor after: Li Qiang

Inventor after: Tian Nan

Inventor after: Li Tengfei

Inventor after: Chen Lizhe

Inventor after: Shu Zhanxiang

Inventor after: Li Menglin

Inventor before: Li Qiang

Inventor before: Yu Xiang

Inventor before: Tian Nan

Inventor before: Li Tengfei

Inventor before: Chen Lizhe

Inventor before: Shu Zhanxiang

Inventor before: Li Menglin

CB03 Change of inventor or designer information