CN114157628A

CN114157628A - Dynamic divulgence risk assessment method and device based on FCE algorithm

Info

Publication number: CN114157628A
Application number: CN202111208786.2A
Authority: CN
Inventors: 张鹏; 朱东伟; 许洪波
Original assignee: Institute of Information Engineering of CAS
Current assignee: Institute of Information Engineering of CAS
Priority date: 2021-10-18
Filing date: 2021-10-18
Publication date: 2022-03-08

Abstract

The invention discloses a dynamic divulgence risk assessment method and a dynamic divulgence risk assessment device based on an FCE algorithm, wherein the method comprises the steps of obtaining risk item data information of user data to be assessed; setting a threshold value based on the risk item data information and the risk grade of each risk item, and performing simulation expert normalization scoring by combining an FCE algorithm to obtain a scoring array of each risk item; constructing a user comprehensive risk characteristic matrix by utilizing the scoring array; and carrying out matrix multiplication on each risk item normalization score array set by the expert and the user comprehensive risk characteristic matrix, and acquiring the comprehensive risk grade of the user to be evaluated according to the matrix multiplication result. According to the method and the system, research and judgment are dynamically carried out and quantitative evaluation is carried out on the divulgence risk according to the real-time data of the user, and the obtained result is more biased to the requirements of business personnel.

Description

Dynamic divulgence risk assessment method and device based on FCE algorithm

Technical Field

The invention relates to the field of information security risk assessment, in particular to a dynamic divulgence risk assessment method and device based on an FCE algorithm.

Background

The mailbox system is taken as an early internet information transmission medium, and is favored by people from appearance to the present because of convenient use and capability of carrying out information and file interaction at any time and any place. With the rapid development of internet technology, people rely more on mailboxes to transmit important information. The number of netizens in China exceeds the United states in 2008 in 2 months, and the netizen is the largest country in the world. By 6 months in 2018, the scale of Chinese netizens reaches 8.02 hundred million. The mail user size is 3.06 hundred million people, accounting for 38.1% of the total number of net citizens, and more net citizens transmit important information through mailboxes, wherein the important information comprises 'divulgers' in units or organizations for earning illegal benefits.

Due to the fact that the number of users of a mailbox system is large, interaction relations among the users are complex, sent contents are complex, related risk items are many and the like, quantitative evaluation on mailbox leakage risks is difficult to carry out all the time, business personnel are difficult to focus on mailbox users with high leakage risks, and challenges are brought to work of detecting and preventing leakage and the like.

Disclosure of Invention

Aiming at the problems that quantitative evaluation of mailbox divulgence risks is difficult due to the large number of users of a mailbox system, complex interaction relation among the users, complex sent content and more involved risk items, the invention provides a dynamic divulgence risk evaluation method and device based on a fuzzy comprehensive evaluation method (FCE), which can dynamically perform comprehensive analysis and scoring according to mail data sent by mailbox users and give the current risk level of the users.

The invention provides a dynamic divulgence risk assessment method based on an FCE algorithm. The method comprises the following steps:

a dynamic divulgence risk assessment method based on an FCE algorithm comprises the following steps:

1) acquiring risk item data information of user data to be evaluated;

2) setting a threshold value based on the risk item data information and the risk grade of each risk item, and performing simulation expert normalization scoring by combining an FCE algorithm to obtain a scoring array of each risk item;

3) constructing a user comprehensive risk characteristic matrix by utilizing the scoring array;

4) and carrying out matrix multiplication on each risk item normalization score array set by the expert and the user comprehensive risk characteristic matrix, and acquiring the comprehensive risk grade of the user to be evaluated according to the matrix multiplication result.

Further, the user data to be evaluated includes: mailbox data, Facebook data, or Twitter data.

Further, the risk item data information includes: the mail data of the confidential topics is in proportion, the mail data sent overseas is in proportion, the mail data received overseas is in proportion and the suspected password-containing mail data is in proportion.

Further, the number of groupings m of simulation experts is n +1, where n is the number of risk level setting thresholds.

Further, according to the business requirements, the number of risk level setting thresholds is selected.

Further, the method for acquiring the comprehensive risk level comprises the following steps: and obtaining the matrix multiplication result based on the maximum membership judgment principle.

A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the above method when executed.

An electronic device comprising a memory and a processor, wherein the memory stores a program that performs the above described method.

The invention has the beneficial effects that:

the invention provides a dynamic divulgence risk assessment method based on an FCE algorithm. On the first hand, aiming at the problems that the mailbox system at present has large number of users, complicated interaction relation among the users, complicated sent content and more involved risk items, which cause difficulty in quantitative evaluation of mailbox divulgence risks, the invention designs a fuzzy comprehensive scoring mechanism simulating experts by utilizing the idea of fuzzy comprehensive evaluation, and can quantitatively evaluate the divulgence risks. In a second aspect, the invention provides a scoring mechanism based on expert knowledge for specific services. The cognition of business experts on each risk item in the evaluation field is utilized, and the normalized weight distribution is carried out according to the importance degree of the risk item, so that the important risk item can be highlighted depending on the deviation of business personnel, the obtained result is more inclined to the requirement of the business personnel, and the business personnel can focus on the interested user more easily.

In addition, the method is suitable for safety risk assessment (including but not limited to mailbox risk assessment) in the field of information security, can effectively capture high-risk users interested by business personnel, and has certain universality. Meanwhile, as an expert knowledge scoring mechanism is integrated, the invention improves the probability that service personnel catch interested users to a certain extent, and is very suitable for popularization and use according to specific services.

Drawings

Fig. 1 is a schematic flow chart of a dynamic risk assessment method based on an FCE algorithm according to an embodiment of the present invention.

Fig. 2 is a policy diagram of a dynamic risk assessment method based on an FCE algorithm according to an embodiment of the present invention.

Fig. 3 is a schematic specific flow chart of a dynamic risk of compromise assessment method based on the FCE algorithm according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a dynamic divulgence risk assessment method based on an FCE algorithm. As shown in fig. 1-3, it mainly includes:

s101: accessing user data to be evaluated into a system, starting a system marking timing task, and marking all data;

specifically, after the data is accessed into the system, a timing marking task in the system is started. Aiming at mailbox data, the mark mainly comprises a confidential topic label, an overseas sending label, an overseas receiving label and a suspected password-containing label. The significance of marking data is convenient for subsequent dynamic real-time query;

s102: carrying out normalization scoring on each risk item and setting a risk item ratio threshold value through expert knowledge;

specifically, through the understanding of the data by the business experts, the risk items are subjected to normalized scoring, and a threshold value is set for the risk item ratio. For example, in the example system, for the four risk items involved in the mailbox data, the normalized score given is [0.55,0.1,0.15,0.2], and the thresholds are set as follows: the number of mail pieces on confidential topics is high risk when the percentage is more than or equal to 3%, medium risk when the percentage is between 3% and 1% (including 1%), and low risk when the percentage is less than 1%. The percentage of the number of the outbound receiving/sending mails is 15% and 5%. The suspected password-containing mail accounts for 5% and 2% of the threshold. It should be noted here that the threshold parameters may be set to 2 (for example, as given in this example), and the 2 parameters divide the data into three categories, namely high, medium and low, or the threshold parameters may be set to 4 according to the requirement, and the data may be divided into five categories, namely high, medium and low. In addition, in S103, the number of groups scored by the simulation expert should be 1 added to the number of threshold parameters (if the number of thresholds is set to be n, then the number of groups scored by the simulation expert is n + 1). The purpose of this is to have a set of simulated expert scores corresponding to the user's risk item data regardless of the threshold interval over which it is distributed.

S103: acquiring data information of each risk item of a user, simulating an expert scoring mechanism through expert knowledge, carrying out normalized scoring on the risk items one by combining the risk item data, and constructing a user comprehensive risk matrix according to normalized scores of the risk items;

specifically, the data information of each risk item of the user (in the case, the ratio of the number of the mail items of each risk item of the user to the total number of the mail items of the user) is acquired, the corresponding normalized score array is selected from the score arrays generated by the analog expert scoring mechanism by combining the threshold information set in S102, and the user comprehensive score risk matrix is constructed according to the normalized score arrays corresponding to each risk item. The method comprises the following steps:

s1031: acquiring data information of each risk item of a user, comprising the following steps: the mail data of the confidential topics is in proportion, the mail data sent overseas is in proportion, the mail data received overseas is in proportion and the suspected password-containing mail data is in proportion.

S1032: and based on the data information of each risk item of the user acquired in the step S1031 and the threshold information set by using expert knowledge in the step S102, combining an FCE (fuzzy C-means) method to carry out simulation expert normalization scoring to obtain a simulation expert normalization scoring array corresponding to the risk item.

S1033: and constructing a user comprehensive risk characteristic matrix based on the acquired user risk item data score arrays in S1032.

S104: and carrying out matrix multiplication calculation and judging according to a maximization membership principle to obtain a user comprehensive risk level.

Specifically, matrix multiplication is performed on each risk item normalized score array obtained in S102 and the user comprehensive risk matrix constructed in S103, and according to the user comprehensive score array result, a corresponding user comprehensive risk level is obtained by combining a maximization membership judgment principle.

The following test experiments are performed on the usability of the dynamic divulgence risk assessment method based on the FCE algorithm provided by the embodiment of the present invention:

in order to verify the effectiveness of the method provided by the invention, 178583 pieces of collected data including organization and personal mail data are respectively tested, and the result shows that the risk grades of individuals and organizations can be rapidly evaluated, and the individuals and organizations can be screened according to the risk grades, so that business personnel can focus on the individuals or organizations with higher risk grades.

The experimental environment is as follows: example experiments have no special requirements on hardware equipment, and the involved software includes: neo4j (for storing user points, side information), ES (for storing underlying mail data and queries).

The dynamic divulgence risk assessment method based on the FCE algorithm can analyze information which is difficult to quantify of the user qualitatively and quantitatively according to the comprehensive performance of the user and the preference of business personnel, obtain the comprehensive risk level and enable the business personnel to focus on interested users more quickly. The method provided by the invention has certain universality for information security risk assessment business, and can provide a new idea for information security risk assessment.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A dynamic divulgence risk assessment method based on an FCE algorithm comprises the following steps:

1) acquiring risk item data information of user data to be evaluated;

2. The method of claim 1, wherein the user data to be evaluated comprises: mailbox data, Facebook data, or Twitter data.

3. The method of claim 2, wherein risk item data information comprises: the mail data of the confidential topics is in proportion, the mail data sent overseas is in proportion, the mail data received overseas is in proportion and the suspected password-containing mail data is in proportion.

4. The method of claim 1, wherein the number of groupings of simulation experts, m-n +1, where n sets the number of thresholds for the risk level.

5. The method of claim 4, wherein the number of risk level setting thresholds is selected according to business needs.

6. The method of claim 1, wherein the method of obtaining a composite risk level comprises: and obtaining the matrix multiplication result based on the maximum membership judgment principle.

7. A storage medium having a computer program stored thereon, wherein the computer program is arranged to, when executed, perform the method of any of claims 1-6.

8. An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the method according to any of claims 1-6.