CN109284317B - Time sequence directed graph-based stolen information clue extraction and segmented evaluation method - Google Patents

Time sequence directed graph-based stolen information clue extraction and segmented evaluation method Download PDF

Info

Publication number
CN109284317B
CN109284317B CN201811259183.3A CN201811259183A CN109284317B CN 109284317 B CN109284317 B CN 109284317B CN 201811259183 A CN201811259183 A CN 201811259183A CN 109284317 B CN109284317 B CN 109284317B
Authority
CN
China
Prior art keywords
log
information
clue
evaluation
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811259183.3A
Other languages
Chinese (zh)
Other versions
CN109284317A (en
Inventor
李兴国
苗功勋
郑传义
王蒙
崔新安
张庆亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongfu Safety Technology Co Ltd
Original Assignee
BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD
Zhongfu Information Co Ltd
Zhongfu Safety Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD, Zhongfu Information Co Ltd, Zhongfu Safety Technology Co Ltd filed Critical BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT CO LTD
Priority to CN201811259183.3A priority Critical patent/CN109284317B/en
Publication of CN109284317A publication Critical patent/CN109284317A/en
Application granted granted Critical
Publication of CN109284317B publication Critical patent/CN109284317B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a time sequence directed graph-based stolen information clue extraction and segmentation evaluation method, which comprises the following steps: the method comprises the steps of obtaining log information in the whole intranet, wherein the log information comprises massive log data generated by various protection and supervision devices in the intranet, cleaning and labeling the log information in the obtaining process, and forming paradigm (formatting) clue data. The canonical thread data at least includes information about the thread body, the associated attributes, the phase of the thread and the thread time. Then, each clue data is used as a vertex, the associated attributes are used as edges, all clue data in the selected time are subjected to directed series connection, and a directed graph formed by the internal clues and the associated attributes is formed. And then, traversing the directed graph to extract the information stealing line cable chain, establishing an information stealing evaluation function to evaluate each line cable chain according to the number of the cable points and the integrity of the cable stages. And extracting and alarming the wire chain with higher evaluation risk value.

Description

Time sequence directed graph-based stolen information clue extraction and segmented evaluation method
Technical Field
The invention relates to the technical field of internet information security, in particular to a method for extracting stolen information clues and evaluating the stolen information clues in a segmented mode based on a time sequence directed graph.
Background
With the continuous improvement of the domestic information construction level, each organ and organization gradually establishes an internal office network or an industry private network, and office networks constructed by a plurality of organs and organizations cannot be communicated with the internet for various reasons. The networks isolated from the internet are often used for transmitting and processing some sensitive information, and become sensitive networks. Information security protection in such networks is a crucial issue.
At present, the information security protection solution for the intranet is single, and is basically a scheme designed based on the traditional internet information security protection means. The scheme mainly aims at the protection requirements of traditional internet equipment such as viruses, malicious software, system bugs and the like in the intranet, and carries out security reinforcement and protection on the equipment and the system in the intranet. And the targeted reinforcement and protection for information stealing and disclosure in a sensitive inner network are not realized. These traditional supervision means can generate a large amount of management logs, operation logs and alarm information in the operation process, and these information are often of large data volume, and have the problem that accurate clues cannot be provided in the aspect of judging information stealing and divulging.
Disclosure of Invention
In order to overcome the deficiencies in the prior art, the invention provides a stolen information clue extraction and segmentation evaluation method based on a time sequence directed graph, so as to solve the technical problems.
The technical scheme of the invention is as follows:
a method for extracting and evaluating stolen information clues based on a time sequence directed graph comprises the following steps:
acquiring log information and extracting clues of the log information in the acquisition process;
performing directed series segmentation on the acquired log clues, and forming a limited directed graph by all clues in the network within a determined time range;
extracting information stealing line cable chains from the digraph;
and establishing an information stealing evaluation function to perform clue evaluation on each line cable chain.
Preferably, the step of acquiring log information in the whole intranet and performing clue extraction on the log information in the acquisition process includes:
acquiring mass log data generated by various protection and supervision equipment in an intranet;
cleaning and labeling the log data in the log data acquisition process to form paradigm clue data;
the acquired log data are divided into stages according to the process characteristics of the stealing and divulging events; the attack chain model is optimized according to the behavior habit of intranet attack, and log data are divided into: intention phase A, preparation phase B, action phase C, mask phase D.
Preferably, in the step of obtaining log data, the log data is cleaned and labeled to form a canonical clue data, and the canonical clue data at least includes information about a clue body, an associated attribute, a clue stage and a clue time.
Preferably, the step of cleaning and labeling the log data in the log data acquisition process to form the canonical clue data specifically includes:
receiving the log according to SYSLOG log transmission standard;
analyzing and normalizing the received log by configuring an analysis template;
storing the normalized data to form log thread set items in different stages with event alarm and log as threads
Figure BDA0001843496140000021
Where event cues A, B, C, D represent sets of log cues belonging to four different phases, respectively, and E represents all log cues in the network and user environment.
Preferably, the step of storing the normalized data forms a log thread set with the event alarm and the log as threads in different stages, and the threads in the set at least should include two associated attributes and one time attribute.
Preferably, the step of performing directed series segmentation on the obtained log threads forms a limited directed graph of all threads in the network within a certain time range, which is as follows:
and performing directed series connection on all the clue data in the selected time by taking each clue data as a vertex and the associated attributes as edges to form a directed graph consisting of the internal clue and the associated attributes.
Preferably, each thread data is used as a vertex, the associated attributes are used as edges, all thread data in the selected time are subjected to directed series connection to form a directed graph formed by the intranet threads and the associated attributes, and the occurrence time sequence of the log threads determines the direction of an arrow connecting the connecting edges of the two log threads.
And forming a directed graph in the network, wherein the data volume analyzed after the time direction of the directed graph is reduced, and a large amount of computing resources are saved. There is an absolute efficiency advantage over traditional correlation analysis when analyzing data from very large data sets. The original information stealing clues captured by filtering in the time direction are more consistent with the crime rule of the information stealing activity, and the matching degree of the original information stealing clues is improved.
Preferably, in the step of extracting the information stealing line chain from the directed graph, each path is an information stealing clue composed of log clues according to a time sequence, and a plurality of paths are combined to form an information stealing clue set { L }iWhere, set { L }iEach element L iniA log line chain;
Figure BDA0001843496140000032
Tiis a set LiCan be evaluated as a log-wire chain of information-stealing cues.
Preferably, the step of establishing an information stealing assessment function to perform thread assessment on each thread chain includes:
establishing a piecewise evaluation function FA (L)i),FB(Li),FC(Li),FD(Li) Respectively to information stealing clue TiPerforming evaluation of stage performance;
Figure BDA0001843496140000033
Tiis a set LiA log-line chain that can be evaluated as an information-stealing cue; wherein, FA (L)i) Chain T for representing information cluesiEvaluation of risk Performance in intentional phase A, FA (L)i) The evaluation of the stage is scored from two aspects, the number of cable logs within the stage and the risk level of the logs, FA (L)i)=ValCnt(Li)+ValLel(Li),ValCnt(Li) Is a piecewise function, as follows:
Figure BDA0001843496140000031
CntA(Li) Is a thread chain LiNumber of threads in stage a.
ValLel(Li)=10*(ExLow(Li)+ExMid(Li)+ExHig(Li))+15*(ExMid(Li)+ExHig(Li))+25*ExHig(Li),
ExLow(Li)、ExMid(Li)、ExHig(Li) Respectively represent LiAnd whether a low risk level log exists, whether a medium risk level log exists and whether a high risk level log exists are determined, wherein the value is {0, 1 }.
Similarly, FB (L) is analogizedi),FC(Li),FD(Li) The risk performance evaluation values of the preparation stage B, the action stage C and the covering stage D are respectively shown.
Setting weights WA, WB, WC and WD occupied by the segmented evaluation function in the information stealing event;
wherein the weight setting of the segment evaluation is obtained according to historical data analysis, and adjustment is allowed according to data in an application scene;
establishing and evaluating function FA (L)i),FB(Li),FC(Li),FD(Li) And the information stealing comprehensive evaluation function P (L) related to the weights WA, WB, WC and WDi) Performing clue evaluation on the extracted information stealing clues, and calculating an evaluation value;
wherein the content of the first and second substances,
Figure BDA0001843496140000041
wherein
Figure BDA0001843496140000042
As an adjusting function, evaluating the coverage of the cable to the stages, and performing forward adjustment on the cable chain covered to all the stages;
Figure BDA0001843496140000043
CoverAll (Li) denotes LiWhether the clue in (1) covers all stages is set as {0, 1 }.
And (4) exciting the log clues conforming to the whole process risk through the adjusting function, thereby achieving the purpose of carrying out accuracy adjustment according to specific data and services.
The accuracy of the clues grabbed by the sectional evaluation of the stealing clues is greatly improved, the number of the false alarms is effectively reduced, and the proportion of the false alarms is reduced. The model is established and a segmented evaluation method is introduced, so that index quantification can be performed on information stealing analysis behaviors through logs, and quantification indexes and algorithm support are provided for subsequent big data mining and artificial intelligence analysis.
Preferably, the method further comprises:
and performing service-related operation on the information stealing clues, wherein the operation comprises sorting according to the magnitude of the evaluation value, extracting the wire chain with higher evaluation risk value of the wire chain and giving an alarm.
The invention provides a time sequence directed graph-based stolen information clue extraction and segmented evaluation method, which can carry out targeted clue mining and clue accuracy evaluation on information stealing behaviors in a sensitive intranet.
According to the technical scheme, the invention has the following advantages: the capability of discovering and sensing the information stealing and divulging secret of the internal network in advance is greatly enhanced. The method mainly includes the steps of adopting a data mining technology, mining possible stealing and divulging key information from massive log information generated by various protection and supervision devices in an intranet, and carrying out stealing and divulging risk assessment on the key information according to stealing and divulging psychology research to find out key information with large hidden danger of stealing and divulging.
The amount of data analyzed after the time direction of the directed graph is reduced, saving a large amount of computing resources. There is an absolute efficiency advantage over traditional correlation analysis when analyzing data from very large data sets. The original information stealing clues captured by filtering in the time direction are more consistent with the crime rule of the information stealing activity, and the matching degree of the original information stealing clues is improved. The accuracy of the clues grabbed by the sectional evaluation of the stealing clues is greatly improved, the number of the false alarms is effectively reduced, and the proportion of the false alarms is reduced. The model is established and a segmented evaluation method is introduced, so that index quantification can be performed on information stealing analysis behaviors through logs, and quantification indexes and algorithm support are provided for subsequent big data mining and artificial intelligence analysis.
In addition, the invention has reliable design principle, simple structure and very wide application prospect.
Therefore, compared with the prior art, the invention has prominent substantive features and remarkable progress, and the beneficial effects of the implementation are also obvious.
Drawings
FIG. 1 is a flow chart of a method for extracting and evaluating stolen information clues in sections based on a time-series directed graph;
fig. 2 is a diagram of log thread directed concatenation segmentation.
Detailed Description
The invention provides a method for extracting and evaluating stolen information clues based on a time sequence directed graph in a segmented manner, which is used for acquiring log information in the whole intranet, including massive log data generated by various protection and supervision equipment in the intranet, and cleaning and labeling the log information in the acquisition process to form paradigm (formatted) clue data. The canonical thread data at least includes information about the thread body, the associated attributes, the phase of the thread and the thread time. Then, each clue data is used as a vertex, the associated attributes are used as edges, all clue data in the selected time are subjected to directed series connection, and a directed graph formed by the internal clues and the associated attributes is formed. And then, traversing the directed graph to extract the information stealing line cable chain, establishing an information stealing evaluation function to evaluate each line cable chain according to the number of the cable points and the integrity of the cable stages. And extracting and alarming the wire chain with higher evaluation risk value. And targeted clue mining and clue accuracy evaluation can be performed on the information stealing behavior in the sensitive intranet. The capability of discovering and sensing the information stealing and divulging secret of the internal network in advance is greatly enhanced. The method mainly includes the steps of adopting a data mining technology, mining possible stealing and divulging key information from massive log information generated by various protection and supervision devices in an intranet, and carrying out stealing and divulging risk assessment on the key information according to stealing and divulging psychology research to find out key information with large hidden danger of stealing and divulging.
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Example one
The embodiment provides a method for extracting and evaluating stolen information clues based on a time sequence directed graph, which comprises the following steps:
s1: acquiring log information and extracting clues of the log information in the acquisition process;
it should be noted that, the implementation process of this step is as follows:
s11: acquiring mass log data generated by various protection and supervision equipment in an intranet;
s12: cleaning and labeling the log data in the log data acquisition process to form paradigm clue data; the normalized pre-done data in this step at least includes information of four aspects of thread body, associated attribute, thread stage and thread time;
s13: the acquired log data are divided into stages according to the process characteristics of the stealing and divulging events; the attack chain model is optimized according to the behavior habit of intranet attack, and log data are divided into the following parts according to the time sequence: intention stage A, preparation stage B, action stage C, mask stage D;
the characteristics of each stage of a theft and divulgence event are different;
an intention stage: subjective consciousness, attitude change, temperament change and abnormal daily behavior;
a preparation stage: preparing activities, social activities, collecting information and trying to break through;
an action stage: essential actions, scanning for penetration, system intrusion, deploying tools;
a covering stage: good post-processing, trace erasing, tool unloading, data transfer.
It should be further explained that, the thread extraction process refers to SYSLOG transmission standard to receive logs, analyzes and formalizes the received logs by configuring an analysis template, and stores the formalized data to form the log thread set items at different stages with event alarms and logs as threads
Figure BDA0001843496140000071
Where event cues A, B, C, D represent a collection of log cues belonging to four different phases, respectively, the cues in this collection should contain at least two associated attributes and one time attribute, and E represents all log cues in the network and user environment. These log threads mentioned herein originate from network devices, security devices, SOC platforms, application systems, and other operation and maintenance systems in the intranet.
S2: performing directed series segmentation on the acquired log clues, and forming a limited directed graph by all clues in the network within a determined time range; furthermore, each clue data is used as a vertex, the associated attribute is used as an edge, and all clue data in the selected time are subjected to directed series connection to form a directed graph formed by the internal clues and the associated attributes. Subsequent clue extraction work and clue evaluation work are analyzed based on the directed graph;
the edges of the log threads are connected through the correlation attributes of the log thread E, and the time sequence of the occurrence of the log threads determines the direction of an arrow connecting the connecting edges of the two log threads.
S3: extracting information stealing line cable chains from the digraph; traversing all paths according to the direction of the directed graph, and inquiring and storing each path as a list, so that each path in the list is an information stealing clue formed by log clues according to a time sequence; multiple paths are combined into a set of stealing threads { L }iWhere, set { L }iEach element L iniA log line chain;
Figure BDA0001843496140000072
Tiis a set LiCan be evaluated as a log-wire chain of information-stealing cues.
And forming a directed graph in the network, wherein the data volume analyzed after the time direction of the directed graph is reduced, and a large amount of computing resources are saved. There is an absolute efficiency advantage over traditional correlation analysis when analyzing data from very large data sets. The original information stealing clues captured by filtering in the time direction are more consistent with the crime rule of the information stealing activity, and the matching degree of the original information stealing clues is improved.
S4: establishing an information stealing evaluation function to carry out clue evaluation on each line cable chain;
it should be noted that, the implementation of this step is as follows:
s41: establishing a piecewise evaluation function FA (L)i),FB(Li),FC(Li),FD(Li) Respectively to information stealing clue TiPerforming evaluation of stage performance;
Figure BDA0001843496140000081
Tiis a set LiA log-line chain that can be evaluated as an information-stealing cue; wherein, FA (L)i) Chain T for representing information cluesiEvaluation of risk Performance in intentional phase A, FA (L)i) The evaluation of the stage is scored from two aspects, the number of cable logs within the stage and the risk level of the logs, FA (L)i)=ValCnt(Li)+ValLel(Li),ValCnt(Li) Is a piecewise function, as follows:
Figure BDA0001843496140000082
CntA(Li) Is a thread chain LiNumber of threads in stage a.
ValLel(Li)=10*(ExLow(Li)+ExMid(Li)+ExHig(Li))+15*(ExMid(Li)+ExHig(Li))+25*ExHig(Li),
ExLow(Li)、ExMid(Li)、ExHig(Li) Respectively represent LiAnd whether a low risk level log exists, whether a medium risk level log exists and whether a high risk level log exists are determined, wherein the value is {0, 1 }.
Similarly, FB (L) is analogizedi),FC(Li),FD(Li) The risk performance evaluation values of the preparation stage B, the action stage C and the covering stage D are respectively shown.
S42: setting weights WA, WB, WC and WD occupied by the segmented evaluation function in the information stealing event; wherein the weight setting of the segment evaluation is obtained according to historical data analysis, and adjustment is allowed according to data in an application scene;
s43: establishing and evaluating function FA (L)i),FB(Li),FC(Li),FD(Li) And the information stealing comprehensive evaluation function P (L) related to the weights WA, WB, WC and WDi) The extracted information stealing clues are subjected to clue evaluation and evaluation values are calculated, wherein,
Figure BDA0001843496140000091
wherein
Figure BDA0001843496140000092
As an adjusting function, evaluating the coverage of the cable to the stages, and performing forward adjustment on the cable chain covered to all the stages;
Figure BDA0001843496140000093
CoverAll(Li) Represents LiWhether the clue in (1) covers all stages is set as {0, 1 }.
And (4) exciting the log clues conforming to the whole process risk through the adjusting function, thereby achieving the purpose of carrying out accuracy adjustment according to specific data and services.
The method further comprises the following steps:
s5: performing service-related operation on the information stealing clues, wherein the operation comprises sorting according to the magnitude of the evaluation value, extracting and alarming the cable chain with higher evaluation risk value;
the accuracy of the clues grabbed by the sectional evaluation of the stealing clues is greatly improved, the number of the false alarms is effectively reduced, and the proportion of the false alarms is reduced. The model is established and a segmented evaluation method is introduced, so that index quantification can be performed on information stealing analysis behaviors through logs, and quantification indexes and algorithm support are provided for subsequent big data mining and artificial intelligence analysis.
Example two
As shown in fig. 1, the present embodiment provides a stolen information clue extraction and segmentation evaluation method based on a time-series directed graph, which includes the following steps:
s1: acquiring log information and extracting clues of the log information in the acquisition process;
it should be noted that, the implementation process of this step is as follows:
s11: acquiring mass log data generated by various protection and supervision equipment in an intranet;
s12: cleaning and labeling the log data in the log data acquisition process to form paradigm clue data; the normalized pre-done data in this step at least includes information of four aspects of thread body, associated attribute, thread stage and thread time;
s13: the acquired log data are divided into stages according to the process characteristics of the stealing and divulging events; the attack chain model is optimized according to the behavior habit of intranet attack, and log data are divided into the following parts according to the time sequence: intention stage A, preparation stage B, action stage C, mask stage D;
the characteristics of each stage of a theft and divulgence event are different;
an intention stage: subjective consciousness, attitude change, temperament change and abnormal daily behavior;
a preparation stage: preparing activities, social activities, collecting information and trying to break through;
an action stage: essential actions, scanning for penetration, system intrusion, deploying tools;
a covering stage: good post-processing, trace erasing, tool unloading, data transfer.
It should be further explained that, the thread extraction process refers to SYSLOG transmission standard to receive logs, analyzes and formalizes the received logs by configuring an analysis template, and stores the formalized data to form the log thread set items at different stages with event alarms and logs as threads
Figure BDA0001843496140000101
Where event cues A, B, C, D represent a collection of log cues belonging to four different phases, respectively, the cues in this collection should contain at least two associated attributes and one time attribute, and E represents all log cues in the network and user environment. These log threads mentioned herein originate from network devices, security devices, SOC platforms, application systems, and other operation and maintenance systems in the intranet.
S2: performing directed series segmentation on the acquired log clues, and forming a limited directed graph by all clues in the network within a determined time range; furthermore, each clue data is used as a vertex, the associated attribute is used as an edge, and all clue data in the selected time are subjected to directed series connection to form a directed graph formed by the internal clues and the associated attributes. Subsequent clue extraction work and clue evaluation work are analyzed based on the directed graph;
the edges of the log threads are connected through the correlation attributes of the log thread E, and the time sequence of the occurrence of the log threads determines the direction of an arrow connecting the connecting edges of the two log threads.
As shown in FIG. 2, Ai,Bi,Ci,DiRespectively representing clue data in four stages of A, B, C and D, and A1By associating an attribute with A2,B1The method comprises the following steps that (1) an association relation exists, and the arrow direction represents the time sequence of time attributes in clue data; a. the3By associating an attribute with A4,B1There is an associative relationship, and the arrow direction represents the temporal order of the temporal attributes in the cue data. And by analogy, a directed relationship graph shown in the upper graph is formed.
S3: extracting information stealing line cable chains from the digraph; traversing all paths according to the direction of the directed graph, and inquiring and storing each path as a list, so that each path in the list is an information stealing clue formed by log clues according to a time sequence; multiple paths are combined into a set of stealing threads { L }iWhere, set { L }iEach element L iniA log line chain;
Figure BDA0001843496140000113
Tiis a set LiCan be evaluated as a log-wire chain of information-stealing cues.
And forming a directed graph in the network, wherein the data volume analyzed after the time direction of the directed graph is reduced, and a large amount of computing resources are saved. There is an absolute efficiency advantage over traditional correlation analysis when analyzing data from very large data sets. The original information stealing clues captured by filtering in the time direction are more consistent with the crime rule of the information stealing activity, and the matching degree of the original information stealing clues is improved.
S4: establishing an information stealing evaluation function to carry out clue evaluation on each line cable chain;
it should be noted that, the implementation of this step is as follows:
s41: establishing a piecewise evaluation function FA (L)i),FB(Li),FC(Li),FD(Li) Respectively to information stealing clue TiPerforming evaluation of stage performance;
Figure BDA0001843496140000111
Tiis a set LiA log-line chain that can be evaluated as an information-stealing cue; wherein, FA (L)i) Chain T for representing information cluesiEvaluation of risk Performance in intentional phase A, FA (L)i) The evaluation of the stage is scored from two aspects, the number of cable logs within the stage and the risk level of the logs, FA (L)i)=ValCnt(Li)+ValLel(Li),ValCnt(Li) Is a piecewise function, as follows:
Figure BDA0001843496140000112
CntA(Li) Is a thread chain LiNumber of threads in stage a.
ValLel(Li)=10*(ExLow(Li)+ExMid(Li)+ExHig(Li))+15*(ExMid(Li)+ExHig(Li))+25*ExHig(Li),
ExLow(Li)、ExMid(Li)、ExHig(Li) Respectively represent LiAnd whether a low risk level log exists, whether a medium risk level log exists and whether a high risk level log exists are determined, wherein the value is {0, 1 }.
Similarly, FB (L) is analogizedi),FC(Li),FD(Li) The risk performance evaluation values of the preparation stage B, the action stage C and the covering stage D are respectively shown.
The evaluation value of each stage needs to consider the number of clues found in the stage, the hazard level of the clues found in the stage, and the evaluation value range is [ o,100 ];
s42: setting weights WA, WB, WC and WD occupied by the segmented evaluation function in the information stealing event; wherein the weight setting of the segment evaluation is obtained according to historical data analysis, and adjustment is allowed according to data in an application scene;
in this embodiment, it can be known through analysis of historical data and cases in an intranet that influence weights of threads appearing in four stages on information theft case occurrence risks are 1:3:4:12, and then WA is set to be 5%, WB is set to be 15%, WC is 20%, WD is set to be 60%, and in an actual implementation process, the weights of the parts are allowed to be adjusted according to data in an application scene;
s43: establishing and evaluating function FA (L)i),FB(Li),FC(Li),FD(Li) And the information stealing comprehensive evaluation function P (L) related to the weights WA, WB, WC and WDi) The extracted information stealing clues are subjected to clue evaluation and evaluation values are calculated, wherein,
Figure BDA0001843496140000121
wherein
Figure BDA0001843496140000122
As an adjusting function, evaluating the coverage of the cable to the stages, and performing forward adjustment on the cable chain covered to all the stages;
Figure BDA0001843496140000123
CoverAll(Li) Represents LiWhether the clue in (1) covers all stages is set as {0, 1 }.
And (4) exciting the log clues conforming to the whole process risk through the adjusting function, thereby achieving the purpose of carrying out accuracy adjustment according to specific data and services.
The method further comprises the following steps:
s5: performing service-related operation on the information stealing clues, wherein the operation comprises sorting according to the magnitude of the evaluation value, extracting and alarming the cable chain with higher evaluation risk value;
for example, if the risk assessment value is higher than 60, it indicates that at least one behavior masking D clue is found, and the occurrence of the clue at this stage indicates that there is a behavior of deleting the audit log or operation trace, the corresponding event and the related personnel should be further investigated, so that the line with a high assessment value generates an alarm to remind the service personnel to further check and handle the event.
The accuracy of the clues grabbed by the sectional evaluation of the stealing clues is greatly improved, the number of the false alarms is effectively reduced, and the proportion of the false alarms is reduced. The model is established and a segmented evaluation method is introduced, so that index quantification can be performed on information stealing analysis behaviors through logs, and quantification indexes and algorithm support are provided for subsequent big data mining and artificial intelligence analysis.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (6)

1. A time-series directed graph-based stolen information clue extraction and segmentation evaluation method is characterized by comprising the following steps:
acquiring log information and extracting clues of the log information in the acquisition process;
performing directed series segmentation on the acquired log clues, and forming a limited directed graph by all clues in the network within a determined time range;
extracting information stealing line cable chains from the digraph;
establishing an information stealing evaluation function to carry out clue evaluation on each line cable chain;
the steps of acquiring the log information and performing clue extraction on the log information in the acquisition process specifically include: acquiring mass log data generated by various protection and supervision equipment in an intranet; cleaning and labeling the log data in the log data acquisition process to form paradigm clue data; the acquired log data are divided into stages according to the process characteristics of the stealing and divulging events; the attack chain model is optimized according to the behavior habit of intranet attack, and log data are divided into: intention stage, preparation stage, action stage, mask stage;
cleaning and labeling log data in the log data acquisition process to form paradigm clue dataIn the step, the canonicalized cue data at least includes four information of cue main body, associated attribute, cue stage and cue time; the method comprises the following specific steps: receiving the log according to SYSLOG log transmission standard; analyzing and normalizing the received log by configuring an analysis template; storing the normalized data to form log thread set items in different stages with event alarm and log as threads
Figure FDA0002960916750000011
Wherein, the event cues A, B, C, D represent the collection of log cues belonging to four different phases respectively, and E represents all log cues in the network and user environment;
the step of extracting the information stealing line cable chain from the directed graph specifically comprises the following steps: each path is an information stealing cue composed of log cues according to the time sequence, and a plurality of paths are combined into an information stealing cue set { L }iWhere, set { L }iEach element L iniA log line chain;
the step of establishing an information stealing evaluation function to perform clue evaluation on each line cable chain specifically comprises the following steps:
establishing and evaluating function FA (L)i),FB(Li),FC(Li),FD(Li) And the information stealing comprehensive evaluation function P (L) related to the weights WA, WB, WC and WDi) And performing clue evaluation on the extracted information stealing clues, and calculating an evaluation value.
2. The method as claimed in claim 1, wherein the step of storing the normalized data forms a log thread set with event alarm and log as threads, and the threads in the set at least comprise two correlation attributes and one time attribute.
3. The method as claimed in claim 2, wherein the step of performing directional concatenation segmentation on the obtained log clues forms a limited digraph for all clues in the network within a certain time range, and the specific steps are as follows:
and performing directed series connection on all the clue data in the selected time by taking each clue data as a vertex and the associated attributes as edges to form a directed graph consisting of the internal clue and the associated attributes.
4. The method according to claim 3, wherein the step of using each thread data as a vertex and associated attributes as edges and performing directed concatenation on all thread data in a selected time forms a directed graph composed of intranet threads and associated attributes, and the chronological order of the occurrence of the log threads determines the direction of an arrow connecting the connecting edges of the two log threads.
5. The stolen information clue extraction and segmentation evaluation method based on the time-series directed graph as claimed in claim 4, wherein the step of establishing an information stealing evaluation function to perform clue evaluation on each line chain comprises:
establishing a piecewise evaluation function FA (L)i),FB(Li),FC(Li),FD(Li) Respectively to information stealing clue TiPerforming evaluation of stage performance; wherein the content of the first and second substances,
FA(Li)=ValCnt(Li)+ValLel(Li),ValCnt(Li) Is a function of the segment to be determined,
Figure FDA0002960916750000021
CntA(Li) Is a thread chain LiNumber of threads in stage a;
ValLel(Li)=10*(ExLow(Li)+ExMid(Li)+ExHig(Li))+15*(ExMid(Li)+ExHig(Li))+25*ExHig(Li),
ExLow(Li)、ExMid(Li)、ExHig(Li) Respectively represent LiWhether a low risk level log exists, whether a medium risk level log exists and whether a high risk level log exists are judged, and the value is {0, 1 };
setting weights WA, WB, WC and WD occupied by the segmented evaluation function in the information stealing event; wherein the weight setting of the segment evaluation is obtained according to historical data analysis;
establishing and evaluating function FA (L)i),FB(Li),FC(Li),FD(Li) And the information stealing comprehensive evaluation function P (L) related to the weights WA, WB, WC and WDi) The extracted information stealing clues are subjected to clue evaluation and evaluation values are calculated, wherein,
Figure FDA0002960916750000031
Figure FDA0002960916750000032
CoverAll(Li) Represents LiWhether the clue in (1) covers all stages is set as {0, 1 }.
6. The stolen information cue extraction and segmentation evaluation method based on the time series directed graph as claimed in claim 1, wherein the method further comprises:
and performing business operation on the information stealing clues, wherein the business operation comprises sorting according to the magnitude of the evaluation value, extracting the cable chain with higher evaluation risk value and giving an alarm.
CN201811259183.3A 2018-10-26 2018-10-26 Time sequence directed graph-based stolen information clue extraction and segmented evaluation method Active CN109284317B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811259183.3A CN109284317B (en) 2018-10-26 2018-10-26 Time sequence directed graph-based stolen information clue extraction and segmented evaluation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811259183.3A CN109284317B (en) 2018-10-26 2018-10-26 Time sequence directed graph-based stolen information clue extraction and segmented evaluation method

Publications (2)

Publication Number Publication Date
CN109284317A CN109284317A (en) 2019-01-29
CN109284317B true CN109284317B (en) 2021-07-06

Family

ID=65177434

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811259183.3A Active CN109284317B (en) 2018-10-26 2018-10-26 Time sequence directed graph-based stolen information clue extraction and segmented evaluation method

Country Status (1)

Country Link
CN (1) CN109284317B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111147300B (en) * 2019-12-26 2022-04-29 绿盟科技集团股份有限公司 Network security alarm confidence evaluation method and device
CN111538741B (en) * 2020-03-23 2021-04-02 重庆特斯联智慧科技股份有限公司 Deep learning analysis method and system for big data of alarm condition

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104125217A (en) * 2014-06-30 2014-10-29 复旦大学 Cloud data center real-time risk assessment method based on mainframe log analysis
CN107483425A (en) * 2017-08-08 2017-12-15 北京盛华安信息技术有限公司 Composite attack detection method based on attack chain
CN107888607A (en) * 2017-11-28 2018-04-06 新华三技术有限公司 A kind of Cyberthreat detection method, device and network management device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090113549A1 (en) * 2007-10-24 2009-04-30 International Business Machines Corporation System and method to analyze software systems against tampering
CN105959328B (en) * 2016-07-15 2019-03-12 北京工业大学 The network forensics method and system that evidence figure is combined with loophole reasoning
CN106685921B (en) * 2016-11-14 2019-06-21 中国人民解放军信息工程大学 Network equipment methods of risk assessment
CN107370755B (en) * 2017-08-23 2020-03-03 杭州安恒信息技术股份有限公司 Method for multi-dimensional deep detection of APT (active Power test) attack

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104125217A (en) * 2014-06-30 2014-10-29 复旦大学 Cloud data center real-time risk assessment method based on mainframe log analysis
CN107483425A (en) * 2017-08-08 2017-12-15 北京盛华安信息技术有限公司 Composite attack detection method based on attack chain
CN107888607A (en) * 2017-11-28 2018-04-06 新华三技术有限公司 A kind of Cyberthreat detection method, device and network management device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于扩展有向图的复合攻击模型及检测方法研究;张爱芳;《中国博士学位论文全文数据库信息科技辑》;20091215;参见论文第三、四、六章 *

Also Published As

Publication number Publication date
CN109284317A (en) 2019-01-29

Similar Documents

Publication Publication Date Title
Ektefa et al. Intrusion detection using data mining techniques
CN103368979B (en) Network security verifying device based on improved K-means algorithm
CN106209817B (en) Information network security based on big data and trust computing is from system of defense
CN106375339B (en) Attack mode detection method based on event sliding window
CN112114995B (en) Terminal abnormality analysis method, device, equipment and storage medium based on process
CN111355697B (en) Detection method, device, equipment and storage medium for botnet domain name family
CN105009132A (en) Event correlation based on confidence factor
EP2936772B1 (en) Network security management
KR101692982B1 (en) Automatic access control system of detecting threat using log analysis and automatic feature learning
CN112039862A (en) Multi-dimensional stereo network-oriented security event early warning method
CN109284317B (en) Time sequence directed graph-based stolen information clue extraction and segmented evaluation method
CN106100885A (en) Network security alarm system and design scheme
CN105959162A (en) Distributed electric power enterprise information network safety management system
CN110519231A (en) A kind of cross-domain data exchange supervisory systems and method
Milan et al. Reducing false alarms in intrusion detection systems–a survey
Ebrahimi et al. Automatic attack scenario discovering based on a new alert correlation method
Aung et al. Association rule pattern mining approaches network anomaly detection
Mohamed et al. Alert correlation using a novel clustering approach
CN117033501A (en) Big data acquisition and analysis system
Othman et al. Improving signature detection classification model using features selection based on customized features
Lu et al. One intrusion detection method based on uniformed conditional dynamic mutual information
Kosamkar et al. Data Mining Algorithms for Intrusion Detection System: An Overview
CN116232695A (en) Network security operation and maintenance association analysis system
CN107623677B (en) Method and device for determining data security
Junjing Research on the application of cluster analysis in criminal community detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 2530, building 2, Aosheng building, 1166 Xinluo street, high tech Zone, Jinan, Shandong 250100

Applicant after: Zhongfu Safety Technology Co.,Ltd.

Applicant after: ZHONGFU INFORMATION Co.,Ltd.

Applicant after: BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT Co.,Ltd.

Address before: Room 2530, building 2, Aosheng building, 1166 Xinluo street, high tech Zone, Jinan, Shandong 250100

Applicant before: SHANDONG ZHONGFU SAFETY TECHNOLOGY Co.,Ltd.

Applicant before: ZHONGFU INFORMATION Co.,Ltd.

Applicant before: BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211015

Address after: Room 2530, building 2, Aosheng building, 1166 Xinluo street, high tech Zone, Jinan, Shandong 250100

Patentee after: Zhongfu Safety Technology Co.,Ltd.

Address before: Room 2530, building 2, Aosheng building, 1166 Xinluo street, high tech Zone, Jinan, Shandong 250100

Patentee before: Zhongfu Safety Technology Co.,Ltd.

Patentee before: ZHONGFU INFORMATION Co.,Ltd.

Patentee before: BEIJING ZHONGFU TAIHE TECHNOLOGY DEVELOPMENT Co.,Ltd.