CN107172022B - APT threat detection method and system based on intrusion path - Google Patents

APT threat detection method and system based on intrusion path Download PDF

Info

Publication number
CN107172022B
CN107172022B CN201710303758.6A CN201710303758A CN107172022B CN 107172022 B CN107172022 B CN 107172022B CN 201710303758 A CN201710303758 A CN 201710303758A CN 107172022 B CN107172022 B CN 107172022B
Authority
CN
China
Prior art keywords
data
text
evidence
behavior
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710303758.6A
Other languages
Chinese (zh)
Other versions
CN107172022A (en
Inventor
彭光辉
屈立笳
陶磊
苏礼刚
林伟
黄丽洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Goldtel Industry Group Co ltd
Original Assignee
Chengdu Goldtel Industry Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Goldtel Industry Group Co ltd filed Critical Chengdu Goldtel Industry Group Co ltd
Priority to CN201710303758.6A priority Critical patent/CN107172022B/en
Publication of CN107172022A publication Critical patent/CN107172022A/en
Application granted granted Critical
Publication of CN107172022B publication Critical patent/CN107172022B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2463/00Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00
    • H04L2463/146Tracing the source of attacks

Abstract

The invention relates to an APT threat detection method and system based on an intrusion path, the APT threat detection method based on the intrusion path comprises the following steps: s1: modeling a knowledge base of the intrusion path field; s2: acquiring behavior data, including acquiring host behavior data and acquiring network behavior data; s3: performing correlation analysis on the result of the collected behavior data; s4: the evidence is preserved, and the evidence preservation of the attack risk behavior is restored; s5: presenting the evidence. The APT threat detection system based on the intrusion approach is composed of an evidence presentation module, a behavior evidence correlation analysis module, a knowledge base module, an evidence preservation module and an evidence collection module. The invention has the beneficial effects that: the method has the advantages that the invasion of the APT attack initiator is cut off from the source, the construction with low cost and high efficiency aiming at the invasion path is realized, the acquisition process is hidden and completely transparent, the network has no burden, the evidence presentation is easy to use, and the operation is simple.

Description

APT threat detection method and system based on intrusion path
Technical Field
The invention relates to the technical field of APT threat detection, in particular to an APT threat detection method and system based on an intrusion path.
Background
Finance and government are major target industries for APT attacks, up to 84% and 77%, respectively. The following is a telecommunications reach of 66%, military reach of 64%, industrial enterprise 54%, and others account for 14%. Email and social networking sites are the most prominent way for hackers to launch APT attacks, with email being utilized up to 68% and social networking sites being utilized up to 65%. Email and social networking sites even go beyond traditional hacking approaches such as viruses, malicious links, phishing sites, etc.
With the trend, people can see that in recent years, along with the popularity of social networks, traditional security protection means of enterprises cannot effectively control the social networks, and emails are always serious disaster areas of enterprise security protection. In addition to the lack of effective security management policies, employee security awareness is particularly important in this regard. The operation of the e-mail and the social network site belong to the personnel, and an attacker also sees opportunities at the point, so that the penetration of the e-mail and the social network site of the personnel who are security conscious and single in the enterprise is taken as an entrance, and the server and the network of the enterprise are invaded step by step.
The main reason why APT attacks are difficult to defend by an attacker is that their unique attack patterns and means are difficult to detect. The APT attack is a great concern for information security, and the prevention of the threat must be integrated into a larger monitoring and prevention strategy and the existing network defense is integrated. Therefore, the user can pay more attention to how to strengthen and prevent APT attack and advanced threat, avoid the attack from damaging the network and leaking sensitive information, and can more fully exert the security protection products and technologies invested by the user.
By investigating and researching the current mainstream technical means for preventing APT attack at home and abroad, 77% of users consider the anomaly detection scheme to be the most effective. Additionally, the sandbox scheme has 69% user selection, the full flow audit scheme has 66% user selection, and 55% user selection based on unknown malicious code detection.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide an APT threat detection method and system based on an intrusion path, so that the intrusion of an APT attack initiator is cut off from the source, and the construction target with low cost and high efficiency is achieved.
The purpose of the invention is realized by the following technical scheme: an APT threat detection method based on an intrusion path comprises the following steps:
s1: modeling a knowledge base of the intrusion path field;
s2: acquiring behavior data, including acquiring host behavior data and acquiring network behavior data, wherein the acquired host behavior data includes acquiring an incoming/outgoing line information record, a port information record, an operation record of disk data, a system registry information change record, a terminal system basic information update record, a peripheral equipment connection and data transmission record and a third-party application program information record, the acquired network behavior data is obtained by firstly carrying out classification marking on network behaviors, then carrying out network behavior data restoration, recording and tracking operation and maintenance service information and external connection information of a network system, and finally storing the data locally;
s3: performing correlation analysis on the result of the collected behavior data, firstly reconstructing an attack process, and performing correlation analysis on the behavior data respectively from the host and the network according to set logic conditions;
s4: the evidence is preserved, and the evidence preservation of the attack risk behavior is restored to form a complete evidence aiming at the accident;
s5: presenting the evidence.
The APT intrusion path is mail and social network sites.
The knowledge base modeling adopts a multi-dimensional heterogeneous data source integration and integration model to realize the synthesis and integration of various heterogeneous data sources, can dynamically call a proper data mining algorithm, improves the analysis efficiency, and has the main design idea that:
a. using a unified knowledge representation approach: the forms of data of the internet include structured data, semi-structured data and unstructured data. The structured data only accounts for 10%, the rest 90% of the structured data are semi-structured and unstructured data, XML is used as a basic storage form of the data, the basic storage form comprises data formats, knowledge models and semantic metadata expression, and a plurality of heterogeneous data sources can be integrated across Internet and Intranet on a cooperative platform;
b. protocol conversion: and converting the data acquired by the self-collecting equipment of the system in real time, identifying the protocol type of the data according to the protocol type library, and storing the protocol type in the database. The data collected by other monitoring equipment (such as intrusion detection, firewall, content audit and the like) is converted and converged in a real-time or non-real-time mode;
c. dynamic loading algorithm: each rule may be dynamically associated with a plurality of specific analysis objects, and the dynamic loading algorithm periodically obtains data from the data source according to the data extraction time of the rule and stores the data in the case database. While no extraction is done for already existing data.
The knowledge base comprises: the system comprises a user information record table, a mail feature library, a social platform feature library, an intranet network data flow feature library and a user behavior feature library.
The network behavior acquisition adopts a data distribution technology based on CIP and SIP, and supports rapid interception, distribution and restoration of big data; an improved AC-BM algorithm is adopted in the keyword matching technology, so that the searching efficiency is improved; the load balance of a large-scale network center is realized by adopting an efficient load balance algorithm; and a node detection mode is adopted to carry out data exchange and communication among different hosts, so that the overall throughput rate of the system is improved.
The improved AC-BM algorithm is the BMH2 algorithm, and is given as a character set as Σ, taking "string search" as an example for a pattern, a character set a1 { "t '," i ', "n '," g ', "0 '," e ', "a '," c ', "h ' } in which characters in Σ occur once in the pattern, a2 {" s ', "r ' } in which two or more times occur, and a character set a0 { -a1-a2 in which 0 times occur. If inspired by text [ k ], the BMH algorithm actually aligns and then re-matches the last text [ k ] in the pattern with the text [ k ] in the text. Thus, when text [ k ] ∈ A0, the pointer scanning the text can move forward by a maximum distance m (pattern length). The basic starting point here is to let the text pointer move forward by a maximum distance m with a higher probability. Assuming that a new round of matching can be started after the text [ k ] appearing last in pattern is aligned with the text [ k ] in the text, the text pointer can move forward by the maximum distance m when the text [ k ] belongs to (A0U A1), and the moving distance of the text pointer can be increased when the text [ k ] belongs to A2. For example, the spacing between's' and 'r' in the pattern string is 7 and 8, respectively, and accordingly, the moving distance of the text pointer will be increased by 7 and 8, respectively.
For this reason, a newSkip array is added, and if the number of times that the character ch appears in the pattern string pattern is 0 or 1, newSkip [ ch ] is m; if the character ch appears in pattern at 2 or more, and f denotes the position where ch appears the second time from the last in pattern (subscript starts from 0), then newSkip [ ch ] ═ m-f-1. In addition, a preChar array is defined, and if the character ch finally appears in pattern [ e ] in the pattern string pattern, the preChar [ ch ] ═ pattern [ e-1 ]; if the character ch has not appeared in pattern, then preChar [ ch ] ═ 1. When pattern [0] appears only once in the pattern string, newSkip [ pattern [0] ] is assigned to m-1 alone since pattern [0] is not preceded by a character. The length of the newSkip array and the length of the preChar array are the same as the length of the skip array, and the newSkip array and the preChar array are the number of elements in the character set. If the code is ASCII code, the length is 256.
When a match begins to compare text [ k-m +1 … k ] with pattern [0 … m-1], text [ k ] … text [ k-m +1] is checked sequentially from right to left. If no match is found, text [ k-1] and preChar [ text [ k ] ]arecompared. When text [ k-1 ]! Reassign the text pointer to k + newSkip [ text [ k ] ]; otherwise the text pointer is reassigned to k + skip [ text [ k ] ]. In fact, preChar [ text [ k ] ] can be initialized to any value when text [ k ] does not appear in the pattern string. Since the values of skip [ text [ k ] ] and newSkip [ text [ k ] ] are both m at this time, the text pointer will be reassigned to k + m regardless of whether the values of text [ k-1] and preChar [ text [ k ] ] are equal.
The BMH2 algorithm achieves higher matching efficiency by increasing the average moving distance of the pattern string. When the pattern string has no same characters or the distance between the same characters is larger, the BMH2 algorithm can achieve better matching efficiency.
The objects for collecting the network behaviors comprise mail data collection, social platform application data collection, intranet transport layer data stream collection, database protocol data collection and remote control protocol data collection.
The threat detection method adopts a protocol analysis and transmission flow analysis technology to analyze and research the transmission flow and an application layer protocol in an important service system, analyzes the conversation process and the conversation characteristics of the protocols, grasps the user behavior, and realizes evidence preservation of external connection behavior of an information system and detection of abnormal data.
Evidence presentation takes two different forms, chart and list. The query of the chart adopts a drilling mode, and goes deep layer by layer from the whole to the details. The list provides combined query, and all behavior logs and manual judging logs provide a multi-condition combined query mode. Including behavioral patterns, behavioral objects, specific IPs, etc. The method is easy to use and simple to operate, and facilitates users to master the overall behavior situation and trace the APT attack event.
The association analysis adopts a case-based data management and knowledge discovery model CDMKDM. The model can realize sorting, merging and filing of a large amount of original data collected from a network, extract interesting knowledge and information in the original data, establish implicit association relation on related information according to the service requirement of actual work, provide an intuitive knowledge representation mode and assist users to make decisions by fully utilizing network data.
The detection method also comprises theme-oriented information classification used for a data collection part and a background data mining part, wherein the event classification is to classify data under given conditions, and the alarm event can be automatically classified by using a classification technology, so that the confirmation of the abnormal event is realized.
The detection method also comprises event clustering, wherein various evidences are analyzed and clustered to realize dynamic perception of various safety events, and the event clustering is to divide the data into different data classes according to different characteristics of the data under an unsupervised condition. The aim is to keep the distance between individuals belonging to the same category as small as possible, while the distance between individuals belonging to different categories is as large as possible. In the forensics system based on active defense, alarm events occurring in clusters can be analyzed through clustering, abnormal rules are found, and therefore early warning information is generated.
An APT threat detection system based on an intrusion approach, comprising: the system comprises an evidence presentation module, a behavior evidence correlation analysis module, a knowledge base module, an evidence preservation module and an evidence collection module, wherein the behavior evidence correlation analysis module, the knowledge base module and the evidence preservation module are respectively connected with the evidence presentation module; the behavior evidence correlation analysis module is connected with the knowledge base module, and the knowledge base module is connected with the evidence preservation module; the knowledge base and the evidence preservation module are respectively connected with the evidence collection module.
The evidence presentation module comprises a behavior evidence presentation module, an evidence auxiliary analysis module and a behavior main body responsibility confirmation module, the evidence presentation module is connected with the evidence auxiliary analysis module, and the evidence auxiliary analysis module is connected with the behavior main body responsibility confirmation module;
the behavior evidence correlation analysis module comprises a host operation behavior module, a network communication behavior module, a business behavior module, a remote service behavior module and a correlation module, wherein the host operation behavior module, the network communication behavior module, the business behavior module and the remote service behavior module are respectively connected with the correlation module;
the knowledge base module comprises a user information record base, a user safety requirement base, a threat model base, a behavior risk assessment standard base, a regulation and regulation base and an evidence collection strategy base, wherein the first end of the threat model base and the first end of the behavior risk assessment standard base are respectively connected with the user information record base and the user safety requirement base, the second end of the threat model base is connected with the first end of the regulation and regulation base, the second end of the behavior risk assessment standard base is connected with the first end of the regulation and regulation base, and the second end of the regulation and regulation base is connected with the evidence collection strategy base;
the evidence preservation module comprises an evidence module, a regulation processing module, an original data module, a technical processing module and a data storage module, wherein the data storage module is connected with a first end of the original data module, the first end of the technical processing module and the first end of the regulation processing module are respectively connected with a second end of the original data module, and the second end of the regulation processing module is connected with the evidence module;
the evidence collection module comprises a host behavior collection module, a network behavior collection module, various servers, hosts and equipment, wherein the host behavior collection module and the network behavior collection module are respectively connected with the various servers, the hosts and the equipment.
The system comprises an evidence presentation module, a behavior evidence correlation analysis module, a knowledge base module, an evidence preservation module and an evidence collection module which are connected by adopting an Intranet technology.
Data transmission among an evidence presentation module, a behavior evidence correlation analysis module, a knowledge base module, an evidence preservation module and an evidence collection module of the system is carried out in an encryption mode and comprises user authentication and authority management. The evidence collection module collects data of each collection area, and the collection areas can be network management centers of a basic information network and an important information system.
And the behavior evidence correlation analysis module performs correlation analysis on the data in the front-end data collection module and classifies the evidence data according to the content of the data.
The evidence preservation module generates a network attack and breach event data record.
The evidence presenting module is mainly a variety of inquiry/management terminals. The evidence presentation module generates various reports and analysis reports according to the requirements of the using main body, queries the data warehouse contents by using a friendly interface, realizes session replay, and manages and maintains various platforms, such as backup, deletion and the like.
When the system runs, the evidence presentation module, the behavior evidence correlation analysis module, the knowledge base module, the evidence preservation module and the evidence collection module are connected dynamically and at a high speed, on one hand, the equipment in the evidence area acquires rules from a rule base of the platform through a collector, dynamically stores the acquired data in the platform and realizes alarming; on the other hand, the user authentication mechanism receives the inquiry/management request of each device of the user analysis platform, and provides data analysis or rule modification service.
The invention has the beneficial effects that:
1) analyzing and mining personal internet access behavior data in an attacked organization mechanism, identifying possible spearphishing attacks and disguised attacks, and intercepting the invasion of an APT attack initiator from the source;
2) before the APT attack is initiated, the invasion is prevented in advance, and the construction target with low cost and high efficiency is achieved;
3) the acquisition process is concealed and completely transparent, so that the network is not burdened, and the operation of other network equipment is not influenced;
4) the evidence presentation is easy to use, the operation is simple, and a user can conveniently master the overall behavior situation and trace the APT attack event.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a block diagram of the system of the present invention;
FIG. 3 is a business flow diagram of the present invention;
fig. 4 is a system architecture diagram of the present invention.
Detailed Description
The technical solution of the present invention is further described in detail with reference to the following specific examples, but the scope of the present invention is not limited to the following.
Example 1
As shown in fig. 1, an APT threat detection method based on an intrusion approach includes:
s1: modeling a knowledge base of the intrusion path field;
s2: acquiring behavior data, including acquiring host behavior data and acquiring network behavior data, wherein the acquired host behavior data includes acquiring an incoming/outgoing line information record, a port information record, an operation record of disk data, a system registry information change record, a terminal system basic information update record, a peripheral equipment connection and data transmission record and a third-party application program information record, the acquired network behavior data is obtained by firstly carrying out classification marking on network behaviors, then carrying out network behavior data restoration, recording and tracking operation and maintenance service information and external connection information of a network system, and finally storing the data locally;
s3: performing correlation analysis on the result of the collected behavior data, firstly reconstructing an attack process, and performing correlation analysis on the behavior data respectively from the host and the network according to set logic conditions;
s4: the evidence is preserved, and the evidence preservation of the attack risk behavior is restored to form a complete evidence aiming at the accident;
s5: presenting the evidence.
The APT intrusion path is mail and social network sites.
The knowledge base modeling adopts a multi-dimensional heterogeneous data source integration and integration model. The method realizes the synthesis and integration of various heterogeneous data sources, can dynamically call a proper data mining algorithm, and improves the analysis efficiency, and has the main design ideas that:
a. using a unified knowledge representation approach: the forms of data of the internet include structured data, semi-structured data and unstructured data. The structured data only accounts for 10%, the rest 90% of the structured data are semi-structured and unstructured data, XML is used as a basic storage form of the data, the basic storage form comprises data formats, knowledge models and semantic metadata expression, and a plurality of heterogeneous data sources can be integrated across Internet and Intranet on a cooperative platform;
b. protocol conversion: and converting the data acquired by the self-collecting equipment of the system in real time, identifying the protocol type of the data according to the protocol type library, and storing the protocol type in the database. The data collected by other monitoring equipment (such as intrusion detection, firewall, content audit and the like) is converted and converged in a real-time or non-real-time mode;
c. dynamic loading algorithm: each rule may be dynamically associated with a plurality of specific analysis objects, and the dynamic loading algorithm periodically obtains data from the data source according to the data extraction time of the rule and stores the data in the case database. While no extraction is done for already existing data.
The knowledge base comprises: the system comprises a user information record table, a mail feature library, a social platform feature library, an intranet network data flow feature library and a user behavior feature library.
The network behavior acquisition adopts a data distribution technology based on CIP and SIP, and supports rapid interception, distribution and restoration of big data; an improved AC-BM algorithm is adopted in the keyword matching technology, so that the searching efficiency is improved; the load balance of a large-scale network center is realized by adopting an efficient load balance algorithm; and a node detection mode is adopted to carry out data exchange and communication among different hosts, so that the overall throughput rate of the system is improved.
The objects for collecting the network behaviors comprise mail data collection, social platform application data collection, intranet transport layer data stream collection, database protocol data collection and remote control protocol data collection.
The threat detection method adopts a protocol analysis and transmission flow analysis technology to analyze and research the transmission flow and an application layer protocol in an important service system, analyzes the conversation process and the conversation characteristics of the protocols, grasps the user behavior, and realizes evidence preservation of external connection behavior of an information system and detection of abnormal data.
Evidence presentation takes two different forms, chart and list. The query of the chart adopts a drilling mode, and goes deep layer by layer from the whole to the details. The list provides combined query, and all behavior logs and manual judging logs provide a multi-condition combined query mode. Including behavioral patterns, behavioral objects, specific IPs, etc. The method is easy to use and simple to operate, and facilitates users to master the overall behavior situation and trace the APT attack event.
The association analysis adopts a case-based data management and knowledge discovery model CDMKDM. The model can realize sorting, merging and filing of a large amount of original data collected from a network, extract interesting knowledge and information in the original data, establish implicit association relation on related information according to the service requirement of actual work, provide an intuitive knowledge representation mode and assist users to make decisions by fully utilizing network data.
As shown in fig. 3, an APT threat detection method based on an intrusion approach is characterized in that a business process is modeled by a knowledge base in the fields of mails and social network sites, a white list of a user intranet network data stream is modeled, behavior data respectively originating from a host and a network are subjected to correlation analysis according to set logic conditions, attack risk behavior evidence is restored and preserved, complete evidence for an accident is formed, and the evidence is presented. The detection method also comprises theme-oriented information classification used for a data collection part and a background data mining part, wherein the event classification is to classify data under given conditions, and the alarm event can be automatically classified by using a classification technology, so that the confirmation of the abnormal event is realized. For example, in the application layer evidence analysis, the application layer protocol may be divided into an HTTP uplink, an HTTP downlink, a mail transmission, a mail reception, and the like according to the port number. Some protocols can be further subdivided according to the feature code of each application, for example, HTTP upstream can be divided into login information, BBS, Web chat room, WebMail, etc. The key technology for realizing the classifier is text representation, word segmentation, feature extraction and classification algorithm. The project adopts a classical vector space model and a cosine value method of a vector to calculate the similarity between each document and the requirements of a user.
The detection method also comprises event clustering, wherein various evidences are analyzed and clustered to realize dynamic perception of various safety events, and the event clustering is to divide the data into different data classes according to different characteristics of the data under an unsupervised condition. The aim is to keep the distance between individuals belonging to the same category as small as possible, while the distance between individuals belonging to different categories is as large as possible. In the forensics system based on active defense, alarm events occurring in clusters can be analyzed through clustering, abnormal rules are found, and therefore early warning information is generated.
As shown in fig. 2, an APT threat detection system based on intrusion approach includes:
the system comprises an evidence presentation module, a behavior evidence correlation analysis module, a knowledge base module, an evidence preservation module and an evidence collection module, wherein the behavior evidence correlation analysis module, the knowledge base module and the evidence preservation module are respectively connected with the evidence presentation module; the behavior evidence correlation analysis module is connected with the knowledge base module, and the knowledge base module is connected with the evidence preservation module; the knowledge base and the evidence preservation module are respectively connected with the evidence collection module.
The evidence presentation module comprises a behavior evidence presentation module, an evidence auxiliary analysis module and a behavior main body responsibility confirmation module, the evidence presentation module is connected with the evidence auxiliary analysis module, and the evidence auxiliary analysis module is connected with the behavior main body responsibility confirmation module;
the behavior evidence correlation analysis module comprises a host operation behavior module, a network communication behavior module, a business behavior module, a remote service behavior module and a correlation module, wherein the host operation behavior module, the network communication behavior module, the business behavior module and the remote service behavior module are respectively connected with the correlation module;
the knowledge base module comprises a user information record base, a user safety requirement base, a threat model base, a behavior risk assessment standard base, a regulation and regulation base and an evidence collection strategy base, wherein the first end of the threat model base and the first end of the behavior risk assessment standard base are respectively connected with the user information record base and the user safety requirement base, the second end of the threat model base is connected with the first end of the regulation and regulation base, the second end of the behavior risk assessment standard base is connected with the first end of the regulation and regulation base, and the second end of the regulation and regulation base is connected with the evidence collection strategy base;
the evidence preservation module comprises an evidence module, a regulation processing module, an original data module, a technical processing module and a data storage module, wherein the data storage module is connected with a first end of the original data module, the first end of the technical processing module and the first end of the regulation processing module are respectively connected with a second end of the original data module, and the second end of the regulation processing module is connected with the evidence module;
the evidence collection module comprises a host behavior collection module, a network behavior collection module, various servers, hosts and equipment, wherein the host behavior collection module and the network behavior collection module are respectively connected with the various servers, the hosts and the equipment.
The system evidence presentation module, the behavior evidence correlation analysis module, the knowledge base module, the evidence preservation module and the evidence collection module are connected by adopting an Intranet technology.
Data transmission among an evidence presentation module, a behavior evidence correlation analysis module, a knowledge base module, an evidence preservation module and an evidence collection module of the system is carried out in an encryption mode and comprises user authentication and authority management. The evidence collection module collects data of each collection area, and the collection areas can be network management centers of a basic information network and an important information system.
And the behavior evidence correlation analysis module performs correlation analysis on the data in the front-end data collection module and classifies the evidence data according to the content of the data.
The evidence preservation module generates a network attack and breach event data record.
The evidence presenting module is mainly a variety of inquiry/management terminals. The evidence presentation module generates various reports and analysis reports according to the requirements of the using main body, queries the data warehouse contents by using a friendly interface, realizes session replay, and manages and maintains various platforms, such as backup, deletion and the like.
As shown in FIG. 4, an APT threat detection system architecture based on an intrusion approach comprises a knowledge base, an evidence collection layer, an evidence preservation layer, an evidence analysis layer, an evidence presentation layer and a standard time source.
When the system runs, the evidence presentation module, the behavior evidence correlation analysis module, the knowledge base module, the evidence preservation module and the evidence collection module are connected dynamically and at a high speed. On one hand, the equipment in the evidence area acquires rules from a rule base of the platform through a collector, dynamically stores the acquired data in the platform and realizes alarming; on the other hand, the user authentication mechanism receives the inquiry/management request of each device of the user analysis platform, and provides data analysis or rule modification service. The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (4)

1. An APT threat detection method based on an intrusion path is characterized by comprising the following steps:
s1: modeling a knowledge base of the intrusion path field; the knowledge base modeling adopts a multi-dimensional heterogeneous data source integration and integration model; the method realizes the synthesis and integration of various heterogeneous data sources, dynamically calls a proper data mining algorithm, and improves the analysis efficiency, and the main design idea is as follows:
a. using a unified knowledge representation approach: the form of the data of the internet comprises structured data, semi-structured data and unstructured data; the structured data only accounts for 10%, the rest 90% of the structured data is semi-structured and unstructured data, XML is used as a basic storage form of the data, the basic storage form comprises data formats, knowledge models and semantic metadata expression, and a plurality of heterogeneous data sources are integrated across Internet and Intranet on a cooperative platform;
b. protocol conversion: the method comprises the steps of converting data collected by collecting equipment of a system in real time, identifying the protocol type of the data according to a protocol type library, and storing the data in a database; data collected by other monitoring equipment are converted and converged in a real-time or non-real-time mode;
c. dynamic loading algorithm: each rule is dynamically associated with a plurality of specific analysis objects, and a dynamic loading algorithm periodically acquires data from a data source according to the data extraction time of the rule and stores the data into a case database; and no extraction is performed on the existing data;
s2: collecting behavior data, including collecting host behavior data and collecting network behavior data, wherein the collected host behavior data includes collecting incoming/outgoing line information records, port information records, operation records of disk data, system registry information change records, terminal system basic information update records, peripheral equipment connection and data transmission records and third-party application program information records, and the collected network behavior data supports rapid interception, distribution and restoration of big data by adopting a data distribution technology based on CIP and SIP; an improved AC-BM algorithm is adopted in the keyword matching technology, so that the searching efficiency is improved; the load balance of a large-scale network center is realized by adopting an efficient load balance algorithm; the data exchange and communication between different hosts are carried out by adopting a node detection mode, so that the overall throughput rate of the system is improved; firstly, classifying and marking network behaviors, then restoring network behavior data, recording and tracking operation and maintenance service information and external connection information of a network system, and finally storing the data locally;
the improved AC-BM algorithm is a BMH2 algorithm, and comprises the following specific steps:
setting the character set as sigma, judging the number of times of single character in sigma appearing in pattern, setting the character set appearing once as A1, setting the character set appearing twice or more as A2, and setting the character set appearing 0 times as A0;
aligning the text [ k ] appearing last time in the pattern with the text [ k ] in the text and starting a new round of matching, and when the text [ k ] belongs to (A0U A1), moving a pointer scanning the text forward by a maximum distance m, wherein m is the length of the pattern; when the text [ k ] belongs to A2, the pointer of the scanned text moves forward by the distance of the character distance of two occurrences of the corresponding character of the text [ k ] in the pattern;
adding a newSkip array, if the number of times that the character ch appears in the pattern string pattern is 0 or 1, then newSkip [ ch ] = m; if the appearance of the character ch in the pattern is more than or equal to 2, and f represents the position of the ch appearing in the pattern for the second time from the last, then newSkip [ ch ] = m-f-1; in addition, a preChar array is defined, and if the character ch finally appears in pattern [ e ] in the pattern string pattern, the preChar [ ch ] = pattern [ e-1 ]; if the character ch does not appear in pattern, then preChar [ ch ] = -1; when pattern [0] appears only once in the pattern string, newSkip [ pattern [0] ] is assigned to m-1 alone since pattern [0] is not preceded by a character; the length of the newSkip array and the length of the preChar array are the same as the length of the skip array, and the lengths are the number of elements in the character set; if the code is ASCII code, the length is 256;
when matching starts to compare text [ k-m +1 … k ] with pattern [0 … m-1], checking the text [ k ] … text [ k-m +1] from right to left; if no match is found, then text [ k-1] and preChar [ text [ k ] ]arecompared; reassign the text pointer to k + newSkip [ text [ k ] ]whentext [ k-1 ]! = preChar [ text [ k ] ]; otherwise, the text pointer is reassigned to k + skip [ text [ k ] ]; preChar [ text [ k ] ] is initialized to an arbitrary value when text [ k ] does not appear in the pattern string; because the values of skip [ text [ k ] ] and newSkip [ text [ k ] ] are both m, the text pointer will be reassigned to k + m no matter whether the values of text [ k-1] and preChar [ text [ k ] ] are equal;
s3: performing correlation analysis on the result of the collected behavior data, firstly reconstructing an attack process, and performing correlation analysis on the behavior data respectively from the host and the network according to set logic conditions;
s4: the evidence is preserved, and the evidence preservation of the attack risk behavior is restored to form a complete evidence aiming at the accident;
s5: presenting the evidence;
the APT threat detection method comprises the steps of modeling through a mail and social network site field knowledge base, modeling aiming at a user intranet network data flow white list, carrying out correlation analysis on behavior data respectively from a host and a network according to set logic conditions, restoring attack risk behavior evidence and preserving, forming complete evidence aiming at accidents and presenting the evidence; the detection method also comprises a theme-oriented information classification part used for collecting and background data mining, wherein the event classification is to classify data under given conditions and automatically classify alarm events by using a classification technology so as to confirm abnormal events; in application layer evidence analysis, according to port numbers, dividing an application layer protocol into HTTP uplink and HTTP downlink, sending and receiving mails; according to the feature code of each application, dividing HTTP uplink into login information, BBS, Web chat room and WebMail; the key technology for realizing the classifier is text representation, word segmentation, feature extraction and classification algorithm; calculating the similarity between each document and the requirements of the user by adopting a classical vector space model and a cosine value method of the vector;
the detection method also comprises event clustering, namely analyzing and clustering various evidences to realize dynamic perception of various safety events, wherein the event clustering is to divide the data into different data classes according to different characteristics of the data under an unsupervised condition; the distance between individuals belonging to the same category is made as small as possible, and the distance between individuals on different categories is made as large as possible; in an active defense-based evidence obtaining system, alarm events appearing in clusters are analyzed through clustering, abnormal rules are found, and therefore early warning information is generated.
2. The APT threat detection method based on an intrusion path according to claim 1, characterized in that the collected network behavior data comprises mail data, social platform application data, intranet transport layer data stream, database protocol data and remote control protocol data.
3. The APT threat detection method based on intrusion route according to claim 1, characterized in that in step S4, the table and list are used for query, the table query is drilled in a drilling mode from the general to the detail, the list query is deepened layer by layer, and the multi-condition combined query is carried out by using all the behavior logs and the manual research and judgment logs.
4. The APT threat detection method based on intrusion routes according to claim 1, characterized in that the model adopted by the association analysis is a case-based data management and knowledge discovery model CDMKDM.
CN201710303758.6A 2017-05-03 2017-05-03 APT threat detection method and system based on intrusion path Active CN107172022B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710303758.6A CN107172022B (en) 2017-05-03 2017-05-03 APT threat detection method and system based on intrusion path

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710303758.6A CN107172022B (en) 2017-05-03 2017-05-03 APT threat detection method and system based on intrusion path

Publications (2)

Publication Number Publication Date
CN107172022A CN107172022A (en) 2017-09-15
CN107172022B true CN107172022B (en) 2021-01-01

Family

ID=59812726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710303758.6A Active CN107172022B (en) 2017-05-03 2017-05-03 APT threat detection method and system based on intrusion path

Country Status (1)

Country Link
CN (1) CN107172022B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109951419A (en) * 2017-12-20 2019-06-28 广东电网有限责任公司电力调度控制中心 A kind of APT intrusion detection method based on attack chain attack rule digging
CN108229175B (en) * 2017-12-28 2020-04-10 中国科学院信息工程研究所 Correlation analysis system and method for multidimensional heterogeneous evidence obtaining information
CN110545251A (en) * 2018-05-29 2019-12-06 国际关系学院 evidence chain construction method for Trojan attack scene
CN109981596B (en) * 2019-03-05 2020-09-04 腾讯科技(深圳)有限公司 Host external connection detection method and device
CN110837640B (en) * 2019-11-08 2022-02-22 深信服科技股份有限公司 Malicious file searching and killing method, device, storage medium and device
CN111177772B (en) * 2019-12-04 2023-10-20 国网浙江省电力有限公司 Data security method for palm power business of power system
CN110958257B (en) * 2019-12-06 2022-06-07 北京中睿天下信息技术有限公司 Intranet permeation process reduction method and system
CN111245796B (en) * 2019-12-31 2022-06-14 南京联成科技发展股份有限公司 Big data analysis method for industrial network intrusion detection
CN111914408B (en) * 2020-07-15 2024-03-08 中国民航信息网络股份有限公司 Threat modeling-oriented information processing method and system and electronic equipment
CN112291260A (en) * 2020-11-12 2021-01-29 福建奇点时空数字科技有限公司 APT (android packet) attack-oriented network security threat concealed target identification method
CN112202818B (en) * 2020-12-01 2021-03-09 南京中孚信息技术有限公司 Network traffic intrusion detection method and system fusing threat information
CN112671800B (en) * 2021-01-12 2023-09-26 江苏天翼安全技术有限公司 Method for quantifying enterprise risk value by threat
CN115412320A (en) * 2022-08-19 2022-11-29 奇安信网神信息技术(北京)股份有限公司 Attack behavior tracing method, device and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102594625A (en) * 2012-03-07 2012-07-18 北京启明星辰信息技术股份有限公司 White data filter method and system in APT (Advanced Persistent Threat) intelligent detection and analysis platform
CN102638458A (en) * 2012-03-23 2012-08-15 中国科学院软件研究所 Method for identifying vulnerability utilization safety threat and determining associated attack path
CN104283889A (en) * 2014-10-20 2015-01-14 国网重庆市电力公司电力科学研究院 Electric power system interior APT attack detection and pre-warning system based on network architecture
CN104753946A (en) * 2015-04-01 2015-07-01 浪潮电子信息产业股份有限公司 Security analysis framework based on network traffic meta data
CN105141598A (en) * 2015-08-14 2015-12-09 中国传媒大学 APT (Advanced Persistent Threat) attack detection method and APT attack detection device based on malicious domain name detection

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9736173B2 (en) * 2014-10-10 2017-08-15 Nec Corporation Differential dependency tracking for attack forensics
CN105871883B (en) * 2016-05-10 2019-10-08 上海交通大学 Advanced duration threat detection method based on attack analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102594625A (en) * 2012-03-07 2012-07-18 北京启明星辰信息技术股份有限公司 White data filter method and system in APT (Advanced Persistent Threat) intelligent detection and analysis platform
CN102638458A (en) * 2012-03-23 2012-08-15 中国科学院软件研究所 Method for identifying vulnerability utilization safety threat and determining associated attack path
CN104283889A (en) * 2014-10-20 2015-01-14 国网重庆市电力公司电力科学研究院 Electric power system interior APT attack detection and pre-warning system based on network architecture
CN104753946A (en) * 2015-04-01 2015-07-01 浪潮电子信息产业股份有限公司 Security analysis framework based on network traffic meta data
CN105141598A (en) * 2015-08-14 2015-12-09 中国传媒大学 APT (Advanced Persistent Threat) attack detection method and APT attack detection device based on malicious domain name detection

Also Published As

Publication number Publication date
CN107172022A (en) 2017-09-15

Similar Documents

Publication Publication Date Title
CN107172022B (en) APT threat detection method and system based on intrusion path
AU2019403265B2 (en) Threat detection platforms for detecting, characterizing, and remediating email-based threats in real time
US11743294B2 (en) Retrospective learning of communication patterns by machine learning models for discovering abnormal behavior
Karatas et al. Deep learning in intrusion detection systems
US11032312B2 (en) Programmatic discovery, retrieval, and analysis of communications to identify abnormal communication activity
Ho et al. Detecting and characterizing lateral phishing at scale
US20200344251A1 (en) Multistage analysis of emails to identify security threats
Feng et al. A user-centric machine learning framework for cyber security operations center
US11451576B2 (en) Investigation of threats using queryable records of behavior
CN108881265B (en) Network attack detection method and system based on artificial intelligence
US20200322368A1 (en) Method and system for clustering darknet traffic streams with word embeddings
US20230007042A1 (en) A method and system for determining and acting on an email cyber threat campaign
US20120011590A1 (en) Systems, methods and devices for providing situational awareness, mitigation, risk analysis of assets, applications and infrastructure in the internet and cloud
US20100287196A1 (en) Automated forensic document signatures
Krishnaveni et al. Ensemble approach for network threat detection and classification on cloud computing
CN113904881B (en) Intrusion detection rule false alarm processing method and device
US20230033117A1 (en) Systems and methods for analyzing cybersecurity events
US11663303B2 (en) Multichannel threat detection for protecting against account compromise
Sharma et al. An overview of flow-based anomaly detection
CN117478403A (en) Whole scene network security threat association analysis method and system
CN110912753A (en) Cloud security event real-time detection system and method based on machine learning
CN113343231A (en) Data acquisition system of threat information based on centralized management and control
US11973772B2 (en) Multistage analysis of emails to identify security threats
Du et al. A Method of Network Behavior Recognition and Attack Scenario Reconstruction for Attack Kill Chain
Meng et al. POSTER: Security Logs Graph Analytics for Industry Network System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant