CN116319065A - Threat situation analysis method and system applied to business operation and maintenance - Google Patents

Threat situation analysis method and system applied to business operation and maintenance Download PDF

Info

Publication number
CN116319065A
CN116319065A CN202310430598.7A CN202310430598A CN116319065A CN 116319065 A CN116319065 A CN 116319065A CN 202310430598 A CN202310430598 A CN 202310430598A CN 116319065 A CN116319065 A CN 116319065A
Authority
CN
China
Prior art keywords
threat
flow
sequence
target
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202310430598.7A
Other languages
Chinese (zh)
Inventor
喻永豪
张立新
聂子恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202310430598.7A priority Critical patent/CN116319065A/en
Publication of CN116319065A publication Critical patent/CN116319065A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/026Capturing of monitoring data using flow identification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to the technical field of data mining, and discloses a threat situation analysis method applied to business operation and maintenance, which comprises the following steps: splitting the washed business operation web log information into a web log information sequence; extracting flow object features, system object features and defending object features from the weblog information sequence one by one according to the time sequence; carrying out relevance analysis on the flow object features, the system object features and the defending object features to obtain associated situation features, and generating a network threat feature sequence according to all the associated situation features; iterative updating is carried out on a preset time sequence threat model by utilizing a network threat feature sequence, so that a threat situation analysis model is obtained; and generating real-time analysis threat characteristics corresponding to the real-time operation and maintenance weblog information by using a threat situation analysis model, and upgrading the real-time analysis threat characteristics. The invention also provides a threat situation analysis system applied to business operation and maintenance. The invention can improve the accuracy of threat situation analysis.

Description

Threat situation analysis method and system applied to business operation and maintenance
Technical Field
The invention relates to the technical field of data mining, in particular to a threat situation analysis method and system applied to business operation and maintenance.
Background
With the advent of the digital age, cyber threats have become an important challenge for enterprises and organizations, and cyber attacks can cause a lot of losses, including fund loss, data leakage, network paralysis, reputation damage, etc., so that in order to face more and more complex cyber attacks and threats, cyber threat situation analysis needs to be performed, and cyber attacks are discovered and responded in time, thereby improving the cyber security defenses of enterprises and organizations.
Most of the existing threat situation analysis methods are threat situation analysis methods based on big data, threat situation analysis is performed by calculating threat flow proportion in real-time network flow, in practical application, operation and maintenance threat flow often has time domain characteristics, and a simple threat situation analysis method based on big data cannot identify time domain correlation among flows, so that accuracy in threat situation analysis is low.
Disclosure of Invention
The invention provides a threat situation analysis method and a threat situation analysis system applied to business operation and maintenance, which mainly aim to solve the problem of lower accuracy in threat situation analysis.
In order to achieve the above object, the present invention provides a threat situation analysis method applied to business operation and maintenance, including:
Acquiring business operation and maintenance weblog information, performing data cleaning on the business operation and maintenance weblog information to obtain standard weblog information, and splitting the standard weblog information into weblog information sequences according to a time domain;
the method comprises the steps of selecting weblog information in the weblog information sequence one by one according to a time sequence as target domain log information, extracting a webtraffic log, a system program log and an active defense log from the target domain log information, extracting flow object features from the webtraffic log, extracting system object features from the system program log and extracting defense object features from the active defense log;
carrying out relevance analysis on the flow object features, the system object features and the defending object features to obtain associated situation features, and generating a network threat feature sequence according to all the associated situation features and the network log information sequence;
respectively extracting long and short time sequence threat features and attention threat features corresponding to the weblog information sequence by using a preset time sequence threat model, generating an analysis threat feature sequence according to the long and short time sequence threat features and the attention threat features, and carrying out iterative updating on the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain a threat situation analysis model, wherein the iterative updating is carried out on the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain the threat situation analysis model, and the method comprises the following steps: calculating a threat loss value between the cyber-threat signature and the analytic threat signature using the threat loss value formula:
Figure BDA0004190193750000021
Wherein loss refers to the threat loss value, n is the sequence length of the network threat feature sequence, and the sequence length of the network threat feature sequence is equal to the sequence length of the analysis threat feature sequence, j refers to the sequence number, X j Is a cyber-threat feature of sequence number j in the cyber-threat feature sequence,
Figure BDA0004190193750000022
is an analytic threat feature of sequence number j in the analytic threat feature sequence; judging whether the threat loss value is smaller than a preset loss value threshold value or not; if not, updating the model parameters of the time sequence threat model according to the threat loss value, and returning to the step of extracting the long and short time sequence threat characteristics and the attention threat characteristics corresponding to the weblog information sequence by using the preset time sequence threat model; if yes, the updated time sequence threat model is used as a threat situation analysis model;
and acquiring real-time operation and maintenance weblog information, generating real-time analysis threat features corresponding to the real-time operation and maintenance weblog information by using the threat situation analysis model, and carrying out operation and maintenance weblog security upgrading according to the real-time analysis threat features.
Optionally, the data cleaning is performed on the business operation and maintenance weblog information to obtain standard weblog information, which includes:
Splitting the business operation and maintenance weblog information into a plurality of single-class operation and maintenance log data sets according to the data types;
selecting the single-class operation and maintenance log data set one by one as a target-class operation and maintenance log data set, taking the data type of the target-class operation and maintenance log data set as a target data type, and taking the value range of the target data type as a target value range;
screening out repeated log data from the target class operation and maintenance log data set to obtain a target primary log data set;
screening out messy code log data from the target primary log data set according to the target data type to obtain a target secondary log data set;
and screening out offside log data from the target secondary log data set according to the target value range to obtain a target standard single-class log data set, and collecting all the target standard single-class log data sets into standard weblog information.
Optionally, the extracting the traffic object feature from the network traffic log includes:
dividing the network flow log into a plurality of data flow logs, selecting the data flow logs one by one as target data flow logs, taking the flow size of the target data flow logs as target flow size, and extracting paired data proportions from the target data flow logs;
Calculating the flow rate acceleration of the target data flow log according to the target flow size and the paired data proportion;
extracting the number of flow packets, the number of flow bits and the flow life cycle from the target data flow log;
performing data tracing on the target data stream log to obtain a communication address set, analyzing a geographic address set corresponding to the communication address set by using a preset geographic position library, and calculating a geographic flow ratio corresponding to the geographic address set;
address verification is carried out on the addresses in the communication address set to obtain the number of fake addresses, and the speed increase of the fake addresses is calculated according to the number of fake addresses;
and collecting the geographic flow proportion, the flow speed increasing rate, the flow packet number, the flow bit number, the flow life cycle and the fake address speed increasing rate into target flow characteristics, and collecting all the target flow characteristics into flow object characteristics.
Optionally, the extracting the system object feature from the system program log includes:
selecting process events in the system program log one by one according to a time sequence as target process time, taking event names of the target process events as target event names, and extracting start time and end time corresponding to the target event names;
Calculating behavior duration according to the starting time and the ending time, extracting a starting time domain feature of the starting time, and extracting an ending time domain feature of the ending time;
and collecting the target event name, the starting time domain feature, the ending time domain feature and the behavior duration into a target object group, vectorizing the target object group into target system features, and collecting all the target system features into system object features.
Optionally, the extracting the defending object feature from the active defending log includes:
selecting the defending processes in the active defending log as target defending processes one by one according to the time sequence, and taking the process names of the target defending processes as target process names;
respectively extracting a defending process duration, an invading virus name, an attack data type and an attack frequency from the target defending process;
generating target defending features according to the target process name, the defending process duration, the invasive virus name, the attack data type and the attack frequency, and collecting all the target defending features into defending object features.
Optionally, the performing association analysis on the flow object feature, the system object feature and the defending object feature to obtain an association situation feature includes:
Clustering each flow characteristic in the flow object characteristics to obtain a flow characteristic class set, and updating the flow object characteristics into a standard flow characteristic sequence according to the flow characteristic class set;
clustering all the system features in the system object features to obtain a system feature class set, and updating the system object features into a standard system feature sequence according to the system feature class set;
clustering each defending feature in the defending object features to obtain a defending feature class set, and updating the defending object features to obtain a standard defending feature sequence according to the defending feature class set;
splitting the standard flow characteristic sequence into a plurality of standard flow characteristic sections by utilizing a preset time window, selecting the standard flow characteristic sections one by one as target section standard flow characteristics according to a time sequence, screening standard system characteristics corresponding to the target section standard flow characteristics from the standard system characteristic sequence as target section standard system characteristics, and screening standard defense characteristics corresponding to the target section standard flow characteristics from the standard defense characteristic sequence as target section standard defense characteristics;
calculating the situation association degree among the standard flow characteristics of the target segment, the standard system characteristics of the target segment and the standard defense characteristics of the target segment by using the following situation association degree algorithm:
Figure BDA0004190193750000041
Wherein C is the situation association degree, m is the window length of the preset time window, θ is the preset association degree countermeasure coefficient, i is the ith moment in the preset time window, and x i Refers to the value of the flow characteristic in the standard flow characteristic of the target segment corresponding to the ith moment in the preset time window,
Figure BDA0004190193750000051
mean value, y of the flow characteristics in the standard flow characteristics of the target segment in the preset time window i Refers to the value of the system characteristic in the standard system characteristic of the target segment corresponding to the ith moment in the preset time window,/for the system characteristic>
Figure BDA0004190193750000052
Mean value, z of system characteristics in the standard system characteristics of the target segment corresponding to the preset time window i Refers to the value of the defending feature in the standard defending feature of the target segment corresponding to the ith moment in the preset time window,/I->
Figure BDA0004190193750000054
Mean values of the defending characteristics in the standard defending characteristics of the target segment corresponding to the preset time window;
and taking the standard flow characteristics of the target segment when the situation association degree is larger than a preset association threshold as the standard flow characteristics of the associated segment, taking the time domain segment corresponding to the standard flow characteristics of the associated segment as the associated time domain segment, and extracting the association situation characteristics from all the associated time domain segments.
Optionally, the clustering the flow characteristics in the flow object characteristics to obtain a flow characteristic class set includes:
splitting the flow object features into a plurality of flow feature groups, and randomly selecting primary flow center features for each flow feature group;
calculating covariance matrix distances between each flow characteristic and each primary flow center characteristic in the flow object characteristics by using the following covariance distance formula:
Figure BDA0004190193750000053
where S refers to the covariance matrix distance, p refers to the flow feature, q refers to the primary flow center feature, T is a transpose symbol, cov () is a covariance symbol, cov (p, p) refers to the covariance of the flow feature, cov (p, q) refers to the covariance between the flow feature and the primary flow center feature, cov (q, p) refers to the covariance between the primary flow center feature and the flow feature, cov (q, q) refers to the covariance of the primary flow center feature;
updating the flow characteristic groups into standard flow characteristic groups one by one according to the covariance matrix distance;
calculating standard flow center features of each standard flow feature group, and calculating center covariance matrix distances between the standard flow center features and the corresponding primary flow center features one by one;
And iteratively updating each standard flow characteristic group into a corresponding flow characteristic class according to all the center covariance matrix distances, and collecting all the flow characteristic classes into a flow characteristic class set.
Optionally, the extracting the associated situation features from all associated time domain segments includes:
selecting the associated time domain segments one by one as target associated time domain segments, taking a standard flow characteristic segment corresponding to the target associated time domain segments as target segment standard flow characteristics, taking a standard system characteristic segment corresponding to the target associated time domain segments as target segment standard system characteristics, and taking a standard defense characteristic segment corresponding to the target associated time domain segments as target segment standard defense characteristics;
normalizing the standard flow characteristics of the target segment into a target flow code, normalizing the system characteristics of the target segment into a target system code, and normalizing the defending characteristics of the target segment into a target defending code;
multiplying the target flow code by a preset flow threat coefficient to obtain a flow threat situation, multiplying the target system code by a preset system threat coefficient to obtain a system threat situation, and multiplying the target defense code by a preset defense threat coefficient to obtain a defense threat situation;
And taking the sum of the traffic threat situation, the system threat situation and the defending threat situation as an associated threat situation, and taking the sum of all the associated threat situations as associated situation characteristics.
Optionally, the extracting the long and short time sequence threat features and the attention threat features corresponding to the weblog information sequence by using a preset time sequence threat model includes:
extracting features of the weblog information sequence by using a preset time sequence threat model to obtain an object feature sequence;
performing recursive feature extraction on the object feature sequence by using the time sequence threat model to obtain short-term time sequence threat features;
performing jump feature extraction on the object feature sequence by using the time sequence threat model to obtain long-term time sequence threat features;
fusing the short-term timing threat features and the long-term timing threat features into long-short timing threat features;
and extracting attention threat features corresponding to the object feature sequence by using a self-attention mechanism of the time sequence threat model.
In order to solve the above problems, the present invention also provides a threat situation analysis system applied to business operations, the system comprising:
The data cleaning module is used for acquiring business operation and maintenance weblog information, cleaning the business operation and maintenance weblog information to obtain standard weblog information, and splitting the standard weblog information into weblog information sequences according to a time domain;
the feature extraction module is used for selecting the weblog information in the weblog information sequence one by one as target domain log information according to a time sequence, extracting a webflow log, a system program log and an active defense log from the target domain log information, extracting flow object features from the webflow log, extracting system object features from the system program log and extracting defense object features from the active defense log;
the association analysis module is used for carrying out association analysis on the flow object features, the system object features and the defending object features to obtain association situation features, and generating a network threat feature sequence according to all the association situation features and the network log information sequence;
the model training module is configured to extract a long-short time sequence threat feature and a attention threat feature corresponding to the weblog information sequence respectively by using a preset time sequence threat model, generate an analysis threat feature sequence according to the long-short time sequence threat feature and the attention threat feature, and iteratively update the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain a threat situation analysis model, where the iteratively updating the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain the threat situation analysis model includes: calculating a threat loss value between the cyber-threat signature and the analytic threat signature using the threat loss value formula:
Figure BDA0004190193750000071
Wherein loss refers to the threat loss value, n is the sequence length of the network threat feature sequence, and the sequence length of the network threat feature sequence is equal to the sequence length of the analysis threat feature sequence, j refers to the sequence number, X j Is a cyber-threat feature of sequence number j in the cyber-threat feature sequence,
Figure BDA0004190193750000072
is the score of the sequence number j in the analysis threat signature sequenceAnalyzing threat characteristics; judging whether the threat loss value is smaller than a preset loss value threshold value or not; if not, updating the model parameters of the time sequence threat model according to the threat loss value, and returning to the step of extracting the long and short time sequence threat characteristics and the attention threat characteristics corresponding to the weblog information sequence by using the preset time sequence threat model; if yes, the updated time sequence threat model is used as a threat situation analysis model;
the threat analysis module is used for acquiring real-time operation and maintenance weblog information, generating real-time analysis threat characteristics corresponding to the real-time operation and maintenance weblog information by utilizing the threat situation analysis model, and carrying out operation and maintenance weblog security upgrading according to the real-time analysis threat characteristics.
According to the embodiment of the invention, the business operation and maintenance network log information is obtained, the data is cleaned to obtain the standard network log information, the accuracy of a data set can be improved, the accuracy of subsequent threat situation analysis is improved, the standard network log information is split into the network log information sequence according to a time domain, the extraction of subsequent time sequence characteristics can be facilitated, the flow object characteristics are extracted from the network flow log, the system object characteristics are extracted from the system program log, the defending object characteristics are extracted from the active defending log, the flow change characteristics, the system operation characteristics and the defending object characteristics in the network log in the same time domain can be respectively extracted, the subsequent correlation of the three is facilitated, the accuracy of threat situation prediction is improved, the correlation situation characteristics are obtained by carrying out correlation analysis on the flow object characteristics, the system object characteristics and the defending object characteristics, the flow behavior, the system behavior and the defending behavior can be correlated, the accuracy of threat situation detection is improved, and the threat situation detection can be conveniently extracted from the network log sequence according to all correlation situation characteristics and the time domain situation characteristics, and the time domain sequence of the threat situation characteristics can be conveniently extracted from the network log.
The method comprises the steps of generating an analysis threat feature sequence by using a preset time sequence threat model, carrying out iterative updating on the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain a threat situation analysis model, analyzing the time sequence relationship between the weblog information and the threat situation by using the trained threat situation analysis model, thereby improving the analysis accuracy of the threat situation, generating real-time analysis threat features corresponding to the real-time operation weblog information by using the threat situation analysis model, carrying out operation and maintenance network security upgrading according to the real-time analysis threat features, and analyzing the threat situation in a future time period so as to improve network security. Therefore, the threat situation analysis method and system applied to business operation and maintenance can solve the problem of lower accuracy in threat situation analysis.
Drawings
FIG. 1 is a flow chart of a threat situation analysis method applied to business operations and dimensions according to an embodiment of the invention;
FIG. 2 is a flow chart illustrating the flow object feature extraction according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of extracting associated situation features according to an embodiment of the present invention;
FIG. 4 is a functional block diagram of a threat situation analysis system for use in business operations and maintenance according to an embodiment of the invention;
the achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the application provides a threat situation analysis method applied to business operation and maintenance. The execution subject of the threat situation analysis method applied to the business operation and maintenance includes, but is not limited to, at least one of a server, a terminal and the like, which can be configured to execute the method provided by the embodiment of the application. In other words, the threat situation analysis method applied to the business operation may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Referring to fig. 1, a flow chart of a threat situation analysis method applied to business operation and maintenance according to an embodiment of the invention is shown. In this embodiment, the threat situation analysis method applied to business operation and maintenance includes:
s1, acquiring business operation and maintenance weblog information, performing data cleaning on the business operation and maintenance weblog information to obtain standard weblog information, and splitting the standard weblog information into weblog information sequences according to a time domain.
In the embodiment of the invention, the business operation web log information refers to a data set formed by various log information generated in the business operation web site work, wherein the business operation web log information comprises a web traffic log, a system program log and an active defense log, wherein the web traffic log records data such as a network protocol, an IP address, a traffic size and the like of each traffic of visiting the business operation web site, the system program log refers to system running state of the business operation web site and operation related record data of a system, and the active defense log refers to virus investigation and killing, intrusion detection and vulnerability scanning related record data of the business operation web site.
In the implementation of the present invention, the data cleaning of the business operation and maintenance weblog information to obtain standard weblog information includes:
splitting the business operation and maintenance weblog information into a plurality of single-class operation and maintenance log data sets according to the data types;
selecting the single-class operation and maintenance log data set one by one as a target-class operation and maintenance log data set, taking the data type of the target-class operation and maintenance log data set as a target data type, and taking the value range of the target data type as a target value range;
screening out repeated log data from the target class operation and maintenance log data set to obtain a target primary log data set;
screening out messy code log data from the target primary log data set according to the target data type to obtain a target secondary log data set;
and screening out offside log data from the target secondary log data set according to the target value range to obtain a target standard single-class log data set, and collecting all the target standard single-class log data sets into standard weblog information.
In detail, splitting the business operation and maintenance weblog information into a plurality of single-class operation and maintenance log data sets according to the data types refers to collecting the data with the same data types in the business operation and maintenance weblog information into the single-class operation and maintenance log data sets, wherein the single-class operation and maintenance log data sets refer to a set corresponding to all data with one type in the business operation and maintenance weblog information, such as a data set composed of all IP addresses.
Specifically, the repeated log data refers to the same log data in the target class log data set, and the same log data refers to the repeated data of a time domain instead of the simple data value being the same; the messy-code log data refers to data which does not belong to a data type corresponding to the target primary log data set, for example, english character data which occurs when the data type is a floating point number, and the offside log data refers to data which exceeds the numerical range of the target value range, for example, the flow is-30G.
In the embodiment of the present invention, splitting the standard blog information into the blog information sequence according to the time domain includes: sequencing each data in the standard weblog information according to the time sequence order to obtain time sequence weblog information; and selecting the weblog information of the fixed time domain in the time sequence weblog information one by one as target weblog information according to the time sequence, and collecting all the target weblog information into a weblog information sequence, wherein the time domain can be one day or one hour.
In detail, the step of sorting each data in the standard weblog information according to the time sequence order, and the step of obtaining the time sequence weblog information refers to sorting according to the time sequence order of the time stamps of each data in the standard weblog information, so as to obtain the time sequence weblog information.
In the embodiment of the invention, the business operation and maintenance weblog information is obtained and subjected to data cleaning to obtain the standard weblog information, so that the accuracy of a data set can be improved, the accuracy of the subsequent threat situation analysis is improved, and the standard weblog information is split into the weblog information sequence according to a time domain, so that the extraction of the subsequent time sequence characteristics can be facilitated.
S2, selecting the weblog information in the weblog information sequence one by one according to a time sequence as target domain log information, extracting a webtraffic log, a system program log and an active defense log from the target domain log information, extracting flow object features from the webtraffic log, extracting system object features from the system program log and extracting defense object features from the active defense log.
In the embodiment of the invention, the network flow log records the data such as network protocol, IP address and flow size of each flow accessing the commercial operation and maintenance website, the system program log refers to the system running state of the commercial operation and maintenance website and the operation related record data of the system, and the active defense log refers to the virus searching and killing, intrusion detection and vulnerability scanning related record data of the commercial operation and maintenance website.
In an embodiment of the present invention, referring to fig. 2, the extracting a traffic object feature from the network traffic log includes:
s21, dividing the network flow log into a plurality of data flow logs, selecting the data flow logs one by one as target data flow logs, taking the flow size of the target data flow logs as target flow size, and extracting paired data proportions from the target data flow logs;
s22, calculating the flow speed increase of the target data flow log according to the target flow size and the paired data proportion;
s23, extracting the number of flow packets, the number of flow bits and the flow life cycle from the target data flow log;
s24, carrying out data tracing on the target data stream log to obtain a communication address set, analyzing a geographic address set corresponding to the communication address set by using a preset geographic position library, and calculating a geographic flow ratio corresponding to the geographic address set;
s25, address verification is carried out on the addresses in the communication address set to obtain the fake address quantity, and fake address speed increase is calculated according to the fake address quantity;
s26, collecting the geographic flow proportion, the flow speed increasing, the flow packet number, the flow bit number, the flow life cycle and the fake address speed increasing into target flow characteristics, and collecting all the target flow characteristics into flow object characteristics.
Specifically, the data flow log refers to a data record of an information data flow in the network traffic log in a traffic survival period.
In detail, the geographic location library refers to a database of the corresponding relation between the IP address and the geographic location, the geographic flow ratio refers to the ratio of the flow of each geographic location to the total flow, and the calculating of the flow acceleration of the target data flow according to the target flow and the paired data ratio refers to dividing the difference obtained by subtracting twice the product of the paired data ratio and the target flow size from the target flow size by the transmission time.
In an embodiment of the present invention, the extracting the system object feature from the system program log includes:
selecting process events in the system program log one by one according to a time sequence as target process time, taking event names of the target process events as target event names, and extracting start time and end time corresponding to the target event names;
calculating behavior duration according to the starting time and the ending time, extracting a starting time domain feature of the starting time, and extracting an ending time domain feature of the ending time;
And collecting the target event name, the starting time domain feature, the ending time domain feature and the behavior duration into a target object group, vectorizing the target object group into target system features, and collecting all the target system features into system object features.
In detail, the event name, for example, "opening xx office software" or "opening xx browser software", and calculating the behavior duration according to the start time and the end time, refers to subtracting the start time from the end time to obtain a behavior duration, where the unit of the behavior duration is minutes.
In detail, the extracting the starting time domain feature of the starting time, the extracting the ending time domain feature of the ending time refers to extracting the starting time and the daily time feature of the ending time, for example, the starting time domain feature of 14 minutes at the starting time of 22 years, 11 months, 8 days, 14 minutes refers to 14 minutes and the ending time domain feature of 32 minutes at the ending time of 22 years, 11 months, 8 days, 15 minutes refers to 15 minutes and 32 minutes, and the target behavior group is, for example, (xx office software, 14.32,15.32,60).
In detail, the extracting the defending object feature from the active defending log includes:
Selecting the defending processes in the active defending log as target defending processes one by one according to the time sequence, and taking the process names of the target defending processes as target process names;
respectively extracting a defending process duration, an invading virus name, an attack data type and an attack frequency from the target defending process;
generating target defending features according to the target process name, the defending process duration, the invasive virus name, the attack data type and the attack frequency, and collecting all the target defending features into defending object features.
In detail, the process name is, for example, "msmpeng. Exe" or "mcshield. Exe", the duration of the defending process refers to the running duration of the target defending process, the name of the invading virus refers to the name of the virus defended by the target defending process, and the type of attack data refers to the part of the operation and maintenance network of the virus attack corresponding to the name of the invading virus, such as user information, transaction amount, and the like.
In the embodiment of the invention, the flow object features are extracted from the network flow log, the system object features are extracted from the system program log, and the defending object features are extracted from the active defending log, so that the flow change features, the system operation features and the defending object features in the network log in the same time domain can be respectively extracted, the follow-up association of the three is facilitated, and the accuracy of threat situation prediction is improved.
And S3, carrying out relevance analysis on the flow object features, the system object features and the defending object features to obtain associated situation features, and generating a network threat feature sequence according to all the associated situation features and the network log information sequence.
In the embodiment of the present invention, performing relevance analysis on the flow object feature, the system object feature and the defending object feature to obtain a relevant situation feature includes:
clustering each flow characteristic in the flow object characteristics to obtain a flow characteristic class set, and updating the flow object characteristics into a standard flow characteristic sequence according to the flow characteristic class set;
clustering all the system features in the system object features to obtain a system feature class set, and updating the system object features into a standard system feature sequence according to the system feature class set;
clustering each defending feature in the defending object features to obtain a defending feature class set, and updating the defending object features to obtain a standard defending feature sequence according to the defending feature class set;
splitting the standard flow characteristic sequence into a plurality of standard flow characteristic sections by utilizing a preset time window, selecting the standard flow characteristic sections one by one as target section standard flow characteristics according to a time sequence, screening standard system characteristics corresponding to the target section standard flow characteristics from the standard system characteristic sequence as target section standard system characteristics, and screening standard defense characteristics corresponding to the target section standard flow characteristics from the standard defense characteristic sequence as target section standard defense characteristics;
Calculating the situation association degree among the standard flow characteristics of the target segment, the standard system characteristics of the target segment and the standard defense characteristics of the target segment by using the following situation association degree algorithm:
Figure BDA0004190193750000131
wherein C is the situation association degree, m is the window length of the preset time window, θ is the preset association degree countermeasure coefficient, i is the ith moment in the preset time window, and x i Refers to the value of the flow characteristic in the standard flow characteristic of the target segment corresponding to the ith moment in the preset time window,
Figure BDA0004190193750000141
mean value, y of the flow characteristics in the standard flow characteristics of the target segment in the preset time window i Refers to the value of the system characteristic in the standard system characteristic of the target segment corresponding to the ith moment in the preset time window,/for the system characteristic>
Figure BDA0004190193750000142
Mean value, z of system characteristics in the standard system characteristics of the target segment corresponding to the preset time window i Refers to the target segment corresponding to the ith moment in the preset time windowThe value of the defensive feature in the standard defensive feature, < >>
Figure BDA0004190193750000143
Mean values of the defending characteristics in the standard defending characteristics of the target segment corresponding to the preset time window;
and taking the standard flow characteristics of the target segment when the situation association degree is larger than a preset association threshold as the standard flow characteristics of the associated segment, taking the time domain segment corresponding to the standard flow characteristics of the associated segment as the associated time domain segment, and extracting the association situation characteristics from all the associated time domain segments.
In detail, by calculating the situation association degree among the target segment standard flow characteristic, the target segment standard system characteristic and the target segment standard defense characteristic by using the situation association degree algorithm, the association degree can be represented by the deviation degree of the target segment standard flow characteristic, the target segment standard system characteristic and the target segment standard defense characteristic in each time domain characteristic deviation value, and the accuracy of situation analysis is improved.
In the embodiment of the present invention, the clustering of each flow characteristic in the flow object characteristics to obtain a flow characteristic class set includes:
splitting the flow object features into a plurality of flow feature groups, and randomly selecting primary flow center features for each flow feature group;
calculating covariance matrix distances between each flow characteristic and each primary flow center characteristic in the flow object characteristics by using the following covariance distance formula:
Figure BDA0004190193750000144
where S refers to the covariance matrix distance, p refers to the flow feature, q refers to the primary flow center feature, T is a transpose symbol, cov () is a covariance symbol, cov (p, p) refers to the covariance of the flow feature, cov (p, q) refers to the covariance between the flow feature and the primary flow center feature, cov (q, p) refers to the covariance between the primary flow center feature and the flow feature, cov (q, q) refers to the covariance of the primary flow center feature;
Updating the flow characteristic groups into standard flow characteristic groups one by one according to the covariance matrix distance;
calculating standard flow center features of each standard flow feature group, and calculating center covariance matrix distances between the standard flow center features and the corresponding primary flow center features one by one;
and iteratively updating each standard flow characteristic group into a corresponding flow characteristic class according to all the center covariance matrix distances, and collecting all the flow characteristic classes into a flow characteristic class set.
In the embodiment of the invention, the covariance matrix distance between each flow characteristic and each primary flow center characteristic in the flow object characteristics is calculated by utilizing the covariance distance formula, so that the distance between the unbiased estimated quantity of each flow characteristic and the unbiased estimated quantity of each primary flow center characteristic can be used as the covariance matrix distance, thereby improving the clustering efficiency.
Specifically, updating the flow characteristic groups into standard flow characteristic groups one by one according to the covariance matrix distance refers to redistributing each flow characteristic into the flow characteristic groups corresponding to the primary flow center characteristics closest to the covariance matrix distance.
Specifically, the calculating the standard flow center feature of each standard flow feature group refers to calculating the flow feature which is the same as the covariance matrix distance between each flow feature in the standard flow feature group as the standard flow center feature, and the center covariance matrix distance refers to the covariance matrix distance between the primary flow center feature and the corresponding standard flow center feature.
In detail, iteratively updating each standard flow characteristic group into a corresponding flow characteristic class according to all the center covariance matrix distances refers to calculating an average value of distance sums of all the center covariance matrix distances, when the average value is greater than a preset distance threshold, returning the standard flow center characteristic as a primary flow center characteristic to the step of calculating covariance matrix distances between each flow characteristic and each primary flow center characteristic in the flow object characteristic by using the following covariance distance formula, and when the average value is smaller than or equal to the distance threshold, taking the standard flow characteristic group at the moment as the flow characteristic class.
Specifically, updating the flow object feature into a standard flow feature sequence according to the flow feature class set refers to selecting flow features in the flow object feature one by one as target flow features, taking a flow feature class containing the target flow features in the flow feature class set as target flow feature class, and replacing the target flow features in the flow feature class set by using a standard flow center feature of the target flow feature class to obtain a standard flow feature sequence.
In detail, the clustering is performed on each system feature in the system object features to obtain a system feature class set, the method for updating the system object features to a standard system feature sequence according to the system feature class set and the clustering is performed on each defending feature in the defending object features to obtain a defending feature class set, the method for updating the defending object features to obtain a standard defending feature sequence according to the defending feature class set and the clustering is performed on each flow feature in the flow object features to obtain a flow feature class set, and the method for updating the flow object features to a standard flow feature sequence according to the flow feature class set is consistent and is not repeated here.
In detail, the taking the time domain segment corresponding to the related segment standard flow characteristic as the related time domain segment refers to taking the time domain segment where the preset time window corresponding to the related segment standard flow characteristic is located as the related time domain segment.
In the embodiment of the present invention, referring to fig. 3, the extracting the associated situation features from all the associated time domain segments includes:
s31, selecting the associated time domain segments one by one as target associated time domain segments, taking a standard flow characteristic segment corresponding to the target associated time domain segments as a standard flow characteristic of target segments, taking a standard system characteristic segment corresponding to the target associated time domain segments as a standard system characteristic of the target segments, and taking a standard defense characteristic segment corresponding to the target associated time domain segments as a standard defense characteristic of the target segments;
S32, normalizing the standard flow characteristics of the target segment into a target flow code, normalizing the system characteristics of the target segment into a target system code, and normalizing the defending characteristics of the target segment into a target defending code;
s33, multiplying the target flow code by a preset flow threat coefficient to obtain a flow threat situation, multiplying the target system code by a preset system threat coefficient to obtain a system threat situation, and multiplying the target defense code by a preset defense threat coefficient to obtain a defense threat situation;
s34, taking the sum of the traffic threat situation, the system threat situation and the defending threat situation as an associated threat situation, and taking the sum of all the associated threat situations as an associated situation characteristic.
In detail, the target segment standard flow characteristics may be normalized to a target flow code, the target segment system characteristics may be normalized to a target system code, and the target segment defense characteristics may be normalized to a target defense code using a normalization function such as gaussian distribution or softmax.
Specifically, the traffic threat coefficient, the system threat coefficient and the defense threat coefficient are manually set according to experience, and are used for representing parameters affecting the network threat, and generating a network threat feature sequence according to all associated situation features and the network log information sequence means mapping the associated situation features to each piece of network log information of the network log information sequence, so as to obtain the network threat feature sequence.
In the embodiment of the invention, the correlation analysis is carried out on the flow object feature, the system object feature and the defending object feature to obtain the correlation situation feature, and the flow behavior, the system behavior and the defending behavior can be correlated, so that the accuracy of threat situation detection is improved, and the network threat feature sequence is generated according to all the correlation situation features and the network log information sequence, so that the threat severity degree in the network log in the past time period can be represented in a materialized manner, and the subsequent time domain feature and the self-attention feature can be conveniently extracted.
S4, respectively extracting long and short time sequence threat features and attention threat features corresponding to the weblog information sequence by using a preset time sequence threat model, generating an analysis threat feature sequence according to the long and short time sequence threat features and the attention threat features, and carrying out iterative updating on the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain a threat situation analysis model.
In the embodiment of the invention, the time sequence threat model may be a preset long-short time sequence neural network introduced with a self-attention neural network, the long-short time sequence threat characteristic refers to a time sequence characteristic corresponding to the weblog information sequence, and the attention threat characteristic refers to a self-attention characteristic corresponding to the weblog information sequence.
In the embodiment of the present invention, the extracting the long and short time sequence threat features and the attention threat features corresponding to the weblog information sequence by using a preset time sequence threat model includes:
extracting features of the weblog information sequence by using a preset time sequence threat model to obtain an object feature sequence;
performing recursive feature extraction on the object feature sequence by using the time sequence threat model to obtain short-term time sequence threat features;
performing jump feature extraction on the object feature sequence by using the time sequence threat model to obtain long-term time sequence threat features;
fusing the short-term timing threat features and the long-term timing threat features into long-short timing threat features;
and extracting attention threat features corresponding to the object feature sequence by using a self-attention mechanism of the time sequence threat model.
In detail, the object feature sequence is composed of a traffic object feature sequence, a system object feature sequence and a defending object feature sequence.
Specifically, the recursive feature extraction refers to calculating short-term reset features and short-term update features corresponding to the object feature sequence by using a gating loop layer of the time sequence threat model, and calculating short-term time sequence threat features according to the short-term reset features and the short-term update features, and the jump feature extraction refers to performing interval jump recursive feature extraction.
Specifically, the step of generating an analysis threat feature sequence according to the long and short time sequence threat features and the attention threat features refers to feature fusion of the long and short time sequence threat features and the attention threat features according to time sequence, so as to obtain analysis threat features corresponding to each time domain, and integrating all the analysis threat features into an analysis threat feature sequence.
In detail, the step of iteratively updating the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain a threat situation analysis model includes:
calculating a threat loss value between the cyber-threat signature and the analytic threat signature using the threat loss value formula:
Figure BDA0004190193750000181
wherein loss refers to the threat loss value, n is the sequence length of the network threat feature sequence, and the sequence length of the network threat feature sequence is equal to the sequence length of the analysis threat feature sequence, j refers to the sequence number, X j Is a cyber-threat feature of sequence number j in the cyber-threat feature sequence,
Figure BDA0004190193750000182
is the analysis threat feature orderAn analytical threat feature with sequence number j in the column;
Judging whether the threat loss value is smaller than a preset loss value threshold value or not;
if not, updating the model parameters of the time sequence threat model according to the threat loss value, and returning to the step of extracting the long and short time sequence threat characteristics and the attention threat characteristics corresponding to the weblog information sequence by using the preset time sequence threat model;
if yes, the updated time sequence threat model is used as a threat situation analysis model.
In the embodiment of the invention, the threat loss value between the network threat characteristic sequence and the analysis threat characteristic sequence is calculated by utilizing the threat loss value formula, and the threat loss value can be determined according to the difference distance between each sequence element between the network threat characteristic sequence and the analysis threat characteristic sequence, so that the characterization of the threat loss value is improved.
In detail, model parameters of the time-series threat model may be updated according to the threat loss value using a gradient descent algorithm.
In the embodiment of the invention, the analysis threat feature sequence is generated by utilizing the preset time sequence threat model, the time sequence threat model is iteratively updated according to the network threat feature sequence and the analysis threat feature sequence to obtain the threat situation analysis model, and the time sequence relation between the network log information and the threat situation can be analyzed by utilizing the trained threat situation analysis model, so that the analysis accuracy of the threat situation is improved.
S5, acquiring real-time operation and maintenance weblog information, generating real-time analysis threat features corresponding to the real-time operation and maintenance weblog information by using the threat situation analysis model, and carrying out operation and maintenance network security upgrading according to the real-time analysis threat features.
In the embodiment of the invention, the real-time operation and maintenance network log information refers to a network real-time flow log, a real-time system program log and a real-time active defense log corresponding to an operation and maintenance network.
Specifically, the method for generating the real-time analysis threat feature corresponding to the real-time operation and maintenance weblog information by using the threat situation analysis model is consistent with the method for generating the analysis threat feature sequence by using the preset time sequence threat model to extract the long and short time sequence threat features and the attention threat features corresponding to the weblog information sequence in the step S4, and the method is not repeated here.
In detail, the operation and maintenance network security upgrading according to the real-time analysis threat features refers to determining a corresponding network security upgrading means according to the numerical values of the threat features in the real-time analysis threat features.
In the embodiment of the invention, the threat situation of a future time period can be analyzed by acquiring the real-time operation and maintenance weblog information, generating the real-time analysis threat characteristic corresponding to the real-time operation and maintenance weblog information by utilizing the threat situation analysis model, and carrying out operation and maintenance network security upgrading according to the real-time analysis threat characteristic, thereby improving network security.
According to the embodiment of the invention, the business operation and maintenance network log information is obtained, the data is cleaned to obtain the standard network log information, the accuracy of a data set can be improved, the accuracy of subsequent threat situation analysis is improved, the standard network log information is split into the network log information sequence according to a time domain, the extraction of subsequent time sequence characteristics can be facilitated, the flow object characteristics are extracted from the network flow log, the system object characteristics are extracted from the system program log, the defending object characteristics are extracted from the active defending log, the flow change characteristics, the system operation characteristics and the defending object characteristics in the network log in the same time domain can be respectively extracted, the subsequent correlation of the three is facilitated, the accuracy of threat situation prediction is improved, the correlation situation characteristics are obtained by carrying out correlation analysis on the flow object characteristics, the system object characteristics and the defending object characteristics, the flow behavior, the system behavior and the defending behavior can be correlated, the accuracy of threat situation detection is improved, and the threat situation detection can be conveniently extracted from the network log sequence according to all correlation situation characteristics and the time domain situation characteristics, and the time domain sequence of the threat situation characteristics can be conveniently extracted from the network log.
The method comprises the steps of generating an analysis threat feature sequence by using a preset time sequence threat model, carrying out iterative updating on the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain a threat situation analysis model, analyzing the time sequence relationship between the weblog information and the threat situation by using the trained threat situation analysis model, thereby improving the analysis accuracy of the threat situation, generating real-time analysis threat features corresponding to the real-time operation weblog information by using the threat situation analysis model, carrying out operation and maintenance network security upgrading according to the real-time analysis threat features, and analyzing the threat situation in a future time period so as to improve network security. Therefore, the threat situation analysis method applied to business operation and maintenance can solve the problem of lower accuracy in threat situation analysis.
FIG. 4 is a functional block diagram of a threat situation analysis system for use in business operations and maintenance according to an embodiment of the invention.
The threat situation analysis system 100 of the present invention for use in business operations may be installed in an electronic device. Depending on the functionality implemented, the threat situation analysis system 100 applied to the business operations may include a data cleansing module 101, a feature extraction module 102, an association analysis module 103, a model training module 104, and a threat analysis module 105. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the present embodiment, the functions concerning the respective modules/units are as follows:
the data cleaning module 101 is configured to obtain commercial operation and maintenance weblog information, perform data cleaning on the commercial operation and maintenance weblog information to obtain standard weblog information, and split the standard weblog information into a weblog information sequence according to a time domain;
the feature extraction module 102 is configured to select, one by one, the weblog information in the weblog information sequence as target domain log information according to a time sequence, extract a webtraffic log, a system program log and an active defense log from the target domain log information, extract a traffic object feature from the webtraffic log, extract a system object feature from the system program log, and extract a defense object feature from the active defense log;
the association analysis module 103 is configured to perform association analysis on the traffic object feature, the system object feature and the defending object feature to obtain an association situation feature, and generate a network threat feature sequence according to all association situation features and the weblog information sequence;
the model training module 104 is configured to extract a long-short time sequence threat feature and an attention threat feature corresponding to the weblog information sequence respectively by using a preset time sequence threat model, generate an analysis threat feature sequence according to the long-short time sequence threat feature and the attention threat feature, and iteratively update the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain a threat situation analysis model, where the iteratively updating the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain a threat situation analysis model includes: calculating a threat loss value between the cyber-threat signature and the analytic threat signature using the threat loss value formula:
Figure BDA0004190193750000211
Wherein loss refers to the threat loss value, and n is the network threat feature orderThe sequence length of the columns and the sequence length of the network threat characteristic sequence is equal to the sequence length of the analysis threat characteristic sequence, j refers to the sequence number, X j Is a cyber-threat feature of sequence number j in the cyber-threat feature sequence,
Figure BDA0004190193750000212
is an analytic threat feature of sequence number j in the analytic threat feature sequence; judging whether the threat loss value is smaller than a preset loss value threshold value or not; if not, updating the model parameters of the time sequence threat model according to the threat loss value, and returning to the step of extracting the long and short time sequence threat characteristics and the attention threat characteristics corresponding to the weblog information sequence by using the preset time sequence threat model; if yes, the updated time sequence threat model is used as a threat situation analysis model;
the threat analysis module 105 is configured to obtain real-time operation and maintenance weblog information, generate real-time analysis threat features corresponding to the real-time operation and maintenance weblog information by using the threat situation analysis model, and perform operation and maintenance weblog security upgrade according to the real-time analysis threat features.
In detail, the modules in the threat situation analysis system 100 for business operation and maintenance according to the embodiment of the present invention use the same technical means as the threat situation analysis method for business operation and maintenance described in fig. 1 to 3, and can produce the same technical effects, which are not described herein.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, system and method may be implemented in other manners. For example, the system embodiments described above are merely illustrative, e.g., the division of the modules is merely a logical function division, and other manners of division may be implemented in practice.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. Multiple units or systems set forth in the system embodiments may also be implemented by one unit or system in software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (10)

1. A threat situation analysis method applied to business operations and dimensions, the method comprising:
s1: acquiring business operation and maintenance weblog information, performing data cleaning on the business operation and maintenance weblog information to obtain standard weblog information, and splitting the standard weblog information into weblog information sequences according to a time domain;
s2: the method comprises the steps of selecting weblog information in the weblog information sequence one by one according to a time sequence as target domain log information, extracting a webtraffic log, a system program log and an active defense log from the target domain log information, extracting flow object features from the webtraffic log, extracting system object features from the system program log and extracting defense object features from the active defense log;
S3: carrying out relevance analysis on the flow object features, the system object features and the defending object features to obtain associated situation features, and generating a network threat feature sequence according to all the associated situation features and the network log information sequence;
s4: respectively extracting long and short time sequence threat features and attention threat features corresponding to the weblog information sequence by using a preset time sequence threat model, generating an analysis threat feature sequence according to the long and short time sequence threat features and the attention threat features, and carrying out iterative updating on the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain a threat situation analysis model, wherein the iterative updating is carried out on the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain the threat situation analysis model, and the method comprises the following steps:
s41: calculating a threat loss value between the cyber-threat signature and the analytic threat signature using the threat loss value formula:
Figure FDA0004190193720000011
wherein loss refers to the threat loss value, n is the sequence length of the network threat feature sequence, and the sequence length of the network threat feature sequence is equal to the sequence length of the analysis threat feature sequence, j refers to the sequence number, X j Is a cyber-threat feature of sequence number j in the cyber-threat feature sequence,
Figure FDA0004190193720000012
is an analytic threat feature of sequence number j in the analytic threat feature sequence;
s42: judging whether the threat loss value is smaller than a preset loss value threshold value or not;
s43: if not, updating the model parameters of the time sequence threat model according to the threat loss value, and returning to the step of extracting the long and short time sequence threat characteristics and the attention threat characteristics corresponding to the weblog information sequence by using the preset time sequence threat model;
s44: if yes, the updated time sequence threat model is used as a threat situation analysis model;
s5: and acquiring real-time operation and maintenance weblog information, generating real-time analysis threat features corresponding to the real-time operation and maintenance weblog information by using the threat situation analysis model, and carrying out operation and maintenance weblog security upgrading according to the real-time analysis threat features.
2. The threat situation analysis method for business operations of claim 1, wherein the performing data cleansing on the business operations web log information to obtain standard web log information comprises:
splitting the business operation and maintenance weblog information into a plurality of single-class operation and maintenance log data sets according to the data types;
Selecting the single-class operation and maintenance log data set one by one as a target-class operation and maintenance log data set, taking the data type of the target-class operation and maintenance log data set as a target data type, and taking the value range of the target data type as a target value range;
screening out repeated log data from the target class operation and maintenance log data set to obtain a target primary log data set;
screening out messy code log data from the target primary log data set according to the target data type to obtain a target secondary log data set;
and screening out offside log data from the target secondary log data set according to the target value range to obtain a target standard single-class log data set, and collecting all the target standard single-class log data sets into standard weblog information.
3. The threat situation analysis method for use in a business operation of claim 1, wherein said extracting traffic object features from said web traffic log comprises:
dividing the network flow log into a plurality of data flow logs, selecting the data flow logs one by one as target data flow logs, taking the flow size of the target data flow logs as target flow size, and extracting paired data proportions from the target data flow logs;
Calculating the flow rate acceleration of the target data flow log according to the target flow size and the paired data proportion;
extracting the number of flow packets, the number of flow bits and the flow life cycle from the target data flow log;
performing data tracing on the target data stream log to obtain a communication address set, analyzing a geographic address set corresponding to the communication address set by using a preset geographic position library, and calculating a geographic flow ratio corresponding to the geographic address set;
address verification is carried out on the addresses in the communication address set to obtain the number of fake addresses, and the speed increase of the fake addresses is calculated according to the number of fake addresses;
and collecting the geographic flow proportion, the flow speed increasing rate, the flow packet number, the flow bit number, the flow life cycle and the fake address speed increasing rate into target flow characteristics, and collecting all the target flow characteristics into flow object characteristics.
4. The threat situation analysis method for a business operation of claim 1, wherein extracting system object features from the system program log comprises:
selecting process events in the system program log one by one according to a time sequence as target process time, taking event names of the target process events as target event names, and extracting start time and end time corresponding to the target event names;
Calculating behavior duration according to the starting time and the ending time, extracting a starting time domain feature of the starting time, and extracting an ending time domain feature of the ending time;
and collecting the target event name, the starting time domain feature, the ending time domain feature and the behavior duration into a target object group, vectorizing the target object group into target system features, and collecting all the target system features into system object features.
5. The threat situation analysis method for use in a business operation and maintenance of claim 1, wherein extracting defensive object features from the active defensive log comprises:
selecting the defending processes in the active defending log as target defending processes one by one according to the time sequence, and taking the process names of the target defending processes as target process names;
respectively extracting a defending process duration, an invading virus name, an attack data type and an attack frequency from the target defending process;
generating target defending features according to the target process name, the defending process duration, the invasive virus name, the attack data type and the attack frequency, and collecting all the target defending features into defending object features.
6. The threat situation analysis method for use in a business operation and maintenance of claim 1, wherein said performing a correlation analysis on said traffic object features, said system object features, and said defending object features to obtain correlated situation features comprises:
clustering each flow characteristic in the flow object characteristics to obtain a flow characteristic class set, and updating the flow object characteristics into a standard flow characteristic sequence according to the flow characteristic class set;
clustering all the system features in the system object features to obtain a system feature class set, and updating the system object features into a standard system feature sequence according to the system feature class set;
clustering each defending feature in the defending object features to obtain a defending feature class set, and updating the defending object features to obtain a standard defending feature sequence according to the defending feature class set;
splitting the standard flow characteristic sequence into a plurality of standard flow characteristic sections by utilizing a preset time window, selecting the standard flow characteristic sections one by one as target section standard flow characteristics according to a time sequence, screening standard system characteristics corresponding to the target section standard flow characteristics from the standard system characteristic sequence as target section standard system characteristics, and screening standard defense characteristics corresponding to the target section standard flow characteristics from the standard defense characteristic sequence as target section standard defense characteristics;
Calculating the situation association degree among the standard flow characteristics of the target segment, the standard system characteristics of the target segment and the standard defense characteristics of the target segment by using the following situation association degree algorithm:
Figure FDA0004190193720000041
wherein C is the situation association degree, m is the window length of the preset time window, θ is the preset association degree countermeasure coefficient, i is the ith moment in the preset time window, and x i Refers to the value of the flow characteristic in the standard flow characteristic of the target segment corresponding to the ith moment in the preset time window,
Figure FDA0004190193720000042
mean value, y of the flow characteristics in the standard flow characteristics of the target segment in the preset time window i Refers to the value of the system characteristic in the standard system characteristic of the target segment corresponding to the ith moment in the preset time window,/for the system characteristic>
Figure FDA0004190193720000043
Mean value, z of system characteristics in the standard system characteristics of the target segment corresponding to the preset time window i Refers to the value of the defending feature in the standard defending feature of the target segment corresponding to the ith moment in the preset time window,/I->
Figure FDA0004190193720000044
Mean values of the defending characteristics in the standard defending characteristics of the target segment corresponding to the preset time window;
and taking the standard flow characteristics of the target segment when the situation association degree is larger than a preset association threshold as the standard flow characteristics of the associated segment, taking the time domain segment corresponding to the standard flow characteristics of the associated segment as the associated time domain segment, and extracting the association situation characteristics from all the associated time domain segments.
7. The threat situation analysis method for business operation and maintenance according to claim 6, wherein clustering each flow feature in the flow object features to obtain a flow feature class set comprises:
splitting the flow object features into a plurality of flow feature groups, and randomly selecting primary flow center features for each flow feature group;
calculating covariance matrix distances between each flow characteristic and each primary flow center characteristic in the flow object characteristics by using the following covariance distance formula:
Figure FDA0004190193720000051
where S refers to the covariance matrix distance, p refers to the flow feature, q refers to the primary flow center feature, T is a transpose symbol, cov () is a covariance symbol, cov (p, p) refers to the covariance of the flow feature, cov (p, q) refers to the covariance between the flow feature and the primary flow center feature, cov (q, p) refers to the covariance between the primary flow center feature and the flow feature, cov (q, q) refers to the covariance of the primary flow center feature;
updating the flow characteristic groups into standard flow characteristic groups one by one according to the covariance matrix distance;
Calculating standard flow center features of each standard flow feature group, and calculating center covariance matrix distances between the standard flow center features and the corresponding primary flow center features one by one;
and iteratively updating each standard flow characteristic group into a corresponding flow characteristic class according to all the center covariance matrix distances, and collecting all the flow characteristic classes into a flow characteristic class set.
8. The threat situation analysis method for use in a business operation and maintenance of claim 1, wherein extracting associated situation features from all associated time-domain segments comprises:
selecting the associated time domain segments one by one as target associated time domain segments, taking a standard flow characteristic segment corresponding to the target associated time domain segments as target segment standard flow characteristics, taking a standard system characteristic segment corresponding to the target associated time domain segments as target segment standard system characteristics, and taking a standard defense characteristic segment corresponding to the target associated time domain segments as target segment standard defense characteristics;
normalizing the standard flow characteristics of the target segment into a target flow code, normalizing the system characteristics of the target segment into a target system code, and normalizing the defending characteristics of the target segment into a target defending code;
Multiplying the target flow code by a preset flow threat coefficient to obtain a flow threat situation, multiplying the target system code by a preset system threat coefficient to obtain a system threat situation, and multiplying the target defense code by a preset defense threat coefficient to obtain a defense threat situation;
and taking the sum of the traffic threat situation, the system threat situation and the defending threat situation as an associated threat situation, and taking the sum of all the associated threat situations as associated situation characteristics.
9. The threat situation analysis method for business operation and maintenance according to claim 1, wherein the extracting the long and short time sequence threat features and the attention threat features corresponding to the weblog information sequence by using a preset time sequence threat model respectively comprises:
extracting features of the weblog information sequence by using a preset time sequence threat model to obtain an object feature sequence;
performing recursive feature extraction on the object feature sequence by using the time sequence threat model to obtain short-term time sequence threat features;
performing jump feature extraction on the object feature sequence by using the time sequence threat model to obtain long-term time sequence threat features;
Fusing the short-term timing threat features and the long-term timing threat features into long-short timing threat features;
and extracting attention threat features corresponding to the object feature sequence by using a self-attention mechanism of the time sequence threat model.
10. A threat situation analysis system for use in a business operation, the system comprising:
the data cleaning module is used for acquiring business operation and maintenance weblog information, cleaning the business operation and maintenance weblog information to obtain standard weblog information, and splitting the standard weblog information into weblog information sequences according to a time domain;
the feature extraction module is used for selecting the weblog information in the weblog information sequence one by one as target domain log information according to a time sequence, extracting a webflow log, a system program log and an active defense log from the target domain log information, extracting flow object features from the webflow log, extracting system object features from the system program log and extracting defense object features from the active defense log;
the association analysis module is used for carrying out association analysis on the flow object features, the system object features and the defending object features to obtain association situation features, and generating a network threat feature sequence according to all the association situation features and the network log information sequence;
The model training module is configured to extract a long-short time sequence threat feature and a attention threat feature corresponding to the weblog information sequence respectively by using a preset time sequence threat model, generate an analysis threat feature sequence according to the long-short time sequence threat feature and the attention threat feature, and iteratively update the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain a threat situation analysis model, where the iteratively updating the time sequence threat model according to the network threat feature sequence and the analysis threat feature sequence to obtain the threat situation analysis model includes: calculating a threat loss value between the cyber-threat signature and the analytic threat signature using the threat loss value formula:
Figure FDA0004190193720000071
where loss refers to the threat lossThe value n is the sequence length of the network threat feature sequence, and the sequence length of the network threat feature sequence is equal to the sequence length of the analysis threat feature sequence, j is the sequence number, X j Is a cyber-threat feature of sequence number j in the cyber-threat feature sequence,
Figure FDA0004190193720000072
Is an analytic threat feature of sequence number j in the analytic threat feature sequence; judging whether the threat loss value is smaller than a preset loss value threshold value or not; if not, updating the model parameters of the time sequence threat model according to the threat loss value, and returning to the step of extracting the long and short time sequence threat characteristics and the attention threat characteristics corresponding to the weblog information sequence by using the preset time sequence threat model; if yes, the updated time sequence threat model is used as a threat situation analysis model;
the threat analysis module is used for acquiring real-time operation and maintenance weblog information, generating real-time analysis threat characteristics corresponding to the real-time operation and maintenance weblog information by utilizing the threat situation analysis model, and carrying out operation and maintenance weblog security upgrading according to the real-time analysis threat characteristics.
CN202310430598.7A 2023-04-20 2023-04-20 Threat situation analysis method and system applied to business operation and maintenance Withdrawn CN116319065A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310430598.7A CN116319065A (en) 2023-04-20 2023-04-20 Threat situation analysis method and system applied to business operation and maintenance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310430598.7A CN116319065A (en) 2023-04-20 2023-04-20 Threat situation analysis method and system applied to business operation and maintenance

Publications (1)

Publication Number Publication Date
CN116319065A true CN116319065A (en) 2023-06-23

Family

ID=86783545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310430598.7A Withdrawn CN116319065A (en) 2023-04-20 2023-04-20 Threat situation analysis method and system applied to business operation and maintenance

Country Status (1)

Country Link
CN (1) CN116319065A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117240594A (en) * 2023-10-31 2023-12-15 深圳市常行科技有限公司 Multi-dimensional network security operation and maintenance protection management system and method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117240594A (en) * 2023-10-31 2023-12-15 深圳市常行科技有限公司 Multi-dimensional network security operation and maintenance protection management system and method
CN117240594B (en) * 2023-10-31 2024-06-18 深圳市常行科技有限公司 Multi-dimensional network security operation and maintenance protection management system and method

Similar Documents

Publication Publication Date Title
CN111428231B (en) Safety processing method, device and equipment based on user behaviors
Peng et al. Modeling and predicting extreme cyber attack rates via marked point processes
Hu et al. A simple and efficient hidden Markov model scheme for host-based anomaly intrusion detection
CN103559235B (en) A kind of online social networks malicious web pages detection recognition methods
US20210021616A1 (en) Method and system for classifying data objects based on their network footprint
CN110602137A (en) Malicious IP and malicious URL intercepting method, device, equipment and medium
CN108491714A (en) The man-machine recognition methods of identifying code
CN106022349B (en) Method and system for device type determination
Krishnaveni et al. Ensemble approach for network threat detection and classification on cloud computing
CN110708339B (en) Correlation analysis method based on WEB log
CN112733045B (en) User behavior analysis method and device and electronic equipment
Quinkert et al. Raptor: Ransomware attack predictor
CN116319065A (en) Threat situation analysis method and system applied to business operation and maintenance
Stevanovic et al. Next generation application-layer DDoS defences: applying the concepts of outlier detection in data streams with concept drift
Elekar Combination of data mining techniques for intrusion detection system
CN116248362A (en) User abnormal network access behavior identification method based on double-layer hidden Markov chain
Harbola et al. Improved intrusion detection in DDoS applying feature selection using rank & score of attributes in KDD-99 data set
CN109948339A (en) A kind of malicious script detection method based on machine learning
CN112667875A (en) Data acquisition method, data analysis method, data acquisition device, data analysis device, equipment and storage medium
EP4024252A1 (en) A system and method for identifying exploited cves using honeypots
CN115225359A (en) Honeypot data tracing method and device, computer equipment and storage medium
CN112073362B (en) APT (advanced persistent threat) organization flow identification method based on flow characteristics
CN109995605A (en) A kind of method for recognizing flux and device and computer readable storage medium
Zolotukhin et al. Detection of anomalous http requests based on advanced n-gram model and clustering techniques
Jia et al. MAGIC: Detecting Advanced Persistent Threats via Masked Graph Representation Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20230623

WW01 Invention patent application withdrawn after publication