CN115544519A - Method for carrying out security association analysis on threat information of metering automation system - Google Patents

Method for carrying out security association analysis on threat information of metering automation system Download PDF

Info

Publication number
CN115544519A
CN115544519A CN202211284766.8A CN202211284766A CN115544519A CN 115544519 A CN115544519 A CN 115544519A CN 202211284766 A CN202211284766 A CN 202211284766A CN 115544519 A CN115544519 A CN 115544519A
Authority
CN
China
Prior art keywords
frequent
item
transaction
threat
items
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211284766.8A
Other languages
Chinese (zh)
Inventor
孙文龙
何智帆
刘涛
姜和芳
马越
梁洪浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Power Supply Bureau Co Ltd
Original Assignee
Shenzhen Power Supply Bureau Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Power Supply Bureau Co Ltd filed Critical Shenzhen Power Supply Bureau Co Ltd
Priority to CN202211284766.8A priority Critical patent/CN115544519A/en
Publication of CN115544519A publication Critical patent/CN115544519A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for carrying out security association analysis on threat information of a metering automation system, which comprises the following steps: acquiring multi-source heterogeneous threat information; data preprocessing is carried out, and a threat information transaction data set is constructed; performing correlation analysis by using multi-algorithm fusion; and finding out potential risks through correlation analysis results. By implementing the method, the polymerization degree of threat information is improved, the relevance analysis efficiency is improved, and the safety analysis can be quickly formed to resist potential risks, so that the safety of a metering automation system is improved.

Description

Method for carrying out security association analysis on threat information of metering automation system
Technical Field
The invention relates to the technical field of computer security, in particular to a method for carrying out security association analysis on threat information of a metering automation system based on a fusion algorithm.
Background
With the arrival of the big data era, the proportion of network services in the life of people is getting larger and larger, the network scale is continuously enlarged, the number and the influence of network attacks are increased rapidly, the network and the information safety are greatly damaged, the efficient and stable operation of the network and the information system is guaranteed, and the method is the basis of all market activities and normal operation. In the existing metering automation system, the overall safety analysis of the system needs to be satisfied, so that reliable first-hand and third-party data are provided for the safety evaluation of electric energy metering, and the effectiveness and reliability of the safety evaluation are improved. In order to achieve the above purpose, analysis and research on network threat information are needed, security protection on the network and the service system is enhanced, a security defense system is established, and normal operation of the network and the service system is guaranteed.
Threat intelligence is evidence-based knowledge including context, mechanisms, indicators, implicit and actual feasible suggestions. Threat intelligence describes an existing, or imminent, threat or danger to an asset and may be used to notify a subject to take some response to the relevant threat or danger.
Due to the fact that sources of obtained threat information data are complex, different data formats and analysis methods exist, a data isolated island is easily formed, really valuable threat information cannot be obtained, correlation analysis cannot be conducted on all threat information, in the prior art, most of the threat information is manually searched by a security analyst and correlated with various threats, efficiency is low, the threat information cannot be effectively aggregated, and security analysis is quickly formed to resist potential risks.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a method for carrying out security association analysis on threat information of a metering automation system, so that the polymerization degree of various threat information is improved, and the analysis efficiency is improved.
As an aspect of the present invention, a method for security association analysis of threat intelligence of a metering automation system is provided, which at least comprises the following steps:
step 101, collecting multisource heterogeneous threat intelligence data, wherein the threat intelligence comprises internal source threat intelligence or/and external source threat intelligence; the internal source threat information is key infrastructure data; the externally sourced threat intelligence comprises security events from different OSINT offerings;
step 102, preprocessing the collected threat information, unifying formats, and constructing a threat information transaction database;
103, selecting a transaction data set to be analyzed from a transaction database, associating the selected transaction data set by using a FiDoop algorithm, and pruning to generate a frequent item set; obtaining a strong association rule corresponding to the frequent item set through an Apriori algorithm to form an association analysis result;
and 104, finding potential risks according to the correlation analysis result, and performing positioning processing to ensure the safe operation of the system.
Preferably, the step 102 further comprises:
normalizing the collected threat information data, and unifying the data or items with the same meaning into the same description language;
extracting keywords capable of completely expressing items from the threat description language as transaction data to perform association analysis;
and removing repeated or meaningless data or items for association analysis to form a source transaction data set, thereby constructing the transaction database.
Preferably, the step 103 further comprises:
step 201, selecting a transaction data set to be analyzed from a transaction database, and sorting the transaction data set according to intelligence data;
step 202, appointing a minimum support degree and a minimum confidence degree;
step 203, finding all frequent item sets from the transaction data set by adopting a FiDoop algorithm, wherein the frequent item sets are non-empty subsets with the support degree greater than the minimum support degree after each iteration process;
and step 204, after finding out all frequent item sets, performing association analysis by using an Apriori algorithm, acquiring association rule confidence by using a subset association rule of Apriori in all frequent items with the length of more than 1, comparing the association rule confidence with a minimum confidence threshold value, acquiring a strong association rule meeting conditions, and forming an association analysis result.
Preferably, the step 203 further comprises:
step 301, in a transaction data set, adopting a first MapReduce operation to find all frequent 1-item sets;
step 302, finding out a frequent k-item set by adopting a second MapReduce operation;
and step 303, mining the frequent item set by adopting a third MapReduce operation to obtain all frequent item sets.
Preferably, the step S301 further includes:
the first MapReduce operation is responsible for finding all frequent 1-item sets, the task of the Map at the stage is input into an original transaction data set, and the Reduce task is output of all frequent 1-item sets; the transaction data set is divided into a plurality of segments and stored in the data nodes, each Mapper locally inputs the transaction set segments and stores the transaction set segments in the form of key-value pairs < offset, itemset >, wherein the offset points to the offset value of the transaction, and the itemset represents the transaction itself;
then, each Mapper respectively calculates the frequency of each local item and generates a local frequent 1-item set, the 1-item sets with the same key value are sent to a designated Reducer and are subjected to merging operation to generate a global 1-item set, then the global frequent 1-item set is obtained by comparing with the minimum support degree, the non-frequent items are clipped, and the global frequent 1-item set is output as a first MapReduce in a key value pair < item, count > form.
Preferably, the step S302 further includes:
scanning the database again in the second MapReduce operation, removing the non-frequent item set in the transaction, and generating a frequent k-item set;
taking the first MapReduce operation output as a second operation input, carrying out secondary scanning on a transaction data set, cutting off the infrequent items in the transaction, and if the transaction contains k frequent items, determining the item set as a k-item set, wherein the process is consistent with the first operation; generating a frequent k-item set by referring to all the frequent 1-item sets generated previously;
and after the Mapper work is finished, outputting an intermediate key value pair in the form of < k-items, 1>, wherein the k-items indicates the number of frequent 1-item sets in the clipped residual transactions and the content of the item sets. Merging operation is carried out in a Reducer, the count value of the Reducer is counted, and a key value pair with the form of < k, (k-instances, count) > is output, wherein the key value pair is expressed as the length of an item set, and the value is expressed as the content and the count of the item set under the length; the k-instances generated by this job are arranged in ascending dictionary order.
Preferably, the step 303 further comprises:
in the third MapReduce operation, decomposing the k-itemsets generated by the second operation into a shorter item set, merging the item set with the k-itemsets with the same k value in the local memory according to the item set with the same length to construct a k-FIU-tree, utilizing the distributed processing capacity of MapReduce, generating a group of new key value pairs < k, k-FIU-tree > in the process, meaning generating a group of local FIU-trees with path length k, distributing the items with the same length to the same Reducer, aggregating local FIU-trees with unique lengths in respective maps, constructing k-FIU-trees in a global scope, wherein leaf nodes of the FIU-trees have two attributes of item names and counts, and comparing the count values of the leaf nodes of the global k-FIU-trees with Sms to obtain all frequent item sets.
Preferably, the step 104 further comprises:
performing association analysis on the obtained threat information through a system security association analysis model based on a FiDoop-Apriori fusion algorithm to find out vulnerability information existing in the system and find out potential risks, wherein the method comprises the following steps: software and hardware security risks, external attack risks, network vulnerability and the like, help analysts identify, evaluate and classify multi-source heterogeneous threat information, and locate and process system vulnerabilities.
The embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a method for carrying out security association analysis on threat information of a metering automation system, which can collect security events and key infrastructure data of internal sources provided by different OSINTs by carrying out data acquisition on multi-source heterogeneous threat information and acquiring the information; and then, preprocessing the acquired threat intelligence, unifying the format of the threat intelligence and generating a source transaction data set. Then, performing correlation analysis on the preprocessed intelligence data by using big data and machine learning, performing algorithm fusion on the intelligence data by using a FiDoop algorithm and an Apriori algorithm, and finding out vulnerability information existing in the current system; and finally, finding potential risks through the correlation analysis result, and performing related operations such as positioning processing and the like to ensure the safe operation of the system.
By implementing the method, the threat intelligence data of different sources and different formats can be subjected to correlation analysis, correlation threat index characteristics are researched and analyzed based on understanding and characteristic induction of the information, potential risks existing in the analysis are predicted through a model, and a suggested measure for responding to threat activities is provided according to results. By implementing the method, the polymerization degree of threat information is improved, the analysis efficiency is improved, safety analysis can be quickly formed to resist potential risks, and the safety of a metering automation system is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic main flow chart of an embodiment of a method for security association analysis of threat intelligence of a metering automation system according to the present invention;
FIG. 2 is a more detailed flow chart of step 103 of FIG. 1;
FIG. 3 is a more detailed flow chart of step 203 of FIG. 2;
FIG. 4 is a diagram of a process of MapReduce operation corresponding to the FiDoop algorithm in FIG. 3 according to an embodiment of the present invention.
FIG. 5 is a diagram illustrating an example of the FIU-tree construction process according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that, in order to avoid obscuring the present invention with unnecessary details, only the structures and/or processing steps closely related to the scheme according to the present invention are shown in the drawings, and other details not so relevant to the present invention are omitted.
Fig. 1 is a main flow diagram illustrating an embodiment of a method for security association analysis of threat intelligence of a metering automation system according to the present invention; referring to fig. 2 to 5 together, in the present embodiment, the method at least includes:
step 101, multi-source heterogeneous threat intelligence is obtained.
Multi-source heterogeneous threat intelligence data is collected, wherein the threat intelligence comprises internal source threat intelligence or/and external source threat intelligence; the internally sourced threat intelligence is key infrastructure data; the out-of-source threat intelligence includes security events from different Open source network intelligence tools (OSINT);
more specifically, for the metering automation system, threat information can be divided into two types of internal sources and external sources, wherein the internal sources mainly comprise asset logs, network flow, operation state data and the like, commercial threat information in the external sources is mainly provided for domestic and foreign security manufacturers and information security government organization units, and open-source data information providers are system provider enterprises and network operators. Wherein, open source information can be collected through the OSINT, and the internal information is obtained through the state data of the system equipment information platform.
And 102, preprocessing data and constructing a threat intelligence transaction data set.
Carrying out data preprocessing operation on the collected threat information, dividing the information data into two categories of structured data and unstructured data, carrying out fake-removing and duplicate-removing, consistency analysis and data fusion operation on the whole data, eliminating redundant repeated information in the data, effectively managing the information with multiple sources and carrying out fusion processing on the information with self data of a corresponding system to form information data suitable for algorithm correlation analysis, storing the information data into a database, constructing a transaction data set, and improving the algorithm analysis efficiency;
more specifically, in one example, the step 102 further comprises:
normalizing the collected threat information data, and unifying the data or items with the same meaning into the same description language;
extracting keywords capable of completely expressing items from the threat description language as transaction data to perform association analysis;
and removing repeated or meaningless data or items for association analysis to form a source transaction data set, thereby constructing the transaction database.
And 103, performing correlation analysis by using multi-algorithm fusion. Selecting a transaction data set to be analyzed from a transaction database, associating the selected transaction data set by using a FiDoop algorithm, and pruning to generate a frequent item set; obtaining a strong association rule corresponding to the frequent item set through an Apriori algorithm to form an association analysis result;
in one specific example, a transaction data set constructed using threat intelligence is obtained from a database, and association rules for system security analysis are extracted. Association rules may discover some regularity that exists between two or more variable values, i.e., the association or correlation that exists between sets of items in a large amount of data. The method mainly comprises two stages:
the first stage is as follows: all the frequent item sets are found from the transactional data set.
And a second stage: the final association rules are generated from the frequent set of items.
The method for analyzing threat information safety association of the metering automation system based on the fusion algorithm is characterized in that a FiDoop algorithm and an Apriori algorithm are fused, firstly, a transaction data set in a database is associated by the FiDoop algorithm to generate a frequent item set, and the frequent item set which is more than or less than the minimum support degree in the pruning process is stored. And secondly, processing the frequent item set by the advantages of the Apriori algorithm to generate a strong association rule, and performing subsequent association analysis research. Using MapReduce distributed parallel operation framework to process mass data, performing algorithm fusion on a FiDoop algorithm and an Apriori algorithm, and performing frequent pattern mining and association analysis on frequent item sets, as shown in fig. 2, in an example, the step 103 further includes:
step 201, sorting transaction data sets according to intelligence data;
the transaction data set to be analyzed is selected from the transaction database, and the corresponding object mined by the association rule is generally the transaction data set.
More specifically, the association rule may be defined as: let T = { I 1 ,I 2 ,I 3 ,…,I k 8230and is a transaction database, t k For the kth transaction of T, a transaction data set I is formed for each set of data items k ={i 1 ,i 2 ,i 3 ,…,i k 8230, where i k For a transaction, each I k In which a plurality of i k . Forming all transaction data sets into a set T = { I = { n } 1 ,I 2 ,I 3 ,…,I k 8230, then, the correlation analysis is carried out on T, and then, each transaction in T can be obtainedi k Inter or multi-part transaction sets N, M (N, M are parts i, respectively) k Set of) and obtaining support and confidence of the relations.
Step 202, appointing a minimum support degree and a minimum confidence degree;
the support degree represents the probability of occurrence of the event combination of X and Y in the total transaction record, and the probability of occurrence of a certain item set, namely the proportion of the threat information number containing the item set to the total threat information. X and Y are subsets of I, and the association relation between X and Y is found in the transaction data set and is recorded as
Figure BDA0003899288970000071
Wherein P (X) represents the proportion of the item set X, P (X ≧ Y) represents the proportion of the item set X occurring simultaneously with the item set Y, and the support can be expressed as:
Figure BDA0003899288970000072
confidence represents the probability that the combination of X and Y events in the transaction of the occurrence item set X occurs. The probability that item set Y appears simultaneously when item set X appears is recorded as
Figure BDA0003899288970000073
The confidence level refers to the ratio of threat intelligence including both item set X and item set Y, and can be expressed as:
Figure BDA0003899288970000074
in the present embodiment, the minimum support degree S is set min And minimum confidence C min With the thresholds for specifying support and confidence, only the association rules that reach these two thresholds are referred to as strong association rules. In the subsequent process, the non-empty subset with the minimum support degree is kept as a frequent item set after each iteration process, and the finally generated association rule is also composed of the frequent item set.
Step 203, finding all frequent item sets from the transaction data set by adopting a FiDoop algorithm;
this stage serves as the first stage of association rule mining: all the frequent itemsets are found from the transactional dataset. The FiDoop algorithm is adopted to efficiently discover a frequent item set.
The FiDoop algorithm is an algorithm for optimizing through three times of MapReduce operation based on the FIUT algorithm, wherein the FIUT algorithm mainly comprises two stages, the first stage is used for carrying out two database scanning steps and respectively acquiring a frequent 1-item set and a k-item set generated by an infrequent item, and the second stage is used for acquiring all frequent k-item sets. It can be understood that MapReduce is a programming framework (programming model) of a mature distributed operation program, and is used for parallel operation of large data sets.
As shown in fig. 3, which is a flowchart of the three MapReduce operations in step 203, the FiDoop algorithm may be utilized to implement efficient mining on frequent item sets through the three MapReduce operations. Specifically, the step 203 further includes:
step 301, finding all frequent 1-item sets, namely finding all frequent 1-item sets in a transaction data set by adopting a first MapReduce operation;
and the first MapReduce job is responsible for discovering all frequent 1-item sets, the task of the Map at the stage inputs an original transaction data set, and the Reduce task outputs all frequent 1-item sets. Wherein the transaction data set is divided into a plurality of segments and stored in the data nodes, each Mapper locally inputs the transaction set segments and stores them in the form of key-value pairs < offset, itemset >, wherein offset points to the offset value of the transaction and itemset denotes the transaction itself. Then, each Mapper respectively calculates the frequency of each local item and generates a local frequent 1-item set, the 1-item sets with the same key value are sent to a designated Reducer and are subjected to merging operation to generate a global 1-item set, then the global frequent 1-item set is obtained by comparing with the minimum support degree, the non-frequent items are clipped, and the global frequent 1-item set is output as a first MapReduce in a key value pair < item, count > form.
Step 302, finding out a frequent k-item set, namely finding out the frequent k-item set by adopting a second MapReduce operation;
the database is scanned again in a second MapReduce job, removing the non-frequent set of items in the transaction, resulting in a frequent set of k-items. And taking the first MapReduce operation output as a second operation input, performing secondary scanning on a transaction data set, cutting off the infrequent items in the transaction, and if the transaction contains k frequent items, determining the item set as a k-item set, wherein the process is consistent with the first operation. A frequent k-term set is generated with reference to all the frequent 1-term sets previously generated. And after the Mapper finishes working, outputting an intermediate key value pair in the form of < k-itemsets,1>, wherein the k-itemsets indicate the number of frequent 1-item sets and the content of the item sets in the clipped residual transactions. And performing a merge operation in the Reducer, counting the count value of the merge operation, and outputting a key value pair with the form of < k (k-occurrences, count) > which is expressed as the length of the item set, wherein the value is expressed as the content and the count of the item set under the length. The K-items generated by this operation are important data sources for constructing the K-FIU-tree by the third MapReduce operation, and therefore are arranged in ascending order of the dictionary.
Step 303, mining the frequent item set, namely mining the frequent item set by adopting a third MapReduce operation to obtain all frequent item sets;
in the third MapReduce operation, k-itemsets generated by the second operation need to be decomposed into a shorter item set, and are combined with k-itemsets with the same k value in a local memory according to an item set with the same length to construct a k-FIU-tree, by utilizing the distributed processing capacity of MapReduce, a group of new key value pairs < k, k-FIU-tree > is generated by a Mapper in the process, meaning that a group of local FIU-trees with path length k are generated, items with the same length are distributed into the same Reducer, local FIU-trees with unique length in respective maps are aggregated, k-FIU-trees in a global scope are constructed, FIU-tree nodes have two attributes of item names and counts, a recursive traversal of the whole tree is not needed, and a Smulti set is obtained by counting the count values of leaf nodes of the whole k-FIU-tree and comparing the K-FIU-tree with the Smount values.
For ease of understanding, the steps described above may be referred to collectively as shown in fig. 4-5. FIG. 4 shows a FiDoop algorithm MapReduce job process in one embodiment of the invention. FIG. 5 is an exemplary diagram of the FIU-tree construction process, wherein the initial transaction data set is tested to contain 8 pieces of intelligence data, and the minimum support and confidence are set to 0.5.
Step 204, performing Apriori algorithm association analysis;
after all frequent item sets are found out, association analysis is carried out by using an Apriori algorithm, association rule confidence is obtained by using a subset association rule of Apriori in all frequent items with the length larger than 1, comparison is carried out according to a minimum confidence threshold value, and a strong association rule meeting conditions is obtained to form an association analysis result. Specifically, the association rule generating step is as follows:
(1) For each frequent item set I, generating all non-empty subsets, and for each non-empty subset X of I, calculating:
Conference(X)≥minconfidence
to represent
Figure BDA0003899288970000091
This is true. Obtaining a theorem: let set X, X1 be a subset of X if rule:
Figure BDA0003899288970000092
not a strong rule, then:
Figure BDA0003899288970000093
must not be strongly regular; if the rule:
Figure BDA0003899288970000094
is a strong rule, then
Figure BDA0003899288970000095
Must be a strong rule.
(2) And verifying whether the generated strong association rule meets the minimum support degree and the minimum confidence degree.
104, finding out potential risks according to the correlation analysis result;
performing association analysis on the obtained threat information through a system security association analysis model based on a FiDoop-Apriori fusion algorithm to find out vulnerability information existing in the system and find out potential risks, wherein the method comprises the following steps: software and hardware security risks, external attack risks, network vulnerability and the like, help analysts identify, evaluate and classify multi-source heterogeneous threat information, and locate and process system vulnerabilities.
The embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a method for carrying out security association analysis on threat information of a metering automation system, which can collect security events and key infrastructure data of internal sources provided by different OSINTs by carrying out data acquisition on multi-source heterogeneous threat information and acquiring the information; then, preprocessing the collected threat intelligence, unifying the format of the threat intelligence and generating a source transaction data set. Then, performing correlation analysis on the preprocessed intelligence data by using big data and machine learning, performing algorithm fusion on the intelligence data by using a FiDoop algorithm and an Apriori algorithm, and finding out vulnerability information existing in the current system; and finally, finding potential risks through the correlation analysis result, and performing related operations such as positioning processing and the like to ensure the safe operation of the system.
By implementing the method, the threat intelligence data of different sources and different formats can be subjected to correlation analysis, correlation threat index characteristics are researched and analyzed based on understanding and characteristic induction of the information, potential risks existing in the analysis are predicted through a model, and a suggested measure for responding to threat activities is provided according to results. By implementing the method, the polymerization degree of threat information is improved, the analysis efficiency is improved, safety analysis can be quickly formed to resist potential risks, and the safety of a metering automation system is improved.
The above description is only a preferred embodiment of the present invention and should not be taken as limiting the scope of the claims, therefore, other equivalent changes and modifications should be made without departing from the spirit of the present invention.

Claims (8)

1. A method for security association analysis of threat intelligence of a metering automation system is characterized by at least comprising the following steps:
step 101, collecting multisource heterogeneous threat information data, wherein the threat information comprises internal source threat information or/and external source threat information; the internally sourced threat intelligence is key infrastructure data; the externally sourced threat intelligence comprises security events from different OSINT offerings;
step 102, preprocessing the collected threat intelligence and constructing a threat intelligence transaction database;
103, selecting a transaction data set to be analyzed from a transaction database, associating the selected transaction data set by using a FiDoop algorithm, and pruning to generate a frequent item set; obtaining a strong association rule corresponding to the frequent item set through an Apriori algorithm to form an association analysis result;
and 104, finding potential risks according to the correlation analysis result, and performing positioning processing to ensure the safe operation of the system.
2. The method of claim 1, wherein the step 102 further comprises:
normalizing the collected threat information data, and unifying the data or items with the same meaning into the same description language;
extracting keywords capable of completely expressing items from the threat description language as transaction data to perform correlation analysis;
and removing repeated or meaningless data or items for association analysis to form a source transaction data set, thereby constructing the transaction database.
3. The method of claim 1, wherein the step 103 further comprises:
step 201, selecting a transaction data set to be analyzed from a transaction database, and sorting the transaction data set according to intelligence data;
step 202, appointing a minimum support degree and a minimum confidence degree;
step 203, finding all frequent item sets from the transaction data set by adopting a FiDoop algorithm, wherein the frequent item sets are non-empty subsets with the support degree larger than the minimum support degree after each iteration process;
and step 204, after finding out all frequent item sets, performing association analysis by using an Apriori algorithm, acquiring association rule confidence by using a subset association rule of Apriori in all frequent items with the length of more than 1, comparing the association rule confidence with a minimum confidence threshold value, acquiring a strong association rule meeting conditions, and forming an association analysis result.
4. The method of claim 3, wherein said step 203 further comprises:
step 301, in a transaction data set, adopting a first MapReduce operation to find all frequent 1-item sets;
step 302, finding out a frequent k-item set by adopting a second MapReduce operation;
and step 303, mining the frequent item set by adopting a third MapReduce operation to obtain all frequent item sets.
5. The method of claim 4, further comprising, in the step S301:
the first MapReduce operation is responsible for finding all frequent 1-item sets, the task of the Map at the stage is input into an original transaction data set, and the Reduce task is output of all frequent 1-item sets; the transaction data set is divided into a plurality of segments and stored in the data nodes, each Mapper locally inputs the transaction set segments and stores the segments in the form of key value pairs < offset, itemset >, wherein the offset points to the offset value of the transaction, and the itemset represents the transaction itself;
then, each Mapper calculates the frequency of each local item and generates a local frequent 1-item set, the 1-item sets with the same key value are sent to a designated Reducer and are merged to generate a global 1-item set, then the global frequent 1-item set is obtained by comparing with the minimum support degree and by clipping the non-frequent items, and the global frequent 1-item set is output as a first MapReduce in the form of a key value pair < item, count >.
6. The method of claim 5, further comprising, in the step S302:
scanning the database again in the second MapReduce operation, removing the non-frequent item set in the transaction, and generating a frequent k-item set;
taking the output of the first MapReduce operation as the input of the second operation, carrying out secondary scanning on a transaction data set, cutting off the infrequent items in the transaction, and if the transaction contains k frequent items, then the item set is changed into a k-item set, and the process is consistent with the operation of the first operation; generating a frequent k-item set by referring to all the frequent 1-item sets generated previously;
after the Mapper finishes working, outputting an intermediate key value pair in the form of < k-items, 1>, wherein the k-items indicates the number of frequent 1-item sets in the clipped residual transaction and the content of the item sets; and merging operation is carried out in the Reducer, the count value is counted, a key value pair with the form of < k (k-itemsets, count) > is output, and the k-itemsets generated by the operation are arranged in ascending order of a dictionary.
7. The method of claim 6, further comprising, in step 303:
in the third MapReduce operation, decomposing the k-itemsets generated by the second operation into a shorter item set, merging the item set with the k-itemsets with the same k value in the local memory according to the item set with the same length to construct a k-FIU-tree, and generating a group of new key value pairs < k, k-FIU-tree > by using the distributed processing capacity of the MapReduce in the process, wherein the meaning of the key value pairs is that a group of local FIU-trees with the path length of k is generated; items with the same length are distributed to the same Reducer, local FIU-trees with the unique length in respective maps are gathered, k-FIU-trees in a global scope are constructed, FIU-tree leaf nodes have two attributes of item names and counts, and all frequent item sets are obtained by counting the count values of the leaf nodes of the global k-FIU-trees and comparing the count values with Smin.
8. The method of any of claims 1 to 7, wherein the step 104 further comprises:
performing association analysis on the obtained threat information through a system security association analysis model based on a FiDoop-Apriori fusion algorithm to find out vulnerability information existing in the system and find out potential risks, wherein the method comprises the following steps: software and hardware security risks, external attack risks, network vulnerability and the like, help analysts identify, evaluate and classify multi-source heterogeneous threat information, and locate and process system vulnerabilities.
CN202211284766.8A 2022-10-20 2022-10-20 Method for carrying out security association analysis on threat information of metering automation system Pending CN115544519A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211284766.8A CN115544519A (en) 2022-10-20 2022-10-20 Method for carrying out security association analysis on threat information of metering automation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211284766.8A CN115544519A (en) 2022-10-20 2022-10-20 Method for carrying out security association analysis on threat information of metering automation system

Publications (1)

Publication Number Publication Date
CN115544519A true CN115544519A (en) 2022-12-30

Family

ID=84736171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211284766.8A Pending CN115544519A (en) 2022-10-20 2022-10-20 Method for carrying out security association analysis on threat information of metering automation system

Country Status (1)

Country Link
CN (1) CN115544519A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093951A (en) * 2023-10-16 2023-11-21 北京安天网络安全技术有限公司 Threat information merging method and device, electronic equipment and storage medium
CN117473571A (en) * 2023-11-10 2024-01-30 青岛中企英才集团商业管理有限公司 Data information security processing method and system
CN117640699A (en) * 2024-01-10 2024-03-01 广州雅图新能源科技有限公司 Control system of personnel safety transfer rescue cabin
CN117473571B (en) * 2023-11-10 2024-05-14 广东深技信息科技有限公司 Data information security processing method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093951A (en) * 2023-10-16 2023-11-21 北京安天网络安全技术有限公司 Threat information merging method and device, electronic equipment and storage medium
CN117093951B (en) * 2023-10-16 2024-01-26 北京安天网络安全技术有限公司 Threat information merging method and device, electronic equipment and storage medium
CN117473571A (en) * 2023-11-10 2024-01-30 青岛中企英才集团商业管理有限公司 Data information security processing method and system
CN117473571B (en) * 2023-11-10 2024-05-14 广东深技信息科技有限公司 Data information security processing method and system
CN117640699A (en) * 2024-01-10 2024-03-01 广州雅图新能源科技有限公司 Control system of personnel safety transfer rescue cabin

Similar Documents

Publication Publication Date Title
CN110223168B (en) Label propagation anti-fraud detection method and system based on enterprise relationship map
CN109347801B (en) Vulnerability exploitation risk assessment method based on multi-source word embedding and knowledge graph
CN107835087B (en) Automatic extraction method of alarm rule of safety equipment based on frequent pattern mining
CN115544519A (en) Method for carrying out security association analysis on threat information of metering automation system
Gupta et al. Minimally infrequent itemset mining using pattern-growth paradigm and residual trees
CN112016602B (en) Method, equipment and storage medium for analyzing correlation between power grid fault cause and state quantity
CN111722984B (en) Alarm data processing method, device, equipment and computer storage medium
CN110825769A (en) Data index abnormity query method and system
CN114048870A (en) Power system abnormity monitoring method based on log characteristic intelligent mining
CN114757468B (en) Root cause analysis method for process execution abnormality in process mining
CN114465874A (en) Fault prediction method, device, electronic equipment and storage medium
CN111738843A (en) Quantitative risk evaluation system and method using running water data
CN113904881A (en) Intrusion detection rule false alarm processing method and device
Zubi et al. Using data mining techniques to analyze crime patterns in the libyan national crime data
CN117221087A (en) Alarm root cause positioning method, device and medium
CN112631889A (en) Portrayal method, device and equipment for application system and readable storage medium
CN115543951B (en) Log acquisition, compression and storage method based on origin graph
CN116739605A (en) Transaction data detection method, device, equipment and storage medium
CN116545679A (en) Industrial situation security basic framework and network attack behavior feature analysis method
CN114881521A (en) Service evaluation method, device, electronic equipment and storage medium
CN112750047B (en) Behavior relation information extraction method and device, storage medium and electronic equipment
CN113434607A (en) Behavior analysis method and device based on graph data, electronic equipment and storage medium
Biswas et al. An Iterative Clustering Approach for Tracking Server Logs for Monitoring SCADA EMS/DMS
CN112435151A (en) Government affair information data processing method and system based on correlation analysis
CN117473571B (en) Data information security processing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination