WO2022174379A1 - Risk assessment of firewall rules in a datacenter - Google Patents
Risk assessment of firewall rules in a datacenter Download PDFInfo
- Publication number
- WO2022174379A1 WO2022174379A1 PCT/CN2021/076792 CN2021076792W WO2022174379A1 WO 2022174379 A1 WO2022174379 A1 WO 2022174379A1 CN 2021076792 W CN2021076792 W CN 2021076792W WO 2022174379 A1 WO2022174379 A1 WO 2022174379A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- firewall
- firewall rule
- rules
- rule
- firewall rules
- Prior art date
Links
- 238000012502 risk assessment Methods 0.000 title claims abstract description 28
- 238000009826 distribution Methods 0.000 claims abstract description 85
- 238000000034 method Methods 0.000 claims abstract description 63
- 239000011159 matrix material Substances 0.000 claims description 46
- 239000013598 vector Substances 0.000 claims description 24
- 230000004044 response Effects 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 13
- 230000007246 mechanism Effects 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 description 37
- 238000013459 approach Methods 0.000 description 14
- 238000002372 labelling Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
- H04L63/0263—Rule management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/085—Retrieval of network configuration; Tracking network configuration history
- H04L41/0853—Retrieval of network configuration; Tracking network configuration history by actively collecting configuration information or by backing up configuration information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0893—Assignment of logical groups to network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/22—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks comprising specially adapted graphical user interfaces [GUI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
- H04L63/104—Grouping of entities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0803—Configuration setting
- H04L41/0806—Configuration setting for initial configuration or provisioning, e.g. plug-and-play
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/16—Threshold monitoring
Definitions
- Firewall rules are widely deployed in datacenters for the purposes of, e.g., ensuring safety of network and data accessing, defending against network attacks, etc.
- a datacenter may widely refer to a set of devices or a platform consisted of a number of devices, which operates for various purposes or scenarios.
- Devices in datacenters may comprise various network or computing devices, which are usually referred to as, e.g., hosts, etc.
- the devices in datacenters may comprise databases, file servers, application servers, cloud processing units, gateways, etc.
- Embodiments of the present disclosure propose methods and apparatuses for risk assessment of firewall rules in a datacenter.
- the datacenter may have a plurality of devices and be configured with a plurality of firewall rules.
- Firewall rule configuration information of the plurality of devices may be obtained.
- a plurality of device groups formed by the plurality of devices may be identified.
- Count distributions of the plurality of firewall rules over the plurality of device groups may be determined.
- the plurality of firewall rules may be clustered into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
- FIG. 1 illustrates an exemplary process for performing risk assessment of firewall rules in a datacenter according to an embodiment.
- FIG. 2 illustrates an exemplary process for clustering firewall rules into firewall rule groups according to an embodiment.
- FIG. 3 illustrates an example of obtaining firewall rule groups according to an embodiment.
- FIG. 4 illustrates an exemplary process for automatically labeling firewall rules in a firewall rule group according to an embodiment.
- FIG. 5 illustrates an exemplary process of historical risk attribute label inheriting mechanism according to an embodiment.
- FIG. 6 illustrates a flowchart of an exemplary method for risk assessment of firewall rules in a datacenter according to an embodiment.
- FIG. 7 illustrates an exemplary apparatus for risk assessment of firewall rules in a datacenter according to an embodiment.
- FIG. 8 illustrates an exemplary apparatus for risk assessment of firewall rules in a datacenter according to an embodiment.
- firewall rules usually, a large number of firewall rules will be deployed in a datacenter. Besides normal or reasonably-configured firewall rules, the deployed firewall rules may also include some risky firewall rules.
- the risky firewall rules widely refer to those firewall rules that may hinder normal operations of the datacenter, provide very limited or no protection for the datacenter, cause threatens to network connection or data in the datacenter, etc.
- the risky firewall rules may comprise invalid or misconfigured firewall rules, casually-applied or locally-applied firewall rules that are deployed at a limited amount of devices, malicious firewall rules that are deployed in the datacenter by illegitimate entities, etc. Therefore, there is a need of identifying risky firewall rules from all firewall rules configured in the datacenter.
- firewall rules rely on manually checking of firewall rules, e.g., manually checking whether the deployed firewall rules are risky one by one.
- the above manual approaches of identifying risky firewall rules are very inefficient, and can hardly check all the firewall rules configured in the datacenter timely.
- some devices or services may require adding, removing or updating relevant firewall rules frequently, and this would further increase the difficulty of checking firewall rules by the above manual approaches of identifying risky firewall rules.
- Embodiments of the present disclosure propose schemes that facilitate risk assessment of firewall rules in a datacenter.
- risk assessment of a firewall rule may refer to determining a risk attribute of the firewall rule, e.g., determining whether the firewall rule is risky or riskless, determining a risky level of the firewall rule, etc.
- firewall rules “Remote Desktop -Shadow (TCP-In) ” , “Remote Desktop -User Mode (TCP-In) ” , “Remote Desktop -User Mode (UDP-In) ” , etc. serve for the same purpose of “Remote Desktop” .
- firewall rules serving for the same purpose would have equivalent count distributions in the datacenter, and thus each of the firewall rule groups obtained by using count distributions would comprise a plurality of firewall rules that serve for the same purpose and have the same risk attribute.
- other firewall rules having equivalent count distributions e.g., other firewall rules in the same firewall rule group with the firewall rule, would also have such risk attribute. Therefore, with the firewall rule groups obtained through the embodiments of the present disclosure, risk assessment may be performed efficiently, e.g., the same risk attribute label may be attached to all firewall rules in one firewall rule group.
- the embodiments of the present disclosure may automatically attach risk attribute labels to firewall rules in the firewall rule groups by, e.g., utilizing historical risk attribute labels.
- firewall rules may facilitate to perform efficient, precise and automatic risk assessment of firewall rules. All firewall rules in a datacenter may be assessed in an efficient and time-saving approach. Moreover, benefiting from the high efficiency of risk assessment according to the embodiments, it would be easier to achieve centralized control of firewall rules in a datacenter. Risky firewall rules can be identified timely.
- the processes proposed by the embodiments of the present disclosure may be performed automatically in a periodical approach or in response to any type of triggering events, and thus firewall rules in a datacenter can be monitored timely and continuously.
- FIG. 1 illustrates an exemplary process 100 for performing risk assessment of firewall rules in a datacenter according to an embodiment. It is assumed that a datacenter 102 is configured with a plurality of firewall rules, and risk attributes of these firewall rules are to be assessed.
- the datacenter 102 may comprise a plurality of devices, e.g., network or computing devices.
- Firewall rule configuration information 112 of the devices in the datacenter 102 may be obtained.
- the firewall rule configuration information 112 may comprise firewall rules configured at each device. For example, assuming that there are 25 firewall rules configured at a certain database, firewall rule configuration information of this database may comprise or indicate names of the 25 firewall rules.
- a plurality of device groups 114 formed by the devices in the datacenter 102 may be identified.
- devices in a datacenter may be divided into a plurality of device groups based on functions or purposes, and devices in each device group may have the same function or purpose. For example, there may be 10 devices for providing a certain file assessing service in a datacenter, and thus these 10 devices may be divided into one device group. Since devices in one device group have the same function or purpose, these devices may also have the same or similar requirements of firewall rules. Accordingly, it is very likely that firewall rules serving for the same purpose are also configured at the devices in the same one device group. It should be understood that the identifying of device groups may refer to determining the device groups from all devices in the datacenter 102 based on various predetermined functions or purposes, or receiving indications of the device groups that have been previously identified in any approaches.
- count distributions of the plurality of firewall rules configured in the datacenter 102 over the plurality of device groups 114 may be determined.
- a count distribution of each firewall rule may comprise the number of devices being configured with the firewall rule in each of the plurality of device groups. It is assumed that, for a firewall rule R i , this firewall rule is configured at 10 devices out of the total 25 devices in device group 1, configured at 8 devices out of the total 12 devices in device group 2, configured at 15 devices out of the total 30 devices in device group 3, etc. Then, a count distribution of the firewall rule R i may indicate that this firewall rule is configured at 10 devices in device group 1, configured at 8 devices in device group 2, configured at 15 devices in device group 3, etc.
- the firewall rule configuration information 112 and the identified device groups 114 may be used together for determining a count distribution of each firewall rule.
- the firewall rules in the datacenter 102 may be represented as a list of tuples, each tuple corresponding to a firewall rule.
- An exemplary format of a tuple may be [RuleName, Location, Count] , wherein the item of RuleName is the name of a firewall rule represented by this tuple, and the items of Location and Count are used for representing information about a count distribution of this firewall rule.
- the item of Location may be a vector listing all the device groups 114 in the datacenter
- the item of Count may be a vector listing the number of devices being configured with this firewall rule in each of the plurality of device groups 114. It should be understood that the above representations of the firewall rules are exemplary, and the firewall rules are not limited to be represented in this approach.
- the plurality of firewall rules in the datacenter 102 may be clustered into a plurality of firewall rule groups 140 based on the count distributions determined at 120.
- the clustering operation at 130 may intend to cluster firewall rules that having equivalent count distributions into the same firewall rule group.
- the equivalent count distributions may refer to the same or similar count distributions.
- Firewall rules in each of the firewall rule groups 140 would have the same risk attribute.
- an unsupervised learning algorithm may be adopted for performing the clustering operation at 130, which will be discussed in details in connection with FIG. 2 and FIG. 3 later.
- firewall rule groups obtained through the process 100 would facilitate to greatly improve efficiency in the risk assessment of firewall rules. For example, since firewall rules in each firewall rule group have the same risk attribute, risk attribute labels of these firewall rules may be unified to the same risk attribute label. Accordingly, there is no need of checking all firewall rules in one firewall rule group.
- firewall rule groups 140 may be further performed based at least on the firewall rule groups 140.
- manual labeling may be performed at 150.
- a risk attribute label of one firewall rule in this firewall rule group may be manually determined, and the determined label may be further attached to all firewall rules in this firewall rule group.
- several risk attribute labels of few firewall rules in this firewall rule group may be manually determined firstly, and then a representative or combined risk attribute label may be selected from the several risk attribute labels and attached to all firewall rules in this firewall rule group.
- automatic labeling may be performed at 160. It is assumed that at least a part or all of the firewall rules in the datacenter have been attached with risk attribute labels before the process 100 is performed, these existing risk attribute labels may be used as historical risk attribute labels and applied at 160 for automatically determining and attaching risk attribute labels to firewall rules in the firewall rule groups 140. The operation of automatic labeling at 160 will be discussed in details in connection with FIG. 4 and FIG. 5 later.
- the process 100 may be performed periodically according to a predefined period.
- the predefined period may be any types of period, e.g., per day, per week, etc.
- the process 100 may be performed in response to any type of triggering events.
- the triggering events may be any type of predefined events, e.g., device failure, system failure, network attacks, new service or device deployment, service updating, predetermined time points, etc.
- risk attributes of firewall rules may be classified in different approaches.
- risk attributes of firewall rules may be classified as risky or riskless.
- a risk attribute label attached to a firewall rule may indicate whether the firewall rule is risky or riskless.
- risk attributes of firewall rules may be classified into various risky levels, e.g., high risk, low risk, riskless, etc. Accordingly, a risk attribute label attached to a firewall rule may indicate a certain risky level.
- the embodiments of the present disclosure are not restricted by any specific classification approaches of risk attributes.
- FIG. 2 illustrates an exemplary process 200 for clustering firewall rules into firewall rule groups according to an embodiment.
- the process 200 is an exemplary implementation of the clustering operation at 130 in FIG. 1.
- An unsupervised learning algorithm for clustering firewall rules is discussed in connection with operations in the process 200.
- the unsupervised learning algorithm intends to cluster firewall rules that have equivalent count distributions into the same firewall rule group.
- count distributions 202 of firewall rules in a datacenter have been obtained through, e.g., the operation at 120 in FIG. 1.
- distribution vector representing may be performed for the firewall rules based on the count distributions 202.
- a count distribution of each firewall rule may be represented as a distribution vector of the firewall rule.
- a distribution vector of as an example denotes the distribution vector of the third firewall rule R 3 , which indicates that there are 200 devices being configured with R 3 in the first device group, there are 150 devices being configured with R 3 in the second device group, ..., there are 300 devices being configured with R 3 in the n-th device group.
- similarity calculation may be performed among the firewall rules based on the distribution vectors 212. For example, similarity of every two firewall rules among the firewall rules may be calculated with distribution vectors of the two firewall rules, and a similarity matrix 222 may be formed with the calculated similarities. Various approaches may be adopted for calculating similarity of two firewall rules.
- cosine similarity may be adopted for calculating similarity of two firewall rules. For example, assuming that is a distribution vector of the firewall rule R 1 and is a distribution vector of the firewall rule R 2 , similarity of R 1 and R 2 may be calculated as:
- cos_sim ( ⁇ ) denotes a cosine similarity function
- maximum relative distance similarity may be adopted for calculating similarity of two firewall rules. For example, assuming that is a distribution vector of the firewall rule R 1 and is a distribution vector of the firewall rule R 2 , similarity of R 1 and R 2 may be calculated as:
- max_sim ( ⁇ ) denotes a maximum relative distance similarity function
- min ( ⁇ ) is a minimum value extraction function
- truncate ( ⁇ ) is a truncating function
- abs ( ⁇ ) denotes taking an absolute value
- the maximum relative distance similarity is proposed by the embodiments of the present disclosure to provide sharper function shape than that of the cosine similarity. As compared with the cosine similarity, the maximum relative distance similarity may achieve higher performance in the process of clustering firewall rules, because it can pay more attention to distribution characteristics of count distributions indicated in distribution vectors.
- the similarity calculation at 220 may adopt either or both of the cosine similarity and the maximum relative distance similarity, or any other approaches capable of calculating similarity of two firewall rules.
- the similarity matrix 222 may be formed with the calculated similarities of every two firewall rules among all the firewall rules.
- the similarity matrix 222 may be a m ⁇ m matrix, wherein m is the number of firewall rules in the datacenter. Items in the similarity matrix 222 may be denoted as l p, q which is the calculated similarity of the p-th firewall rule and the q-th firewall rule.
- matrix conversion may be performed to the similarity matrix 222.
- the similarity matrix 222 may be converted to an adjacency matrix 232 by applying a similarity threshold. Those items with values equal to or above the similarity threshold in the similarity matrix 222 may be converted to items with value “1” in the adjacency matrix 232, while those items with values below the similarity threshold in the similarity matrix 222 may be converted to items with value “0” in the adjacency matrix 232. Alternatively, diagonal items in the similarity matrix 222 may also be converted to items with value “0” in the adjacency matrix 232.
- the similarity threshold may be predetermined empirically or experimentally.
- a higher similarity threshold would ensure that firewall rules in one group can have high similarity with each other, but may cause fewer firewall rules to be included in one group.
- a lower similarity threshold would cause a group to include more firewall rules, but may cluster risky firewall rules and riskless firewall rules into one group.
- graph building may be performed with the adjacency matrix 232.
- a graph representation 242 of the adjacency matrix 232 may be built at 240.
- Value “1” in the adjacency matrix 232 indicates that there is an edge between two nodes in the graph representation 242, wherein the two nodes correspond to two firewall rules.
- subgraph extraction may be performed to the graph representation 242.
- a plurality of connected subgraphs 252 may be extracted from the graph representation 242.
- Each connected subgraph may comprise a plurality of nodes that have high similarity with each other. Therefore, the plurality of connected subgraphs 252 may correspond to a plurality of firewall rule groups 260 respectively. It should be understood that it is possible that a connected subgraph contains only one node, which indicates that similarities between a firewall rule corresponding to this node and any other firewall rules are below the similarity threshold, and accordingly this firewall rule itself will form a firewall rule group.
- firewall rules in a datacenter may be clustered into a plurality of firewall rule groups based on the count distributions of the firewall rules. It should be understood that all the operations in the process 200 are exemplary, and the embodiments of the present disclosure may cover any other approaches or processes that are capable of clustering firewall rules based on count distributions. Moreover, it should be understood that, since a complete graph may indicate that all the nodes therein have higher similarities than other types of connected graph, and accordingly lead to a higher precision of clustering, the similarity threshold may also be predetermined to cause the plurality of connected subgraphs to approximate complete graphs. For example, the similarity threshold may be selected for causing the extracted connected subgraphs 252 to be complete graphs and close to complete graphs as much as possible.
- firewall rule when the firewall rule is clustered into a certain firewall rule group through the process 200, the firewall rule may be represented as an updated tuple of [RuleName, Location, Count, GroupID] , wherein the item of GroupID is an identification of the firewall rule group into which the firewall rule is clustered.
- FIG. 3 illustrates an example of obtaining firewall rule groups according to an embodiment. The example in FIG. 3 is proposed based on the process 200 in FIG. 2.
- Similarity matrix may have an exemplary format 310, wherein an item l p, q denotes the calculated similarity of the p-th firewall rule and the q-th firewall rule.
- a similarity matrix 312 is shown in FIG. 3, in which each item is inserted with a value of calculated similarity.
- the similarity matrix 312 may be converted to an adjacency matrix 322 through applying an exemplary similarity threshold “0.8” .
- Those items with values equal to or above the similarity threshold “0.8” in the similarity matrix 312 are converted to items with value “1” in the adjacency matrix 322
- those items with values below the similarity threshold “0.8” in the similarity matrix 312 are converted to items with value “0” in the adjacency matrix 322
- diagonal items in the similarity matrix 312 are converted to items with value “0” in the adjacency matrix 322.
- a graph representation 332 is built for the adjacency matrix 322, in which edges among nodes are set based on those items with value “1” in the adjacency matrix 322.
- the connected subgraph 342 contains three nodes corresponding to the firewall rules R 1 , R 2 and R 3 respectively, and the connected subgraph 344 contains three nodes corresponding to the firewall rules R 4 , R 5 and R 6 respectively.
- two firewall rule groups are obtained. For example, Group 1 containing the firewall rules R 1 , R 2 and R 3 may be determined based on the connected subgraph 342, and Group 2 containing the firewall rules R 4 , R 5 and R 6 may be determined based on the connected subgraph 344.
- FIG. 4 illustrates an exemplary process 400 for automatically labeling firewall rules in a firewall rule group according to an embodiment.
- the process 400 is an exemplary implementation of the operation at 160 in FIG. 1.
- the process 400 utilizes historical or existing risk attribute labels of firewall rules for automatically labeling firewall rules in each firewall rule group.
- the historical risk attribute labels may be manually labelled previously or automatically labeled through performing the process 400 previously.
- those firewall rules having been attached with historical risk attribute labels may also be referred to as historical firewall rules.
- the process 100 in FIG. 1 may be performed iteratively or repeatedly, and thus those firewall rules processed in the last iteration of the process 100 may be deemed as historical firewall rules.
- firewall rules in a target firewall rule group 402 are to be automatically labeled, wherein the target firewall rule group 402 may come from the firewall rule groups 140 in FIG. 1 or the firewall rule groups 260 in FIG. 2.
- the process 400 intends to attach the same risk attribute label to all firewall rules in the target firewall rule group 402 automatically.
- an intermediate risk attribute label may be assigned to each firewall rule in the target firewall rule group 402.
- the intermediate risk attribute label may be a historical risk attribute label.
- the assigning of intermediate risk attribute label may be performed through historical risk attribute label inheriting mechanism.
- FIG. 5 illustrates an exemplary process 500 of historical risk attribute label inheriting mechanism according to an embodiment.
- a corresponding historical firewall rule may be identified at 510.
- a historical firewall rule having the same name as the current firewall rule 502 may be identified at 510.
- similarity of the current firewall rule 502 and the identified historical firewall rule may be calculated according to the operation 220 in FIG. 2, and the calculated similarity may be compared with a predetermined inheriting threshold.
- an operation of label inheriting may be performed at 530, e.g., assigning a historical risk attribute label of the identified historical firewall rule to the current firewall rule 502 as an intermediate risk attribute label of the current firewall rule 502.
- the current firewall rule 502 may be labeled as unknown, wherein the “unknown” risk attribute label is an intermediate risk attribute label of the current firewall rule 502.
- the intermediate risk attribute labels of the firewall rules in the target firewall rule group 402 may be used for determining whether to attach a unified same risk attribute label to all firewall rules in the target firewall rule group 402.
- the type of intermediate risk attribute label may be risky or riskless, and taking a ratio threshold “65%” as an example, it may be determined at 420 whether the ratio of firewall rules being assigned with a risky label in the target firewall rule group 402 is above 65%, or whether the ratio of firewall rules being assigned with a riskless label in the target firewall rule group 402 is above 65%.
- the type of intermediate risk attribute label may be broadcasted in the target firewall rule group 402 at 430, e.g., attaching the type of intermediate risk attribute label to all firewall rules in the target firewall rule group 402. For example, assuming that the ratio of firewall rules being assigned with a risky label in the target firewall rule group 402 is above the ratio threshold 65%, all the firewall rules in the target firewall rule group 402 may be attached with the risky label. Then the process 400 would end at 440.
- the ratio threshold may be used for controlling that: when a majority or a predetermined portion of firewall rules in a firewall rule group have the same risk attribute label, i.e., have the same risk attribute, it can be derived that all firewall rules in the firewall rule group should have such risk attribute and thus can be attached with the same risk attribute label.
- a firewall rule in the target firewall rule group 402 is originally represented as a tuple of [RuleName, Location, Count, GroupID] , wherein the item of GroupID corresponds to the target firewall rule group 402
- the firewall rule when the firewall rule is attached with a risk attribute label through the process 400, the firewall rule may be represented as an updated tuple of [RuleName, Location, Count, GroupID, Label] , wherein the item of Label is the risk attribute label attached through the process 400.
- the current firewall rule would become a historical firewall rule, and the current Label in the tuple would become a historical risk attribute label.
- FIG. 6 illustrates a flowchart of an exemplary method 600 for risk assessment of firewall rules in a datacenter according to an embodiment.
- the datacenter may have a plurality of devices and be configured with a plurality of firewall rules.
- firewall rule configuration information of the plurality of devices may be obtained.
- a plurality of device groups formed by the plurality of devices may be identified.
- count distributions of the plurality of firewall rules over the plurality of device groups may be determined.
- the plurality of firewall rules may be clustered into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
- the firewall rule configuration information may comprise firewall rules configured at each device.
- devices in each device group may have the same function or purpose.
- a count distribution of each firewall rule may comprise the number of devices being configured with the firewall rule in each of the plurality of device groups.
- the clustering may comprise: clustering firewall rules that have equivalent count distributions into the same firewall rule group.
- the clustering may comprise: representing a count distribution of each firewall rule as a distribution vector of the firewall rule; calculating similarity of every two firewall rules among the plurality of firewall rules with distribution vectors of the two firewall rules, to form a similarity matrix; converting the similarity matrix to an adjacency matrix by applying a similarity threshold; building a graph representation of the adjacency matrix; and extracting, from the graph representation, a plurality of connected subgraphs corresponding to the plurality of firewall rule groups respectively.
- the similarity threshold may be predetermined to cause the plurality of connected subgraphs to approximate complete graphs.
- the similarity of every two firewall rules may be calculated based on cosine similarity or maximum relative distance similarity.
- the method 600 may further comprise: attaching the same risk attribute label to all firewall rules in each firewall rule group automatically.
- a risk attribute label attached to a firewall rule may indicate whether the firewall rule is risky or riskless.
- the attaching the same risk attribute label may comprise: assigning an intermediate risk attribute label to each firewall rule in the firewall rule group through historical risk attribute label inheriting mechanism; determining whether the ratio of firewall rules being assigned with a type of intermediate risk attribute label in the firewall rule group is above a ratio threshold; and in response to determining that the ratio is above the ratio threshold, attaching the type of intermediate risk attribute label to all firewall rules in the firewall rule group.
- the historical risk attribute label inheriting mechanism may comprise: identifying a historical firewall rule having the same name as the current firewall rule; determining whether the current firewall rule and the historical firewall rule have equivalent count distributions; and in response to determining that the current firewall rule and the historical firewall rule have equivalent count distributions, assigning a historical risk attribute label of the historical firewall rule to the current firewall rule as an intermediate risk attribute label of the current firewall rule.
- the method 600 may further comprise any steps/processes for risk assessment of firewall rules in a datacenter according to the embodiments of the present disclosure as mentioned above.
- FIG. 7 illustrates an exemplary apparatus 700 for risk assessment of firewall rules in a datacenter according to an embodiment.
- the datacenter may have a plurality of devices and be configured with a plurality of firewall rules.
- the apparatus 700 may comprise: a configuration information obtaining module 710, for obtaining firewall rule configuration information of the plurality of devices; a device group identifying module 720, for identifying a plurality of device groups formed by the plurality of devices; a count distribution determining module 730, for determining count distributions of the plurality of firewall rules over the plurality of device groups; and a clustering module 740, for clustering the plurality of firewall rules into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
- the apparatus 700 may also comprise any other modules configured for risk assessment of firewall rules in a datacenter according to the embodiments of the present disclosure as mentioned above.
- FIG. 8 illustrates an exemplary apparatus 800 for risk assessment of firewall rules in a datacenter according to an embodiment.
- the datacenter may have a plurality of devices and be configured with a plurality of firewall rules.
- the apparatus 800 may comprise: at least one processor 810; and a memory 820 storing computer-executable instructions.
- the at least one processor 810 may: obtain firewall rule configuration information of the plurality of devices; identify a plurality of device groups formed by the plurality of devices; determine count distributions of the plurality of firewall rules over the plurality of device groups; and cluster the plurality of firewall rules into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
- a count distribution of each firewall rule may comprise the number of devices being configured with the firewall rule in each of the plurality of device groups.
- the clustering may comprise: clustering firewall rules that have equivalent count distributions into the same firewall rule group.
- the clustering may comprise: representing a count distribution of each firewall rule as a distribution vector of the firewall rule; calculating similarity of every two firewall rules among the plurality of firewall rules with distribution vectors of the two firewall rules, to form a similarity matrix; converting the similarity matrix to an adjacency matrix by applying a similarity threshold; building a graph representation of the adjacency matrix; and extracting, from the graph representation, a plurality of connected subgraphs corresponding to the plurality of firewall rule groups respectively.
- the computer-executable instructions stored in the memory 820 may be further executed to cause the at least one processor 810 to: attach the same risk attribute label to all firewall rules in each firewall rule group automatically.
- the attaching the same risk attribute label may comprise: assigning an intermediate risk attribute label to each firewall rule in the firewall rule group through historical risk attribute label inheriting mechanism; determining whether the ratio of firewall rules being assigned with a type of intermediate risk attribute label in the firewall rule group is above a ratio threshold; and in response to determining that the ratio is above the ratio threshold, attaching the type of intermediate risk attribute label to all firewall rules in the firewall rule group.
- the historical risk attribute label inheriting mechanism may comprise: identifying a historical firewall rule having the same name as the current firewall rule; determining whether the current firewall rule and the historical firewall rule have equivalent count distributions; and in response to determining that the current firewall rule and the historical firewall rule have equivalent count distributions, assigning a historical risk attribute label of the historical firewall rule to the current firewall rule as an intermediate risk attribute label of the current firewall rule.
- the at least one processor 810 may perform any other operations of the methods for risk assessment of firewall rules in a datacenter according to the embodiments of the present disclosure as mentioned above.
- the embodiments of the present disclosure propose a computer program product for risk assessment of firewall rules in a datacenter.
- the datacenter may have a plurality of devices and be configured with a plurality of firewall rules.
- the computer program product may comprise a computer program that is executed by at least one processor for: obtaining firewall rule configuration information of the plurality of devices; identifying a plurality of device groups formed by the plurality of devices; determining count distributions of the plurality of firewall rules over the plurality of device groups; and clustering the plurality of firewall rules into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
- the computer program in the computer program product may be further executed by the at least one processor to perform any other operations of the methods for risk assessment of firewall rules in a datacenter according to the embodiments of the present disclosure as mentioned above.
- the embodiments of the present disclosure may be embodied in a non-transitory computer-readable medium.
- the non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform any operations of the methods for risk assessment of firewall rules in a datacenter according to the embodiments of the present disclosure as mentioned above.
- modules in the apparatuses described above may be implemented in various approaches. These modules may be implemented as hardware, software, or a combination thereof. Moreover, any of these modules may be further functionally divided into sub-modules or combined together.
- processors have been described in connection with various apparatuses and methods. These processors may be implemented using electronic hardware, computer software, or any combination thereof. Whether such processors are implemented as hardware or software will depend upon the particular application and overall design constraints imposed on the system.
- a processor, any portion of a processor, or any combination of processors presented in the present disclosure may be implemented with a microprocessor, microcontroller, digital signal processor (DSP) , a field-programmable gate array (FPGA) , a programmable logic device (PLD) , a state machine, gated logic, discrete hardware circuits, and other suitable processing components configured to perform the various functions described throughout the present disclosure.
- DSP digital signal processor
- FPGA field-programmable gate array
- PLD programmable logic device
- a state machine gated logic, discrete hardware circuits, and other suitable processing components configured to perform the various functions described throughout the present disclosure.
- the functionality of a processor, any portion of a processor, or any combination of processors presented in the present disclosure may be
- a computer-readable medium may include, by way of example, memory such as a magnetic storage device (e.g., hard disk, floppy disk, magnetic strip) , an optical disk, a smart card, a flash memory device, random access memory (RAM) , read only memory (ROM) , programmable ROM (PROM) , erasable PROM (EPROM) , electrically erasable PROM (EEPROM) , a register, or a removable disk.
- RAM random access memory
- ROM read only memory
- PROM programmable ROM
- EPROM erasable PROM
- EEPROM electrically erasable PROM
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Computer Hardware Design (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present disclosure provides methods and apparatuses for risk assessment of firewall rules in a datacenter. The datacenter may have a plurality of devices and be configured with a plurality of firewall rules. Firewall rule configuration information of the plurality of devices may be obtained. A plurality of device groups formed by the plurality of devices may be identified. Count distributions of the plurality of firewall rules over the plurality of device groups may be determined. The plurality of firewall rules may be clustered into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
Description
Firewall rules are widely deployed in datacenters for the purposes of, e.g., ensuring safety of network and data accessing, defending against network attacks, etc. Herein, a datacenter may widely refer to a set of devices or a platform consisted of a number of devices, which operates for various purposes or scenarios. Devices in datacenters may comprise various network or computing devices, which are usually referred to as, e.g., hosts, etc. For example, the devices in datacenters may comprise databases, file servers, application servers, cloud processing units, gateways, etc.
SUMMARY
This Summary is provided to introduce a selection of concepts that are further described below in the Detailed Description. It is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Embodiments of the present disclosure propose methods and apparatuses for risk assessment of firewall rules in a datacenter. The datacenter may have a plurality of devices and be configured with a plurality of firewall rules. Firewall rule configuration information of the plurality of devices may be obtained. A plurality of device groups formed by the plurality of devices may be identified. Count distributions of the plurality of firewall rules over the plurality of device groups may be determined. The plurality of firewall rules may be clustered into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
It should be noted that the above one or more aspects comprise the features hereinafter fully described and particularly pointed out in the claims. The following description and the drawings set forth in detail certain illustrative features of the one or more aspects. These features are only indicative of the various ways in which the principles of various aspects may be employed, and this disclosure is intended to include all such aspects and their equivalents.
The disclosed aspects will hereinafter be described in connection with the appended drawings that are provided to illustrate and not to limit the disclosed aspects.
FIG. 1 illustrates an exemplary process for performing risk assessment of firewall rules in a datacenter according to an embodiment.
FIG. 2 illustrates an exemplary process for clustering firewall rules into firewall rule groups according to an embodiment.
FIG. 3 illustrates an example of obtaining firewall rule groups according to an embodiment.
FIG. 4 illustrates an exemplary process for automatically labeling firewall rules in a firewall rule group according to an embodiment.
FIG. 5 illustrates an exemplary process of historical risk attribute label inheriting mechanism according to an embodiment.
FIG. 6 illustrates a flowchart of an exemplary method for risk assessment of firewall rules in a datacenter according to an embodiment.
FIG. 7 illustrates an exemplary apparatus for risk assessment of firewall rules in a datacenter according to an embodiment.
FIG. 8 illustrates an exemplary apparatus for risk assessment of firewall rules in a datacenter according to an embodiment.
The present disclosure will now be discussed with reference to several example implementations. It is to be understood that these implementations are discussed only for enabling those skilled in the art to better understand and thus implement the embodiments of the present disclosure, rather than suggesting any limitations on the scope of the present disclosure.
Usually, a large number of firewall rules will be deployed in a datacenter. Besides normal or reasonably-configured firewall rules, the deployed firewall rules may also include some risky firewall rules. Herein, the risky firewall rules widely refer to those firewall rules that may hinder normal operations of the datacenter, provide very limited or no protection for the datacenter, cause threatens to network connection or data in the datacenter, etc. For example, the risky firewall rules may comprise invalid or misconfigured firewall rules, casually-applied or locally-applied firewall rules that are deployed at a limited amount of devices, malicious firewall rules that are deployed in the datacenter by illegitimate entities, etc. Therefore, there is a need of identifying risky firewall rules from all firewall rules configured in the datacenter. Existing approaches of identifying risky firewall rules rely on manually checking of firewall rules, e.g., manually checking whether the deployed firewall rules are risky one by one. However, since the datacenter may be configured with a very large number of firewall rules, the above manual approaches of identifying risky firewall rules are very inefficient, and can hardly check all the firewall rules configured in the datacenter timely. Moreover, some devices or services may require adding, removing or updating relevant firewall rules frequently, and this would further increase the difficulty of checking firewall rules by the above manual approaches of identifying risky firewall rules.
Embodiments of the present disclosure propose schemes that facilitate risk assessment of firewall rules in a datacenter. Herein, risk assessment of a firewall rule may refer to determining a risk attribute of the firewall rule, e.g., determining whether the firewall rule is risky or riskless, determining a risky level of the firewall rule, etc.
The embodiments of the present disclosure propose to cluster firewall rules into firewall rule groups based on count distributions of the firewall rules over device groups in the datacenter. It is assumed that most firewall rules in the datacenter are reasonably configured, and a group of firewall rules should server for the same purpose. For example, firewall rules “Remote Desktop -Shadow (TCP-In) ” , “Remote Desktop -User Mode (TCP-In) ” , “Remote Desktop -User Mode (UDP-In) ” , etc. serve for the same purpose of “Remote Desktop” . Those firewall rules serving for the same purpose would have equivalent count distributions in the datacenter, and thus each of the firewall rule groups obtained by using count distributions would comprise a plurality of firewall rules that serve for the same purpose and have the same risk attribute. In other words, if a firewall rule has a certain risk attribute, other firewall rules having equivalent count distributions, e.g., other firewall rules in the same firewall rule group with the firewall rule, would also have such risk attribute. Therefore, with the firewall rule groups obtained through the embodiments of the present disclosure, risk assessment may be performed efficiently, e.g., the same risk attribute label may be attached to all firewall rules in one firewall rule group. In an aspect, if manual labeling is adopted, it is only needed to manually determine a risk attribute label of one or few firewall rules in one firewall rule group, and the determined label can be attached to all firewall rules in the firewall rule group, without the need of checking all the firewall rules in the firewall rule group. In an aspect, the embodiments of the present disclosure may automatically attach risk attribute labels to firewall rules in the firewall rule groups by, e.g., utilizing historical risk attribute labels.
Through clustering firewall rules into firewall rule groups, the embodiments of the present disclosure may facilitate to perform efficient, precise and automatic risk assessment of firewall rules. All firewall rules in a datacenter may be assessed in an efficient and time-saving approach. Moreover, benefiting from the high efficiency of risk assessment according to the embodiments, it would be easier to achieve centralized control of firewall rules in a datacenter. Risky firewall rules can be identified timely. The processes proposed by the embodiments of the present disclosure may be performed automatically in a periodical approach or in response to any type of triggering events, and thus firewall rules in a datacenter can be monitored timely and continuously.
FIG. 1 illustrates an exemplary process 100 for performing risk assessment of firewall rules in a datacenter according to an embodiment. It is assumed that a datacenter 102 is configured with a plurality of firewall rules, and risk attributes of these firewall rules are to be assessed. The datacenter 102 may comprise a plurality of devices, e.g., network or computing devices.
Firewall rule configuration information 112 of the devices in the datacenter 102 may be obtained. The firewall rule configuration information 112 may comprise firewall rules configured at each device. For example, assuming that there are 25 firewall rules configured at a certain database, firewall rule configuration information of this database may comprise or indicate names of the 25 firewall rules.
A plurality of device groups 114 formed by the devices in the datacenter 102 may be identified. Usually, devices in a datacenter may be divided into a plurality of device groups based on functions or purposes, and devices in each device group may have the same function or purpose. For example, there may be 10 devices for providing a certain file assessing service in a datacenter, and thus these 10 devices may be divided into one device group. Since devices in one device group have the same function or purpose, these devices may also have the same or similar requirements of firewall rules. Accordingly, it is very likely that firewall rules serving for the same purpose are also configured at the devices in the same one device group. It should be understood that the identifying of device groups may refer to determining the device groups from all devices in the datacenter 102 based on various predetermined functions or purposes, or receiving indications of the device groups that have been previously identified in any approaches.
At 120, count distributions of the plurality of firewall rules configured in the datacenter 102 over the plurality of device groups 114 may be determined. A count distribution of each firewall rule may comprise the number of devices being configured with the firewall rule in each of the plurality of device groups. It is assumed that, for a firewall rule R
i, this firewall rule is configured at 10 devices out of the total 25 devices in device group 1, configured at 8 devices out of the total 12 devices in device group 2, configured at 15 devices out of the total 30 devices in device group 3, etc. Then, a count distribution of the firewall rule R
i may indicate that this firewall rule is configured at 10 devices in device group 1, configured at 8 devices in device group 2, configured at 15 devices in device group 3, etc. The firewall rule configuration information 112 and the identified device groups 114 may be used together for determining a count distribution of each firewall rule.
In an implementation, the firewall rules in the datacenter 102 may be represented as a list of tuples, each tuple corresponding to a firewall rule. An exemplary format of a tuple may be [RuleName, Location, Count] , wherein the item of RuleName is the name of a firewall rule represented by this tuple, and the items of Location and Count are used for representing information about a count distribution of this firewall rule. For example, the item of Location may be a vector listing all the device groups 114 in the datacenter, and the item of Count may be a vector listing the number of devices being configured with this firewall rule in each of the plurality of device groups 114. It should be understood that the above representations of the firewall rules are exemplary, and the firewall rules are not limited to be represented in this approach.
At 130, the plurality of firewall rules in the datacenter 102 may be clustered into a plurality of firewall rule groups 140 based on the count distributions determined at 120. In an implementation, the clustering operation at 130 may intend to cluster firewall rules that having equivalent count distributions into the same firewall rule group. Herein, the equivalent count distributions may refer to the same or similar count distributions. Firewall rules in each of the firewall rule groups 140 would have the same risk attribute. In an implementation, an unsupervised learning algorithm may be adopted for performing the clustering operation at 130, which will be discussed in details in connection with FIG. 2 and FIG. 3 later.
The firewall rule groups obtained through the process 100 would facilitate to greatly improve efficiency in the risk assessment of firewall rules. For example, since firewall rules in each firewall rule group have the same risk attribute, risk attribute labels of these firewall rules may be unified to the same risk attribute label. Accordingly, there is no need of checking all firewall rules in one firewall rule group.
Optionally, additional operations of attaching risk attribute labels to firewall rules may be further performed based at least on the firewall rule groups 140.
In an implementation, manual labeling may be performed at 150. For each firewall rule group among the firewall rule groups 140, a risk attribute label of one firewall rule in this firewall rule group may be manually determined, and the determined label may be further attached to all firewall rules in this firewall rule group. Alternatively, several risk attribute labels of few firewall rules in this firewall rule group may be manually determined firstly, and then a representative or combined risk attribute label may be selected from the several risk attribute labels and attached to all firewall rules in this firewall rule group.
In another implementation, automatic labeling may be performed at 160. It is assumed that at least a part or all of the firewall rules in the datacenter have been attached with risk attribute labels before the process 100 is performed, these existing risk attribute labels may be used as historical risk attribute labels and applied at 160 for automatically determining and attaching risk attribute labels to firewall rules in the firewall rule groups 140. The operation of automatic labeling at 160 will be discussed in details in connection with FIG. 4 and FIG. 5 later.
It should be understood that the process 100 may be performed periodically according to a predefined period. The predefined period may be any types of period, e.g., per day, per week, etc. Alternatively, the process 100 may be performed in response to any type of triggering events. The triggering events may be any type of predefined events, e.g., device failure, system failure, network attacks, new service or device deployment, service updating, predetermined time points, etc. Through performing the process 100 in a periodical approach or in response to triggering events, the firewall rules in the datacenter may be monitored timely and continuously, and thus risky firewall rules can be identified timely through performing risk assessment to the firewall rules.
It should be understood that risk attributes of firewall rules may be classified in different approaches. In one case, risk attributes of firewall rules may be classified as risky or riskless. Accordingly, a risk attribute label attached to a firewall rule may indicate whether the firewall rule is risky or riskless. In one case, risk attributes of firewall rules may be classified into various risky levels, e.g., high risk, low risk, riskless, etc. Accordingly, a risk attribute label attached to a firewall rule may indicate a certain risky level. The embodiments of the present disclosure are not restricted by any specific classification approaches of risk attributes.
FIG. 2 illustrates an exemplary process 200 for clustering firewall rules into firewall rule groups according to an embodiment. The process 200 is an exemplary implementation of the clustering operation at 130 in FIG. 1. An unsupervised learning algorithm for clustering firewall rules is discussed in connection with operations in the process 200. The unsupervised learning algorithm intends to cluster firewall rules that have equivalent count distributions into the same firewall rule group.
It is assumed that count distributions 202 of firewall rules in a datacenter have been obtained through, e.g., the operation at 120 in FIG. 1.
At 210, distribution vector representing may be performed for the firewall rules based on the count distributions 202. For example, a count distribution of each firewall rule may be represented as a distribution vector of the firewall rule. The distribution vector may be in a format of, e.g.,
in which
denotes the distribution vector of the i-th firewall rule R
i, n corresponds to the number of device groups, and C
i, j (j=1, …, n) denotes the count of R
i in the j-th device group, i.e., the number of devices being configured with R
i in the j-th device group. Taking a distribution vector of
as an example,
denotes the distribution vector of the third firewall rule R
3, which indicates that there are 200 devices being configured with R
3 in the first device group, there are 150 devices being configured with R
3 in the second device group, …, there are 300 devices being configured with R
3 in the n-th device group. Through the operation at 210, distribution vectors 212 of the firewall rules in the datacenter can be obtained.
At 220, similarity calculation may be performed among the firewall rules based on the distribution vectors 212. For example, similarity of every two firewall rules among the firewall rules may be calculated with distribution vectors of the two firewall rules, and a similarity matrix 222 may be formed with the calculated similarities. Various approaches may be adopted for calculating similarity of two firewall rules.
In an implementation, cosine similarity may be adopted for calculating similarity of two firewall rules. For example, assuming that
is a distribution vector of the firewall rule R
1 and
is a distribution vector of the firewall rule R
2, similarity of R
1 and R
2 may be calculated as:
wherein cos_sim (·) denotes a cosine similarity function.
In an implementation, maximum relative distance similarity may be adopted for calculating similarity of two firewall rules. For example, assuming that
is a distribution vector of the firewall rule R
1 and
is a distribution vector of the firewall rule R
2, similarity of R
1 and R
2 may be calculated as:
wherein max_sim (·) denotes a maximum relative distance similarity function, min (·) is a minimum value extraction function, truncate (·) is a truncating function, and abs (·) denotes taking an absolute value.
The maximum relative distance similarity is proposed by the embodiments of the present disclosure to provide sharper function shape than that of the cosine similarity. As compared with the cosine similarity, the maximum relative distance similarity may achieve higher performance in the process of clustering firewall rules, because it can pay more attention to distribution characteristics of count distributions indicated in distribution vectors.
It should be understood that the similarity calculation at 220 may adopt either or both of the cosine similarity and the maximum relative distance similarity, or any other approaches capable of calculating similarity of two firewall rules.
The similarity matrix 222 may be formed with the calculated similarities of every two firewall rules among all the firewall rules. The similarity matrix 222 may be a m╳m matrix, wherein m is the number of firewall rules in the datacenter. Items in the similarity matrix 222 may be denoted as l
p, q which is the calculated similarity of the p-th firewall rule and the q-th firewall rule.
At 230, matrix conversion may be performed to the similarity matrix 222. For example, the similarity matrix 222 may be converted to an adjacency matrix 232 by applying a similarity threshold. Those items with values equal to or above the similarity threshold in the similarity matrix 222 may be converted to items with value “1” in the adjacency matrix 232, while those items with values below the similarity threshold in the similarity matrix 222 may be converted to items with value “0” in the adjacency matrix 232. Alternatively, diagonal items in the similarity matrix 222 may also be converted to items with value “0” in the adjacency matrix 232. The similarity threshold may be predetermined empirically or experimentally. A higher similarity threshold would ensure that firewall rules in one group can have high similarity with each other, but may cause fewer firewall rules to be included in one group. A lower similarity threshold would cause a group to include more firewall rules, but may cluster risky firewall rules and riskless firewall rules into one group.
At 240, graph building may be performed with the adjacency matrix 232. For example, a graph representation 242 of the adjacency matrix 232 may be built at 240. Value “1” in the adjacency matrix 232 indicates that there is an edge between two nodes in the graph representation 242, wherein the two nodes correspond to two firewall rules.
At 250, subgraph extraction may be performed to the graph representation 242. For example, a plurality of connected subgraphs 252 may be extracted from the graph representation 242. Each connected subgraph may comprise a plurality of nodes that have high similarity with each other. Therefore, the plurality of connected subgraphs 252 may correspond to a plurality of firewall rule groups 260 respectively. It should be understood that it is possible that a connected subgraph contains only one node, which indicates that similarities between a firewall rule corresponding to this node and any other firewall rules are below the similarity threshold, and accordingly this firewall rule itself will form a firewall rule group.
Through the process 200, firewall rules in a datacenter may be clustered into a plurality of firewall rule groups based on the count distributions of the firewall rules. It should be understood that all the operations in the process 200 are exemplary, and the embodiments of the present disclosure may cover any other approaches or processes that are capable of clustering firewall rules based on count distributions. Moreover, it should be understood that, since a complete graph may indicate that all the nodes therein have higher similarities than other types of connected graph, and accordingly lead to a higher precision of clustering, the similarity threshold may also be predetermined to cause the plurality of connected subgraphs to approximate complete graphs. For example, the similarity threshold may be selected for causing the extracted connected subgraphs 252 to be complete graphs and close to complete graphs as much as possible.
Assuming that a firewall rule is originally represented as a tuple of [RuleName, Location, Count] , when the firewall rule is clustered into a certain firewall rule group through the process 200, the firewall rule may be represented as an updated tuple of [RuleName, Location, Count, GroupID] , wherein the item of GroupID is an identification of the firewall rule group into which the firewall rule is clustered.
FIG. 3 illustrates an example of obtaining firewall rule groups according to an embodiment. The example in FIG. 3 is proposed based on the process 200 in FIG. 2.
For the sake of simplicity, it is assumed that there are total six firewall rules in a datacenter to be performed a clustering process, including R
1, R
2, R
3, R
4, R
5 and R
6. Similarity matrix may have an exemplary format 310, wherein an item l
p, q denotes the calculated similarity of the p-th firewall rule and the q-th firewall rule. As an example, a similarity matrix 312 is shown in FIG. 3, in which each item is inserted with a value of calculated similarity.
According to the operation 230 in FIG. 2, the similarity matrix 312 may be converted to an adjacency matrix 322 through applying an exemplary similarity threshold “0.8” . Those items with values equal to or above the similarity threshold “0.8” in the similarity matrix 312 are converted to items with value “1” in the adjacency matrix 322, those items with values below the similarity threshold “0.8” in the similarity matrix 312 are converted to items with value “0” in the adjacency matrix 322, and diagonal items in the similarity matrix 312 are converted to items with value “0” in the adjacency matrix 322.
According to the operation 240 in FIG. 2, a graph representation 332 is built for the adjacency matrix 322, in which edges among nodes are set based on those items with value “1” in the adjacency matrix 322.
According to the operation 250 in FIG. 2, two connected subgraphs 342 and 344 are extracted from the graph representation 332. The connected subgraph 342 contains three nodes corresponding to the firewall rules R
1, R
2 and R
3 respectively, and the connected subgraph 344 contains three nodes corresponding to the firewall rules R
4, R
5 and R
6 respectively.
Based on the extracted connected subgraphs 342 and 344, two firewall rule groups are obtained. For example, Group 1 containing the firewall rules R
1, R
2 and R
3 may be determined based on the connected subgraph 342, and Group 2 containing the firewall rules R
4, R
5 and R
6 may be determined based on the connected subgraph 344.
FIG. 4 illustrates an exemplary process 400 for automatically labeling firewall rules in a firewall rule group according to an embodiment. The process 400 is an exemplary implementation of the operation at 160 in FIG. 1. The process 400 utilizes historical or existing risk attribute labels of firewall rules for automatically labeling firewall rules in each firewall rule group. The historical risk attribute labels may be manually labelled previously or automatically labeled through performing the process 400 previously. In FIG. 4, those firewall rules having been attached with historical risk attribute labels may also be referred to as historical firewall rules. For example, the process 100 in FIG. 1 may be performed iteratively or repeatedly, and thus those firewall rules processed in the last iteration of the process 100 may be deemed as historical firewall rules.
It is assumed that firewall rules in a target firewall rule group 402 are to be automatically labeled, wherein the target firewall rule group 402 may come from the firewall rule groups 140 in FIG. 1 or the firewall rule groups 260 in FIG. 2. The process 400 intends to attach the same risk attribute label to all firewall rules in the target firewall rule group 402 automatically.
At 410, an intermediate risk attribute label may be assigned to each firewall rule in the target firewall rule group 402. The intermediate risk attribute label may be a historical risk attribute label. In an implementation, the assigning of intermediate risk attribute label may be performed through historical risk attribute label inheriting mechanism. FIG. 5 illustrates an exemplary process 500 of historical risk attribute label inheriting mechanism according to an embodiment.
For the current firewall rule 502 in the target firewall rule group 402, a corresponding historical firewall rule may be identified at 510. In an implementation, a historical firewall rule having the same name as the current firewall rule 502 may be identified at 510.
At 520, it is determined whether the current firewall rule 502 and the identified historical firewall rule have equivalent count distributions. In an implementation, similarity of the current firewall rule 502 and the identified historical firewall rule may be calculated according to the operation 220 in FIG. 2, and the calculated similarity may be compared with a predetermined inheriting threshold.
In response to determining at 520 that the current firewall rule 502 and the identified historical firewall rule have equivalent count distributions, e.g., the calculated similarity is equal to or above the inheriting threshold, an operation of label inheriting may be performed at 530, e.g., assigning a historical risk attribute label of the identified historical firewall rule to the current firewall rule 502 as an intermediate risk attribute label of the current firewall rule 502.
In response to determining at 520 that the current firewall rule 502 and the identified historical firewall rule do not have equivalent count distributions, e.g., the calculated similarity is below the inheriting threshold, the current firewall rule 502 may be labeled as unknown, wherein the “unknown” risk attribute label is an intermediate risk attribute label of the current firewall rule 502.
Through performing the process 500 for each firewall rule in the target firewall rule group 402, all the firewall rules in the target firewall rule group 402 would be assigned with respective intermediate risk attribute labels.
Return to FIG. 4, the intermediate risk attribute labels of the firewall rules in the target firewall rule group 402 may be used for determining whether to attach a unified same risk attribute label to all firewall rules in the target firewall rule group 402. In an implementation, at 420, it is determined whether the ratio of firewall rules being assigned with a type of intermediate risk attribute label in the target firewall rule group 402 is above a ratio threshold. Assuming that the type of intermediate risk attribute label may be risky or riskless, and taking a ratio threshold “65%” as an example, it may be determined at 420 whether the ratio of firewall rules being assigned with a risky label in the target firewall rule group 402 is above 65%, or whether the ratio of firewall rules being assigned with a riskless label in the target firewall rule group 402 is above 65%.
In response to determining at 420 that the ratio of firewall rules being assigned with a type of intermediate risk attribute label in the firewall rule group 402 is above the ratio threshold, the type of intermediate risk attribute label may be broadcasted in the target firewall rule group 402 at 430, e.g., attaching the type of intermediate risk attribute label to all firewall rules in the target firewall rule group 402. For example, assuming that the ratio of firewall rules being assigned with a risky label in the target firewall rule group 402 is above the ratio threshold 65%, all the firewall rules in the target firewall rule group 402 may be attached with the risky label. Then the process 400 would end at 440.
In response to determining at 420 that the ratio of firewall rules being assigned with any type of intermediate risk attribute label in the firewall rule group 402 is not above the ratio threshold, no intermediate risk attribute label would be broadcasted in the target firewall rule group 402, and the process 400 would end at 440 directly.
It should be understood that, in FIG. 4, the ratio threshold may be used for controlling that: when a majority or a predetermined portion of firewall rules in a firewall rule group have the same risk attribute label, i.e., have the same risk attribute, it can be derived that all firewall rules in the firewall rule group should have such risk attribute and thus can be attached with the same risk attribute label.
Assuming that a firewall rule in the target firewall rule group 402 is originally represented as a tuple of [RuleName, Location, Count, GroupID] , wherein the item of GroupID corresponds to the target firewall rule group 402, when the firewall rule is attached with a risk attribute label through the process 400, the firewall rule may be represented as an updated tuple of [RuleName, Location, Count, GroupID, Label] , wherein the item of Label is the risk attribute label attached through the process 400. It should be understood that, for a next iteration of the process 100 in FIG. 1 together with the process 400 in FIG. 4, the current firewall rule would become a historical firewall rule, and the current Label in the tuple would become a historical risk attribute label.
FIG. 6 illustrates a flowchart of an exemplary method 600 for risk assessment of firewall rules in a datacenter according to an embodiment. The datacenter may have a plurality of devices and be configured with a plurality of firewall rules.
At 610, firewall rule configuration information of the plurality of devices may be obtained.
At 620, a plurality of device groups formed by the plurality of devices may be identified.
At 630, count distributions of the plurality of firewall rules over the plurality of device groups may be determined.
At 640, the plurality of firewall rules may be clustered into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
In an implementation, the firewall rule configuration information may comprise firewall rules configured at each device.
In an implementation, devices in each device group may have the same function or purpose.
In an implementation, a count distribution of each firewall rule may comprise the number of devices being configured with the firewall rule in each of the plurality of device groups.
In an implementation, the clustering may comprise: clustering firewall rules that have equivalent count distributions into the same firewall rule group.
In an implementation, the clustering may comprise: representing a count distribution of each firewall rule as a distribution vector of the firewall rule; calculating similarity of every two firewall rules among the plurality of firewall rules with distribution vectors of the two firewall rules, to form a similarity matrix; converting the similarity matrix to an adjacency matrix by applying a similarity threshold; building a graph representation of the adjacency matrix; and extracting, from the graph representation, a plurality of connected subgraphs corresponding to the plurality of firewall rule groups respectively.
The similarity threshold may be predetermined to cause the plurality of connected subgraphs to approximate complete graphs.
The similarity of every two firewall rules may be calculated based on cosine similarity or maximum relative distance similarity.
In an implementation, the method 600 may further comprise: attaching the same risk attribute label to all firewall rules in each firewall rule group automatically.
A risk attribute label attached to a firewall rule may indicate whether the firewall rule is risky or riskless.
The attaching the same risk attribute label may comprise: assigning an intermediate risk attribute label to each firewall rule in the firewall rule group through historical risk attribute label inheriting mechanism; determining whether the ratio of firewall rules being assigned with a type of intermediate risk attribute label in the firewall rule group is above a ratio threshold; and in response to determining that the ratio is above the ratio threshold, attaching the type of intermediate risk attribute label to all firewall rules in the firewall rule group.
The historical risk attribute label inheriting mechanism may comprise: identifying a historical firewall rule having the same name as the current firewall rule; determining whether the current firewall rule and the historical firewall rule have equivalent count distributions; and in response to determining that the current firewall rule and the historical firewall rule have equivalent count distributions, assigning a historical risk attribute label of the historical firewall rule to the current firewall rule as an intermediate risk attribute label of the current firewall rule.
It should be understood that the method 600 may further comprise any steps/processes for risk assessment of firewall rules in a datacenter according to the embodiments of the present disclosure as mentioned above.
FIG. 7 illustrates an exemplary apparatus 700 for risk assessment of firewall rules in a datacenter according to an embodiment. The datacenter may have a plurality of devices and be configured with a plurality of firewall rules.
The apparatus 700 may comprise: a configuration information obtaining module 710, for obtaining firewall rule configuration information of the plurality of devices; a device group identifying module 720, for identifying a plurality of device groups formed by the plurality of devices; a count distribution determining module 730, for determining count distributions of the plurality of firewall rules over the plurality of device groups; and a clustering module 740, for clustering the plurality of firewall rules into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
Moreover, the apparatus 700 may also comprise any other modules configured for risk assessment of firewall rules in a datacenter according to the embodiments of the present disclosure as mentioned above.
FIG. 8 illustrates an exemplary apparatus 800 for risk assessment of firewall rules in a datacenter according to an embodiment. The datacenter may have a plurality of devices and be configured with a plurality of firewall rules.
The apparatus 800 may comprise: at least one processor 810; and a memory 820 storing computer-executable instructions. When executing the computer-executable instructions, the at least one processor 810 may: obtain firewall rule configuration information of the plurality of devices; identify a plurality of device groups formed by the plurality of devices; determine count distributions of the plurality of firewall rules over the plurality of device groups; and cluster the plurality of firewall rules into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
In an implementation, a count distribution of each firewall rule may comprise the number of devices being configured with the firewall rule in each of the plurality of device groups.
In an implementation, the clustering may comprise: clustering firewall rules that have equivalent count distributions into the same firewall rule group.
In an implementation, the clustering may comprise: representing a count distribution of each firewall rule as a distribution vector of the firewall rule; calculating similarity of every two firewall rules among the plurality of firewall rules with distribution vectors of the two firewall rules, to form a similarity matrix; converting the similarity matrix to an adjacency matrix by applying a similarity threshold; building a graph representation of the adjacency matrix; and extracting, from the graph representation, a plurality of connected subgraphs corresponding to the plurality of firewall rule groups respectively.
In an implementation, the computer-executable instructions stored in the memory 820 may be further executed to cause the at least one processor 810 to: attach the same risk attribute label to all firewall rules in each firewall rule group automatically.
The attaching the same risk attribute label may comprise: assigning an intermediate risk attribute label to each firewall rule in the firewall rule group through historical risk attribute label inheriting mechanism; determining whether the ratio of firewall rules being assigned with a type of intermediate risk attribute label in the firewall rule group is above a ratio threshold; and in response to determining that the ratio is above the ratio threshold, attaching the type of intermediate risk attribute label to all firewall rules in the firewall rule group.
The historical risk attribute label inheriting mechanism may comprise: identifying a historical firewall rule having the same name as the current firewall rule; determining whether the current firewall rule and the historical firewall rule have equivalent count distributions; and in response to determining that the current firewall rule and the historical firewall rule have equivalent count distributions, assigning a historical risk attribute label of the historical firewall rule to the current firewall rule as an intermediate risk attribute label of the current firewall rule.
Moreover, the at least one processor 810 may perform any other operations of the methods for risk assessment of firewall rules in a datacenter according to the embodiments of the present disclosure as mentioned above.
The embodiments of the present disclosure propose a computer program product for risk assessment of firewall rules in a datacenter. The datacenter may have a plurality of devices and be configured with a plurality of firewall rules. The computer program product may comprise a computer program that is executed by at least one processor for: obtaining firewall rule configuration information of the plurality of devices; identifying a plurality of device groups formed by the plurality of devices; determining count distributions of the plurality of firewall rules over the plurality of device groups; and clustering the plurality of firewall rules into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute. Moreover, the computer program in the computer program product may be further executed by the at least one processor to perform any other operations of the methods for risk assessment of firewall rules in a datacenter according to the embodiments of the present disclosure as mentioned above.
The embodiments of the present disclosure may be embodied in a non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform any operations of the methods for risk assessment of firewall rules in a datacenter according to the embodiments of the present disclosure as mentioned above.
It should be appreciated that all the operations in the methods described above are merely exemplary, and the present disclosure is not limited to any operations in the methods or sequence orders of these operations, and should cover all other equivalents under the same or similar concepts.
It should also be appreciated that all the modules in the apparatuses described above may be implemented in various approaches. These modules may be implemented as hardware, software, or a combination thereof. Moreover, any of these modules may be further functionally divided into sub-modules or combined together.
Processors have been described in connection with various apparatuses and methods. These processors may be implemented using electronic hardware, computer software, or any combination thereof. Whether such processors are implemented as hardware or software will depend upon the particular application and overall design constraints imposed on the system. By way of example, a processor, any portion of a processor, or any combination of processors presented in the present disclosure may be implemented with a microprocessor, microcontroller, digital signal processor (DSP) , a field-programmable gate array (FPGA) , a programmable logic device (PLD) , a state machine, gated logic, discrete hardware circuits, and other suitable processing components configured to perform the various functions described throughout the present disclosure. The functionality of a processor, any portion of a processor, or any combination of processors presented in the present disclosure may be implemented with software being executed by a microprocessor, microcontroller, DSP, or other suitable platform.
Software shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, threads of execution, procedures, functions, etc. The software may reside on a computer-readable medium. A computer-readable medium may include, by way of example, memory such as a magnetic storage device (e.g., hard disk, floppy disk, magnetic strip) , an optical disk, a smart card, a flash memory device, random access memory (RAM) , read only memory (ROM) , programmable ROM (PROM) , erasable PROM (EPROM) , electrically erasable PROM (EEPROM) , a register, or a removable disk. Although memory is shown separate from the processors in the various aspects presented throughout the present disclosure, the memory may be internal to the processors, e.g., cache or register.
The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein. All structural and functional equivalents to the elements of the various aspects described throughout the present disclosure that are known or later come to be known to those of ordinary skilled in the art are intended to be encompassed by the claims.
Claims (20)
- A method for risk assessment of firewall rules in a datacenter, the datacenter having a plurality of devices and being configured with a plurality of firewall rules, the method comprising:obtaining firewall rule configuration information of the plurality of devices;identifying a plurality of device groups formed by the plurality of devices;determining count distributions of the plurality of firewall rules over the plurality of device groups; andclustering the plurality of firewall rules into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
- The method of claim 1, whereinthe firewall rule configuration information comprises firewall rules configured at each device.
- The method of claim 1, whereindevices in each device group have the same function or purpose.
- The method of claim 1, whereina count distribution of each firewall rule comprises the number of devices being configured with the firewall rule in each of the plurality of device groups.
- The method of claim 1, wherein the clustering comprises:clustering firewall rules that have equivalent count distributions into the same firewall rule group.
- The method of claim 1, wherein the clustering comprises:representing a count distribution of each firewall rule as a distribution vector of the firewall rule;calculating similarity of every two firewall rules among the plurality of firewall rules with distribution vectors of the two firewall rules, to form a similarity matrix;converting the similarity matrix to an adjacency matrix by applying a similarity threshold;building a graph representation of the adjacency matrix; andextracting, from the graph representation, a plurality of connected subgraphs corresponding to the plurality of firewall rule groups respectively.
- The method of claim 6, whereinthe similarity threshold is predetermined to cause the plurality of connected subgraphs to approximate complete graphs.
- The method of claim 6, whereinthe similarity of every two firewall rules is calculated based on cosine similarity or maximum relative distance similarity.
- The method of claim 1, further comprising:attaching the same risk attribute label to all firewall rules in each firewall rule group automatically.
- The method of claim 9, whereina risk attribute label attached to a firewall rule indicates whether the firewall rule is risky or riskless.
- The method of claim 9, wherein the attaching the same risk attribute label comprises:assigning an intermediate risk attribute label to each firewall rule in the firewall rule group through historical risk attribute label inheriting mechanism;determining whether the ratio of firewall rules being assigned with a type of intermediate risk attribute label in the firewall rule group is above a ratio threshold; andin response to determining that the ratio is above the ratio threshold, attaching the type of intermediate risk attribute label to all firewall rules in the firewall rule group.
- The method of claim 11, wherein the historical risk attribute label inheriting mechanism comprises:identifying a historical firewall rule having the same name as the current firewall rule;determining whether the current firewall rule and the historical firewall rule have equivalent count distributions; andin response to determining that the current firewall rule and the historical firewall rule have equivalent count distributions, assigning a historical risk attribute label of the historical firewall rule to the current firewall rule as an intermediate risk attribute label of the current firewall rule.
- An apparatus for risk assessment of firewall rules in a datacenter, the datacenter having a plurality of devices and being configured with a plurality of firewall rules, the apparatus comprising:at least one processor; anda memory storing computer-executable instructions that, when executed, cause the at least one processor to:obtain firewall rule configuration information of the plurality of devices,identify a plurality of device groups formed by the plurality of devices,determine count distributions of the plurality of firewall rules over the plurality of device groups, andcluster the plurality of firewall rules into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
- The apparatus of claim 13, whereina count distribution of each firewall rule comprises the number of devices being configured with the firewall rule in each of the plurality of device groups.
- The apparatus of claim 13, wherein the clustering comprises:clustering firewall rules that have equivalent count distributions into the same firewall rule group.
- The apparatus of claim 13, wherein the clustering comprises:representing a count distribution of each firewall rule as a distribution vector of the firewall rule;calculating similarity of every two firewall rules among the plurality of firewall rules with distribution vectors of the two firewall rules, to form a similarity matrix;converting the similarity matrix to an adjacency matrix by applying a similarity threshold;building a graph representation of the adjacency matrix; andextracting, from the graph representation, a plurality of connected subgraphs corresponding to the plurality of firewall rule groups respectively.
- The apparatus of claim 13, wherein the computer-executable instructions stored in the memory are further executed to cause the at least one processor to:attach the same risk attribute label to all firewall rules in each firewall rule group automatically.
- The apparatus of claim 17, wherein the attaching the same risk attribute label comprises:assigning an intermediate risk attribute label to each firewall rule in the firewall rule group through historical risk attribute label inheriting mechanism;determining whether the ratio of firewall rules being assigned with a type of intermediate risk attribute label in the firewall rule group is above a ratio threshold; andin response to determining that the ratio is above the ratio threshold, attaching the type of intermediate risk attribute label to all firewall rules in the firewall rule group.
- The apparatus of claim 18, wherein the historical risk attribute label inheriting mechanism comprises:identifying a historical firewall rule having the same name as the current firewall rule;determining whether the current firewall rule and the historical firewall rule have equivalent count distributions; andin response to determining that the current firewall rule and the historical firewall rule have equivalent count distributions, assigning a historical risk attribute label of the historical firewall rule to the current firewall rule as an intermediate risk attribute label of the current firewall rule.
- A computer program product for risk assessment of firewall rules in a datacenter, the datacenter having a plurality of devices and being configured with a plurality of firewall rules, the computer program product comprising a computer program that is executed by at least one processor for:obtaining firewall rule configuration information of the plurality of devices;identifying a plurality of device groups formed by the plurality of devices;determining count distributions of the plurality of firewall rules over the plurality of device groups; andclustering the plurality of firewall rules into a plurality of firewall rule groups based on the count distributions, firewall rules in each firewall rule group having the same risk attribute.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202180048860.0A CN115885273A (en) | 2021-02-19 | 2021-02-19 | Risk assessment of firewall rules in a data center |
PCT/CN2021/076792 WO2022174379A1 (en) | 2021-02-19 | 2021-02-19 | Risk assessment of firewall rules in a datacenter |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/076792 WO2022174379A1 (en) | 2021-02-19 | 2021-02-19 | Risk assessment of firewall rules in a datacenter |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022174379A1 true WO2022174379A1 (en) | 2022-08-25 |
Family
ID=74870552
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/076792 WO2022174379A1 (en) | 2021-02-19 | 2021-02-19 | Risk assessment of firewall rules in a datacenter |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115885273A (en) |
WO (1) | WO2022174379A1 (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180176185A1 (en) * | 2016-12-19 | 2018-06-21 | Nicira, Inc. | Firewall rule management for hierarchical entities |
US20180316707A1 (en) * | 2017-04-26 | 2018-11-01 | Elasticsearch B.V. | Clustering and Outlier Detection in Anomaly and Causation Detection for Computing Environments |
-
2021
- 2021-02-19 WO PCT/CN2021/076792 patent/WO2022174379A1/en active Application Filing
- 2021-02-19 CN CN202180048860.0A patent/CN115885273A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180176185A1 (en) * | 2016-12-19 | 2018-06-21 | Nicira, Inc. | Firewall rule management for hierarchical entities |
US20180316707A1 (en) * | 2017-04-26 | 2018-11-01 | Elasticsearch B.V. | Clustering and Outlier Detection in Anomaly and Causation Detection for Computing Environments |
Non-Patent Citations (1)
Title |
---|
GUANHUA YAN ET AL: "Dynamic Balancing of Packet Filtering Workloads on Distributed Firewalls", QUALITY OF SERVICE, 2008. IWQOS 2008. 16TH INTERNATIONAL WORKSHOP ON, IEEE, PISCATAWAY, NJ, USA, 2 June 2008 (2008-06-02), pages 209 - 218, XP031270241, ISBN: 978-1-4244-2084-1 * |
Also Published As
Publication number | Publication date |
---|---|
CN115885273A (en) | 2023-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112232771B (en) | Big data analysis method and big data cloud platform applied to smart government-enterprise cloud service | |
AU2016351079A1 (en) | Service processing method and apparatus | |
US11223642B2 (en) | Assessing technical risk in information technology service management using visual pattern recognition | |
CN110544109A (en) | user portrait generation method and device, computer equipment and storage medium | |
CN113626241B (en) | Abnormality processing method, device, equipment and storage medium for application program | |
WO2021109724A1 (en) | Log anomaly detection method and apparatus | |
US20190034760A1 (en) | Incident prediction and prevention | |
CN110046188A (en) | Method for processing business and its system | |
CN115632874A (en) | Method, device, equipment and storage medium for detecting threat of entity object | |
WO2022174379A1 (en) | Risk assessment of firewall rules in a datacenter | |
US11520827B2 (en) | Converting unlabeled data into labeled data | |
CN108830302B (en) | Image classification method, training method, classification prediction method and related device | |
CN115146653B (en) | Dialogue scenario construction method, device, equipment and storage medium | |
CN114495137B (en) | Bill abnormity detection model generation method and bill abnormity detection method | |
CN111046892A (en) | Abnormity identification method and device | |
CN114491282A (en) | Abnormal user behavior analysis method and system based on cloud computing | |
CN113254672A (en) | Abnormal account identification method, system, equipment and readable storage medium | |
CN114625747B (en) | Wind control updating method and system based on information security | |
CN112905191B (en) | Data processing method, device, computer readable storage medium and computer equipment | |
CN113222624B (en) | Intelligent analysis method and system for preventing electricity stealing | |
US11714842B1 (en) | System and method for sensitive content analysis prioritization based on file metadata | |
CN113687849B (en) | Firmware batch upgrading method, device, equipment and storage medium | |
CN115658990B (en) | Data processing method and device for target space grouping | |
CN111625672B (en) | Image processing method, image processing device, computer equipment and storage medium | |
CN115098686A (en) | Grading information determination method and device and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21711174 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21711174 Country of ref document: EP Kind code of ref document: A1 |