CN116192456A - Method, device, equipment and medium for detecting partner attack based on network traffic - Google Patents

Method, device, equipment and medium for detecting partner attack based on network traffic Download PDF

Info

Publication number
CN116192456A
CN116192456A CN202211691764.0A CN202211691764A CN116192456A CN 116192456 A CN116192456 A CN 116192456A CN 202211691764 A CN202211691764 A CN 202211691764A CN 116192456 A CN116192456 A CN 116192456A
Authority
CN
China
Prior art keywords
users
time window
synchronicity
user
combinations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211691764.0A
Other languages
Chinese (zh)
Inventor
解建华
高霞
刘玉权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongtong Uniform Chuangfa Science And Technology Co ltd
Original Assignee
Zhongtong Uniform Chuangfa Science And Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongtong Uniform Chuangfa Science And Technology Co ltd filed Critical Zhongtong Uniform Chuangfa Science And Technology Co ltd
Priority to CN202211691764.0A priority Critical patent/CN116192456A/en
Publication of CN116192456A publication Critical patent/CN116192456A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2463/00Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00
    • H04L2463/121Timestamp
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the disclosure provides a method, a device, equipment and a medium for detecting a partner attack based on network traffic. The method comprises the steps of obtaining flow data to be detected; the flow data to be detected comprises: user information, entity information, and corresponding timestamps; dividing the flow to be detected into a plurality of combinations, wherein each combination comprises user information, corresponding entity information and a time stamp; for each of the plurality of combinations, performing time window segmentation according to a preset time window size; calculating the synchronicity between every two users in a target time window, and extracting every two users with synchronicity greater than or equal to a preset threshold value; carrying out synchronous processing on the extracted two-by-two users based on a connected component algorithm to obtain a plurality of risk clusters; and judging whether a partner attack exists or not based on the obtained multiple risk clusters. In this way, the partner behavior and the normal user can be effectively distinguished, a large number of partner users can be mined, and the attack behavior of the bot partner can be rapidly and accurately identified.

Description

Method, device, equipment and medium for detecting partner attack based on network traffic
Technical Field
The present disclosure relates to the field of network security, and in particular, to the field of partner attack detection methods, apparatuses, devices, and medium technologies based on network traffic.
Background
With the development of internet technology and the continuous perfection of attack detection strategies, the difficulty of individual attack is increased, meanwhile, the tendency of attack behavior grouping and organization is increased due to the lower and lower acquisition cost of individual accounts, and the losses of various group attacks to enterprises such as finance, electronic commerce and the like are huge. Because the rule for screening the group partner behaviors depends on expert knowledge summarized from a large number of historical cases, the attack behaviors are continuously changed and the hysteresis of the rule is adopted, so that the group partner attack behaviors are difficult to detect efficiently and quickly, and if large-scale attack behaviors are not found in time, the safety and stability of the system cannot be ensured.
Disclosure of Invention
The disclosure provides a method, a device, equipment and a medium for detecting a partner attack based on network traffic.
According to a first aspect of the present disclosure, a method for detecting a partner attack based on network traffic is provided. The method comprises the following steps:
acquiring flow data to be detected; the flow data to be detected comprises: user information, entity information, and corresponding timestamps;
dividing the flow to be detected into a plurality of combinations, wherein each combination comprises user information, corresponding entity information and a time stamp;
for each of the plurality of combinations, performing time window segmentation according to a preset time window size;
calculating the synchronicity between every two users in a target time window, and extracting every two users with synchronicity greater than or equal to a preset threshold value;
carrying out synchronous processing on the extracted two-by-two users based on a connected component algorithm to obtain a plurality of risk clusters;
and judging whether a partner attack exists or not based on the obtained multiple risk clusters.
Further, the user information includes: user ID, IP address, user agent information, which is expressed by user ID or IP address plus user agent information for a user; the entity information includes: access path, device information.
Further, for each of the plurality of combinations, performing time window segmentation according to a preset time window size includes:
sorting the divided combinations in time order based on the time stamps to generate a time series combination;
and dividing the time sequence combination according to the preset time window size, wherein the intersection of the adjacent preset time windows is not null.
Further, the target time window is a single time window, and the calculating the synchronicity between every two users in the target time window includes:
calculating the combination quantity of the first entities of the two users in a single time window according to a preset time interval to obtain independent operands of the first entities of the two users;
counting the synchronous combination quantity of every two users according to the preset time interval to obtain a synchronous operand of a first entity of every two users;
calculating the ratio of the synchronous operand to the independent operand to obtain the initial synchronism of the first entity corresponding to the two users;
and respectively calculating and completing initial synchronicity until all the entities are respectively calculated and completing, and taking the maximum initial synchronicity as the synchronization value of a single entity corresponding to two users.
Further, the target time window is all time windows, and the step of calculating the synchronicity between every two users in the target time window further includes:
if the synchronization value of the single entity of the two users is 0 or 1, calculating the combined quantity of all the entities corresponding to the two users in all the time windows to obtain the integral operand corresponding to the two users;
acquiring the integral synchronous number corresponding to two users; the integral synchronous number is the sum of synchronous operands of all entities of every two users;
and calculating the ratio of the integral synchronous number to the integral operand to obtain the integral synchronism of the two corresponding users.
Further, the determining whether a partner attack exists based on the obtained multiple risk clusters includes:
judging the sizes of the obtained multiple risk clusters one by one based on the preset number;
if the number of users of the risk clusters is greater than or equal to the preset number, indicating that the corresponding risk clusters have a group attack;
if the number of users of the risk clusters is smaller than the preset number, calculating an abnormal score value of the corresponding risk cluster; and if the abnormal score value is greater than or equal to a preset abnormal value, indicating that the corresponding risk cluster has a partner attack.
Further, the calculation formula of the anomaly score value is as follows:
Figure BDA0004021319880000031
wherein score ((u, v, t)) represents the anomaly score value of the target user; u and v respectively represent nodes; t represents the current time;
Figure BDA0004021319880000032
representing the number of combinations occurring between the current time windows u-v; />
Figure BDA0004021319880000033
Representing the number of combinations occurring between all time windows u-v preceding the current time window.
According to a second aspect of the present disclosure, a partner attack detection device based on network traffic is provided. The device comprises:
the flow acquisition module is used for acquiring flow data to be detected; the flow data to be detected comprises: user information, entity information, and corresponding timestamps;
the dividing and combining module is used for dividing the flow to be detected into a plurality of combinations, wherein each combination comprises user information, corresponding entity information and a time stamp;
the time window segmentation module is used for carrying out time window segmentation according to the preset time window size for each of the plurality of combinations;
the synchronicity calculating module is used for calculating synchronicity between every two users in the target time window and extracting every two users with synchronicity larger than or equal to a preset threshold value;
the risk cluster module is used for carrying out synchronous processing on the two extracted users based on a connected component algorithm to obtain a plurality of risk clusters;
and the attack judging module is used for judging whether a partner attack exists or not based on the obtained multiple risk clusters.
According to a third aspect of the present disclosure, an electronic device is provided. The electronic device includes: a memory and a processor, the memory having stored thereon a computer program, the processor implementing the method as described above when executing the program.
According to a fourth aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method according to the first aspect of the present disclosure.
The method comprises the steps of obtaining flow data to be detected, dividing the flow to be detected into a plurality of combinations, dividing the combinations into time windows according to the size of a preset time window, calculating the synchronicity between every two users in a target time window, and extracting every two users with synchronicity greater than or equal to a preset threshold value; carrying out synchronous processing on the extracted two-by-two users based on a connected component algorithm to obtain a plurality of risk clusters; and judging whether a partner attack exists or not based on the obtained multiple risk clusters. In this way, the partner behavior and the normal user can be effectively distinguished, a large number of partner users can be mined, and the attack behavior of the bot partner can be rapidly and accurately identified.
It should be understood that what is described in this summary is not intended to limit the critical or essential features of the embodiments of the disclosure nor to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The above and other features, advantages and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. For a better understanding of the present disclosure, and without limiting the disclosure thereto, the same or similar reference numerals denote the same or similar elements, wherein:
FIG. 1 illustrates a flow chart of a network traffic based group attack detection method according to an embodiment of the present disclosure;
FIG. 2 illustrates a flow chart of a network traffic based group attack detection method according to yet another embodiment of the present disclosure;
FIG. 3 illustrates a flow chart of a network traffic based group attack detection method according to yet another embodiment of the present disclosure;
FIG. 4 illustrates a flow chart of a network traffic based group attack detection method according to yet another embodiment of the present disclosure;
FIG. 5 illustrates a block diagram of a network traffic based group attack detection device according to an embodiment of the present disclosure;
fig. 6 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the present disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are some embodiments of the present disclosure, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments in this disclosure without inventive faculty, are intended to be within the scope of this disclosure.
In addition, the term "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Fig. 1 illustrates a flow chart of a network traffic based partner attack detection method 100 according to an embodiment of the present disclosure. The method 100 comprises the following steps:
step 110, obtaining flow data to be detected; the flow data to be detected comprises: user information, entity information, and corresponding timestamps.
In some embodiments, flow data to be detected is obtained, the flow data to be detected is parsed, and user information, entity information and corresponding time stamps are extracted. Wherein, the user information includes: user ID, IP address, user agent information UA, entity information including: access paths, device information, etc.
In some embodiments, preprocessing of the extracted user information and entity information is included. The user ID information is used as a user, and the combination information of the ip+user agent information UA can be used as a user for data without explicit user ID. For accessing address information, it needs to make standardization process for address content, if there is related content such as parameter in return path, it makes standardization process for parameter content, so that similar paths are used as a specific behavior, and the subsequent process of calculating synchronicity can accurately reflect behavior characteristics.
Step 120, dividing the traffic to be detected into a plurality of combinations, wherein each combination includes user information, corresponding entity information and a timestamp.
In some embodiments, each user, each entity, and corresponding timestamp are taken as one particular combination, and each log may generate multiple combinations because a user in a single log may correspond to multiple different entities.
Step 130, for each of the plurality of combinations, performing time window segmentation according to a preset time window size.
In some embodiments, since the computed synchronicity is based on a certain time interval, if the time interval of different users in the same entity in two combinations is within a certain time window, it means that the two combinations are a set of synchronicity combinations, and beyond this time window range, it means that the two are not in any contact. Therefore, according to the size of a specific time window, the time stamp in the combination is put into the corresponding window in the form of the window according to the time from small to large, so that the combination is divided. The division of the time window can be set according to the service requirement, five minutes, one hour and the like can be set, if the time window is set larger, the number of combinations in the time window is more, the calculated amount of user synchronism can be greatly increased, meanwhile, the behaviors among common users can be calculated to be synchronous operation by more accidental, and the displayed synchronism is looser. If the time window is small, the time interval of possible partner account operations is long, so that many synchronous operations cannot be detected at all, and therefore, correlation adjustment is required according to actual data on the set time window.
And 140, calculating the synchronicity between every two users in the target time window, and extracting every two users with synchronicity greater than or equal to a preset threshold value.
In some embodiments, synchronicity mainly refers to calculating the consistency of operation content and operation time between users, for example, two users do the same thing together at the same time point, if two users do the same thing together at multiple time points in the time dimension, it is indicated that the two users have high synchronicity, such synchronicity indicates that the two users have an internal association, for two common users, the same thing may occasionally happen at the same time point, but if such synchronicity happens for a long time, it indicates that there may be a certain association between the two users, such synchronicity is reflected in an algorithm, that is, users have more identical combinations in different time windows, because the situation of the same time occurrence is too significant in the real situation, as a cluster is generally masked in time, and therefore, a certain time interval is set for indicating the synchronicity. Through calculation of single entity synchronicity and overall synchronicity, single entity or overall synchronicity can be adopted as final synchronicity of users under different scenes, and for single entity association degree, the synchronicity with the largest value among all entities between two users is adopted, or the overall synchronicity is directly adopted.
In some embodiments, the synchronicity of two users under a specific entity is calculated in a manner of Jaccard similarity, that is, the number of operations synchronized by two under the same entity is divided by the number of operations shared by the two. The synchronicity between two users is reflected by calculating the ratio between the synchronous operand and the independent operand between the two users. A synchronicity between the users indicates the association between the two users, the value range of the synchronicity can be known to be between 0 and 1 according to the jaccard similarity formula, and the larger the synchronicity is, the stronger the synchronicity is. A specific threshold is set for the synchronicity of the users, for example, 0.95, and the combination of the two users with lower synchronicity is deleted, and the combination of the two users with synchronicity of more than or equal to 0.95 is extracted.
And 150, synchronously processing the extracted two-by-two users based on a connected component algorithm to obtain a plurality of risk clusters.
In some embodiments, the two users with synchronicity greater than or equal to 0.95 obtained in step 140 are combined, a connected component algorithm based on a graph structure is adopted, each user is used as a node, synchronicity between the users is used as an edge, the users capable of being associated together are used as a cluster combination with higher synchronicity, the users reflected in the cluster generally have higher behavior synchronicity, and the large-scale synchronicity objectively reflects that a certain association exists between the users.
Step 160, determining whether a group attack exists based on the obtained multiple risk clusters.
In some embodiments, a number of different clusters may be obtained through step 150, and a certain distinction is needed for the different clusters to determine the risk level of the different clusters, and further determine which clusters are real bot partner accounts. An important judgment for distinguishing different risks is the size of the cluster, namely the number of users in the cluster, and if the number of users in the cluster is large, the risk of the cluster is particularly high; the anomaly score reflects whether the user has an attack behavior from another angle, the characteristic of the attack behavior of a general cluster is burstiness, namely, the attack behavior is quick and has large access, specific attack is completed within a certain time range, the behavior is greatly different from the previous behavior in terms of expression, and the duty ratio of the anomaly user in the cluster user is calculated by setting an anomaly score threshold value. Finally, for a large number of clusters with a large number of users and a small number of clusters with a high abnormal proportion, the clusters belong to bot clusters with attack group characteristics.
In some embodiments, the sizes of the obtained multiple risk clusters are judged one by one based on a preset number; if the number of users of the risk clusters is greater than or equal to the preset number, indicating that the corresponding risk clusters have a group attack; if the number of users of the risk clusters is smaller than the preset number, calculating an abnormal score value of the corresponding risk cluster; and if the abnormal score value is greater than or equal to a preset abnormal value, indicating that the corresponding risk cluster has a partner attack. Wherein, the calculation formula of the anomaly score value is as follows:
Figure BDA0004021319880000091
wherein score ((u, v, t)) represents the anomaly score value of the target user; u and v respectively represent nodes; t represents the current time;
Figure BDA0004021319880000092
representing the number of combinations occurring between the current time windows u-v; />
Figure BDA0004021319880000093
Representing the number of combinations occurring between all time windows u-v preceding the current time window. Counting the occurrence number of different combinations in each time window, wherein the behavior of the user in different time windows is in a relatively normal fluctuation range for the normal user, if the user passes through a tool or needs to complete certain specific tasks in a specific time range, the user may have an abnormal behavior relative to the previous behavior in a certain time window, and the behavior of the user in different time windows is abnormal for the normal userThe combination number of the user and the entity is relatively fixed, the behavior of excessive combination number variation is avoided, the fact that the combination number is excessive is also indicated that the user frequently uses different entities, and the indication itself has a certain risk is also indicated. The abnormal score value is calculated for the user and the associated entity, and then the maximum abnormal score is taken as the final abnormal score of the user according to all the entities associated with the user.
In some embodiments, as shown in fig. 2: the time window is divided at 5 second intervals, the circle represents a combination of the user A at the nth second, the diamond represents a combination of the user B at the nth second, and the numbers in the circle and the diamond represent the nth second. In the second time window, for the user B, t is 2, s is 2, a is 1, and the anomaly value is 0 according to the anomaly score calculation mode, because the number of combinations in the first and second time windows is 1, which indicates no change in behavior, i.e., no anomaly.
It should be noted that, the calculation of the abnormal score value not only can be used for judging whether the small number of clusters are bot group attack, but also can be used for carrying out early warning on the abnormal behavior of each user in the running process of the system, and timely taking measures to maintain the running safety and stability of the system.
Based on the above embodiments, the time window segmentation for each of the plurality of combinations of the further embodiment provided in the present disclosure is performed according to a preset time window size, including the steps of:
sorting the divided combinations in time order based on the time stamps to generate a time series combination;
and dividing the time sequence combination according to the preset time window size, wherein the intersection of the adjacent preset time windows is not null.
In some embodiments, since there is a possibility that the combination of the latter part in one time window and the combination of the former part in the next window is smaller than the specific time window in time interval, 2 times of the minimum time interval is taken as one time window when the time windows are set, and the former and latter time windows have overlapping of the minimum time interval, so that the data in one time window is calculated alone without neglecting the data crossing the window. Meanwhile, partial calculation is repeated due to overlapping in the time window, so that in the calculation process, after the data in the whole window is calculated, the data with the overlapping part is calculated once, and the result of the whole data is subtracted from the data with the overlapping part, so that the data is ensured not to be repeatedly calculated in the calculation process. As shown in fig. 2, the circle represents the combination of user a on the time axis, the number in the circle represents the time, the square represents the combination of user B on the time axis, the number in the square represents the time, and both users are combinations under the same entity. A synchronization is indicated within 5 seconds with a time interval of 5 seconds, i.e. the time difference between the same physical combination of the two users, here a time window of 10 seconds, a combination of 1 to 10 seconds is placed within a first time window, and a second time window of 5 to 15 seconds.
Based on the foregoing embodiment, in another embodiment of the disclosure, the target time window is a single time window, and the calculating the synchronicity between two users in the target time window is described in fig. 3, and includes the following steps:
step 310, calculating the number of combinations of the first entities of the two users in a single time window according to a preset time interval, and obtaining the independent operands of the first entities of the two users.
And 320, counting the synchronous combination quantity of the two users according to the preset time interval to obtain the synchronous operation number of the first entity of the two users.
Step 330, calculating the ratio of the synchronous operand to the independent operand to obtain the initial synchronicity of the first entity corresponding to the two users.
Step 340, until all the entities respectively calculate and complete the initial synchronicity, and the maximum initial synchronicity is used as the synchronization value of the single entity corresponding to the two users.
In some embodiments, the number of identical combinations in the time window, i.e. the number of operations of the user under a certain entity, is calculated first as
Figure BDA0004021319880000111
Representing the number of logs for user i over a particular entity k within a time window, there is a separate operand for each user and its corresponding entity. The calculation formula is as follows: />
Figure BDA0004021319880000112
Wherein Sim (U) i ,U j ,C k ) The synchronicity of the users i and j under the entity k in the window c is shown, i and j respectively show the users, k shows the entity, u shows the node, and c shows the window;
Figure BDA0004021319880000113
: representing the number of logs of user i over a time window at a specific entity k; />
Figure BDA0004021319880000114
: representing the number of logs for user j over a time window at a particular entity k.
In some embodiments, taking the first time window shown in fig. 2 as an example, the number of combinations of user a is 2 and the number of combinations of user B is 2. At the same time, the number of combinations in the second half of the time window needs to be subtracted, so that the number of combinations of the whole users A is 2, and the number of combinations of B is 1, and the combination in the second half of the time window is counted in the next time window. For the number of synchronizations, the two combinations of user a and user B in the first time window, the time difference is within 5 seconds, and the number of synchronizations of all users a and B is 2. For the synchronicity of the user a and the user B, all time windows need to be integrated, two time windows need to be integrated, the number of entity combinations of the first time window a and the second time window B is 2 and 1 respectively, the number of entity combinations of the second time window a and the second time window B is 1 and 2 respectively, so that the overall number of combinations is 3 and 3 respectively, and the number of synchronization actions of the two is 3 and 3 respectively, so that the calculation of the similarity of the user a and the user B in jaccard is 1. By means of the method for calculating the data in different time windows separately, the synchronism of the user in a long time range can be calculated effectively and accurately, calculation is faster and more efficient, and expansion is facilitated.
Based on the foregoing embodiments, in still another embodiment provided in the present disclosure, the target time window is all time windows, and the step of calculating the synchronicity between two users in the target time window is as described in fig. 4, and includes the following steps:
in step 410, if the synchronization value of the single entity of the two users is 0 or 1, the number of combinations of all entities corresponding to the two users in all time windows is calculated, and the whole operand corresponding to the two users is obtained.
Step 420, obtaining the corresponding overall synchronous number of every two users; the overall synchronous number is the sum of synchronous operands of all entities of every two users.
And step 430, calculating the ratio of the whole synchronous number to the whole operand to obtain the whole synchronicity of the two corresponding users.
In some embodiments, for some reasons, the frequency of occurrence of a certain entity may be lower, and only 0 and 1 may occur when a synchronization operation is calculated for a specific entity, so if similar entities exist, the synchronization for all entities may need to be calculated, the calculation mode is unchanged, that is, the content combined in a window is not just a single entity, but for all entities, different entities may directly adopt the form of accumulation to calculate, that is, the synchronization operation refers to the number of synchronization of entity a plus the number of synchronization of entity b, and the total operation also considers the operands of all entities to accumulate, so that the accidental deviation of some entities in the calculation process is prevented from causing the deviation of the overall calculation. The calculation formula is as follows:
Figure BDA0004021319880000121
wherein Sim (U) i ,U j ) The overall synchronicity of the users i and j on the entity k is shown, i and j respectively show the users, k shows the entity, and u shows the node;
Figure BDA0004021319880000122
: representing the number of logs of user i over a time window at a specific entity k; />
Figure BDA0004021319880000123
: representing the number of logs for user j over a time window at a particular entity k; a is that i Representing the sum of the log numbers of k entities of the user i in all windows; a is that j Representing the sum of the log numbers of k entities of user j in all windows.
In some embodiments, because the synchronicity calculated in a single time window has a larger contingency, and meanwhile, the behavior distribution time of the partner accounts is wider, the false alarm rate of the result of the user synchronicity detection in only one window is particularly high, and meanwhile, because the anomaly score is also the description of the user behavior anomaly for a long time, when detecting the bot cluster, comprehensive calculation is needed according to the behaviors in the long time window, so that the detection result is more accurate, the false alarm rate is reduced, and meanwhile, the anomaly score is more accurately and less than the anomaly condition of the current window through the calculation of a plurality of time windows. For the synchronization times between users and the independent combination times of the users calculated in different time windows, the times in all the windows can be directly combined, and overlapping time windows exist in different windows, but data of overlapping parts can be removed when single window data are calculated, so that the added combination operation can be directly carried out when multi-window data are combined, and the repeated problem is not considered. Meanwhile, aiming at the calculation of the incremental data, the final result of the stock data can be reserved, and the calculation of new data can be directly carried out by directly carrying out accumulation operation on the incremental data.
The method for detecting the partner attack based on the network traffic comprises the steps of obtaining traffic data to be detected, dividing the traffic data to be detected into a plurality of combinations, dividing the combinations into time windows according to the size of a preset time window, calculating the synchronicity between every two users in a target time window, and extracting every two users with synchronicity greater than or equal to a preset threshold value; carrying out synchronous processing on the extracted two-by-two users based on a connected component algorithm to obtain a plurality of risk clusters; and judging whether a partner attack exists or not based on the obtained multiple risk clusters. In this way, the partner behavior and the normal user can be effectively distinguished, a large number of partner users can be mined, and the attack behavior of the bot partner can be rapidly and accurately identified.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present disclosure is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present disclosure. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all alternative embodiments, and that the acts and modules referred to are not necessarily required by the present disclosure.
The foregoing is a description of embodiments of the method, and the following further describes embodiments of the present disclosure through examples of apparatus.
Fig. 5 illustrates a block diagram of a network traffic based partner attack detection device 500 according to an embodiment of the present disclosure. As shown in fig. 5, the apparatus 500 includes:
the flow obtaining module 510 is configured to obtain flow data to be detected; the flow data to be detected comprises: user information, entity information, and corresponding timestamps.
The dividing and combining module 520 is configured to divide the traffic to be detected into a plurality of combinations, where each combination includes user information, corresponding entity information, and a timestamp.
A time window dividing module 530, configured to perform time window division according to a preset time window size for each of the multiple combinations.
The synchronicity calculating module 540 is configured to calculate synchronicity between every two users within the target time window, and extract every two users with synchronicity greater than or equal to a preset threshold.
The risk cluster module 550 is configured to perform synchronization processing on the extracted two-by-two users based on the connected component algorithm, so as to obtain a plurality of risk clusters.
An attack determination module 560, configured to determine whether a partner attack exists based on the obtained multiple risk clusters.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the described modules may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 6 shows a schematic block diagram of an electronic device 600 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
The device 600 includes a computing unit 601 that can perform various suitable actions and processes according to computer programs stored in a Read Only Memory (ROM) 602 or loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the device 600 may also be stored. The computing unit 601, ROM 602, and RAM 603 are connected to each other by a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Various components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, mouse, etc.; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 601 performs the various methods and processes described above, such as method 100. For example, in some embodiments, the method 100 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 602 and/or the communication unit 609. One or more of the steps of the method 100 described above may be performed when a computer program is loaded into the RAM 603 and executed by the computing unit 601. Alternatively, in other embodiments, the computing unit 601 may be configured to perform the method 100 by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (10)

1. The method for detecting the partner attack based on the network traffic is characterized by comprising the following steps:
acquiring flow data to be detected; the flow data to be detected comprises: user information, entity information, and corresponding timestamps;
dividing the flow to be detected into a plurality of combinations, wherein each combination comprises user information, corresponding entity information and a time stamp;
for each of the plurality of combinations, performing time window segmentation according to a preset time window size;
calculating the synchronicity between every two users in a target time window, and extracting every two users with synchronicity greater than or equal to a preset threshold value;
carrying out synchronous processing on the extracted two-by-two users based on a connected component algorithm to obtain a plurality of risk clusters;
and judging whether a partner attack exists or not based on the obtained multiple risk clusters.
2. The method of claim 1, wherein the user information comprises: user ID, IP address, user agent information, which is expressed by user ID or IP address plus user agent information for a user; the entity information includes: access path, device information.
3. The method of claim 1, wherein for each of the plurality of combinations, performing time window segmentation according to a preset time window size comprises:
sorting the divided combinations in time order based on the time stamps to generate a time series combination;
and dividing the time sequence combination according to the preset time window size, wherein the intersection of the adjacent preset time windows is not null.
4. The method of claim 1, wherein the target time window is a single time window, and the calculating the synchronicity between two users in the target time window comprises:
calculating the combination quantity of the first entities of the two users in a single time window according to a preset time interval to obtain independent operands of the first entities of the two users;
counting the synchronous combination quantity of every two users according to the preset time interval to obtain a synchronous operand of a first entity of every two users;
calculating the ratio of the synchronous operand to the independent operand to obtain the initial synchronism of the first entity corresponding to the two users;
and respectively calculating and completing initial synchronicity until all the entities are respectively calculated and completing, and taking the maximum initial synchronicity as the synchronization value of a single entity corresponding to two users.
5. The method of claim 4, wherein the target time window is all time windows, and the calculating the synchronicity between two users in the target time window further comprises:
if the synchronization value of the single entity of the two users is 0 or 1, calculating the combined quantity of all the entities corresponding to the two users in all the time windows to obtain the integral operand corresponding to the two users;
acquiring the integral synchronous number corresponding to two users; the integral synchronous number is the sum of synchronous operands of all entities of every two users;
and calculating the ratio of the integral synchronous number to the integral operand to obtain the integral synchronism of the two corresponding users.
6. The method of claim 1, wherein the determining whether a partner attack exists based on the resulting plurality of risk clusters comprises:
judging the sizes of the obtained multiple risk clusters one by one based on the preset number;
if the number of users of the risk clusters is greater than or equal to the preset number, indicating that the corresponding risk clusters have a group attack;
if the number of users of the risk clusters is smaller than the preset number, calculating an abnormal score value of the corresponding risk cluster; and if the abnormal score value is greater than or equal to a preset abnormal value, indicating that the corresponding risk cluster has a partner attack.
7. The method of claim 6, wherein the anomaly score value is calculated as follows:
Figure FDA0004021319870000021
wherein score ((u, v, t)) represents the anomaly score value of the target user; u and v respectively represent nodes; t represents the current time;
Figure FDA0004021319870000031
representing the number of combinations occurring between the current time windows u-v; />
Figure FDA0004021319870000032
Representing the number of combinations occurring between all time windows u-v preceding the current time window.
8. A network traffic based group attack detection device, comprising:
the flow acquisition module is used for acquiring flow data to be detected; the flow data to be detected comprises: user information, entity information, and corresponding timestamps;
the dividing and combining module is used for dividing the flow to be detected into a plurality of combinations, wherein each combination comprises user information, corresponding entity information and a time stamp;
the time window segmentation module is used for carrying out time window segmentation according to the preset time window size for each of the plurality of combinations;
the synchronicity calculating module is used for calculating synchronicity between every two users in the target time window and extracting every two users with synchronicity larger than or equal to a preset threshold value;
the risk cluster module is used for carrying out synchronous processing on the two extracted users based on a connected component algorithm to obtain a plurality of risk clusters;
and the attack judging module is used for judging whether a partner attack exists or not based on the obtained multiple risk clusters.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-7.
CN202211691764.0A 2022-12-27 2022-12-27 Method, device, equipment and medium for detecting partner attack based on network traffic Pending CN116192456A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211691764.0A CN116192456A (en) 2022-12-27 2022-12-27 Method, device, equipment and medium for detecting partner attack based on network traffic

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211691764.0A CN116192456A (en) 2022-12-27 2022-12-27 Method, device, equipment and medium for detecting partner attack based on network traffic

Publications (1)

Publication Number Publication Date
CN116192456A true CN116192456A (en) 2023-05-30

Family

ID=86445393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211691764.0A Pending CN116192456A (en) 2022-12-27 2022-12-27 Method, device, equipment and medium for detecting partner attack based on network traffic

Country Status (1)

Country Link
CN (1) CN116192456A (en)

Similar Documents

Publication Publication Date Title
CN116049146B (en) Database fault processing method, device, equipment and storage medium
CN113792154A (en) Method and device for determining fault association relationship, electronic equipment and storage medium
CN116010220A (en) Alarm diagnosis method, device, equipment and storage medium
CN113312560A (en) Group detection method and device and electronic equipment
CN117499148A (en) Network access control method, device, equipment and storage medium
CN115168154B (en) Abnormal log detection method, device and equipment based on dynamic baseline
CN114697247B (en) Fault detection method, device, equipment and storage medium of streaming media system
CN116192456A (en) Method, device, equipment and medium for detecting partner attack based on network traffic
CN115687406A (en) Sampling method, device and equipment of call chain data and storage medium
CN115599687A (en) Method, device, equipment and medium for determining software test scene
CN113887101A (en) Visualization method and device of network model, electronic equipment and storage medium
CN114706893A (en) Fault detection method, device, equipment and storage medium
CN117395071B (en) Abnormality detection method, abnormality detection device, abnormality detection equipment and storage medium
CN114500326B (en) Abnormality detection method, abnormality detection device, electronic device, and storage medium
CN114513441B (en) System maintenance method, device, equipment and storage medium based on block chain
CN117608896A (en) Transaction data processing method and device, electronic equipment and storage medium
CN117690277A (en) Threshold determining method, device, equipment and storage medium
CN114003459A (en) Fault detection method and device, electronic equipment and readable storage medium
CN117495525A (en) Data processing method, apparatus, device, storage medium, and program product
CN116089499A (en) Data statistics method, device and medium based on kafka data volume
CN115774648A (en) Abnormity positioning method, device, equipment and storage medium
CN114416418A (en) Data detection method and device, electronic equipment and storage medium
CN115455019A (en) Search intention identification method, device and equipment based on user behavior analysis
CN116823159A (en) Workload estimation method, device and program product for financial project
CN117762745A (en) Method, system, equipment and medium for collecting and processing index data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination