Detailed Description
The number scanning attack is an attack mode which executes a user name and a password by using automatic machine behaviors to simulate login and tries whether the user name and the password are correct or not. For example, as shown in fig. 1, the terminal 11 is a device controlled by a hacker to serve as an initiator of a number scanning attack. The IP address of the terminal 11 is IP1, and the terminal 11 may launch a highly concurrent attack, such as attack 1, attack 2, attack 3, attack 4, and so on in fig. 1, and may launch multiple attacks at a higher frequency, where each attack is that the terminal 11 launches one simulated login, and the user name and password used in each attack may be different; if a certain login is successful, an attacker can obtain the private information of the user.
The login information corresponding to the number scanning attack may be information of a registered user of a certain application, for example, the login information may be login information of a shopping website, many users register in the shopping website and shop on the website, and the website also stores some private information corresponding to the user under an account of each user. In order to prevent number scanning attack and protect the information security of the user, the application may use the statistical server to identify the number scanning attack, as shown in fig. 1, the statistical server 12 may receive many service requests, in this example, the service request may be a login request, for example, when a normal application registered user logs in a website on its computer, the user may input its user name and password to request login, and the computer sends the login request. For the statistical server 12, the received service requests include both a request of a normal user and a service request of an attacker, that is, login requests corresponding to multiple attacks such as attack 1 and attack 2 sent by the terminal 11 illustrated in fig. 1.
The statistical server 12 may determine whether a number scanning attack occurs by using a counting statistical method, where the counting statistical method may record an IP address corresponding to each service request and the number of times the IP address initiates a request, and store statistical information in a Key-value Key value pair form, where Key is the IP address and value is the number of times the request occurs. For example, the attacker, i.e. the terminal 11 in fig. 1, attacks at a high frequency, that is, the IP address of "IP 1" will make frequent login requests, the statistics server 12 may record "key-IP 1 and value-1" when receiving the request for the first time, and when receiving the request for the second time, the statistics server 12 needs to update the key-value pair to "key-IP 1 and value-2" when receiving the request for the second time, and it can be seen that, each time a request is received, a write operation is performed on the value corresponding to the source IP address of the request, and the value is updated.
However, due to the high concurrency of the sweep number attack, a problem is easily caused in that two threads need to write and update values corresponding to the same IP address at the same time, which is caused by receiving two requests initiated by the IP address in a short time, and then a "write collision" may occur, which is a characteristic that a database adopts to ensure data accuracy. Not to say that every attack request of a scan number attack will result in a write collision, but the highly concurrent nature of the scan number attack makes write collisions likely to occur with a fairly high probability and likely to continue. In this case, if the statistics server performs retry, i.e. rewriting, every time a collision occurs, system resources are wasted seriously, and especially, the resources should be avoided from being wasted in the processing of illegal requests of such a number scanning attack as much as possible.
Based on the above, the embodiment of the present application provides an attack identification method, which mainly aims to reduce resource waste of a write conflict to a server during a high concurrent attack, mainly identify whether a service request conflict is a number scanning attack conflict or a normal conflict, and identify the number scanning attack conflict, so that a statistics server 12 side does not rewrite the conflict and waste resources. As shown in fig. 2, the attack recognition device 13 is provided in the present example, for example, as shown in fig. 2, the attack recognition device 13 may include: a first cache 131, and at least one second cache 132. For example, the Cache may be an LRU (least recently used) Cache, where the LRU Cache is a mechanism for removing part of objects in the Cache according to a least recently used principle; here, the first Cache 131 and the second Cache 132 may not be two physically separated LRU caches, but may be distinguished according to different storage contents, for example, the first Cache 131 may store a request end identification characteristic when a conflict occurs, and the second Cache 132 may be used to store conflict information corresponding to the identification characteristic, such as a conflict time. The attack recognition device 13 may obtain an identification feature (for example, an IP address of the terminal 11) of a request end (for example, the terminal 11) carried in a service request sent by the terminal 11, and may determine whether the request end is an attack according to the identification feature; the attack recognition device 13 may also intercept a service request sent by the attack end to the statistics server 12 when it is recognized and determined that the request end is the attack end, so as to reduce the pressure on the statistics server side.
It should be noted that the information stored in the LRU Cache is information related to request conflicts, and the attack recognition device 13 determines whether an IP address where a request conflict occurs is an IP address of an attacker according to the information, that is, determines whether the conflict situation is caused by an attack behavior.
In order to make the description of the attack recognition method of the present example clearer, it will be described below how the information in the respective lrucaches described above is stored, and then how the attack recognition device 13 performs attack recognition based on the information.
Information storage in the LRU Cache:
as shown in fig. 2, even if the attack recognition device 13 can recognize an attacker according to the information in the LRU Cache, it is usually determined that a certain IP address is an attacker according to the information when it is determined that the frequency of collisions reaches a certain threshold, which is determined according to the characteristics of the number-scanning attack. Therefore, when the amount of information stored in the LRU Cache is small and has not reached the threshold, even if the attack recognition device 13 queries the LRU Cache, it cannot be determined that the LRU Cache is an attacker, and then the attack recognition device 13 may release the service request and the request is received by the side of the statistics server 12.
The released service request may be an attack request, which is not identified by the attack identification device 13 only temporarily; then, if the frequency of attack requests is particularly high, there is still a possibility that write conflicts occur on the side of the statistics server 12. In this example, the statistics server 12 may feed back to the attack recognition device 13 to inform the attack recognition device 13 that the processing result of the current service request is a write conflict, and the attack recognition device 13 may know that the service request released by the attack recognition device 13 conflicts on the statistics server 12 side, at this time, the attack recognition device 13 may perform information storage in the LRU Cache, record the current service conflict, and store information related to the current conflict.
As shown in fig. 3, the information stored in the LRU Cache is in the form of a list, and assuming that the current conflict is a new IP address, the attack recognition device 13 may store the IP address as key4 in the conflict list of the first Cache (the first Cache and the second Cache in the following examples may be the LRU Cache), and also store key1, key2, and key3, where the IP addresses are previously over-collided and stored in the list. The keys in which the service conflicts (such as the above-mentioned write conflicts) occur are stored in the conflict list, and these keys may be referred to as identification features, and the first cache is used to store the identification features of each requesting end in which the service conflicts occur, for example, the identification features may be IP addresses of the requesting end devices, and the keys are represented in this example.
Moreover, the attack recognition device 13 further sets a corresponding second cache for each key, where the second cache stores conflict information corresponding to the key, and the conflict information may include: time when the collision occurs, value to be written at the time of the collision, and the like. The conflict information may also be used to determine the number of conflicts corresponding to the identification feature in a preset time period, which is described in the following examples. As shown in fig. 3, the time in the conflict information is illustrated, and for the Key4 newly added in the conflict list, the time4-1 of the occurrence of the conflict of this time is correspondingly added in the second cache corresponding to the Key 4. Assuming that the collision hint corresponding to the key4 is received for the second time, the attack recognition device 13 may continue to record yet another piece of collision information among the pieces of collision information corresponding to the key 4.
For the way of the LRU Cache illustrated in fig. 3 to store information, there are two points to be described as follows:
firstly, the automatic elimination of the useless information can be realized by utilizing the LRU Cache:
taking the conflict list as an example, if the Key in the list conflicts again, the Key is moved up in the list, for example, if the service request corresponding to the Key1 in fig. 3 conflicts again, the Key1 is moved up from the list to the top of the Key 4. According to this principle, the key at the bottom of the list, which usually has no conflict for a long time, is removed when the LRUCache memory is full and some data needs to be cleared. In principle, the key at the bottom of the conflict list has not been conflicted for a long time, which shows that the conflict of the key has low occurrence frequency, is not in accordance with the characteristics of the attacker, and can be moved out of the list. Of course, if the key conflicts next, it can be added to the list again and the monitoring can be restarted.
The principle of the maintenance mechanism of the second cache for storing the conflict information is the same as that of the first cache, the time at the bottom of the list of the conflict information is the conflict time occurring before a longer time, and the time which is the longest time from the current time is preferentially eliminated.
Secondly, capacity design of the LRU Cache:
in this example, the operation mechanism of the LRU Cache may be used to automatically eliminate the useless data, the LRU Cache not only has the function of cleaning the useless data, so that the query amount is not too large when querying the list, and the query speed is high, but also needs to be able to identify an attacker, and can complete the identification of the attacker according to a predetermined attack identification condition. For example, if the attack recognition condition is that "a certain key conflicts 10 times within 1 minute, and the key is determined to be an attacker", the capacity of the second cache at least needs to be able to store conflict information of 10 conflicts, that is, at least is used to store a preset number of conflict times, where the preset number is equal to the number of conflicts corresponding to the preset threshold (e.g., the above 10 times). While the capacity of the first cache depends on the number of keys to be monitored simultaneously, e.g., if 1000 keys are to be monitored, the capacity of the first cache is used to store at least a preset number (e.g., 1000) of identification features.
Information utilization in LRU Cache:
the attack recognition device 13 may perform recognition of a number scanning attack according to information stored in the LRU Cache. Fig. 4 illustrates a flowchart of an attack identification method according to an embodiment of the present application, and the attack identification device 13 can identify a number scanning attack according to the flowchart. As shown in fig. 4, the method includes:
in step 401, it is determined that the identification characteristics of the request end carried in the service request are stored.
For example, the attack recognition device 13 obtains an identification feature of the request end carried in one service request, where the identification feature may be an IP address, IP1, of the terminal 11.
The attack recognition device 13 queries the first cache, and in this example, may determine that the first cache stores the identification feature. For example, the attack recognition device 13 may query the conflict list of the first cache, and if the IP1 (assuming that the IP1 is key3) is found in the conflict list, determine that the identification feature is stored, and proceed to step 402; otherwise, if IP1 is no longer on the list, the service request may be passed through so that the request is sent to the statistics server side. On the side of the statistical server, if the request of the IP1 does not generate a write conflict, the statistical server may normally update the request number value corresponding to the IP; if conflict occurs, information can be fed back to the attack recognition device for recording.
In step 402, the number of times of conflict of occurrence of service conflict corresponding to the identification feature is obtained.
For example, still taking key3 as an example, assuming that the preset time period is 1 minute, a plurality of conflict times included in the preset time period, such as two times 3-1 and 3-2 corresponding to key3 in fig. 3, may be obtained from the current time and the preset time period, and assuming that time3-3 has exceeded 1 minute from the current time, the number of conflicts within the preset time period from the current time is determined to be two.
In step 403, if the number of collisions corresponding to the identifier reaches a threshold value within a preset time period, it is determined that the request end corresponding to the identifier is an attacker, and the service request is an attack request.
For example, if the threshold is 2, the number of collisions corresponding to the key3 in step 402 has reached the threshold, the attack recognition apparatus may determine that the IP1 is the IP of the attacker, and if the service request is an attack request, the attack recognition apparatus may intercept the request and no longer send the request to the statistical server side; if the threshold is 10, the number of conflicts corresponding to the key3 in step 402 has not reached the threshold, and the attack recognition device cannot determine that the IP1 is an attacker, then the service request may be released.
In addition, in the above example, if the number of times of collision of the identifier characteristics does not reach the threshold value within the preset time period, or the identifier characteristics of the request end are not in the identifier information, after the service request is sent to the server end for processing, when the prompt for occurrence of the service collision fed back by the server end is received, the corresponding collision information is stored in the second cache corresponding to the identifier characteristics of the current service request.
According to the attack identification method provided by the example, the relevant conflict information is stored by using the cache, and whether the request end with the service conflict is an attacker can be determined through the relevant conflict information, so that the attack can be intercepted when being identified, and the resource waste of the write-in conflict to the service end when the high-concurrency attack is carried out is reduced. In addition, the LRU Cache is used as a Cache for storing information, so that the attack recognition can be facilitated, useless information can be automatically eliminated by an operation mechanism, the information quantity is kept not too large, the information query operation can be quickly executed by the LRU Cache, the speed is high, and the attack recognition can be facilitated quickly.
In order to implement the attack identification method, an embodiment of the present application further provides an attack identification device, as shown in fig. 5, the device may include: an information acquisition module 51 and an identification processing module 52.
The information obtaining module 51 is configured to, when it is determined that an identification feature of a request end carried in a service request is stored, obtain a number of times of collision of a service conflict occurring corresponding to the identification feature within a preset time period;
and the identification processing module 52 is configured to determine that the request end corresponding to the identification feature is an attacker and the service request is an attack request if the number of collisions reaches a threshold value within a preset time period.
In one example, as shown in fig. 6, the apparatus may further include: an information storage module 53;
the identification processing module 52 is further configured to send the service request to a server for processing when the identification feature of the request end is not stored, or when the number of collisions corresponding to the identification feature in a preset time period does not reach a threshold value.
And the information storage module 53 is configured to store the identifier feature and record the service conflict when receiving a prompt of occurrence of the service conflict fed back by the server.
In one example, the information storage module 53, when configured to record the service conflict, includes: storing the identification characteristics of the request end of the current service conflict in a first LRUCache, wherein the first LRU Cache is used for storing the identification characteristics of each request end with the service conflict; and storing conflict information of the service conflict in a second LRU Cache corresponding to the identification characteristics, wherein the conflict information is used for determining the number of conflicts corresponding to the identification characteristics in a preset time period.
In one example, the information obtaining module 51 is configured to, when the conflict information includes: and when the conflict time of each conflict corresponding to the identification characteristics occurs, acquiring a plurality of conflict times included in a preset time period according to the current time and the preset time period, wherein the number of the conflict times is the number of the conflicts corresponding to the identification characteristics.
In one example, the first LRU Cache has a capacity for storing at least a preset number of identification features;
and when the conflict information comprises conflict time, the capacity of the second LRU Cache is at least used for storing the conflict time with preset quantity, and the preset quantity is equal to the conflict times corresponding to the threshold value.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.