CN107872434B

CN107872434B - Method and device for screening access points

Info

Publication number: CN107872434B
Application number: CN201610854659.2A
Authority: CN
Inventors: 陈文杰
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2016-09-27
Filing date: 2016-09-27
Publication date: 2020-12-01
Anticipated expiration: 2036-09-27
Also published as: CN107872434A

Abstract

The scheme provides a screening method of access points, relates to the field of data processing, and can improve the identification accuracy of attacked sites to a certain extent. The method comprises the following steps: acquiring the respective corresponding accessed times of at least two access points; and determining fake access points according to the accessed times of the access points and the total number of the access points. The embodiment of the scheme is suitable for the relevant processing process of the forged data packet.

Description

Method and device for screening access points

Technical Field

The present disclosure relates to the field of data processing technologies, and in particular, to a method and an apparatus for screening access points.

Background

The Internet transmits data packets based on a TCP/IP (Transmission Control Protocol/Internet Protocol ) Protocol, but the authenticity of the transmitted data packets is not strictly controlled, that is, whether the data packets are valid data packets or not cannot be identified, which causes the current Internet to be flooded with a large number of forged data packets.

The reason for generating the current forged data packets is that in order to find the accurate position of the target site, a large-scale network exploration needs to be performed to scan the whole internet, and at this time, data packets of all IP addresses included in the internet need to be forged and transmitted in parallel, so that the internet is flooded with a large number of forged data packets, and the forged data packets have a great influence on a system for performing calculation and control based on real network access data, and the analyzed result is inaccurate when data analysis is performed. In the prior art, there are three methods for identifying counterfeit data:

the first method comprises the following steps: and performing DNS (Domain Name System) analysis on the data packet of the target site, and if the analyzed IP address is not in the network of the target site, judging that the data packet received by the access point is a forged data packet, namely the access point is attacked by the forged data packet.

The second method comprises the following steps: and counting the access times of the target HOST in unit time based on the precondition that the network detection access is not required to be continuously accessed (namely the continuous access times are relatively low). If the access times are lower than a uniform threshold value, the access data packet to the target HOST in the unit time is judged to be a fake data packet.

The third method comprises the following steps: the network requests that do not match a response are checked for spurious access based on the precondition that the destination HOST will not respond to spurious access. In the process of implementing the scheme, the inventor finds that the prior art has at least the following problems:

in the first method, when performing DNS resolution, the IP addresses that are resolved are different due to the difference between the link and the region of the DNS server, and thus the result of determining whether an access point is attacked by a counterfeit packet is inaccurate.

In the second method, different network probes have different strategies and different probing rules, so that only one threshold is needed for detection, and the accuracy is low.

In the third method, since most WEB servers do not control to respond only to hosts that provide services, and respond to almost any access, there is a significant false-detection rate depending on whether there is a response to identify a spoofed packet attack.

Disclosure of Invention

The screening method and the screening device for the access point can improve the identification accuracy of the attacked site to a certain extent.

In a first aspect, an embodiment of the present disclosure provides a method for screening access points, including:

acquiring the respective corresponding accessed times of at least two access points;

and determining fake access points according to the accessed times of the access points and the total number of the access points.

The above aspect and any possible implementation manner further provide an implementation manner, where the determining a fake access point according to the number of times of access by each of the access points and the total number of the access points includes:

determining a flatness between the number of times of access of each of the access points and a flatness between the total number of times of access and the total number of times of access according to the number of times of access of each of the access points and the total number of the access points;

and determining the forged access point according to the flatness and the flatness.

The above aspect and any possible implementation manner further provide an implementation manner, wherein the determining, according to the number of times each access point is accessed and the total number of the access points, a flatness between the number of times each access point is accessed comprises:

calculating according to the number of times of access of each access point and the total number of the access points to obtain a standard deviation and an average value of the number of times of access of each access point;

dividing the standard deviation by the average value to obtain the flatness.

The aspect described above and any possible implementation further provide an implementation, wherein the determining, according to the number of times of access and the total number of access points for each of the access points, a flatness between the total number of access points and the total number of times of access includes:

calculating according to the number of times of access of each access point and the total number of the access points to obtain an average value of the number of times of access of each access point;

dividing the total number by the average value to obtain the flatness.

The above aspect and any possible implementation further provide an implementation, where the determining the fake access point according to the flatness and the flatness includes:

determining the at least two access points as fake access points when the flatness satisfies a first threshold range and the flatness satisfies a second threshold range.

The above-mentioned aspect and any possible implementation manner further provide an implementation manner, after the determining that the at least two access points are fake access points, further including:

acquiring the corresponding access times of at least three access points including the determined fake access point;

and determining a new fake access point according to the respective accessed times of the at least three access points and the total number of the at least three access points.

The above-described aspects and any possible implementation further provide an implementation, when a new fake access point cannot be determined according to the respective number of times of access of the at least three access points and the total number of the at least three access points, the method further includes:

acquiring the accessed times of at least two other access points which are completely different from the determined fake access points; and determining a fake access point according to the accessed times of the other two access points and the total number of the other at least two access points.

The above-described aspects and any possible implementation further provide an implementation, further including:

when the flatness does not meet a first threshold range or the flatness does not meet a second threshold range, comparing the magnitude relation between the total number of the access points and a rated number, wherein the rated number is not less than 2;

when the total number of the access points is equal to the rated number, acquiring the accessed times of other at least two access points which are not identical with the at least two access points; and determining a fake access point according to the accessed times of the other two access points and the total number of the other at least two access points.

The above-described aspect and any possible implementation manner further provide an implementation manner, before the acquiring the number of times of accesses corresponding to each of the at least two access points, further including:

collecting a data packet;

acquiring an access point corresponding to the acquired data packet;

determining a top-level domain name corresponding to each access point according to the access point corresponding to the acquired data packet;

selecting no less than the nominal number of access points among the access points corresponding to the same top-level domain name.

The above-mentioned aspect and any possible implementation manner further provide an implementation manner, after the determining a fake access point according to the number of times of access by each of the access points and the total number of the access points, further including:

acquiring a target data packet from the acquired data packets, wherein the destination address of the target data is one of the forged access points;

collecting current data flow;

and performing a specified operation on a data packet with the same source address as the target data packet in the data traffic.

In a second aspect, an embodiment of the present disclosure further provides an apparatus for screening access points, including:

the acquisition unit is used for acquiring the respective corresponding accessed times of at least two access points;

and the determining unit is used for determining fake access points according to the accessed times of the access points and the total number of the access points.

The above-described aspect and any possible implementation further provide an implementation, where the determining unit includes:

a first determining subunit, configured to determine, according to the number of times of access to each of the access points and the total number of the access points, a flatness between the number of times of access to each of the access points, and a flatness between the total number of the access points and the total number of times of access;

and the second determining subunit is used for determining the forged access point according to the flatness and the flatness.

The foregoing aspect and any possible implementation manner further provide an implementation manner, where the first determining subunit is specifically configured to perform calculation according to the number of times of access to each access point and the total number of the access points, to obtain a standard deviation and an average value of the number of times of access to each access point; dividing the standard deviation by the average value to obtain the flatness.

The foregoing aspect and any possible implementation manner further provide an implementation manner, where the first determining subunit is specifically configured to perform calculation according to the number of times of access to each access point and the total number of the access points, to obtain an average value of the number of times of access to each access point; dividing the total number by the average value to obtain the flatness.

The above-described aspect and any possible implementation further provide an implementation, where the second determining subunit is specifically configured to determine that the at least two access points are fake access points when the flatness satisfies a first threshold range and the flatness satisfies a second threshold range.

The above aspects, and any possible implementations, further provide an implementation,

the acquisition unit is used for acquiring the respective corresponding access times of at least three access points including the determined fake access point;

and the determining unit is used for determining a new forged access point according to the respective access times of the at least three access points and the total number of the at least three access points.

the acquisition unit is used for acquiring the accessed times of at least two other access points which are completely different from the determined forged access point;

and the determining unit is used for determining fake access points according to the accessed times of the other two access points and the total number of the other at least two access points.

The above-described aspects and any possible implementations further provide an implementation, where the apparatus further includes:

a comparison unit, configured to compare a magnitude relationship between the total number of access points and a rated number when the flatness does not satisfy a first threshold range or the flatness does not satisfy a second threshold range, where the rated number is not less than 2;

the acquisition unit is used for acquiring the accessed times of other at least two access points which are not identical with the at least two access points when the total number of the access points is equal to a rated number;

the acquisition unit is used for acquiring data packets;

the first acquisition unit is used for acquiring the access point corresponding to the acquired data packet;

the determining unit is used for determining the top-level domain name corresponding to each access point according to the access point corresponding to the acquired data packet;

and the selection unit is used for selecting access points which are not less than the rated number from the access points corresponding to the same top-level domain name.

a second obtaining unit, configured to obtain a target data packet from the collected data packets, where a destination address of the target data is one of the fake access points;

the acquisition unit is used for acquiring the current data flow;

and the operation unit is used for carrying out specified operation on the data packet with the same source address as the target data packet in the data flow.

Compared with the prior art, the identification method provided by the embodiment of the scheme is closer to the actual situation and can avoid interference factors caused by DNS analysis, network detection strategies and service response, so that the identification of the forged access points is more accurately realized.

Drawings

In order to more clearly illustrate the embodiment of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiment or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive labor.

Fig. 1 is a flowchart of a method for screening access points in an embodiment of the present disclosure;

FIG. 2 is a flowchart of another method for screening access points in this embodiment;

FIG. 3 is a flowchart of another method for screening access points in this embodiment;

FIG. 4 is a flowchart of another method for screening access points in this embodiment;

FIG. 5 is a flowchart of another method for screening access points according to the embodiment of the present invention;

FIG. 6 is a flowchart of another method for screening access points in this embodiment;

FIG. 7 is a flowchart of another method for screening access points in this embodiment;

FIG. 8 is a flowchart of another method for screening access points in this embodiment;

FIG. 9 is a flowchart of another method for screening access points in this embodiment;

fig. 10 is a block diagram showing the components of another access point screening apparatus according to the embodiment of the present invention;

fig. 11 is a block diagram showing another access point screening apparatus according to the embodiment of the present invention;

fig. 12 is a block diagram showing the components of another access point screening apparatus according to the embodiment of the present invention;

fig. 13 is a block diagram showing another access point screening apparatus according to the embodiment of the present invention;

fig. 14 is a block diagram showing the components of another access point screening apparatus according to the embodiment of the present invention;

fig. 15 is a block diagram showing another screening apparatus for an access point according to the embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure. All other embodiments obtained by a person skilled in the art based on the embodiments in the present disclosure without any creative effort belong to the protection scope of the present disclosure.

The current access behavior using spoofed packets generally includes extensive network probing of the spoofed target HOST. For example, a benign 44 minute scan across the internet, typically zmap, and a malicious one, typically attacks CDN origin site nodes to scan national IPs.

Also included are fake source IP centralized access to some target HOST, such as a DDoS network attack by SYNFlood, or a swipe-by-swipe action for some business purpose.

The access behavior of these fake packets is usually not too high for the same access point in a certain time, and may be only a few or a dozen, because they require access to a large number of access points in a short time. Furthermore, since the number of access points is relatively large, and access is performed by using network addresses of a scale such as a country or a region or a whole network, the number of access times of a single access point is relatively small compared to the total number of access points. In the scheme, the access point which can be accessed by the large-scale network detection behavior is a forged access point, and the corresponding forged data packet and the forged flow can be determined by receiving the data packet and the flow through the forged access point.

In order to solve the problem in the prior art, based on the foregoing situation, the embodiment of the present invention analyzes the statistical characteristics of the forged data packet, and finds the following two rules:

1) when the fake data packet is used for detecting the network, as long as all the access points in the network are ensured to be accessed, the access times do not need to be specifically increased or decreased for one or more access points, so that the detection access for the detection target is indifferent, and the access times for the fake access points in the detection range are basically average and can be described by flatness,

2) in the process of probing the network by using the fake data packet, for the purpose of traversing, the number of access behaviors generated by each access point is not large, so the probing depth (i.e. the number of times each access point is accessed) for each access point is limited relative to the breadth of the whole probing range (i.e. the number of access points in the probed network), and it can be understood that the number of times each access point is accessed is much smaller than the total number of access points in the probing range, and can be described by the flatness, i.e. the flatness is used

The embodiment of the scheme takes the two rules as decision bases to realize the identification of each access point.

An access point screening method provided in an embodiment of the present invention is, as shown in fig. 1, including:

101. and acquiring the corresponding accessed times of at least two access points.

The access point refers to a site device defined by a designated HOST name HOST, a designated IP address and a designated Port number, and may be various network devices such as a gateway and an operator server. The access point may be configured to receive a request sent by a user, complete processing in response to the request, and return a processing result.

In order to better utilize the statistical characteristics of the two points in the implementation process of the embodiment of the present disclosure, a number of the collected access points is selected as many as possible, and the collection number can be set by itself according to the actual need of the judgment precision.

102. And determining fake access points according to the accessed times of the access points and the total number of the access points.

Compared with the prior art, the identification method provided by the embodiment of the scheme is closer to the actual situation, namely, the identification can be carried out based on the relations between the access times and the total number of the access points in the access process of the forged data, so that the interference factors caused by DNS analysis, network detection strategies and service response can be effectively avoided, and the identification of the forged access points can be more accurately realized.

Specifically, compared with DNS resolution, the embodiment of the present invention is based on the number of accesses and the total number of access points, and does not need to perform identification of the home of the access IP, which can eliminate interference in the DNS resolution process.

Compared with the method for setting the uniform threshold value to measure the access times, the embodiment of the scheme does not judge by the access times alone, but identifies by using the flatness and the flatness between the access times and the access point number, so that the limitation on the scale of the access times under different access strategies is removed, and the influence brought by the threshold value setting can be eliminated.

Compared with the case of collecting whether the processing response exists, the embodiment of the scheme does not need to judge based on the result, so that the influence caused by the processing response can be eliminated.

For the implementation of step 102, the embodiment of the present invention provides the following process, as shown in fig. 2, including:

1021. and determining the flatness between the accessed times of each access point according to the accessed times of each access point and the total number of the access points.

Wherein the flatness is used to reflect whether the number of accesses to a fake access point within the detection range is substantially average.

Regarding the method for acquiring the flatness, the embodiment of the present disclosure provides a feasible implementation manner, specifically:

calculating according to the number of times of access of each access point and the total number of the access points to obtain a standard deviation and an average value of the number of times of access of each access point; dividing the standard deviation by the average value to obtain the flatness.

The standard deviation and the average value refer to the standard deviation and the average value of a plurality of access times corresponding to a plurality of acquired access points, and are evaluation parameters capable of reflecting the averageness, and the similar parameters such as variance, weighted average value and the like can be applied to the technical scheme provided by the embodiment of the scheme.

In addition, the flatness may be expressed as a ratio of a standard deviation to an average value, and the setting relationship of the numerator and the denominator is not limited in the embodiment of the present disclosure.

1022. And determining the flatness between the total number of the access points and the total number of the accessed times according to the accessed times of the access points and the total number of the access points.

The flatness is used to reflect whether the number of accesses per access point is much smaller than the total number of access points in the probe range.

Regarding the method for obtaining the flatness, the embodiment of the present disclosure provides a feasible implementation manner, specifically:

calculating according to the number of times of access of each access point and the total number of the access points to obtain an average value of the number of times of access of each access point; dividing the total number by the average value to obtain the flatness.

The average value is an average value of a plurality of access times corresponding to the acquired access points, and is an evaluation parameter capable of reflecting the averageness, and parameters such as a weighted average value similar to the evaluation parameter can be applied to the technical scheme provided by the embodiment of the scheme.

In addition, the flatness may be expressed as a ratio of the total number to the average value, and the embodiment of the present disclosure does not limit the setting relationship of the numerator and denominator.

1023. And determining the forged access point according to the flatness and the flatness.

The specific judgment principle of the forged access point is to determine that the at least two access points are forged access points when the flatness meets a first threshold range and the flatness meets a second threshold range.

Where the first threshold range may be (0, A), A takes one of 5% to 10%, the second threshold range may be (B, + ∞), B taking a larger value, e.g., 100,200, etc. Of course, the assignment values herein are merely examples, and the first threshold range and the second threshold range may be set separately according to the expression manner of the flatness and other practical requirements, and the embodiment of the present invention is not limited to this.

It should be noted that the order of calculating the flatness and the flatness is not specifically limited in the embodiment of the present disclosure. The order of

steps

1021 and 1022 is just one possible implementation.

In addition, it should be noted that, since the two statistical rules need to have a certain reference value only under the condition of a certain sample volume, the collected sample of the access point is limited by the certain sample volume in the embodiment of the present disclosure. Therefore, when the flatness does not satisfy the first threshold range, or the flatness does not satisfy the second threshold range, it means that it cannot be identified whether the collected ap is attacked by the counterfeit packet temporarily, and therefore, a certain number of aps need to be collected again for identification, and the related flow is as shown in fig. 3, and after step 1023, the method includes:

103. comparing the total number of access points with a magnitude relationship of a nominal number.

Wherein the rated number is a minimum sample capacity which ensures that the two statistical rules can be embodied in the calculation process, and is defined as not less than 2. 2 is just an example, and can be increased to 50, 100,200 or even more according to actual needs.

104. When the total number of access points is equal to a nominal number, acquiring the number of times of access of other at least two access points which are not identical to the at least two access points.

In order not to perform repeated identification work, it is therefore avoided as much as possible to select the access point that is identical to the access point that has already been acquired in step 101.

When the total number of the access points is equal to the rated number, indicating that the fewest selected sample size cannot identify a counterfeit access point, the currently selected content may be considered to need to be reconstructed, i.e. step 104 is performed. When the total number of the access points is larger than the rated number, the introduction of the unidentifiable access point information in the sample expansion process is indicated, and the identification process of the acquired access points can be ended. The sample expansion process can refer to the contents shown in

subsequent steps

106 and 107.

In addition, since reconfiguration is required when the total number of access points is equal to the nominal number, it does not occur that the total number of access points is less than the nominal number.

105. And determining a fake access point according to the accessed times of the other two access points and the total number of the other at least two access points.

The method for identifying the reselected access point may refer to

steps

101 and 102 and related descriptions, and the embodiment of the present disclosure is not described herein again.

The above-described method flow is an identification flow for the acquired access points, and can identify whether the acquired access points are all fake access points at one time. If the acquired access points are all fake access points, the fake access points can be used as identification reference samples of other non-acquired access points.

Based on the condition that the collected access points are all fake access points, after the step 102 is executed and the at least two access points are determined to be fake access points, as shown in fig. 4, the method further includes:

106. and acquiring the corresponding accessed times of at least three access points including the determined fake access point.

107. And determining a new fake access point according to the respective accessed times of the at least three access points and the total number of the at least three access points.

The identification process of the at least three collected access points is consistent with the identification process of the at least two collected access points, and details are not repeated in this embodiment.

Further, if there is a case that the flatness and the flatness cannot identify the forged access point after the number of access points is increased to at least three, it is necessary to reselect a new access point for collection and recalculate based on the collection result. The specific process is shown in fig. 5, and includes:

201. acquiring the accessed times of at least two other access points which are completely different from the determined fake access point.

At this time, the determined fake access point may not participate in the subsequent identification process.

202. And determining a fake access point according to the accessed times of the other two access points and the total number of the other at least two access points.

The method for identifying the reselected access point may refer to

steps

In order to provide better targeting, each access point referred to above generally belongs to the same top-level domain name (e.g., top-level domain name in the form of abc.com), and thus it is necessary to collect access traffic at a network entry, record core data of all network access packets, and determine an access point to be identified from all the obtained core data. The related implementation flow, as shown in fig. 6, includes:

301. and collecting the data packet.

302. And acquiring the access point corresponding to the acquired data packet.

303. And determining the top-level domain name corresponding to each access point according to the access point corresponding to the acquired data packet.

Since each packet carries information such as a source IP, a source port number, a destination HOST, a destination IP, a destination port number, etc., the domain name can be obtained by suffix identification of the destination HOST.

304. Selecting no less than the nominal number of access points among the access points corresponding to the same top-level domain name.

Further, after determining the fake access point, the fake data packet needs to be processed in a targeted manner, and the related implementation flow, as shown in fig. 7, includes:

401. and acquiring a target data packet from the acquired data packets, wherein the destination address of the target data is one of the forged access points.

402. And collecting the current data flow.

403. And performing a specified operation on a data packet with the same source address as the target data packet in the data traffic.

The specified operation may include an identify, mark, intercept, etc.

In combination with the foregoing description, the embodiment of the present invention provides an overall flow example to embody the overall execution concept of the technical solution of the present invention. The scheme can be applied to counterfeit access point identification under any top-level domain name, and data flows (at least including access times) received by access points pointed by each HOST, IP address and PORT PORT number under the top-level domain name need to be collected in the identification process.

A possible overall flow of the scheme may specifically refer to what is shown in fig. 8, including:

501. collection belonging to the same top level domainThe name (e.g.,www.cnn.com) of 10000 access points The access times corresponding to each access point. Completion 501Then, step 502 is executed to complete the flatness calculation and step 503 is executed to complete the flatness calculation.

The hostname HOST, the designated IP address, and the designated Port to which each of the 1000 access points belongs are not identical.

The number of access times corresponding to each of the 1000 access points varies from 1 to 10, and for example, the number of access times corresponding to each access point sequentially corresponds to 1, 5, 4, 6, 9, 7, 5, 3, 9, 8, 4, 1, 6, 1, 9, 7, 6 … ….

502. And calculating according to the accessed times of the access points and the total number of the access points to obtain a standard deviation and an average value of the accessed times of the access points, and dividing the standard deviation by the average value to obtain the flatness.

The flatness value is calculated according to the number of access times and the total number of actually acquired access points. However, for convenience of the subsequent description, the flatness is assumed to be 6%.

503. And calculating according to the accessed times of each access point and the total number of the access points to obtain an average value of the accessed times of each access point, and dividing the total number by the average value to obtain the flatness.

The value of the flatness needs to be calculated according to the actually acquired access times and total number of each access point. However, for convenience of the following description, the flatness is assumed to be 200.

504. And combining the calculation results obtained in the step 502 and the step 503 to judge whether the flatness meets a first threshold range, and the flatness also meets a second threshold range.

In the present flow step, for convenience of the later description, the first threshold range is set to 5% to 10% and the second threshold range is set to more than 100% by way of example only. Since the flatness 6% satisfies the first threshold range of 5% to 10%, and at the same time, the flatness 200 also satisfies the second threshold range greater than 100, step 505 may be performed.

505. And determining that the 1000 access points are all forged access points.

506. 1001 access points including 505 confirmed fake access points are collected again.

Assume that the number of newly added access point accesses is 10000.

507. And recalculating flatness and flatness according to the number of times of access and the total number of the 1001 access points.

At this time, the flatness of the 1001 access points may increase to 40%, and the flatness may decrease to 60%.

508. And (4) judging whether the flatness meets a first threshold range or not by combining the calculation result obtained in the step 507, and simultaneously, judging that the flatness meets a second threshold range.

At this time, it is apparent that the new flatness and flatness do not satisfy the first threshold range and the second threshold range. At this time, the 1001 access points cannot be identified due to the newly introduced 1 access point.

509. Intercepting the source addresses of the data packets received by the 1000 access points identified in the step 505.

510. Another 1000 access points are reselected among the remaining 9000 access points and steps 502 and 503 are re-executed.

The values of the above parameters are for illustration only and do not need to be limiting. In actual operation, setting and calculation are needed according to actual conditions and acquired data.

In addition, it is supplementary noted that, because the actually collected access times may be different, the determination result of step 504 may be that the flatness does not satisfy the first threshold range or the flatness does not satisfy the second threshold range, and at this time, 1000 access points that are not identical to the access points already selected in step 501 may need to be reselected from the 10000 access points determined in step 501, and the process after step 502 is completed again.

It should be noted that, as a result of the determination at step 508, if the flatness of the 1001 access points satisfies the first threshold range and the flatness also satisfies the second threshold range, the access points at 1002 including the 1001 access points may be further selected and the procedure of identifying counterfeit access points may be further performed again at step 502. By analogy, as long as forged access points can be determined all the time, new access points can be continuously added from 10000 access points to the determined forged access point set, so as to complete the new forged access point identification process.

In conjunction with the foregoing description, this embodiment provides a possible implementation of this solution, as shown in fig. 9, including:

601. and counting the access times of each access point 'destination HOST + destination IP + destination port number' under a single top-level domain name, and arranging the access times in a reverse order according to the access times, namely, the access points with higher times are arranged in the front, and the access points with lower times are arranged in the back.

602. And taking a rated number Xmin of access points from the front of the array with the highest frequency to form an X set, wherein the number Xn of the X set is Xmin.

603. And calculating the flatness and the flatness of the X set. If the flatness and the flatness of the X set both meet the threshold, go to step 504; otherwise, step 506 is performed.

604. And judging whether the access points to be checked exist. If there are access points to be checked, go to step 505; otherwise, step 510 is performed.

605. And expanding the X set by 1 access point backwards, and re-executing the step 503 to perform calculation.

606. And judging whether the current Xn is the same as Xmin or not. When Xn is Xmin, step 507 is executed.

607. And judging whether the access points to be checked exist. If there are access points to be checked, go to step 508; otherwise step 509 is performed.

608. The X set window is translated back and the calculation is started by re-executing step 503.

609. And removing the access point which is newly added in the current X set.

610. The access point in the current X set is determined to be a fake access point.

611. And judging whether the number of the remained unidentified access points is larger than Xmin or not. When the number of currently remaining unidentified access points is greater than Xmin, step 502 is performed. Otherwise, ending the identification process.

In combination with the foregoing methods and flows, an embodiment of the present invention further provides an access point screening apparatus, which is shown in fig. 10 and includes:

the acquiring unit 71 is configured to acquire the number of times of access corresponding to each of the at least two access points.

A determining unit 72, configured to determine a fake access point according to the number of times of accessing each of the access points and the total number of the access points.

Optionally, as shown in fig. 10, the determining unit 71 includes:

a first determining subunit 711, configured to determine, according to the number of times of access to each of the access points and the total number of the access points, a flatness between the number of times of access to each of the access points, and a flatness between the total number of the access points and the total number of times of access.

A second determining subunit 712, configured to determine the fake access point according to the flatness and the flatness.

Optionally, the first determining subunit 711 is specifically configured to perform calculation according to the number of times of access to each access point and the total number of the access points, so as to obtain a standard deviation and an average value of the number of times of access to each access point; dividing the standard deviation by the average value to obtain the flatness.

Optionally, the first determining subunit 711 is specifically configured to perform calculation according to the number of times of access to each access point and the total number of the access points, so as to obtain an average value of the number of times of access to each access point; dividing the total number by the average value to obtain the flatness.

Optionally, the second determining subunit 712 is specifically configured to determine that the at least two access points are fake access points when the flatness satisfies a first threshold range and the flatness satisfies a second threshold range.

Optionally, the acquiring unit 71 is configured to acquire the number of times of access corresponding to each of at least three access points including the determined fake access point.

The determining unit 72 is configured to determine a new forged access point according to the respective access times of the at least three access points and the total number of the at least three access points.

Optionally, the acquiring unit 71 is configured to acquire the number of times of accessing at least two other access points completely different from the determined fake access point.

The determining unit 72 is configured to determine a fake access point according to the number of times of accessing the other two access points and the total number of the other at least two access points.

Optionally, as shown in fig. 12, the apparatus further includes:

a comparing unit 73, configured to compare a magnitude relationship between the total number of access points and a rated number when the flatness does not satisfy a first threshold range or the flatness does not satisfy a second threshold range, where the rated number is not less than 2.

The acquiring unit 71 is configured to acquire the number of times of access to the at least two other access points that are not identical to the at least two access points when the total number of the access points is equal to the rated number.

Optionally, as shown in fig. 13, the apparatus further includes:

the collecting unit 71 is configured to collect a data packet.

And a first obtaining unit 74, configured to obtain an access point corresponding to the acquired data packet.

The determining unit 72 is configured to determine, according to the access point corresponding to the acquired data packet, a top-level domain name corresponding to each access point.

A selecting unit 75 for selecting not less than the nominal number of access points among the access points corresponding to the same top-level domain name.

Optionally, as shown in fig. 14, the apparatus further includes:

a second obtaining unit 77, configured to obtain a target data packet from the collected data packets, where a destination address of the target data is one of the fake access points;

the acquisition unit 71 is configured to acquire a current data flow;

an operation unit 78, configured to perform a specified operation on a data packet in the data traffic, which has the same source address as the target data packet.

According to the screening device for the access points, the statistical characteristics of the access points are calculated through the number of times of access to the access points and the total number of the access points, and the fake access points attacked by the fake data packets are identified according to the statistical characteristics.

In addition, for the implementation of the access point screening apparatus, the embodiment of the present invention also provides a possible implementation manner, as shown in fig. 15, which is a simplified block diagram of the access point screening apparatus 80. The access point screening apparatus 81 may include a processor 81 connected to one or more data storage facilities, which may include a storage medium 82 and a memory unit 83. The access point screening apparatus 80 may also include an input interface 84 and an output interface 85 for communicating with another apparatus or system. The program code executed by the CPU of the processor 81 may be stored in the memory unit 82 or the storage medium 83.

The processor 81 in the access point screening apparatus 80 calls the program code to execute the following steps:

acquiring the respective corresponding accessed times of at least two access points; and determining fake access points according to the accessed times of the access points and the total number of the access points.

The processor 81 is further configured to determine, according to the number of times of access of each of the access points and the total number of the access points, a flatness between the number of times of access of each of the access points, and a flatness between the total number of the access points and the total number of the times of access; and determining the forged access point according to the flatness and the flatness.

The processor 81 is further configured to calculate according to the number of times of access to each access point and the total number of the access points, so as to obtain a standard deviation and an average value of the number of times of access to each access point; dividing the standard deviation by the average value to obtain the flatness.

The processor 81 is further configured to calculate according to the number of times of access of each access point and the total number of the access points, so as to obtain an average value of the number of times of access of each access point; dividing the total number by the average value to obtain the flatness.

The processor 81 is further configured to determine the at least two access points as fake access points when the flatness satisfies a first threshold range and the flatness satisfies a second threshold range.

The processor 81 is further configured to collect respective access times corresponding to at least three access points including the determined fake access point; and determining a new fake access point according to the respective accessed times of the at least three access points and the total number of the at least three access points.

The processor 81 is further configured to collect the number of times of access to at least two other access points completely different from the determined fake access point; and determining a fake access point according to the accessed times of the other two access points and the total number of the other at least two access points.

The processor 81 is further configured to compare the total number of access points to a magnitude relation of a rated number, the rated number being not less than 2, when the flatness does not satisfy a first threshold range or the flatness does not satisfy a second threshold range; when the total number of the access points is equal to the rated number, acquiring the accessed times of other at least two access points which are not identical with the at least two access points; and determining a fake access point according to the accessed times of the other two access points and the total number of the other at least two access points.

The processor 81 is also used for collecting data packets; acquiring an access point corresponding to the acquired data packet;

determining a top-level domain name corresponding to each access point according to the access point corresponding to the acquired data packet; selecting no less than the nominal number of access points among the access points corresponding to the same top-level domain name.

The processor 81 is further configured to obtain a target data packet from the collected data packets, where a destination address of the target data is one of the fake access points; collecting current data flow; and performing a specified operation on a data packet with the same source address as the target data packet in the data traffic.

In the above embodiments, the storage medium may be a Read-Only Memory (ROM), or may be a Read-write medium, such as a hard disk or a flash Memory. The Memory unit may be a Random Access Memory (RAM). The memory unit may be physically integrated with the processor or integrated in the memory or implemented as a separate unit.

The processor is a control center of the above-mentioned device (the above-mentioned device is the above-mentioned server or the above-mentioned client), and provides a processing device for executing instructions, performing interrupt operation, providing a timing function and various other functions. Optionally, the processor includes one or more Central Processing Units (CPUs), such as CPU 0 and CPU 1 shown in fig. 12. The apparatus may include one or more processors. The processor may be a single core (single CPU) processor or a multi-core (multi-CPU) processor. Unless otherwise stated, a component such as a processor or a memory described as performing a task may be implemented as a general component, which is temporarily used to perform the task at a given time, or as a specific component specially manufactured to perform the task. The term "processor" as used herein refers to one or more devices, circuits and/or processing cores that process data, such as computer program instructions.

The program code executed by the CPU of the processor may be stored in a memory unit or a storage medium. Alternatively, the program code stored in the storage medium may be copied into the memory unit for execution by the CPU of the processor. The processor may execute at least one kernel (e.g., LINUX)^TM、UNIX^TM、WINDOWS^TM、ANDROID^TM、IOS^TM) It is well known for such cores to control the operation of such devices by controlling the execution of other programs or processes, controlling communication with peripheral devices, and controlling the use of computer device resources.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on at least two network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; while the present solution has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for screening access points, comprising:

determining fake access points according to the number of times of access of each access point and the total number of the access points;

the determining fake access points according to the number of times of access to each access point and the total number of the access points comprises:

2. The method of claim 1, wherein determining a flatness between the number of times each access point is visited based on the number of times each access point was visited and the total number of access points comprises:

dividing the standard deviation by the average value to obtain the flatness.

3. The method of claim 1, wherein determining the flatness between the total number of access points and the total number of visited times as a function of the number of times each of the access points was visited and the total number of access points comprises:

dividing the total number by the average value to obtain the flatness.

4. The method of claim 2 or 3, wherein said determining the fake access point based on the flatness and the flatness comprises:

5. The method of claim 4, further comprising, after said determining that the at least two access points are fake access points:

6. The method of claim 5, wherein when a new fake access point cannot be determined according to the respective number of times that the at least three access points are accessed and the total number of the at least three access points, the method further comprises:

7. The method of claim 4, further comprising:

8. The method of claim 4, further comprising, prior to the collecting the respective number of visits by each of the at least two access points:

collecting a data packet;

acquiring an access point corresponding to the acquired data packet;

selecting no less than a nominal number of access points among the access points corresponding to the same top-level domain name.

9. The method of claim 8, further comprising, after said determining fake access points based on the number of times each of the access points was accessed and the total number of access points:

collecting current data flow;

10. An apparatus for screening access points, comprising:

a determining unit, configured to determine fake access points according to the number of times of access of each access point and the total number of the access points;

the determination unit includes:

11. The apparatus according to claim 10, wherein the first determining subunit is specifically configured to perform calculation according to the number of times of access to each of the access points and the total number of the access points, so as to obtain a standard deviation and an average value of the number of times of access to each of the access points; dividing the standard deviation by the average value to obtain the flatness.

12. The apparatus according to claim 10, wherein the first determining subunit is specifically configured to perform calculation according to the number of times of access to each of the access points and the total number of the access points, so as to obtain an average value of the number of times of access to each of the access points; dividing the total number by the average value to obtain the flatness.

13. The apparatus according to claim 11 or 12, wherein the second determining subunit is specifically configured to determine the at least two access points as fake access points when the flatness satisfies a first threshold range and the flatness satisfies a second threshold range.

14. The apparatus of claim 13,

15. The apparatus of claim 14,

16. The apparatus of claim 13, further comprising:

17. The apparatus of claim 13, further comprising:

the acquisition unit is used for acquiring data packets;

and a selection unit for selecting access points not less than a rated number among the access points corresponding to the same top-level domain name.

18. The apparatus of claim 17, further comprising:

the acquisition unit is used for acquiring the current data flow;