CN108234435A - A kind of automatic testing method based on IP classification - Google Patents
A kind of automatic testing method based on IP classification Download PDFInfo
- Publication number
- CN108234435A CN108234435A CN201611201889.5A CN201611201889A CN108234435A CN 108234435 A CN108234435 A CN 108234435A CN 201611201889 A CN201611201889 A CN 201611201889A CN 108234435 A CN108234435 A CN 108234435A
- Authority
- CN
- China
- Prior art keywords
- information
- cluster
- behavior
- collection
- distribution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
Abstract
The present invention provides a kind of automatic testing method based on IP classification, including:The IP address information that server obtains is converted into the geography information that IP uses, and classifies based on the geography information to form cluster to the IP address information;Window behavior statistics is carried out to export behavioral statistics as a result, obtaining current behavior Annual distribution collection C by scheduled time dimension to the chosen elements in the IP parameter sets W and cluster;D to the historical behavior statistical result distribution weight under each chronomere and is collected to update historical behavior and be distributed to identical element or the addition of class statistic data with scheduled chronomere;The behavioral data that the historical behavior is distributed in collection D is classified to generate cluster centre using clustering algorithm;The current behavior Annual distribution collection C is calculated to the distance of each cluster centre and exports lowest distance value;Compare the lowest distance value with risk threshold value to export testing result.The present invention can accurately detect malice domain name.
Description
Technical field
The present invention relates to computer realm, more particularly to a kind of automatic testing method based on IP classification.
Background technology
With the rapid development of network technology and the arrival of cybertimes, the wide and abundant resource that network is contained,
Many facilities are brought to human society.However, just while people’s lives are increasingly dependent on network, by interests driving
The network safety event of generation but emerges in an endless stream, and especially in recent years, Botnet, domain name amplification distributed denial of service are attacked
Hit, numerous security incidents such as extension horse have seriously affected the normal use of network, also bring great harm to various circles of society, because
It is additional important to seem to the detection of these events for this.In addition, using some domain names, terminal website is carried out based on IP address
And malicious registration, the malice application of application also bring great security risk to Internet service provider.
Domain name system is one of important infrastructure of current internet, and a large amount of network service comes dependent on domain name service
Carry out.Domain name resolution service is (hereinafter referred to as:DNS service) abstract IP address is mapped as being easy to the domain name of memory, make interconnection
Network users more easily access various Internet resources, are one of infrastructure services important in internet architecture.Due to domain
Name system is not detected the service behavior for relying on its development, and DNS service lacks malicious act detectability, therefore often
Often utilized by rogue program.In order to detect these malicious events, need to be detected malice domain name.
Technologies of some now existing detection malice domain names frequently rely on black and white lists, by clearly " allowing " and
" not allowing " limits the access of user, so as to fulfill " safety " effect.However, such method is usually associated with a large amount of mistakes
Situation is reported and fails to report, adaptability is very poor under different user environment, business demand scene.
Invention content
The technical issues of technical solution of the present invention solves is how accurately to detect malice domain name.
In order to solve the above-mentioned technical problem, technical solution of the present invention provides a kind of automatic detection side classified based on IP
Method, including:
Pre-defined data packet is obtained from server end to obtain the element in IP parameter sets W, the IP parameter sets W extremely
Include IP address information less;
The IP address information is converted into the geography information that IP uses, and based on the geography information to the IP address
Information classification is clustered with being formed;
Window behavior statistics is carried out with defeated by scheduled time dimension to the chosen elements in the IP parameter sets W and cluster
It goes on a journey as statistical result, obtains current behavior Annual distribution collection C;
With scheduled chronomere to the historical behavior statistical result distribution weight under each chronomere and to identical member
Element or class statistic data are added to update historical behavior distribution collection D;
The behavioral data that the historical behavior is distributed in collection D is classified to generate cluster centre using clustering algorithm;
The current behavior Annual distribution collection C is calculated to the distance of each cluster centre and exports lowest distance value;
Compare the lowest distance value with risk threshold value to export testing result.
Optionally, the data packet includes:Carry out facility information, the network information and the account information of IP behaviors.
Optionally, the element in the IP parameter sets W further includes:IP numerical value, IP network section, IP cutoff informations and Transmission Control Protocol
Stack information;
The IP cutoff informations obtain in the following way:
IP address is expressed as to the binary number of 32;
Preceding n bit values are taken as the IP cutoff informations, wherein, n takes 24 to 32 natural number;
The TCP protocol stack information includes:Tcpts, Wscale and Tcp Source Port.
Optionally, it is described the IP address information is converted into the geography information that IP uses to include:
Using external IP geographical data bank, IP address information is converted into using ground information;
If based on used described in natural language recognition in ground information country, province, in city and street information at least
Dry information forms the field of the geography information;
It is described that IP address information classification is included with forming cluster based on the geography information:
Field setting classification based on described address information is with to IP address information classification.
Optionally, the chosen elements include:IP network section and IP cutoff informations in the IP parameter sets W.
Optionally, the chosen elements and cluster in the IP parameter sets W carry out window by scheduled time dimension
Behavioral statistics are included with exporting behavioral statistics result:
Monitor window behavior;
It counts the chosen elements during window behavior and clusters the display number in predetermined time dimension.
Optionally, the window behavior is based on sliding window or stationary window.
Optionally, the setting method of the predetermined time dimension is:During setting time or setting time starting point and time
Terminal.
Optionally, the chronomere is day, and the behavioral statistics result is daily distributed.
Optionally, it determines to be distributed under j-th of chronomere to the weight k of historical behavior statistical result based on following algorithmj:
kj=aj(a/ (1-a)), wherein a are the predetermined constant more than 0 and less than 1.
Optionally, it is described that the behavioral data that the historical behavior is distributed in collection D is classified with life using clustering algorithm
Include into cluster centre:
Define distribution vector;
Set in historical behavior distribution collection D the distance between distribution vector two-by-two, the distance for it is common it is European away from
From;
It is clustered using K-means algorithms for distance between distribution vector two-by-two, and is calculated using Elbow method
Method determines most preferably to cluster number m and m cluster centre, be denoted as k1, k2 ... km }.
Optionally, it is described that the behavioral data that the historical behavior is distributed in collection D is classified with life using clustering algorithm
Include into cluster centre:Based on the cluster centre update IP behavior clustering informations library generated;
Each described cluster centre is the cluster centre recorded in the IP behaviors clustering information library.
Optionally, the distance for calculating current behavior Annual distribution collection C to each cluster centre includes:It calculates
Each distribution vector is to the distance of corresponding cluster centre in the current behavior Annual distribution collection C;
The risk threshold value is obtained based on following manner:
Distance based on the distribution vector in the current behavior Annual distribution collection C to corresponding cluster centre establishes probability point
Cloth;
The intermediate value digit for taking the probability distribution is the risk threshold value.
Optionally, the distribution vector for the chosen elements or clusters the display number on scheduled time dimension.
Optionally, the method further includes:
Risk class is obtained based on the testing result;
It asks to determine acceptable risk class range according to Outer risks and exports corresponding cluster result.
The advantageous effect of technical solution of the present invention includes at least:
Technical solution of the present invention can effectively detect the IP address information of abnormal behavior, and can be based on IP address information pair
The data packet generated during website or application operating is detected and carries out window behavior Statistical Clustering Analysis to each IP parameter,
So as to detect unusual IP addresses information, so as to improve and monitor the accuracy of malicious IP addresses.
Technical solution of the present invention can also be directed to the parameter set of IP address, and the poly- of parameter set is carried out with reference to historical parameter data
Class is assessed, and accumulation is weighted to history parameters collection based on chronomere, and the probability distribution based on cluster result calculates wind
IP address information with threat is carried out quantitative evaluation by dangerous threshold value and risk class, so as to further improve monitoring malice IP
The accuracy of address.
Technical solution of the present invention can also carry out IP address information based on above-mentioned cluster result the division of risk class, so as to
Third party user is allow effectively to confirm its applicable risk range, makes the evaluation system of malicious IP addresses information can be according to
The situation of tripartite user and assessed, expand the scope of application of technical solution of the present invention, accomplish the simultaneous of multiple assessment system
Hold.
Description of the drawings
Upon reading the detailed description of non-limiting embodiments with reference to the following drawings, other features of the invention,
Objects and advantages will become more apparent upon:
Fig. 1 is a kind of flow diagram of the automatic testing method based on IP classification of technical solution of the present invention;
Fig. 2 is flow diagram of the technical solution of the present invention another kind based on the IP automatic testing methods classified.
Specific embodiment
In order to preferably technical scheme of the present invention be made clearly to show, the present invention is made below in conjunction with the accompanying drawings into one
Walk explanation.
It is extremely important in the society that the detection of malice domain name is popularized in nowadays network.Multiple network application scenarios, such as silver
Row loan, service log-on, electric business marketing etc., are all based on IP address, IP address are all based on to the user in application scenarios or businessman
Carry out the operation of network means.And aforesaid operations all be unable to do without the network window opened based on IP address.Therefore lead in network
In letter, IP address is the information of a kind of user that is very basic and generally having or businessman.Under network application scene, user
Or businessman is based on IP address and carries out network operation progress, but bad user is potentially based on some IP address to merchant service malice
Application, bad businessman are potentially based on IP address and carry out demagogy popularization to platform, above-mentioned behavior may to network environment and
Resource service causes heavy losses, and generates the waste of money.
In above-mentioned action process, since bad user or businessman are carried out by the IP address of network, in network application
Or in extension process, network can the specific operation based on user, such as the windows news such as login window periodically sends to server
Scheduled data packet can protect the various information in relation to IP address in data packet, and technical solution of the present invention passes through detection service device
The above-mentioned data packet of middle reception carries out the classification of IP address information, so as to fulfill the automatic inspection of technical solution of the present invention IP address
It surveys.
A kind of automatic testing method based on IP classification as shown in Figure 1, including step:
Step S100 obtains pre-defined data packet to obtain IP parameter sets W, the IP parameter sets W from server end
In element include at least IP address information.
In this step, in the network user, (the application meaning user uses user and businessman including general networking service
User) it carries out the opening page of network service, login, registration, apply when operations, network service application can be sent to server
The data packet pre-defined.The data packet is then pre-defined, pre-defines the process of data packet in other embodiment
In can be performed as an additional step.The content of pre-defined data packet includes:It is (i.e. above-mentioned that definition carries out IP behaviors
Network operation) facility information;Define the network information;And define account information.Wherein, the information content of account information can
With the content structure of data packet described in selected as.
In the step s 100, the element in the IP parameter sets W specifically includes following information:IP numerical value, IP network section, IP are cut
Disconnected information and TCP protocol stack information.Specifically, the element in the P parameter sets W is based on IP address information, IP address is represented
For binary number (altogether 32), ipSeg_n (n values are from 24 to 32) then is denoted as by n before the binary number, as
The IP cutoff informations;The information of TCP protocol stack includes tcpts, wscale, tcp source port.More specifically,
The timestamp information that it is TCP protocol stack that tcpts information, which is, is an option in Transmission Control Protocol, represents that Transmission Control Protocol is shaken hands generation
Timestamp;Wscale information is the window expansion factor that TCP window expands option, for expanding TCP advertised windows;tcp
Source port information is the communication port information in TCP communication source.
In the technical solution of the present invention, the element information in the IP parameter sets W is in the information based on the data packet
It is obtained acquired in appearance, therefore, the information content of the data packet can also have other than the citing of above-mentioned pre-defined content
Other forms, the purpose is to obtain the required essential information of element in above-mentioned IP parameter sets W by packet content,
The data packet form of technical solution of the present invention is not limited by examples detailed above.
With continued reference to Fig. 1, the automatic testing method based on IP classification described in technical solution of the present invention further includes:
The IP address information is converted into the geography information that IP uses, and based on the geography information pair by step S101
The IP address information classification is clustered with being formed.
It is described that the IP address information is converted into the geography information packet that IP uses specifically, in above-mentioned steps S101
It includes:Using external IP geographical data bank, IP address information is converted into using ground information;And based on natural language recognition institute
At least several information in the country used in ground information, province, city and street information are stated, form the word of the geography information
Section.It is described that IP address information classification is included with forming cluster based on the geography information:Based on the geography information
Field sets classification to classify to the IP address information.
Citing is to illustrate the process of step S101, for example the IP address information obtained is 106.18.236.97, according to outside
IP geographical data banks, it is the ground then wherein formed using ground information, i.e. Hunan China Telecom that can will convert the IP address information
The field for managing information is " China ", " Hunan Province ", " telecommunications ".In view of the geography information conversion results of this, it can be by above-mentioned geography
The field combination of information is to determine classification, for example, the IP address information with " China ", " telecommunications " geographical information field is classified as
One kind, in other embodiments, the type that above-mentioned field combination determines can be set, for example, can also will have " China ",
The IP address information of " Hunan Province " geographical information field is classified as one kind, with can also will having " China ", " Hunan Province " and " telecommunications "
The IP address information of reason information field is classified as one kind.How examples detailed above combines the field of geography information with determining IP if giving
The implementation process of location class categories.
It should be noted that the database for corresponding to geography information for IP address information according to external IP geographical data bank is accurate
Difference is understood in the difference of exactness, the accuracy that IP address information is converted into the geography information that IP is used, that is to say, that for
For the more complete external IP geographical data bank of the prior art, when known IP address information is converted into using ground information,
Its geography information may have more detailed address information, and the corresponding obtained field of geography information also can be more, setting
During the classification of IP address classification, more accurate classification information can also be used.Similar, it can be specific to state using ground information
Family, province, city and street information, at this point, the field of obtained geography information includes national information field, province information word
Section, urban information field and street information field, then when determining IP address class categories, national information word can also be used
The combination of section, province information field, urban information field can also use national information field, province information field, city letter
The combination of field and street information field is ceased, technical solution of the present invention limits not to this.
With continued reference to Fig. 1, the automatic testing method based on IP classification described in technical solution of the present invention further includes:
Step S102 carries out window row to the chosen elements in the IP parameter sets W and cluster by scheduled time dimension
To count to export behavioral statistics as a result, obtaining current behavior Annual distribution collection C.
Specifically, the chosen elements include:IP network section and IP cutoff informations in the IP parameter sets W, the IP are cut
Disconnected information is above-mentioned ipSeg_n.The window behavior is that the data packet of above-mentioned IP address information is received in the server
When, that is, it is considered as the generation of a window behavior.The statistics is real-time counting.Corresponding this IP parameter set W of cluster.More
For specifically, the chosen elements and cluster in the IP parameter sets W carry out window behavior system by scheduled time dimension
Meter is included with exporting behavioral statistics result:Monitor window behavior;And count the chosen elements and cluster during the window behavior
Display number in predetermined time dimension.
In step s 102, the window behavior is based on sliding window or stationary window.The predetermined time dimension width
Setting method be:During setting time or setting time starting point and end time.According to the classification of window behavior, statistical window
The algorithm of behavior includes:Above-mentioned objects of statistics (i.e. IP network section, the address cluster corresponding to ipSeg_n, IP) is being set respectively
The statistics of sliding window or stationary window is carried out during one or more times, alternatively, passing through setting time starting point and time
Terminal carries out the statistics of sliding window or stationary window during past a period of time.
Following present the first examples of statistical window behavior algorithm in technical solution of the present invention:
For above-mentioned selected element, such as IP network section, counted as follows:With the 15th minute, the 30th minute, the 60th
Minute and the 120th minute are as during the time;Count the IP network section the 15th minute, the 30th minute, the 60th minute,
120 minutes sliding window occurrence numbers;The IP address network segment is counted at the 15th minute, the 30th minute, the 60th minute, the 120th point
Clock stationary window occurrence number.
In another then example, the algorithm of statistical window behavior can also be in technical solution of the present invention:
The past period is set as 15 minutes, setting time starting point is 1:00th, end time 1:It 15 or can also set
Starting point of fixing time is 1:15th, end time 1:30.Sliding window occurrence count is:For above-mentioned chosen elements (such as IP network
Section), the number that statistics occurs within 15 minutes in the past;Stationary window occurrence count is:For in the stationary windows of 15 minutes, giving
The number that fixed parameter (such as IP network section) occurs.
Based on above-mentioned technology contents, current behavior Annual distribution collection C described in step S102 is really that IP address selectes member
The statistical distribution of element occurrence number in sliding window or stationary window on multiple time dimensions.
With continued reference to Fig. 1, the automatic testing method based on IP classification described in technical solution of the present invention further includes:
Step S103 distributes weight simultaneously with scheduled chronomere to the historical behavior statistical result under each chronomere
Identical element or class statistic data are added to update historical behavior distribution collection D.
Specifically, in step s 103, chronomere can be day or arbitrarily selected several hours, the chronomere
Preferably day, the behavioral statistics result are daily distributed, i.e., the historical behavior statistical result being daily distributed is carried out as unit of day
Statistics, such as count information of the statistics for today, the count information of yesterday etc..Historical behavior statistical result is with day in this step
The behavioral statistics data of time dimension for chronomere.
More specifically, the update historical behavior distribution collection D is needed according to time dimension to each on time dimension
A element or cluster and its statistical data distribution weight, the technical solution of the present invention preferably temporally time series distribution of dimension
Different weights, and totalling update is carried out to the number statistics based on identical element or cluster, so as to fulfill the technology of the present invention side
Case meaning update historical behavior distribution collection D.The thinking of weight setting is time data more remote, and weight is lower.Based on as follows
Algorithm determines element or the weight k of cluster distribution in j-th of chronomerej:kj=aj(a/ (1-a)), wherein a are more than 0 and small
In 1 predetermined constant.J is the time series numerical value in the counting namely the time dimension of chronomere, and j=1~N, 1 is
The time series of last update, N are initial newer time series.In step s 103, it is described to identical element or poly-
Class statistical data is added includes following process to update historical behavior distribution collection D:To historical time dimension under the chronomere
The upper assignment ratio for identity element or cluster is weighted addition.Wherein, the result of weighting summation is used to update described go through
Recorded in history behavior distribution collection D, historical behavior distribution collection D be each element according to the historical time dimension to this
Element clusters the result that the assignment proportion weighted on time dimension is added.
With continued reference to Fig. 1, the automatic testing method based on IP classification described in technical solution of the present invention further includes:
The behavioral data that the historical behavior is distributed in collection D is classified to generate by step S104 using clustering algorithm
Cluster centre.
According to historical behavior distribution collection D caused by step S103, historical behavior distribution collection D is history IP time of the act
Distribution collection, each element or cluster of the inside are the vector of a j dimension, if than with certain day of 1 hour time dimension
The distribution vector of several statistical distributions, element or cluster is the vector set of one 24 dimension, often one-dimensional to represent above-mentioned element or poly- respectively
Class is in the occurrence number of each hour window.Therefore, the behavior being distributed the historical behavior using clustering algorithm in collection D
Data are classified to be included with generating cluster centre:
Define distribution vector;
Set in historical behavior distribution collection D the distance between distribution vector two-by-two, the distance for it is common it is European away from
From;And
It is clustered using K-means algorithms for distance between distribution vector two-by-two, and is calculated using Elbow method
Method determines most preferably to cluster number m and m cluster centre, be denoted as k1, k2 ... km }.
The display number that the distribution vector is counted for the chosen elements or cluster on scheduled time dimension.It is described
The algorithm of common Euclidean distance can be obtained according to any way of Euclidean distance between two vectors of calculating in the prior art.It is described
Distribution vector is the behavioral data in the historical behavior distribution collection D.In this step, K-means algorithmic procedures include input
Determining cluster number and the database for including several data objects, the cluster that output meets variance minimum sandards are (i.e. above-mentioned
Cluster centre), it specifically includes:
(1) object for arbitrarily selecting to determine cluster number from several data objects is as initial cluster center;(2 according to every
The mean value (center object) of a clustering object calculates the distance of each object and these center objects;And according to minimum range weight
Newly corresponding object is divided;(3) mean value (center object) of each (changing) cluster is recalculated;(4) standard is calculated
Measure function, when meeting certain condition, during such as function convergence, then algorithm terminates, and output meets the cluster of variance minimum sandards;Such as
Really bar part is unsatisfactory for, and returns to step (2).
In above-mentioned K-means algorithms, the database of several data objects is the spacing of the distribution vector two-by-two
From result of calculation set.Determining one kind that may be used in following two modes of the cluster number:One passes through Elbow
Method algorithms determine, i.e., judge to cluster number is imitated for how many when according to the functional relation of the result of cluster and cluster number
Fruit is best, so that it is determined that cluster number;The second is the value of m is determined according to specific demand, such as the cluster of shirt size
It will consider LMS three classes etc..In technical solution of the present invention the technology of the present invention side is determined preferably through Elbow method algorithms
The cluster number m of case.
More specifically, it is described that the behavioral data that the historical behavior is distributed in collection D is classified using clustering algorithm
It is further included with generating cluster centre:Based on the cluster centre update IP behavior clustering informations library generated;The IP behaviors cluster
The cluster centre recorded in information bank has recorded each cluster centre.
With continued reference to Fig. 1, the automatic testing method based on IP classification described in technical solution of the present invention further includes:
Step S105 calculates the current behavior Annual distribution collection C to the distance of each cluster centre and exports minimum
Distance value.
Specifically, the distance for calculating current behavior Annual distribution collection C to each cluster centre includes:It calculates
Each distribution vector is to the distance of corresponding cluster centre in the current behavior Annual distribution collection C.More specifically, in cluster
The heart is stored in the IP behaviors clustering information library, calculates element in the current behavior Annual distribution collection C and is clustered to IP behaviors
The distance of each cluster centre in information bank is minimized as output valve.It should be noted that above-mentioned output valve can also
It is averaged or above-mentioned output valve is calculated based on other input functions, the output valve is to work as by technical solution of the present invention
It moves ahead as element in Annual distribution collection C to the distance of cluster centre.
With continued reference to Fig. 1, the automatic testing method based on IP classification described in technical solution of the present invention further includes:
Step S106, the lowest distance value and risk threshold value are to export testing result.
In step s 106, the risk threshold value is obtained based on following manner:Based on the current behavior Annual distribution collection C
In the distance of distribution vector to corresponding cluster centre establish probability distribution;And take the intermediate value position of the probability distribution
Number is the risk threshold value.
More specifically, in above process, technical solution of the present invention is based in current behavior Annual distribution collection C per number
Strong point is to the distance (can be minimum range) for corresponding to cluster centre, by each data point to all distances of corresponding cluster centre
From small to large, a probability distribution (can be histogram) is formed, 50 quantiles of the probability distribution are determined as the wind
Dangerous threshold value, if the lowest distance value is less than the risk threshold value, the corresponding quantile of lowest distance value is corresponding wind
Dangerous grade, window behavior or network behavior according to corresponding to the division of above-mentioned risk class can determine whether out current IP address information
Belong to the possibility of corresponding cluster, the reliability for judging cluster result can be exported according to confidence level based on above-mentioned possibility, from
And judge the application type belonging to IP address information.Above-mentioned cluster can be corresponding from different application scenarios, so as to pass through this
The scheme of the IP Classification and Identifications of inventive technique scheme carries out the detecting of IP address information, so as to obtain being applied described in IP address information
The risk assessment of type.
With reference to figure 2, the automatic testing method based on IP classification described in technical solution of the present invention can also be walked with the flow of Fig. 2
Suddenly implemented, i.e., based on above-mentioned steps S100~S106, further included:
Step S107 obtains risk class based on the testing result;
Step S108 is asked to determine acceptable risk class range and is exported corresponding cluster knot according to Outer risks
Fruit.
For technical solution of the present invention, step S107, S108 can as the flow scope of technical solution of the present invention,
It can also be used as the application flow of external equipment.
It should be noted that common clustering algorithm can only provide a cluster result, technical solution of the present invention will be above-mentioned
The classification of clustering algorithm and IP address is combined, and can detect IP address in real time and run through the technology of the present invention side
Case judges the cluster result of output IP address.It is determined in view of cluster result not necessarily 100%, but with possibility
, cluster result is placed under confidence level by technical solution of the present invention, provide incessantly cluster as a result, giving cluster result
Reliability.
By the cluster of the IP address information of technical solution of the present invention, by by IP address information cluster, can determine whether out
Application type belonging to IP address, than such as whether for office network, if be mobile network and corresponding probability.
When actual use, for example it is the scene of an electric business marketing, for the fault tolerance of IP cluster results
Compare high, which can select the high risk class of a bit, automatically derive corresponding cluster result;
For another example it is a bank loan scene, it is relatively low to fault tolerance, lower risk class can be selected, automatically
It obtains and different cluster result above.
It is further to note that the IP address information of technical solution of the present invention is handled based on IP binary-coded characters
A plurality of types of information arrived, the cluster of above- mentioned information can further increase the accuracy of IP address information cluster with it is reliable
Property.
Specific embodiments of the present invention are described above.It is to be appreciated that the invention is not limited in above-mentioned
Particular implementation, those skilled in the art can make various deformations or amendments within the scope of the claims, this not shadow
Ring the substantive content of the present invention.
Claims (15)
1. a kind of automatic testing method based on IP classification, which is characterized in that including:
Pre-defined data packet is obtained from server end at least to wrap to obtain the element in IP parameter sets W, the IP parameter sets W
Include IP address information;
The IP address information is converted into the geography information that IP uses, and based on the geography information to the IP address information
Classification is clustered with being formed;
Window behavior statistics is carried out to export row by scheduled time dimension to the chosen elements in the IP parameter sets W and cluster
For statistical result, current behavior Annual distribution collection C is obtained;
With scheduled chronomere to the historical behavior statistical result distribution weight under each chronomere and to identical element or
Class statistic data are added to update historical behavior distribution collection D;
The behavioral data that the historical behavior is distributed in collection D is classified to generate cluster centre using clustering algorithm;
The current behavior Annual distribution collection C is calculated to the distance of each cluster centre and exports lowest distance value;
Compare the lowest distance value with risk threshold value to export testing result.
2. the method as described in claim 1, which is characterized in that the data packet includes:Carry out facility information, the net of IP behaviors
Network information and account information.
3. the method as described in claim 1, which is characterized in that the element in the IP parameter sets W further includes:IP numerical value, IP
The network segment, IP cutoff informations and TCP protocol stack information;
The IP cutoff informations obtain in the following way:
IP address is expressed as to the binary number of 32;
Preceding n bit values are taken as the IP cutoff informations, wherein, n takes 24 to 32 natural number;
The TCP protocol stack information includes:Tcpts, Wscale and Tcp Source Port.
4. the method as described in claim 1, which is characterized in that described that the IP address information is converted into the geography that IP uses
Information includes:
Using external IP geographical data bank, IP address information is converted into using ground information;
Based at least several letters in country, province, city and the street information used described in natural language recognition in ground information
Breath forms the field of the geography information;
It is described that IP address information classification is included with forming cluster based on the geography information:
Field setting classification based on described address information is with to IP address information classification.
5. the method as described in claim 1, which is characterized in that the chosen elements include:IP network in the IP parameter sets W
Section and IP cutoff informations.
6. the method as described in claim 1, which is characterized in that the chosen elements and cluster in the IP parameter sets W
Window behavior statistics is carried out by scheduled time dimension to export behavioral statistics result to include:
Monitor window behavior;
It counts the chosen elements during window behavior and clusters the display number in predetermined time dimension.
7. method as claimed in claim 6, which is characterized in that the window behavior is based on sliding window or stationary window.
8. method as claimed in claim 6, which is characterized in that the setting method of the predetermined time dimension is:Setting time
Period or setting time starting point and end time.
9. the method as described in claim 1, which is characterized in that the chronomere is day, and the behavioral statistics result is daily
Distribution.
10. the method as described in claim 1, which is characterized in that based on following algorithm determine to distribute under j-th of chronomere to
The weight k of historical behavior statistical resultj:
kj=aj(a/ (1-a)), wherein a are the predetermined constant more than 0 and less than 1.
11. the method as described in claim 1, which is characterized in that described that the historical behavior is distributed collection D using clustering algorithm
In behavioral data classify and included with generating cluster centre:
Define distribution vector;
It sets the historical behavior and is distributed in collection D the distance between distribution vector two-by-two, the distance is common Euclidean distance;
It is clustered for distance between distribution vector two-by-two using K-means algorithms, and true using Elbow method algorithms
Fixed best cluster number m and m cluster centre, be denoted as k1, k2 ... km }.
12. the method as described in claim 1 or 11, which is characterized in that described to be divided the historical behavior using clustering algorithm
Behavioral data in cloth collection D is classified to be included with generating cluster centre:Gathered based on the cluster centre update IP behaviors generated
Category information library;
Each described cluster centre is the cluster centre recorded in the IP behaviors clustering information library.
13. the method as described in claim 1, which is characterized in that described to calculate the current behavior Annual distribution collection C to each
The distance of a cluster centre includes:It calculates in the current behavior Annual distribution collection C in each distribution vector to corresponding cluster
The distance of the heart;
The risk threshold value is obtained based on following manner:
Distance based on the distribution vector in the current behavior Annual distribution collection C to corresponding cluster centre establishes probability distribution;
The intermediate value digit for taking the probability distribution is the risk threshold value.
14. the method as described in claim 11 or 13, which is characterized in that the distribution vector is the chosen elements or cluster
Display number on scheduled time dimension.
15. the method as described in claim 1, which is characterized in that further include:
Risk class is obtained based on the testing result;
It asks to determine acceptable risk class range according to Outer risks and exports corresponding cluster result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611201889.5A CN108234435A (en) | 2016-12-22 | 2016-12-22 | A kind of automatic testing method based on IP classification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611201889.5A CN108234435A (en) | 2016-12-22 | 2016-12-22 | A kind of automatic testing method based on IP classification |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108234435A true CN108234435A (en) | 2018-06-29 |
Family
ID=62657192
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611201889.5A Pending CN108234435A (en) | 2016-12-22 | 2016-12-22 | A kind of automatic testing method based on IP classification |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108234435A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110598404A (en) * | 2019-09-17 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Security risk monitoring method, monitoring device, server and storage medium |
CN110677309A (en) * | 2018-07-03 | 2020-01-10 | 百度在线网络技术(北京)有限公司 | Crowd clustering method and system, terminal and computer readable storage medium |
CN111325495A (en) * | 2018-12-17 | 2020-06-23 | 顺丰科技有限公司 | Abnormal part classification method and system |
CN112822143A (en) * | 2019-11-15 | 2021-05-18 | 网宿科技股份有限公司 | Method, system and equipment for evaluating IP address |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103532797A (en) * | 2013-11-06 | 2014-01-22 | 网之易信息技术(北京)有限公司 | Abnormity monitoring method and device for user registration |
CN104050289A (en) * | 2014-06-30 | 2014-09-17 | 中国工商银行股份有限公司 | Detection method and system for abnormal events |
CN104156418A (en) * | 2014-08-01 | 2014-11-19 | 北京系统工程研究所 | Knowledge reuse based evolutionary clustering method |
CN105553998A (en) * | 2015-12-23 | 2016-05-04 | 中国电子科技集团公司第三十研究所 | Network attack abnormality detection method |
JP5957411B2 (en) * | 2013-04-25 | 2016-07-27 | 日本電信電話株式会社 | Address resolution system and method |
-
2016
- 2016-12-22 CN CN201611201889.5A patent/CN108234435A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5957411B2 (en) * | 2013-04-25 | 2016-07-27 | 日本電信電話株式会社 | Address resolution system and method |
CN103532797A (en) * | 2013-11-06 | 2014-01-22 | 网之易信息技术(北京)有限公司 | Abnormity monitoring method and device for user registration |
CN104050289A (en) * | 2014-06-30 | 2014-09-17 | 中国工商银行股份有限公司 | Detection method and system for abnormal events |
CN104156418A (en) * | 2014-08-01 | 2014-11-19 | 北京系统工程研究所 | Knowledge reuse based evolutionary clustering method |
CN105553998A (en) * | 2015-12-23 | 2016-05-04 | 中国电子科技集团公司第三十研究所 | Network attack abnormality detection method |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110677309A (en) * | 2018-07-03 | 2020-01-10 | 百度在线网络技术(北京)有限公司 | Crowd clustering method and system, terminal and computer readable storage medium |
CN111325495A (en) * | 2018-12-17 | 2020-06-23 | 顺丰科技有限公司 | Abnormal part classification method and system |
CN111325495B (en) * | 2018-12-17 | 2023-12-01 | 顺丰科技有限公司 | Abnormal part classification method and system |
CN110598404A (en) * | 2019-09-17 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Security risk monitoring method, monitoring device, server and storage medium |
CN112822143A (en) * | 2019-11-15 | 2021-05-18 | 网宿科技股份有限公司 | Method, system and equipment for evaluating IP address |
CN112822143B (en) * | 2019-11-15 | 2022-05-27 | 网宿科技股份有限公司 | Method, system and equipment for evaluating IP address |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108848515B (en) | Internet of things service quality monitoring platform and method based on big data | |
CN109861953B (en) | Abnormal user identification method and device | |
US20200322237A1 (en) | Traffic detection method and traffic detection device | |
CN102143507B (en) | Method and system for monitoring service quality, and analytical method and system therefor | |
CN108234435A (en) | A kind of automatic testing method based on IP classification | |
CN110321424B (en) | AIDS (acquired immune deficiency syndrome) personnel behavior analysis method based on deep learning | |
CN101686444B (en) | System and method for detecting spam SMS sender number in real time | |
CN106951446A (en) | Financial Information method for pushing and device | |
CN104040963A (en) | System and methods for spam detection using frequency spectra of character strings | |
CN109190916A (en) | Method of opposing electricity-stealing based on big data analysis | |
CN109218321A (en) | A kind of network inbreak detection method and system | |
CN112751835B (en) | Flow early warning method, system, equipment and storage medium | |
CN111049818B (en) | Abnormal information discovery method based on network traffic big data | |
CN108632269A (en) | Detecting method of distributed denial of service attacking based on C4.5 decision Tree algorithms | |
CN111148018B (en) | Method and device for identifying and positioning regional value based on communication data | |
CN111917574B (en) | Social network topology model and construction method, user confidence and affinity calculation method and telecom fraud intelligent interception system | |
CN104598595A (en) | Fraud webpage detection method and corresponding device | |
CN111191720B (en) | Service scene identification method and device and electronic equipment | |
CN111611519B (en) | Method and device for detecting personal abnormal behaviors | |
Althobaiti et al. | Energy theft in smart grids: a survey on data-driven attack strategies and detection methods | |
Sun et al. | Detection and classification of network events in LAN using CNN | |
Zhang et al. | Comprehensive IoT SIM card anomaly detection algorithm based on big data | |
CN109446327B (en) | Diagnosis method and system for mobile communication customer complaints | |
CN106817710A (en) | The localization method and device of a kind of network problem | |
US8032302B1 (en) | Method and system of modifying weather content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180629 |