CN101827092B - Detection method for periodic subsequence in network data stream - Google Patents

Detection method for periodic subsequence in network data stream Download PDF

Info

Publication number
CN101827092B
CN101827092B CN201010134835A CN201010134835A CN101827092B CN 101827092 B CN101827092 B CN 101827092B CN 201010134835 A CN201010134835 A CN 201010134835A CN 201010134835 A CN201010134835 A CN 201010134835A CN 101827092 B CN101827092 B CN 101827092B
Authority
CN
China
Prior art keywords
bunch
packet
data
network data
subsequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201010134835A
Other languages
Chinese (zh)
Other versions
CN101827092A (en
Inventor
胡昌振
王崑声
蒋臻甄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201010134835A priority Critical patent/CN101827092B/en
Publication of CN101827092A publication Critical patent/CN101827092A/en
Application granted granted Critical
Publication of CN101827092B publication Critical patent/CN101827092B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a detection method for a periodic subsequence in a network data stream, belonging to the technical field of information security. The method comprises the following steps: (1) obtaining the network data stream, and defining minimum time interval among adjacent clusters and the number of network packets required for forming a minimum cluster; (2) clustering the network data stream to obtain a cluster set; and (3) constructing a periodic cluster set and judging whether the periodic cluster set forms the periodic subsequence or not. The invention has the advantages that whether the periodic subsequence exists in the network data stream or not can be automatically judged by using programs and the analysis efficiency is high; and the network data stream can be obtained in real time, the analysis can be done in real time and the accuracy is high.

Description

A kind of detection method to periodicity subsequence in the network data flow
Technical field
The present invention relates to a kind of detection method, belong to field of information security technology to periodicity subsequence in the network data flow.
Background technology
Periodically subsequence refers in the data flow of a BlueDrama, periodically to occur has equal sequence signature value and packet that it comprised at a kind of closely sequence type of sequential co-relation.Periodically subsequence is reported mode, Network Synchronization mechanism or other timed tasks and ubiquity as presence testing mechanism, state in diverse network is used.The characteristic whether BlueDrama comprises subsequence periodically and this periodicity subsequence often can be used as the key character of a session of differentiation and other sessions; And because the data that periodically subsequence comprised are irrelevant with concrete business datum mostly; Therefore their existence meeting causes interference to the analysis of network data flow; So identification also filters out the research work amount that periodicity subsequence data will reduce network data flow; Improve to judge the efficient of network data flow further feature, promptly reduce non-application data to the interference of the concrete business datum that will study.
At present, whether there is periodically subsequence, generally adopts packet catchers such as pcap, wireshark, sniffer to catch network packet, then its behavioural characteristic of manual analysis for the BlueDrama of a unknown protocol.Promptly at first judge and whether have periodically subsequence in the network packet; If exist, further extract the characteristic of this periodicity subsequence, as: sequence period, sequence formation etc., so that analyzing data before with its filtration, and this needs more artificial the participation.There is following shortcoming in prior art:
1. need artificial judgment whether to have periodically subsequence, so analysis efficiency is low;
2. exist the long period poor between the collection of network data flow and the analysis, can't accomplish real-time analysis;
3. be prone to cause erroneous judgement, the data that are not periodically subsequence are judged as periodically subsequence, cause concrete business datum to be filtered, impact analyzing concrete business datum.
Summary of the invention
The objective of the invention is to propose a kind of detection method that is directed against the periodicity subsequence of network data flow.The present invention is through coming recognition cycle temper sequence to the characteristic analysis of network packet, and the notion of introducing bunch, make extracting cycle temper sequence more effectively, more accurate.
The objective of the invention is to realize through following technical proposals.
At first provide the definition of relational language:
Define 1. sequences: be the set of the packet of preface with time;
Define 2. subsequences: if sequence X, Y satisfies: packets all in the sequence X all are present among the sequence Y, and among the X among order and the Y of all packets packet sequence consistent, then X is the sub-sequence of Y;
Define the characteristic value of 3. sequences: the attribute of the sequence of forming by set characteristic vector;
Define 4. bunches: bunch be in the set of packet closely of sequential co-relation; And bunch and bunch between packet on sequential relationship, become estranged relatively; Concrete criterion is: be not more than an artificial time value set T when the time interval of adjacent two data bag, judge that then these two data wrap in same bunch; Otherwise judge that these two packets belong to two bunches.
A kind of detection method to periodicity subsequence in the network data flow, the concrete operations step is following:
Step 1, obtain network data flow, and the minimum interval value is T between definition bunch, the number that constitutes the required network packet of tuftlet is S Min, S MinBe positive integer;
Step 2, network data flow is carried out sub-clustering, obtain a bunch set.Its concrete operations step is following:
The 1st step: construct a bunch of tabulation Qc and ephemeral data the package list Lp; Bunch the tabulation Qc be used for the storage bunch object C; Ephemeral data the package list Lp is used for storing the packet that obtains from network data flow; Bunch tabulation Qc comprises but is not limited to following three attributes: sequence of data packet, data packet number, bunch characteristics of objects value; Ephemeral data the package list Lp comprises but is not limited to following four attributes: the time T of sequence of data packet, data packet number, bunch characteristics of objects value, last packet Last
The 2nd step: from the network data flow that step 1 is obtained, read a packet; The time of remembering this packet is Tp, and the initial value of setting data bag quantity is 0;
The 3rd goes on foot: the 2nd step or the 4th is gone on foot the packet that obtains be added into ephemeral data the package list Lp, its data packet number property value increases by 1, and writes down the time T of last packet among ephemeral data the package list Lp LasT=Tp;
The 4th step: judge whether network data flow finishes; If finish, then, obtain the characteristic value of bunch object C, and be added among bunch tabulation Qc with bunch of object C of packet structure among ephemeral data the package list Lp, finish sub-clustering, forward step 3 to; Otherwise read the next packet of network data flow, write down its time Tp;
The 5th step: judge Tp-T LastWhether>T sets up.If be false, the packet that the 4th step was obtained joins ephemeral data the package list Lp, and the property value of its data packet number increases by 1, and upgrades T Last=Tp got back to for the 4th step; If Tp-T Last>T sets up, and among ephemeral data the package list Lp the packet number more than or equal to S Min, then, be added among bunch tabulation Qc with bunch of object C of packet structure among ephemeral data the package list Lp, got back to for the 4th step then; If Tp-T Last>T sets up, and among the Lp packet number less than S Min, then abandon ephemeral data the package list Lp, got back to for the 4th step then.
Bunch set is bunch object of storing among bunch tabulation Qc, the quantity of bunch object of representing with n to comprise, and n is a positive integer.
The method that obtains the characteristic value of bunch object C described in the 4th step is: if bunch data packet number that object C comprises is less than the artificial positive integer m that sets, wherein a m>S Min, then go here and there as characteristic value with the direction and the length sequences splicing of packet among ephemeral data the package list Lp; Otherwise, with average, packet purity, the time span of this bunch object C, attribute such as value splices the generating feature value.
Step 3, periodically bunch set of structure, and judge whether it constitutes periodically subsequence;
From bunch set that step 2 obtains, pick out poor bunch object less than periodicity maximum tolerance variance Ma (Ma is an artificial set point) of the time interval with same characteristic features value and appearance, writing down its quantity is Mum; If Mum>=Mum 1(Mum 1Be that a people is for setting positive integer), then constitute periodically bunch set with bunch object of picking out, judge again whether it constitutes periodically subsequence.The concrete operations step is:
The 1st step: to bunch set of obtaining in the step 2, classify according to characteristic value, bunch object that characteristic value is identical is placed in the set, writes down bunch number of objects Mum in each set;
The 2nd step: judge successively whether bunch number of objects in each set satisfies Mum>=Mum 1If, do not satisfy, then should gather deletion;
The 3rd step: successively a bunch number of objects Mum is not less than Mum 1Set operate, judge whether it exists periodic feature.Concrete operations are:
1. the zero-time of per two adjacent bunch objects in this set is done the difference computing, obtain the time interval between the adjacent cluster object;
2. obtain the average variance in the time interval between the adjacent cluster object;
If 3. this average variance is less than the artificial periodicity maximum tolerance variance Ma that sets, think that then bunch object in this set constitutes periodically subsequence; Otherwise, carried out for the 4. step;
4. used for the each time interval of 1. obtaining in the step did following operation successively: in the primitive network data flow; In two adjacent cluster objects with this time interval; Zero-time at preceding bunch object is a starting point; This time interval is a step-length, respectively forward with the inquiry sequence of data packet identical backward with the characteristic value of this set; If there is such sequence of data packet, then it is stored in the new set as bunch object in chronological order, turn back to the then and 1. go on foot this set is operated; Otherwise, finish operation to this time interval.
Beneficial effect
Compared with present technology, the inventive method has following advantage:
1. but service routine judges whether there is periodically subsequence in the network data flow, analysis efficiency is high automatically;
2. can accomplish that network data flow obtains in real time, real-time analysis;
3. accuracy is high.
Description of drawings
Fig. 1 is that the present invention is about the network data flow sequential chart to a kind of embodiment of the detection method of periodicity subsequence in the network data flow;
Fig. 2 is that the present invention is about the periodicity subsequence sequential chart to a kind of embodiment of the detection method of periodicity subsequence in the network data flow.
Embodiment
Below in conjunction with accompanying drawing and specific embodiment technical scheme of the present invention is described in detail.
The operating procedure of present embodiment is:
The minimum interval value is 1 second between step 1, definition bunch, and the number that constitutes the needed network packet of tuftlet is 2.In the experiment, use host A and host B, the IP address of host A is 172.16.2.8; The IP address of host B is 172.16.2.7, the application program that operation is communicated by letter with host B on host A, and draw the network data flow sequential chart; Shown in 1; X axle among Fig. 1 is a time shaft, and the Y axle is represented the network packet flow, and the column figure is represented the network packet flow of corresponding time point communication; The column figure on X axle right side is represented the packet that mails to host A from host B, and the packet of host B is mail in the column figure representative in X axle left side from host A.The numerical value on column figure top is the concrete numerical value of packet size.The line parallel with the Y axle is whole second separator bar among the figure, that is: the packet between two whole second separator bars was positioned at 1 second.
The network packet that obtains is stored among the doublet Data that forms with packet time and packet size, i.e. Data=[(440.45 ,-6), (440.59 ,-308), (450.46 ,-6), (450.47,3), (451.26,4), (451.37; 28), (451.37 ,-6), (456.97,4), (457.17,28), (457.18 ,-6), (460.47 ,-6), (460.47; 3), (467.53,1), (470.47 ,-6), (470.47,3), (473.0,4), (473.2,32), (475.4; 4), (475.6,40), (480.22,4), (480.41,32), (480.51 ,-6), (480.51,3), (480.55;-6), (480.74 ,-20), (490.51 ,-6), (490.51,3), (490.8,1), (490.99,1), (500.52;-6), (500.52,3), (501.35,1), (501.56,1), (510.52 ,-6), (510.52,3); (511.53,1), (511.72,1), (520.52 ,-6), (520.53,3), (522.04,1), (522.25; 1), (530.53 ,-6), (530.53,3), (532.62,1), (532.84,1), (540.53 ,-6)].
Step 2, the network packet that step 1 is obtained are carried out sub-clustering, obtain a bunch set, and the compute cluster characteristic value, with characteristic value with bunch classification in bunch set.The concrete operations step is following:
The 1st step: construct a bunch of tabulation Qc and ephemeral data the package list Lp; Bunch tabulation Qc is used for storage bunch object C, this object representative bunch characteristic value is identical one type bunch; Ephemeral data the package list Lp is used for storing the packet that obtains from network data flow.
The 2nd step: the network data flow that from step 1, obtains, promptly among the doublet Data successively read data packet to ephemeral data the package list Lp.
The 3rd step: in the read-in process in the 2nd step, if the data of reading in are last data of Data, promptly the current network data flow finishes, and then goes to the operation of the 5th step; Otherwise the difference of judging current time value of reading in data and last data time value of Lp whether greater than bunch between minimum interval (1 second), if greater than would go to the operation of the 4th step, less than then the current data of reading in being added into the Lp end; Carry out the operation in the 2nd step.
The 4th step: the data in this moment ephemeral data bag Lp are data the most closely on sequential relationship; Judge that whether the packet total amount is greater than the necessary number that constitutes tuftlet in the Lp; If greater than; Then the packet in Lp set has constituted new bunch, the compute cluster characteristic value, and be that keyword adds among bunch tabulation Qc with the characteristic value.If less than, then the packet total amount does not meet the cluster condition in the Lp, empties Lp, and the current data of reading in the 3rd step are put into Lp.
The 5th step: judge whether ephemeral data the package list Lp meets formation bunch condition, meets then compute cluster characteristic value, and it is joined Qc; Do not meet then and empty.
Through above-mentioned steps, bunch set and bunch characteristic value that obtain are following:
The 1st bunch: (440.45 ,-6), (440.59 ,-308); Bunch characteristic value :-6-308;
The 2nd bunch: (450.46 ,-6), (450.47,3), (451.26,4), (451.37,28), (451.37 ,-6); Bunch characteristic value :-6+3+4+28-6;
The 3rd bunch: (456.97,4), (457.17,28), (457.18 ,-6); Bunch characteristic value :+4+28-6;
The 4th bunch: (460.47 ,-6), (460.47,3); Bunch characteristic value :-6+3;
The 5th bunch: (470.47 ,-6), (470.47,3); Bunch characteristic value :-6+3;
The 6th bunch: (473.0,4), (473.2,32); Bunch characteristic value :+4+32;
The 7th bunch: (475.4,4), (475.6,40); Bunch characteristic value :+4+40;
The 8th bunch: (480.22,4), (480.41,32), (480.51 ,-6), (480.51,3), (480.55 ,-6), (480.74 ,-20); Bunch characteristic value :+4+32-6+3-6-20;
The 9th bunch: (490.51 ,-6), (490.51,3), (490.8,1), (490.99,1); Bunch characteristic value :-6+3+1+1;
The 10th bunch: (500.52 ,-6), (500.52,3), (501.35,1), (501.56,1); Bunch characteristic value :-6+3+1+1;
The 11st bunch: (510.52 ,-6), (510.52,3), (511.53,1), (511.72,1); Bunch characteristic value :-6+3+1+1;
The 12nd bunch: (520.52 ,-6), (520.53,3); Bunch characteristic value :-6+3;
The 13rd bunch: (522.04,1), (522.25,1); Bunch characteristic value :+1+1;
The 14th bunch: (530.53 ,-6), (530.53,3); Bunch characteristic value :-6+3;
The 15th bunch: (532.62,1), (532.84,1); Bunch characteristic value :+1+1;
Step 3, periodically bunch set of structure, and judge whether it constitutes periodically subsequence;
The 1st step: to bunch set of obtaining in the step 2, classify according to characteristic value, bunch object that characteristic value is identical is placed in the set, writes down bunch number of objects Mum in each set; At this moment, a bunch tabulation Qc is: Qc={ " 6-308 ": [[(40.45 ,-6), (440.59 ,-308)]],
“-6+3+4+28-6”:[[(450.46,-6),(450.47,3),(451.26,4),(451.37,28),(451.37,-6)]],
“+4+28-6”:[[(456.97,4),(457.17,28),(457.18,-6)]],
“-6+3”:[[(460.47,-6),(460.47,3)],[(470.47,-6),(470.47,3)],[(520.52,-6),(520.53,3)],[(530.53,-6),(530.53,3)]],
“+4+32”:[[(473.0,4),(473.2,32)]],
“+4+40”:[[(475.4,4),(475.6,40)]],
“+4+32-6+3-6-20”:[[(480.22,4),(480.41,32),(480.51,-6),(480.51,3),(480.55,-6),(480.74,-20)]],
“-6+3+1+1”:[[(490.51,-6),(490.51,3),(490.8,1),(490.99,1)],[(500.52,-6),(500.52,3),(501.35,1),(501.56,1)],[(510.52,-6),(510.52,3),(511.53,1),(511.72,1)]],
“+1+1”:[[(522.04,1),(522.25,1)],[(532.62,1),(532.84,1)]]}
The 2nd step: judge that successively whether bunch number of objects in each set satisfies Mum >=2, if do not satisfy, then should gather deletion; At this moment, a bunch tabulation Qc is:
Qc={“-6+3”:[[(460.47,-6),(460.47,3)],[(470.47,-6),(470.47,3)],[(520.52,-6),(520.53,3)],[(530.53,-6),(530.53,3)]],
“-6+3+1+1”:[[(490.51,-6),(490.51,3),(490.8,1),(490.99,1)],[(500.52,-6),(500.52,3),(501.35,1),(501.56,1)],[(510.52,-6),(510.52,3),(511.53,1),(511.72,1)]]}
The 3rd step: successively bunch number of objects Mum is not less than 2 set and operates, judge whether it exists periodic feature.Concrete operations are:
1. the zero-time of per two adjacent bunch objects in this set is done the difference computing, obtain the time interval between the adjacent cluster object;
2. obtain the average variance in the time interval between the adjacent cluster object;
If 3. this average variance is less than the artificial periodicity maximum tolerance variance Ma=0.2 that sets, think that then bunch object in this set constitutes periodically subsequence; Otherwise, carried out for the 4. step;
4. used for the each time interval of 1. obtaining in the step did following operation successively: in the primitive network data flow; In two adjacent cluster objects with this time interval; Zero-time at preceding bunch object is a starting point; This time interval is a step-length, respectively forward with the inquiry sequence of data packet identical backward with the characteristic value of this set; If there is such sequence of data packet, then it is stored in the new set as bunch object in chronological order, turn back to the then and 1. go on foot this set is operated; Otherwise, finish operation to this time interval.
To among the Qc bunch of characteristic value "-6+3+1+1 " processing procedure is:
Adjacent cluster zero-time is wherein done the difference computing; Obtaining the time interval is [10.01; 10]; This time-interval averaging variance is less than maximum tolerance variance Ma, but because this bunch do not comprise in the primitive network data flow with " 6-3+1+1 " serves as all time points that the periodicity subsequence periodically occurs.Therefore this bunch can not constitute periodically subsequence (with this bunch serves as that the time point that the periodicity subsequence occurs should be (540.53-440.45)/10-1=9, and this bunch only comprises 3).
To among the Qc bunch of characteristic value "-6+3 " processing procedure is:
Adjacent cluster zero-time is wherein done the difference computing, and obtaining the time interval is [10,50.1; 10.01], because average variance is greater than maximum tolerance variance Ma, then successively to 10; 50.1 10.01 these three time intervals are in two adjacent cluster objects with this time interval; Zero-time at preceding bunch object is a starting point, and this time interval is a step-length, respectively forward with the inquiry sequence of data packet identical backward with the characteristic value of this set; If there is such sequence of data packet, then it is stored in the new set as bunch object in chronological order, obtain Qc={ " 6+3 ": [[(450.46 ,-6), (450.47,3)], [(460.47 ,-6); (460.47,3)], [(470.47 ,-6), (470.47,3)], [(480.51 ,-6); (480.51,3)], [(490.51 ,-6), (490.51,3)], [(500.52 ,-6); (500.52,3)], [(510.52 ,-6), (510.52,3)], [(520.52;-6), (520.53,3)], [(530.53 ,-6), (530.53,3)]] }.
Then, the adjacent cluster zero-time in bunch characteristic value tabulation after upgrading is done the difference computing, obtain characteristic value for "-6+3 " adjacent cluster between the time interval be [10.01; 10,10.04,10; 10.01,10,10; 10.01], this time-interval averaging variance is less than maximum tolerance variance Ma, again because with in the primitive network data flow with "-6+3 " serve as periodically the time point that occurs of subsequence to should be (540.53-440.45)/10-1=9 individual; So and therefore the periodicity subsequence time point that this bunch set has comprised thinks that bunch object constitutes the periodicity subsequence in this set.Draw sequential chart with the data in this bunch set; As shown in Figure 2; The X axle is a time shaft, and the Y axle is the network packet flow, and the column figure is represented the network packet flow of corresponding time point communication; The column figure on X axle right side is represented the packet that mails to host A from host B, and the packet of host B is mail in the column figure representative in X axle left side from host A.The numerical value on column figure top is the concrete numerical value of packet size.The line parallel with the Y axle is whole second separator bar among the figure.Subsequence among Fig. 2 has periodic feature.

Claims (2)

1. one kind to the detection method of subsequence periodically in the network data flow, it is characterized in that: at first provide the definition of relational language:
Define 1. sequences: be the set of the packet of preface with time;
Define 2. subsequences: if sequence X, Y satisfies: packets all in the sequence X all are present among the sequence Y, and among the X among order and the Y of all packets packet sequence consistent, then X is the sub-sequence of Y;
Define the characteristic value of 3. sequences: the attribute of the sequence of forming by set characteristic vector;
Define 4. bunches: bunch be in the set of packet closely of sequential co-relation; And bunch and bunch between packet on sequential relationship, become estranged relatively; Concrete criterion is: be not more than an artificial time value set T when the time interval of adjacent two data bag, judge that then these two data wrap in same bunch; Otherwise judge that these two packets belong to two bunches;
Based on above-mentioned definition, of the present invention a kind of to periodically the concrete operations step of the detection method of subsequence is following in the network data flow:
Step 1, obtain network data flow, and the minimum interval value is T between definition bunch, the number that constitutes the required network packet of tuftlet is S Min, S MinBe positive integer;
Step 2, network data flow is carried out sub-clustering, obtain a bunch set; Its concrete operations step is following:
The 1st step: construct a bunch of tabulation Qc and ephemeral data the package list Lp; Bunch the tabulation Qc be used for the storage bunch object C; Ephemeral data the package list Lp is used for storing the packet that obtains from network data flow; Bunch tabulation Qc comprises but is not limited to following three attributes: sequence of data packet, data packet number, bunch characteristics of objects value; Ephemeral data the package list Lp comprises but is not limited to following four attributes: the time T of sequence of data packet, data packet number, bunch characteristics of objects value, last packet Last
The 2nd step: from the network data flow that step 1 is obtained, read a packet; The time of remembering this packet is Tp, and the initial value of setting data bag quantity is 0;
The 3rd goes on foot: the 2nd step or the 4th is gone on foot the packet that obtains be added into ephemeral data the package list Lp, its data packet number property value increases by 1, and writes down the time T of last packet among ephemeral data the package list Lp Last=Tp;
The 4th step: judge whether network data flow finishes; If finish, judge that then packet total amount in ephemeral data the package list Lp is whether more than or equal to the necessary number S that constitutes tuftlet MinIf: more than or equal to, then the packet in Lp set has constituted new bunch, the compute cluster characteristic value, and be that keyword adds among bunch tabulation Qc with the characteristic value, finish sub-clustering, forward step 3 to; If less than, then empty ephemeral data the package list Lp, finish sub-clustering, forward step 3 to; If network data flow does not finish, then read the next packet of network data flow, write down its time Tp;
The 5th step: judge Tp-T LastWhether>T sets up; If be false, the packet that the 4th step was obtained joins ephemeral data the package list Lp, and the property value of its data packet number increases by 1, and upgrades T Last=Tp got back to for the 4th step; If Tp-T Last>T sets up, and among ephemeral data the package list Lp the packet number more than or equal to S Min, then, be added among bunch tabulation Qc with bunch of object C of packet structure among ephemeral data the package list Lp, got back to for the 4th step then; If Tp-T Last>T sets up, and among the Lp packet number less than S Min, then abandon ephemeral data the package list Lp, got back to for the 4th step then;
Bunch set is bunch object of storing among bunch tabulation Qc, the quantity of bunch object of representing with n to comprise, and n is a positive integer;
Step 3, periodically bunch set of structure, and judge whether it constitutes periodically subsequence;
From bunch set that step 2 obtains, pick out poor bunch object less than periodicity maximum tolerance variance Ma of the time interval with same characteristic features value and appearance, writing down its quantity is Mum, and wherein, Ma is an artificial set point; If Mum>=Mum 1, Mum 1Be that a people is the setting positive integer, then constitute periodically bunch set, judge again whether it constitutes periodically subsequence with bunch object of picking out; The concrete operations step is:
The 1st step: to bunch set of obtaining in the step 2, classify according to characteristic value, bunch object that characteristic value is identical is placed in the set, writes down bunch number of objects Mum in each set;
The 2nd step: judge successively whether bunch number of objects in each set satisfies Mum>=Mum 1If, do not satisfy, then should gather deletion;
The 3rd step: successively a bunch number of objects Mum is not less than Mum 1Set operate, judge whether it exists periodic feature; Concrete operations are:
1. the zero-time of per two adjacent bunch objects in this set is done the difference computing, obtain the time interval between the adjacent cluster object;
2. obtain the average variance in the time interval between the adjacent cluster object;
If 3. this average variance is less than the artificial periodicity maximum tolerance variance Ma that sets, think that then bunch object in this set constitutes periodically subsequence; Otherwise, carried out for the 4. step;
4. used for the each time interval of 1. obtaining in the step did following operation successively: in the primitive network data flow; In must two adjacent cluster objects in the time interval; Zero-time at preceding bunch object is a starting point; This time interval is a step-length, respectively forward with the inquiry sequence of data packet identical backward with the characteristic value of this set; If there is such sequence of data packet, then it is stored in the new set as bunch object in chronological order, turn back to the then and 1. go on foot this set is operated; Otherwise, finish operation to this time interval.
2. a kind of detection method to periodicity subsequence in the network data flow as claimed in claim 1, it is characterized in that: the method for the characteristic value of compute cluster object C is described in the 4th step of step 2: direction and length sequences splicing with packet among ephemeral data the package list Lp are gone here and there as characteristic value.
CN201010134835A 2010-03-30 2010-03-30 Detection method for periodic subsequence in network data stream Expired - Fee Related CN101827092B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010134835A CN101827092B (en) 2010-03-30 2010-03-30 Detection method for periodic subsequence in network data stream

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010134835A CN101827092B (en) 2010-03-30 2010-03-30 Detection method for periodic subsequence in network data stream

Publications (2)

Publication Number Publication Date
CN101827092A CN101827092A (en) 2010-09-08
CN101827092B true CN101827092B (en) 2012-10-03

Family

ID=42690795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010134835A Expired - Fee Related CN101827092B (en) 2010-03-30 2010-03-30 Detection method for periodic subsequence in network data stream

Country Status (1)

Country Link
CN (1) CN101827092B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112994969B (en) * 2019-12-17 2024-05-03 中兴通讯股份有限公司 Service detection method, device, equipment and storage medium
CN115296930B (en) * 2022-09-29 2023-02-17 中孚安全技术有限公司 Periodic behavior detection method, system and terminal

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101572711A (en) * 2009-06-08 2009-11-04 北京理工大学 Network-based detection method of rebound ports Trojan horse

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101572711A (en) * 2009-06-08 2009-11-04 北京理工大学 Network-based detection method of rebound ports Trojan horse

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Fahad Maqbool et al..E-MAP: efficiently mining asynchronous periodic patterns.《IJCSNS International Journal of Computer Science and Network Security》.2006,第6卷(第8A期),全文. *
Kuo-Yu Huang et al..SMCA: a general model for mining asynchronous periodic patterns in temporal databases.《IEEE transactions on knowledge and data engineering》.2005,第17卷(第6期),全文. *
陈为满等.数据流上快速子序列匹配.《计算机工程与应用》.2008,第44卷(第36期),全文. *
陈当阳等.时态数据的趋势序列分析及其子序列匹配算法研究.《计算机研究与发展》.2007,第44卷(第3期),全文. *

Also Published As

Publication number Publication date
CN101827092A (en) 2010-09-08

Similar Documents

Publication Publication Date Title
CN101697545B (en) Security incident correlation method and device as well as network server
CN102025563B (en) Network flow identification method based on Hash collision compensation
CN101976313B (en) Frequent subgraph mining based abnormal intrusion detection method
CN105306475A (en) Network intrusion detection method based on association rule classification
CN106709035A (en) Preprocessing system for electric power multi-dimensional panoramic data
CN102521356B (en) Regular expression matching equipment and method on basis of deterministic finite automaton
CN113839835B (en) Top-k flow accurate monitoring system based on small flow filtration
CN101827092B (en) Detection method for periodic subsequence in network data stream
CN101442535A (en) Method for recognizing and tracking application based on keyword sequence
CN103309966A (en) Data flow point connection query method based on time slide windows
CN106095850A (en) A kind of data processing method and equipment
CN103336771A (en) Data similarity detection method based on sliding window
CN115037543A (en) Abnormal network flow detection method based on bidirectional time convolution neural network
CN106682225A (en) Big data collecting and storing method and system
CN114971710A (en) Event log-based multi-dimensional process variant difference analysis method and system
CN114238360A (en) User behavior analysis system
CN110995770B (en) Fuzzy test application effect comparison method
CN104301682A (en) Monitoring video fragment restoration method and device
CN112765313A (en) False information detection method based on original text and comment information analysis algorithm
CN102436535B (en) Identification method and system for creative inflection point in computer aided design process
CN106021574A (en) Data storage replication method and system
CN114004052B (en) Network management system-oriented fault detection method and device
CN103067300B (en) Network traffics automation feature mining method
CN104731851A (en) Big data analysis method based on topological network
CN112861123A (en) Bit currency malicious address identification method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121003

Termination date: 20180330

CF01 Termination of patent right due to non-payment of annual fee