CN101778115B - Method for extracting occurrence law of P2P (Peer 2 Peer)-TV channel feature code, P2P-TV channel recognition method and recognition system based on same - Google Patents

Method for extracting occurrence law of P2P (Peer 2 Peer)-TV channel feature code, P2P-TV channel recognition method and recognition system based on same Download PDF

Info

Publication number
CN101778115B
CN101778115B CN2010101054373A CN201010105437A CN101778115B CN 101778115 B CN101778115 B CN 101778115B CN 2010101054373 A CN2010101054373 A CN 2010101054373A CN 201010105437 A CN201010105437 A CN 201010105437A CN 101778115 B CN101778115 B CN 101778115B
Authority
CN
China
Prior art keywords
sign indicating
indicating number
channels feature
channel
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2010101054373A
Other languages
Chinese (zh)
Other versions
CN101778115A (en
Inventor
王晖
姜志宏
张鑫
李进
樊鹏翼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN2010101054373A priority Critical patent/CN101778115B/en
Publication of CN101778115A publication Critical patent/CN101778115A/en
Application granted granted Critical
Publication of CN101778115B publication Critical patent/CN101778115B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to the field of P2P (Peer 2 Peer) flow rate recognition, in particular discloses a method for extracting the occurrence law of a P2P-TV channel feature code, a P2P-TV channel recognition method and a recognition system based on the extraction method. The recognition method comprises the following steps of: analyzing acquired data to obtain the occurrence frequency of a byte code segment in UDP (User Datagram Protocol) data, and finding out the occurrence frequency of the channel feature code according to the occurrence frequency; obtaining the channel feature code of each platform based on the occurrence law of the channel feature code, matching with a feature code base and analyzing to obtain the channel information of each platform; and realizing the accurate recognition for the channel of each platform. The channel recognition method has fast running speed and precise recognition result, the recognition system has simple structure, and the safety monitoring of a network to be controlled is effectively realized.

Description

The method for distilling of P2P-TV channels feature sign indicating number occurrence law reaches recognition methods of P2P-TV channel and the recognition system based on it
Technical field
The present invention relates to P2P flow identification field, more particularly, relate to the recognition technology of the pairing P2P-TV channel of network traffics.
Background technology
P2P-TV is popular rapidly on the net as a kind of emerging Network; People can utilize the P2P-TV platform to watch TV easily; Films etc., now online popular P2P-TV platform has PPLive, PPStream, QQLive, UUSee, SOPCast or the like.These platforms bring easily simultaneously to people, equally also bring a series of problem: the application technology based on P2P is used a large amount of network bandwidths that taken on a large scale; Because each adopts different proprietary protocols, break away from the scope of supervision easily; The fragility of network itself in addition, these platforms are easy to become the object of attack of some lawless persons and reactionary forces, and making it becomes the channel of propagating invalid information, injures the interests of country and society.So the supervision that strengthens the P2P-TV platform seems very urgent.
Domestic and international many research institutions have carried out a large amount of researchs about the P2P-TV platform, mainly concentrate on and measure and discern two aspects.Identification is the basis of supervision, and the work in past mainly concentrates on the identification aspect of P2P-TV platform, and also trend is ripe gradually for the technology of this respect, and the method for land identification mainly concentrates on following two aspects:
One is based on the identification of condition code, finds out the extraordinary character code section in the different platform application layer data, forms feature database, from flow to be identified, extracts condition code in the process of identification and matees.
Two are based on the recognition methods of behavioural characteristic, and this method mainly is according to behavioural characteristics such as bag size, transport layer protocol, port number, session persistences the P2P-TV platform to be discerned.
Nowadays the platform of P2P Streaming Media is supervised the control that need become more meticulous more; Except adopting said method to identify the platform; Also need identify unusual concrete channel takes place, therefore the further identification of each channel in the platform of P2P Streaming Media is become is necessary very much.
Summary of the invention
To above-mentioned prior art; The method for distilling that the object of the invention aims to provide a kind of P2P-TV channels feature sign indicating number occurrence law reaches recognition methods of P2P-TV channel and recognition system based on it; So that can identify at the port of export of garden net the net in which node watching which the concrete channel in the P2P-TV platform, the data of garden net are monitored more accurately, and this recognition methods is quick; Be easy to realize that the recognition result that draws is accurate.
To achieve these goals, the present invention has taked following technical scheme:
A kind of method for distilling of P2P-TV channels feature sign indicating number occurrence law may further comprise the steps:
Step 1, data acquisition: data acquisition is carried out respectively two different time sections, and at every turn from the net flow data of two sections t durations of two channels of the same P2P-TV platform of PC collection that is connected in the garden net, wherein two of twice collection channels are identical;
Step 2, data processing: the four segment datas stream to collecting carries out protocol filtering and ports filter, keeps the UDP message in the main udp port, the maximum udp port of the interior data volume of sampling time section that wherein main udp port is meant the t duration;
The frequency of occurrences of bytecode section in step 3, the statistics UDP message: for n byte before the payload data in all packets in each segment data stream; Statistical length is m (m ∈ [4 respectively; L]; The frequency of occurrences of the continuous bytecode section of byte of 4<l<n) sorts according to the frequency of occurrences is descending to n-3 statistics;
Step 4, analysis bytecode section; Tentatively draw the channels feature sign indicating number: for two channels of identical platform; From two sections of each channel different bytecode frequency of occurrences statisticses, find the bytecode section that character matees fully and the frequency of occurrences is all bigger respectively; These two bytecode sections of two channel correspondences are as meeting the following conditions: 1, the length of two bytecode sections is identical; 2, the content of two bytecode sections is inequality; 3, the appearance position of two bytecode sections has identical rule; These two condition codes that the bytecode section is corresponding channel of initial setting then;
The channels feature sign indicating number that step 5, check tentatively draw: uniqueness and the constancy of using two bytecode sections of data detection of identical platform different channel different time sections; Wherein uniqueness is meant that the bytecode section on other channel correspondence position of identical platform is different each other, and constancy is meant that this bytecode section does not change in greater than 6 months time period; Can not then return step 4 through check, then think the channels feature sign indicating number that this bytecode section is this channel through check;
Step 6, analyzing total are born channels feature sign indicating number occurrence law, comprise the bag size rule and the position rule of channels feature sign indicating number, wherein wrap big or small rule and are meant that condition code mainly appears at the packet of size for what; Said position rule is meant the channels feature sign indicating number mainly appears at which position in the payload data in the packet.
As embodiment, in the step 1, it is more than one day that said two different time sections are generally got the time difference.In the said step 4, the bytecode section that the frequency of occurrences is all bigger is meant that the frequency of occurrences of two bytecode sections all is positioned at preceding 5 of frequency of occurrences statistics.
Based on above-mentioned P2P-TV channels feature sign indicating number occurrence law, the present invention also provides the recognition methods of a kind of P2P-TV channel, comprises the steps:
(1) the channels feature storehouse is set up
Step S101 is provided with K P2P-TV platform and need carries out channel identification, finds the channels feature sign indicating number occurrence law of K platform respectively according to P2P-TV channels feature sign indicating number method for distilling;
Step S102 according to the channels feature sign indicating number occurrence law of each platform, extracts the channels feature sign indicating number of K all channels of platform;
Step S103 sets up the channels feature storehouse according to the channels feature sign indicating number that is obtained, and each channels feature sign indicating number comprises following information at least in the channels feature storehouse: platform number, channel and channels feature sign indicating number; And as the case may be platform and channel are numbered;
(2) channel identifying
Step S104, collection network data from the switch of garden net outlet;
Step S105 carries out udp protocol to the network data that collects and filters, and only stays udp data;
Step S106, to the network data flow that filters according to each appearance (InnerIP Port) carries out the network data shunting, obtains refineing to the sub data flow of port, and wherein InnerIP is a garden net implicit IP address;
Step S107, the packet in the sub data flow is by (InnerIP Port) pools new data flow;
Step S108 reads the packet in the short time window delta t from data flow;
Step S109, K kind P2P-TV platform is corresponding K kind condition code occurrence law altogether, uses wherein a kind of condition code occurrence law that the packet in the short time window delta t is carried out the extraction of channels feature sign indicating number;
Channels feature sign indicating number in the channels feature storehouse that step S110, the channels feature sign indicating number that step S109 is extracted and step S103 obtain matees, and coupling is failed and then got into step S111, matees successfully to get into step S117;
Step S111 judges that the condition code whether packet in this short time window has traveled through K kind platform extracts rule, if all used then get into step S112, if all do not use then return step S109;
Step S112, t is added to T with Δ, and wherein T representes the accumulated time length of coupling failure continuously, whether judges T then greater than threshold value M, and wherein M is the time threshold that is judged as unknown flow rate, greater than M, gets into step S113 like all T, otherwise, get into step S114;
Step S113, the accumulated time length T of coupling failure is judged that this data flow is a unknown traffic, and is withdrawed from identifying greater than threshold value M continuously;
Step S114, the data in this Δ t time period are not mated success, and T judges that less than threshold value M this data flow for needing further recognition data stream, gets into step S115 then;
Step S115 judges whether this data flow runs through, if do not run through then return step S108, data run through and then get into step S116;
Step S116; Whether the switch U that judges that judges unknown identification stream equals zero; The effect of the judgement switch U of unknown identification stream be to the duration less than M detect again simultaneously less than channels feature data flow judge; If equal zero then judge that this data flow is a unknown traffic, and withdraw from identifying, otherwise directly withdraw from identifying;
Step S117, condition code is mated successfully, returns recognition result, and the accumulated time length T that will mate failure simultaneously continuously is changed to zero, and the judgement switch U of the unknown identification is added 1 certainly, gets into step S115 then.
As a kind of preferred embodiment, the method that channels feature sign indicating number among the said step S110 and condition code storehouse are complementary can adopt character string matching method to realize.
As embodiment, the process of network data shunting is among the said step S106: (each packet all is stored in the corresponding stored space, promptly realizes data distribution for InnerIP, Port) one section memory space of distribution of flows for each equally.
The invention allows for a kind of P2P-TV channel recognition system, comprise with lower module:
Each platform channels feature sign indicating number extraction module according to the channels feature sign indicating number occurrence law of each platform, extracts the channels feature sign indicating number of each all channel of platform;
The channels feature library module receives the channels feature sign indicating number that the extraction module of each platform channels feature sign indicating number extracts, and sets up a channels feature storehouse;
Garden net outlet data acquisition module, garden net and the Internet interaction data are gathered in the net outlet from the garden;
The network data filtering module carries out the UDP filtration to the data that garden net outlet data acquisition module collects, and keeps the data flow that only contains the UDP message bag;
The network data diverter module is according to (InnerIP Port) shunts the data flow of network data filtering module output, and (InnerIP Port) all can produce a sub-streams to each that occurs in the data flow;
The channels feature sign indicating number extracts and matching module, and the sub data flow of receiving network data diverter module output, and extract the channels feature sign indicating number according to channels feature sign indicating number occurrence law one by one matees with channels feature sign indicating number in the channels feature storehouse, produces matching result;
The recognition result output module, output coupling recognition result;
Wherein the output of garden net outlet data acquisition module is through network data filtering module access network data distribution module; Each platform channels feature sign indicating number extraction module is through the channels feature library module; Insert the channels feature sign indicating number in the lump again with this network data diverter module and extract the input with matching module, the output that said channels feature sign indicating number extracts with matching module inserts the recognition result output module at last.
As embodiment, said garden net outlet data acquisition module and network data filtering module adopt the data acquisition server that Wireshark software is housed to realize that wherein data acquisition server inserts the switch in the garden net.
The channels feature library module adopts the database server realization that inserts P2P-TV channel identification terminal through the garden net.
The method for distilling of P2P-TV channels feature sign indicating number occurrence law according to the invention reaches recognition methods of P2P-TV channel and the system based on it, and concrete operation principle combines accompanying drawing to be described below:
All users that play the same channel of same P2P-TV platform have promptly constituted an overlay network on the internet, and in order to distinguish mutually with the identical platform different channel, playing has the unique sign of this channel certainly in the data mutual between the user of same channel.This sign should have uniqueness, and promptly the sign of each channel of this platform is different; This sign should also have constancy, can keep the consistency of long period section.Because current domestic main flow P2P-TV platform (for example PPLive, PPStream, QQLive, UUSee etc.) mainly is to adopt udp protocol to carry out data interaction; And these data are mainly transmitted through a udp port (the maximum udp port of downlink data amount in the said sampling time section); So what the present invention is directed to mainly adopts udp protocol to carry out the P2P-TV platform of data interaction to liking those, technical scheme mainly comprises following three parts:
One, the method for distilling of P2P-TV channels feature sign indicating number occurrence law may further comprise the steps:
Step 1; Data acquisition: data acquisition is being carried out respectively in two different time sections (two time phase differences are more than a day), at every turn from the net flow data (comprising the uplink and downlink data) of two sections t durations of two channels (two channels of twice collection are identical) of the same P2P-TV platform of PC collection that is connected in the garden net;
Step 2, data processing: four segment datas to collecting are carried out protocol filtering and ports filter, keep the UDP message in the main udp port, and wherein main udp port is meant the maximum udp port of data volume in the t duration sampling time section;
The frequency of occurrences of bytecode section in step 3, the statistics UDP message: for preceding n byte of the Payload data (being payload data) in all packets in each segment data stream; Statistical length is m (m ∈ [4 respectively; L]; The frequency of occurrences of the continuous bytecode section of byte of 4<l<n) sorts according to the frequency of occurrences is descending to n-3 statistics;
Step 4, analysis bytecode section; Tentatively draw the channels feature sign indicating number: for two channels of identical platform; From two sections of each channel different bytecode frequency of occurrences statisticses, find bytecode identical (character matees fully) and the bytecode section frequency of occurrences all bigger (frequency of occurrences of two bytecode sections all is positioned at preceding 5 of statistics) respectively; These two bytecode sections of two channel correspondences are as meeting the following conditions: 1, the length of two bytecode sections is identical; 2, the content of two bytecode sections is inequality; 3, the appearance position of two bytecode sections has identical rule; These two condition codes that the bytecode section is corresponding channel of initial setting then;
The channels feature sign indicating number that step 5, check tentatively draw: uniqueness and the constancy of using two bytecode sections of data detection of identical platform different channel different time sections; Wherein uniqueness is meant that the bytecode section on other channel correspondence position of identical platform is different each other, and constancy is meant that this bytecode section does not change in greater than 6 months time period; Can not then return step 4 through check, then think the channels feature sign indicating number that this bytecode section is this channel through check;
Step 6, analyzing total are born channels feature sign indicating number occurrence law, comprise the bag size rule and the position rule of channels feature sign indicating number, wherein wrap big or small rule and are meant that condition code mainly appears at the packet of size for what; Said position rule is meant the channels feature sign indicating number mainly appears at which position in the packet Payload data (payload data).
Two, as shown in Figure 1, the P2P-TV channel recognition methods based on the application layer signature comprises the steps:
Suppose that total K P2P-TV platform need carry out channel identification.
(1) the channels feature storehouse is set up
Step S101 finds the channels feature sign indicating number occurrence law of K platform respectively according to P2P-TV channels feature sign indicating number method for distilling recited above;
Step S102 according to each platform channel occurrence law that step S101 finds, extracts the channels feature sign indicating number of K all channels of platform;
Step S103 sets up the channels feature storehouse with the channels feature sign indicating number that is obtained, and each channels feature sign indicating number should comprise following information at least in the channels feature storehouse: platform number, channel and channels feature sign indicating number; As the case may be platform and channel are numbered;
Because the renewal of platform software, the channel in each platform possibly change, and can extract the channels feature sign indicating number according to the method for distilling of this channel for the channel that increased afterwards the channels feature storehouse is upgraded.
(2) channel identifying
Step S104, the switch of net outlet is through Port Mirroring or other technological means collection network data from the garden;
Step S105 carries out udp protocol to the network data that collects and filters, and only stays udp data;
Step S106, to the network data flow that filters according to each appearance (InnerIP Port) carries out the network data shunting, obtains refineing to the sub data flow of port, and wherein InnerIP is a garden net implicit IP address;
Step S107, the packet in the sub data flow is by (InnerIP Port) pools new data flow;
Step S108 reads the packet in the short time window delta t from data flow;
Step S109, K kind P2P-TV platform is corresponding K kind condition code occurrence law altogether, uses wherein a kind of condition code occurrence law that the packet in the short time window delta t is carried out the extraction of channels feature sign indicating number;
Channels feature sign indicating number in the channels feature storehouse that step S110, the channels feature sign indicating number that step S109 is extracted and step S103 obtain matees, and coupling is failed and then got into step S111, matees successfully to get into step S117;
Step S111 judges that the condition code whether packet in this short time window has traveled through K kind platform extracts rule, if all used then get into step S112, if all do not use then return step S109;
Step S112 is added to T (T representes the accumulated time length of coupling failure continuously) with Δ t, whether judges T then greater than threshold value M (M is the time threshold that is judged as unknown flow rate), greater than M, gets into step S113 like all T, otherwise, get into step S114;
Step S113, the accumulated time length T of coupling failure is judged that this data flow is a unknown traffic, and is withdrawed from identifying greater than threshold value M continuously;
Step S114, the data in this Δ t time period are not mated success, and T judges that less than threshold value M this data flow for needing further recognition data stream, gets into step S115 then;
Step S115 judges whether this data flow runs through, if do not run through then return step S108, data run through and then get into step S116;
Step S116; Judge that (U is the judgement switch of unknown identification stream to U; The effect of U be to the duration less than M detect again simultaneously less than channels feature data flow judge) whether equal zero; If equal zero then judge that this data flow is a unknown traffic, and withdraw from identifying, otherwise directly withdraw from identifying;
Step S117, condition code is mated successfully, returns recognition result, and the accumulated time length T that will mate failure simultaneously continuously is changed to zero, and the judgement switch U of the unknown identification is added 1 certainly, gets into step S115 then;
If knowing in advance has one or more in this K platform to want to identify wherein N channel that platform is play playing P2P-TV in the garden net, then need S109 in the above-mentioned steps and S111 are done modification as follows, other step remains unchanged:
Step S109, N kind P2P-TV platform is corresponding N kind condition code occurrence law altogether, uses wherein a kind of condition code occurrence law that the packet in the short time window delta t is carried out the extraction of channels feature sign indicating number;
Step S111; Judge whether the packet in this short time window has traveled through the condition code extraction rule of N kind platform; Got into step S112 then if all use then Δ t is added to T (T representes the continuously accumulated time length of coupling failure), if all do not use then return step S109;
Step S107 to step S117 each (InnerIP, Port) corresponding data flow all will be carried out.
Three, as shown in Figure 2, P2P-TV channel recognition system comprises like lower module:
S201 is the extraction module of channels feature sign indicating number, and this module mainly is the channels feature sign indicating number occurrence law according to each platform, extracts the channels feature sign indicating number of each all channel of platform;
S201 is the channels feature library module, and this module mainly is the channels feature sign indicating number that receiver module S201 extracts, and sets up a channels feature storehouse;
S203 is a garden net outlet data acquisition module, and this module mainly is that garden net and the Internet interaction data are gathered in the outlet of the net from the garden;
S204 is the network data filtering module, and this module mainly is the network data that receiver module S203 gathers, and data is carried out UDP filter, and only keeps the UDP message bag;
S205 is the network data diverter module, and this module receiver module S203 filters good network data, according to (InnerIP Port) shunts, and (InnerIP Port) all can produce a sub-streams to each that occurs in the data flow;
S206 extracts and matching module for the channels feature sign indicating number, and the sub data flow that receiver module S205 produces extracts the channels feature sign indicating number according to K kind channels feature sign indicating number occurrence law one by one and mate in the channels feature storehouse, produces matching result;
The S207 output module, the coupling recognition result that output S206 produces.
The method for distilling of P2P-TV channels feature sign indicating number occurrence law according to the invention reaches recognition methods of P2P-TV channel and the recognition system based on it; Can monitor more accurately the data of garden net; And this recognition methods is quick, is easy to realize that the recognition result that draws is accurate.
Description of drawings
Fig. 1 is the FB(flow block) of P2P-TV channel recognition methods among the present invention;
Fig. 2 is the module connection layout of P2P-TV channel recognition system among the present invention;
Fig. 3 is the deployment diagram of the said P2P-TV channel of embodiment recognition system.
Embodiment
Embodiment 1
The method for distilling of present embodiment employing technical scheme mid band condition code occurrence law of the present invention extracts the channels feature sign indicating number occurrence law of PPLive, PPStream, QQLive and four kinds of platforms of UUSee:
UUSee is the most special in the four platforms, and a UUTV_UUPlayer.xml file is arranged under the installation directory of UUSee client software, comprising the condition code information of all channels of UUSee, for example:
Channel id=" 004A3C5E-A8EF-424A-993F-FD61DA08AFF8} " name=" HNTV " >
With first three section conversion byte order of this bytecode, promptly 004A3C5E-A8EF-424A is transformed to 5E3C4A00-EFA8-4A42, and the whole bytecode section in conversion back is the condition code of this channel.In order to analyze the occurrence law of channels feature sign indicating number; Use Wireshark to gather two different channels of UUSee respectively on the PC in the garden net: 1800 seconds data of the HNTV and first finance and economics; Use Wireshark to read off-line data; Through searching for the corresponding channels feature sign indicating number of this channel data, find out the occurrence law of this channels feature sign indicating number.
Find the occurrence law of each platform channels feature sign indicating number according to the method for distilling of P2P-TV channels feature sign indicating number occurrence law for PPLive, PPStream and three platforms of QQLive; When wherein these three platforms being taken statistics analysis; Analyze to as if each packet Payload data in preceding 36 bytes, add up the frequency of occurrences of the successive byte sign indicating number section of 4-16 byte respectively.
Use Wireshak software to be connected two channels that two different time sections grasp each platform respectively: 1800 seconds data of the HNTV and first finance and economics; Totally six segment datas; Wireshark is provided with filtering rule according to the main udp port of each platform operation; The data that grab are the data of the main udp port of this platform; Use C Plus Plus to write the code of the statistics 4-16 byte successive byte sign indicating number section frequency of occurrences based on libpcap, obtain statistics after, it is as shown in table 1 that analytic statistics goes out the channels feature sign indicating number result of two channels of three platforms:
Table 1: the condition code of the four platforms HNTV and first finance and economics
Platform Channel Condition code
PPLive HNTV ?6f?81?a0?a1?0f?ed?97?4c?af?29?29?e5?f4?e4?e6?57
PPLive First finance and economics ?d1?4b?86?96?ff?ff?e5?44?ac?c7?ad?b3?e1?fa?5a?98
PPStream HNTV ?19?00?43?00?00?9b?76?d5?95?00
PPStream First finance and economics ?19?00?43?00?00?9b?6b?3a?98?00
QQLive HNTV de?9b?c4?f5?00?00?00?00?00
QQLive First finance and economics ?9f?a2?2f?a5?00?00?00?00?00
UUSee HNTV ?5e?3c?4a?00?ef?a8?4a?42?99?3f?fd?61?da?08?af?f8
UUSee First finance and economics ?42?ab?6a?1a?b7?ea?d5?40?8e?d5?14?d5?ae?fe?de?64
Can find the occurrence law of each platform features sign indicating number according to the condition code that finds, the condition code occurrence law of four platforms is as shown in table 2:
Table 2: four platforms condition code occurrence law
Figure GSB00000791662000091
Based on the channels feature sign indicating number occurrence law that the said extracted method obtains, recognition methods is described in detail present embodiment to channel to PPLive, PPStream, QQLive and UUSee four platforms channel in the campus network:
The first step is set up the channels feature storehouse of four platforms.
Moving ten channels of four platforms, the channel list that each platform channel is as shown in table 3 respectively on the PC in the garden net:
Table 3: channel list
Figure GSB00000791662000092
Utilize the condition code of 40 channels in the occurrence law extraction table 3 of each platform channels feature sign indicating number, set up the channels feature storehouse of 40 channels of this four platforms, channels feature storehouse example such as table 4:
Table 4: channels feature storehouse example
Platform number Channel The channels feature sign indicating number
1 1 84?c2?8a?ca?c5?c2?5a?47?ba?68?50?ed?0a?4d?25?55
1 2 eb?dc?61?3c?29?a1?c9?40?82?32?54?34?46?e3?49?18
1 3 8c?07?5b?6c?54?55?5d?4e?89?43?53?17?da?a4?4b?c5
1 7 d1?4b?86?96?ff?ff?e5?44?ac?c7?ad?b3?e1?fa?5a?98
2 4 19?00?43?00?00?9b?3f?21?4c?00
2 3 19?00?43?00?00?9b?80?7a?1f?00
2 5 19?00?43?00?00?9b?05?ca?19?00
2 6 19?00?43?00?00?9b?72?a3?35?00
3 7 9f?a2?2f?a5?00?00?00?00?00
3 1 47?3b?65?ee?00?00?00?00?00
3 8 3b?a4?86?0c?00?00?00?00?00
3 9 f6?70?7d?a9?00?00?00?00?00
4 10 35?cf?b6?6f?89?4b?dd?48?92?f9?1a?48?f2?8f?4b?98
4 7 42?ab?6a?1a?b7?ea?d5?40?8e?d5?14?d5?ae?fe?de?64
4 11 12?f9?22?5f?2a?89?5d?4e?a5?49?fd?42?5c?1c?9b?5e
4 12 d2?70?d3?55?4f?45?56?a5?e4?c0?c4?13?3f?99?b6?e0
Wherein, platform, channel number are respectively shown in table 5, table 6:
Table 5: platform label
Platform Numbering
PPLive 1
PPStream 2
QQLive 3
UUSee 4
Table 6: channel label
Channel Numbering Channel Numbering
CCTV6 1 First finance and economics 7
East finance and economics 2 Southeast satellite TV 8
Dragon TV 3 Hubei satellite TV 9
BTV 4 Beijing science and education 10
Changsha politics and law channel 5 Shenzhen satellite TV 11
Jiangsu satellite TV 6 ZTV 12
The channel recognition system corresponding with above-mentioned channel recognition methods adopts the libpcap storehouse to realize based on C Plus Plus; System deployment is as shown in Figure 3; Outlet data in the campus network is transferred in the data acquisition server through the network data mirror image at the switch place; Data acquisition server adopts Wireshark that data are caught, and Wireshark is provided with the UDP filtering rule simultaneously, promptly accomplishes the filtration of data when grasping data; Database server stores has the channels feature storehouse, and writes down each recognition result.P2P-TV channel identification terminal reads from data acquisition server and filters good data; The shunting process is realized by following method: recognition system is each (InnerIP; Port) data flow is set up a vector container; Each packet all is stored among the vector of correspondence, has so just realized shunting.Identifying is to carry out to the data among each vector.A PC in the garden net (IP is 172.20.12.90) moves ten channels of four platforms respectively; Simultaneously the P2P-TV recognition system is discerned, and establishes Δ t=5 second here temporarily, M=300 second; Recognition result arrives shown in the table 10 like table 7; Can find out that from table 7 and table 10 lower recall ratio appears in indivedual channels of PPLive and most of channel of UUSee, this is not that system does not identify, and is because other three platforms of frequency ratio that the channels feature sign indicating number of UUSee occurs are low; But this does not influence the accurate identification of this platform channel, and the T value all can be greater than M in identifying for these two platform channels.
Table 7:PPLive channel recognition result
Data set Identification frequency Precision ratio Recall ratio
First finance and economics 1182 100%(1182/1182) 100%(1182/1182)
Dragon TV 244 100%(237/237) 97.1%(237/244)
Southeast satellite TV 260 100%(239/239) 91.9%(239/260)
Hebei satellite TV 242 100%(235/235) 97.1%(235/242)
Hubei TV station 241 100%(94/94) 39.1%(94/241)
HNTV 236 100%(236/236) 100%(236/236)
Jiangsu satellite TV 382 100%(367/367) 96.1%(367/382)
Shandong satellite TV 498 100%(477/477) 95.8%(477/498)
Yunnan satellite TV 249 100%(206/206) 82.7%(206/249)
ZTV 482 100%(482/482) 100%(482/482)
Table 8:PPStream channel recognition result
Data set Identification frequency Precision ratio Recall ratio
HNTV 185 100%(185/185) 100%(185/185)
BTV 670 100%(670/670) 100%(670/670)
Anhui satellite TV 131 100%(126/126) 96.2%(126/131)
First finance and economics 184 100%(184/184) 100%(184/184)
ZTV 168 100%(168/168) 100%(168/168)
Shanghai TV play channel 461 100%(461/461) 100%(461/461)
Changsha politics and law channel 113 100%(113/113) 100%(113/113)
Guangdong satellite TV 204 100%(204/204) 100%(204/204)
Guizhou satellite TV 172 100%(172/172) 100%(172/172)
Yunnan satellite TV 273 100%(273/273) 100%(273/273)
Table 9:QQLive channel recognition result
Data set Identification frequency Precision ratio Recall ratio
Chongqing satellite TV 195 100%(195/195) 100%(195/195)
CCTV6 789 100%(789/789) 100%(789/789)
First finance and economics 708 100%(708/708) 100%(708/708)
Dragon TV 680 100%(680/680) 100%(680/680)
HNTV 827 100%(827/827) 100%(827/827)
Jiangsu satellite TV 279 100%(279/279) 100%(279/279)
ZTV 636 100%(636/636) 100%(636/636)
Southeast satellite TV 182 100%(182/182) 100%(182/182)
Hubei satellite TV 174 100%(174/174) 100%(174/174)
Sichuan satellite TV 181 100%(181/181) 100%(181/181)
Table 10:UUSee channel recognition result
Data set Identification frequency Precision ratio Recall ratio
Hubei satellite TV 293 100%(57/57) 19.45%(57/293)
Dragon TV 850 100%(620/620) 72.9%(620/850)
ZTV 2115 100%(1084/1084) 51.3%(1084/2115)
HNTV 3921 100%(2263/2263) 57.7%(2263/3921)
Central authorities' security 1045 100%(585/585) 56.0%(585/1045)
First finance and economics 977 100%(220/220) 22.5%(220/977)
Southeast satellite TV 2467 100%(1426/1426) 57.8%(1426/2467)
Jiangsu satellite TV 243 100%(115/115) 47.3%(115/243)
Shenzhen satellite TV 380 100%(194/194) 51.1%(194/380)
BTV 367 100%(213/213) 58.0%(213/367)

Claims (8)

1. the method for distilling of a P2P-TV channels feature sign indicating number occurrence law is characterized in that may further comprise the steps:
Step 1, data acquisition: data acquisition is carried out respectively two different time sections, and at every turn from the net flow data of two sections t durations of two channels of the same P2P-TV platform of PC collection that is connected in the garden net, wherein two of twice collection channels are identical;
Step 2, data processing: the four segment datas stream to collecting carries out protocol filtering and ports filter, keeps the UDP message in the main udp port, the maximum udp port of the interior data volume of sampling time section that wherein main udp port is meant the t duration;
The frequency of occurrences of bytecode section in step 3, the statistics UDP message: for n byte before the payload data in all packets in each segment data stream; Statistical length is the frequency of occurrences of the continuous bytecode section of m byte respectively; Wherein m ∈ [4; L], 4<l<n sorts according to the frequency of occurrences is descending to n-3 statistics;
Step 4, analysis bytecode section; Tentatively draw the channels feature sign indicating number: for two channels of identical platform; From two sections of each channel different bytecode frequency of occurrences statisticses, find the bytecode section that character matees fully and the frequency of occurrences is all bigger respectively; These two bytecode sections of two channel correspondences are as meeting the following conditions: 1) length of two bytecode sections is identical; 2) content of two bytecode sections is inequality; 3) the appearance position of two bytecode sections has identical rule; These two condition codes that the bytecode section is corresponding channel of initial setting then;
The channels feature sign indicating number that step 5, check tentatively draw: uniqueness and the constancy of using two bytecode sections of data detection of identical platform different channel different time sections; Wherein uniqueness is meant that the bytecode section on other channel correspondence position of identical platform is different each other, and constancy is meant that this bytecode section does not change in greater than 6 months time period; Can not then return step 4 through check, then think the channels feature sign indicating number that this bytecode section is this channel through check;
Step 6, analyzing total are born channels feature sign indicating number occurrence law, comprise the bag size rule and the position rule of channels feature sign indicating number, wherein wrap big or small rule and are meant that condition code mainly appears at the packet of size for what; Said position rule is meant the channels feature sign indicating number mainly appears at which position in the payload data in the packet.
2. according to the method for distilling of the said P2P-TV channels feature of claim 1 sign indicating number occurrence law, it is characterized in that in the said step 1 that it is more than one day that said two different time sections are got the time difference.
3. according to the method for distilling of the said P2P-TV channels feature of claim 1 sign indicating number occurrence law, it is characterized in that in the said step 4 that the bytecode section that the frequency of occurrences is all bigger refers to that the frequency of occurrences of two bytecode sections all is positioned at preceding 5 of frequency of occurrences statistics.
4. the P2P-TV channel recognition methods based on the method for distilling of the said P2P-TV channels feature of claim 1 sign indicating number occurrence law is characterized in that comprising the steps:
(1) the channels feature storehouse is set up
Step S101 is provided with K P2P-TV platform and need carries out channel identification, finds the channels feature sign indicating number occurrence law of K platform respectively according to the method for distilling of the said P2P-TV channels feature of claim 1 sign indicating number occurrence law;
Step S102 according to the channels feature sign indicating number occurrence law of each platform, extracts the channels feature sign indicating number of K all channels of platform;
Step S103 sets up the channels feature storehouse according to the channels feature sign indicating number that is obtained, and each channels feature sign indicating number comprises following information at least in the channels feature storehouse: platform number, channel and channels feature sign indicating number; And as the case may be platform and channel are numbered;
(2) channel identifying
Step S104, collection network data from the switch of garden net outlet;
Step S105 carries out udp protocol to the network data that collects and filters, and only stays udp data;
Step S106, to the network data flow that filters according to each appearance (InnerIP Port) carries out the network data shunting, obtains refineing to the sub data flow of port, and wherein InnerIP is a garden net implicit IP address;
Step S107, the packet in the sub data flow is by (InnerIP Port) pools new data flow;
Step S108 reads the packet in the short time window delta t from data flow;
Step S109, K kind P2P-TV platform is corresponding K kind condition code occurrence law altogether, uses wherein a kind of condition code occurrence law that the packet in the short time window delta t is carried out the extraction of channels feature sign indicating number;
Channels feature sign indicating number in the channels feature storehouse that step S110, the channels feature sign indicating number that step S109 is extracted and step S103 obtain matees, and coupling is failed and then got into step S111, matees successfully to get into step S117;
Step S111 judges that the condition code whether packet in this short time window has traveled through K kind platform extracts rule, if all used then get into step S112, if all do not use then return step S109;
Step S112, t is added to T with Δ, and wherein T representes the accumulated time length of coupling failure continuously, whether judges T then greater than threshold value M, and wherein M is the time threshold that is judged as unknown flow rate, greater than M, gets into step S113 like all T, otherwise, get into step S114;
Step S113, the accumulated time length T of coupling failure is judged that this data flow is a unknown traffic, and is withdrawed from identifying greater than threshold value M continuously;
Step S114, the data in this Δ t time period are not mated success, and T judges that less than threshold value M this data flow for needing further recognition data stream, gets into step S115 then;
Step S115 judges whether this data flow runs through, if do not run through then return step S108, data run through and then get into step S116;
Step S116; Whether the switch U that judges that judges unknown identification stream equals zero; The judgement switch U of unknown identification stream detects again less than the data flow of channels feature less than M the duration and judge simultaneously; If equal zero then judge that this data flow is a unknown traffic, and withdraw from identifying, otherwise directly withdraw from identifying;
Step S117, condition code is mated successfully, returns recognition result, and the accumulated time length T that will mate failure simultaneously continuously is changed to zero, and the judgement switch U of the unknown identification is added 1 certainly, gets into step S115 then.
5. according to the recognition methods of the said P2P-TV channel of claim 4, it is characterized in that the method that channels feature sign indicating number and the channels feature sign indicating number in the channels feature storehouse among the said step S110 are complementary adopts character string matching method to realize.
6. according to the recognition methods of the said P2P-TV channel of claim 4; The process that it is characterized in that network data shunting among the said step S106 is: be each (InnerIP; Port) one section memory space of distribution of flows, each packet all is stored in the corresponding stored space, realizes data distribution.
7. a P2P-TV channel recognition system is characterized in that, comprises with lower module:
Each platform channels feature sign indicating number extraction module according to the channels feature sign indicating number occurrence law of each platform, extracts the channels feature sign indicating number of each all channel of platform;
The channels feature library module receives the channels feature sign indicating number that the extraction module of each platform channels feature sign indicating number extracts, and sets up a channels feature storehouse;
Garden net outlet data acquisition module, garden net and the Internet interaction data are gathered in the net outlet from the garden;
The network data filtering module carries out the UDP filtration to the data that garden net outlet data acquisition module collects, and keeps the data flow that only contains the UDP message bag;
The network data diverter module is according to (InnerIP Port) shunts the data flow of network data filtering module output, and (InnerIP Port) all can produce a sub-streams to each that occurs in the data flow;
The channels feature sign indicating number extracts and matching module, and the sub data flow of receiving network data diverter module output, and extract the channels feature sign indicating number according to channels feature sign indicating number occurrence law one by one matees with channels feature sign indicating number in the channels feature storehouse, produces matching result;
The recognition result output module, output coupling recognition result;
Wherein the output of garden net outlet data acquisition module is through network data filtering module access network data distribution module; Each platform channels feature sign indicating number extraction module is through the channels feature library module; Insert the channels feature sign indicating number in the lump again with this network data diverter module and extract the input with matching module, the output that said channels feature sign indicating number extracts with matching module inserts the recognition result output module at last.
8. according to the said P2P-TV channel of claim 7 recognition system; It is characterized in that; Said garden net outlet data acquisition module and network data filtering module adopt the data acquisition server that Wireshark software is housed to realize that wherein data acquisition server inserts the switch in the garden net; The channels feature library module adopts the database server realization that inserts P2P-TV channel identification terminal through the garden net.
CN2010101054373A 2010-02-03 2010-02-03 Method for extracting occurrence law of P2P (Peer 2 Peer)-TV channel feature code, P2P-TV channel recognition method and recognition system based on same Expired - Fee Related CN101778115B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010101054373A CN101778115B (en) 2010-02-03 2010-02-03 Method for extracting occurrence law of P2P (Peer 2 Peer)-TV channel feature code, P2P-TV channel recognition method and recognition system based on same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010101054373A CN101778115B (en) 2010-02-03 2010-02-03 Method for extracting occurrence law of P2P (Peer 2 Peer)-TV channel feature code, P2P-TV channel recognition method and recognition system based on same

Publications (2)

Publication Number Publication Date
CN101778115A CN101778115A (en) 2010-07-14
CN101778115B true CN101778115B (en) 2012-08-29

Family

ID=42514443

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010101054373A Expired - Fee Related CN101778115B (en) 2010-02-03 2010-02-03 Method for extracting occurrence law of P2P (Peer 2 Peer)-TV channel feature code, P2P-TV channel recognition method and recognition system based on same

Country Status (1)

Country Link
CN (1) CN101778115B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103297440B (en) * 2013-06-24 2016-06-29 北京星网锐捷网络技术有限公司 The method for building up of application traffic feature database and device, the network equipment
CN105791988A (en) * 2014-12-24 2016-07-20 Tcl集团股份有限公司 Television channel switching detection method and device and television

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1816053A (en) * 2006-03-10 2006-08-09 清华大学 Flow-media direct-broadcasting P2P network method based on conversation initialization protocol

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1816053A (en) * 2006-03-10 2006-08-09 清华大学 Flow-media direct-broadcasting P2P network method based on conversation initialization protocol

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘琼 等.P2P流媒体网络电视通信机制研究.《电信科学》.2009,(第6期),61-64. *
王海舟 等.PPLive网络电视系统的测量研究.《计算机应用》.2009,第29卷(第7期),1988-1991. *

Also Published As

Publication number Publication date
CN101778115A (en) 2010-07-14

Similar Documents

Publication Publication Date Title
CN101605067B (en) Network behaviour active analyzing and diagnosing method
CN105871832B (en) A kind of network application encryption method for recognizing flux and its device based on protocol attribute
CN108123931A (en) Ddos attack defence installation and method in a kind of software defined network
CN104283918B (en) A kind of WLAN terminal type acquisition methods and system
CN102420701B (en) Method for extracting internet service flow characteristics
CN104283897B (en) Wooden horse communication feature rapid extracting method based on multiple data stream cluster analysis
CN109167798B (en) Household Internet of things device DDoS detection method based on machine learning
CN109274673A (en) A kind of detection of exception of network traffic and defence method
CN107404400A (en) A kind of network situation awareness implementation method and device
CN102271090A (en) Transport-layer-characteristic-based traffic classification method and device
CN102546625A (en) Semi-supervised clustering integrated protocol identification system
CN109451486B (en) WiFi acquisition system based on detection request frame and WiFi terminal detection method
CN107204975A (en) A kind of industrial control system network attack detection technology based on scene fingerprint
CN105187437B (en) A kind of centralized detecting system of SDN network Denial of Service attack
CN102984269B (en) A kind of point-to-point method for recognizing flux and device
CN101778115B (en) Method for extracting occurrence law of P2P (Peer 2 Peer)-TV channel feature code, P2P-TV channel recognition method and recognition system based on same
CN102739457A (en) Network flow recognition system and method based on DPI (Deep Packet Inspection) and SVM (Support Vector Machine) technology
CN105117436A (en) Automatic website channel mining method
CN106887104A (en) A kind of frequency-change sampling system and method
CN106803813A (en) A kind of recognition methods of intelligent home device control command field
CN111080362A (en) Advertisement monitoring system and method
CN111614611B (en) Network security auditing method and device for power grid embedded terminal
CN114553546B (en) Message grabbing method and device based on network application
CN112929364B (en) Data leakage detection method and system based on ICMP tunnel analysis
CN103618712A (en) Method for identifying DVB-RCS protocol

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120829

Termination date: 20210203

CF01 Termination of patent right due to non-payment of annual fee