CN101719907B - Passive load information monitoring method based on BitTorrent - Google Patents

Passive load information monitoring method based on BitTorrent Download PDF

Info

Publication number
CN101719907B
CN101719907B CN200910219162A CN200910219162A CN101719907B CN 101719907 B CN101719907 B CN 101719907B CN 200910219162 A CN200910219162 A CN 200910219162A CN 200910219162 A CN200910219162 A CN 200910219162A CN 101719907 B CN101719907 B CN 101719907B
Authority
CN
China
Prior art keywords
information
file
load information
hash
load
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN200910219162A
Other languages
Chinese (zh)
Other versions
CN101719907A (en
Inventor
蔡皖东
丁军平
胡润东
马富达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang Changrong Mechanical and Electrical Co., Ltd.
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN200910219162A priority Critical patent/CN101719907B/en
Publication of CN101719907A publication Critical patent/CN101719907A/en
Application granted granted Critical
Publication of CN101719907B publication Critical patent/CN101719907B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Storage Device Security (AREA)

Abstract

The invention discloses a passive load information monitoring method based on BitTorrent, which is used for monitoring and analyzing P2P network customizing information spreading and audiences thereof based on the BitTorrent. The method comprises the steps of: downloading required files into a local hard disk by using BitTorrent software; generating information of file samples according to existed file information; carrying out Hash processing on the information of file samples, carrying out Hash processing in load information; comparing the film samples subjected to the Hash processing with the load information subjected to the Hash encryption; comparing the load information with the load samples by adopting a character string matching method for judging; and extracting and processing obtained audience information. Because of adopting the way of carrying out the Hash encryption on the captured load data, the mode matching is carried out on the data subjected to the Hash encryption without recovering original content of the files or considering the format of a transmitted file, a plenty of computer processing time is saved.

Description

Passive load information monitoring method based on BitTorrent
Technical field
The invention belongs to network safety filed, relate to a kind of information monitoring method,, be used for propagation of P2P network-specific information and audient thereof based on BitTorrent are monitored and analyze particularly based on the passive load information monitoring method of BitTorrent.
Background technology
Document " Ho Gyun Lee; Taek yong Nam; Jong Soo Jang.The Method of P2P Traffic Detectingfor P2P Harmful Contents Prevention [C] .ICACT2005.Feb 21-23; 2005, Phoenix Park, Korea " discloses a kind of P2P shared file system certain content method for supervising of content-based recovery; This method is at first discerned the P2P data flow, and the P2P data flow is divided into text, image and video data by the type of transmitting content.Adopt dictionary method relatively for the text data content, the keyword that carries in the text is compared with setting up good flame dictionary library in advance, to realize monitoring to harmful content.For picture material, only mention monitoring in the literary composition to Pornograph, the ratio through " skin area " shared entire image in the image process method detection file surpasses certain threshold value and thinks that then this image carries Pornograph.For video file, adopt two kinds of monitoring methods: the one, from video file, obtain key frame, the content of key frame is judged; The 2nd, a certain fragment of recovery video file judges according to the content of this fragment whether video file exists illegal contents.The shortcoming of this scheme is that system architecture is too complicated, the testing mechanism of lack of uniform; Data message content recovery technical difficulty is big, can't recover the ciphered data content; The detection method of image and video can only be taked ex-post analysis, needs to use the complex image processing technology, and computer processing time is long, and real-time is poor, and it is lower to detect accuracy rate.Because the monitoring model based on the flow content need carry out restorative identification to the flow content that P2P software is transmitted, and but can't recover and monitor the encrypted P2P information content.
Summary of the invention
For overcome that prior art is propagated the BitTorrent customizing messages and the audient when monitoring when analyzing; System architecture is too complicated; The testing mechanism of lack of uniform; And the long deficiency of computer processing time, the present invention provides a kind of passive load information monitoring method based on BitTorrent, generates " paper sample " information according to the file that has existed; Secondly pass through Hash calculating and pattern matching to " load information " and " paper sample " intercepted and captured; Whether " load information " of judging current transmission belongs to monitored customizing messages, and this method can be effectively exists the file of " paper sample " to monitor to all of BitTorrent transmission through network, can reduce computer processing time.
The technical solution adopted for the present invention to solve the technical problems: a kind of passive load information monitoring method based on BitTorrent is characterized in may further comprise the steps:
(a) use BitTorrent software that needed file is downloaded on the local hard drive; Generate " file sample " information according to the fileinfo that has existed; " file sample " information is carried out Hash to be handled; And the message length after Hash handles must be identical, the data message that the information after Hash is handled is spliced to form;
(b) " load information " carried out the Hash encryption, compare with " the file sample " handled through Hash " load information " with the process encryption;
(c) adopt character string matching method to carry out the relatively judgement of " load information " and " file sample ", n*20+1 compares to 20 of (n+1) * in " load information " encryption result and " the file sample ", and n is 0,1,2 in the formula ..., Maxn,
Figure DEST_PATH_GSB00000694096500011
When judging certain " load information " and " file sample " coupling, write down current " load information " place TCP link information, " file sample " numbering and " load information " and meet quantity, proceed follow-up " load information " again and compare; When certain TCP was linked at " load information " quantity that meets on one " file sample " and reaches threshold value set in the system, " load information " that current TCP link is transmitted belonged to monitored specific file;
(d) according to source IP address, source port and purpose IP address, destination interface in the TCP link; Obtain the reference address information and the paper conveyance direction of audient's information; Again according to " file sample " under " load information "; Obtain the fileinfo that current P2P link is transmitted, extract current system time, audient's information that obtains and file transmission information are saved in the database as the file transfer time; The data of being preserved comprise: source IP address, source port, purpose IP address, destination interface, the file name of being transmitted, file hash value, transmission time, " load information " number of matches, record; Only keep one for the data that repeat writing time, deposits the audience data storehouse in the data format of standard.
The advantage that the present invention compares prior art is: owing to adopted " load data " intercepted and captured carried out the technology that data after Hash calculates are carried out pattern matching; The original contents that does not need recovery file; Need not be concerned about the form that is transmitted file; Only need carry out Hash calculating and pattern matching to " load information " of reality intercepting and capturing gets final product; Because it is a technology complicacy, that relate to a plurality of computer realms that file recovers, and omits this step and can save a large amount of computer processing times, the real-time in the time of can accomplishing the BitTorrent information monitoring to transmission over networks; Through experimental verification and actual test shows; This method is for the transmission through network BitTorrent information of 1G bandwidth; Can realize intercepting and capturing in real time, Hash calculates and pattern matching, can get access to the audient's information that customizing messages is propagated of participating in exactly, forms customizing messages target audience database; For network security supervision department provides audient's information monitoring and evidence obtaining means, be network safety situation macroscopic analysis and early-warning and predicting provide the foundation data and decision-making foundation simultaneously.
Below in conjunction with accompanying drawing and embodiment the present invention is further specified.
Description of drawings
Accompanying drawing is the schematic flow sheet of main program of the present invention.
Embodiment
The present invention adopts the architecture of sub-module, and sub-module can make the realization of each functional module separate, and carries out communication through interface between module and the module.The first order is an interface portion, comprising: user interface management.It is the main interface of system and user interactions, mainly realizes the interactive function of user and software and the calling function of other module; The second level is implementation part, comprises file sample manufacturing module, load information intercepting and capturing processing module, load information and file sample comparison module, audient's information extraction and processing, system parameter setting and read module and operates help.Below realize the explanation of module for each:
File sample manufacturing module: " file sample " information of making this document according to known file;
Load information is intercepted and captured processing module: " load information " to transmission over networks intercepted and captured, and " load information " intercepted and captured handled;
Load information and file sample comparison module: compare according to " load information " that obtained and " file sample ", judge whether " load information " of current results is monitored specific file;
Audient's information extraction and processing: according to comparative result, the audient's information in the extraction TCP link also is saved in the database;
System parameter setting and read module: the operational factor to system is provided with, and can read designated parameters as required;
Operate help: the module that the operation that can carry out system describes.
The implementation step that the inventive method is concrete is following:
1) generates " file sample ".
BitTorrent flow information through " passively " intercepts and captures on the network is monitored the BitTorrent transmission information on the network; " load information " be meant when BitTorrent transmits, and after connecting between the node, and when carrying out actual transfer of data, place TCP chains the concrete data message that is transmitted; This data message is generally the actual information of institute's transfer files, does not have encrypted also not additional any out of Memory.
When carrying out based on the passive type of " load information " monitoring; Monitored fileinfo at first must exist; This document is the basis of carrying out subsequent operation; The acquisition of file can obtain in several ways, for example can use BitTorrent software that needed file is downloaded on the local hard drive; Generate " file sample " information according to the fileinfo that has existed then; " file sample " is meant specific file cut apart according to the size of appointment, the information after cutting apart is carried out Hash handle, and the message length after the Hash processing must be identical, the data message that the information after Hash is handled is spliced to form; Use " file sample " and the load information of intercepting and capturing to compare, can judge whether load information is the part of specific file; The host-host protocol regulation of BitTorrent; The least unit of transmitting between node and the node is " piece ", and the size of " piece " is stipulated in " seed file ", is generally 64K, 128K, 256K, 512K and other size; But be necessary for the multiple of 16K; Because to " piece " when transmitting, must " sheet " that " piece " is divided into fixed size be transmitted, the size of " sheet " is the 16K that fixes; When making " file sample ", consider that the size of " piece " can be according to the actual conditions of " seed file " and difference, and the size of " sheet " is fixed; If use the size of " piece " to make " file sample ", identical file must generate a plurality of " file samples ", when intercepting and capturing " load information ", must be unit with " piece " simultaneously, has so just increased the amount of calculation of system and has compared workload; So we use the size of " sheet " to generate " file sample ", have so just reduced the computation complexity of system, the process of intercepting and capturing, compare is unified, do not need to consider the special circumstances in " seed file ".
On user interface, click the file selector button; Selection needs to make the file of " sample " in the file selection box that ejects; Click " file sample " again and generate button; System calls " file sample " generation module automatically, and this module reads the content and " file sample " information of generation of specified file automatically, and the path of depositing of " file sample " message file of generation is provided with in system parameters; The file name of " file sample " message file that generates is identical with old file name, and suffix is called " .lar ".When experiment, the file of selecting altogether is 19, and " the file sample " of generation is 19, all generates successfully.
When generating " file sample " according to the file that has existed; First 16K file content that at first reads specified file is in internal memory; File content in the internal memory is carried out Hash calculate, for the message length after calculating can be identical and can not be cracked by other system, we adopt Secure Hash Algorithm (Secure Hash Algorithm; Hereinafter to be referred as SHA1) carry out Hash calculating, this algorithm has following characteristic: cannot recover prime information from the information after encrypting; Different prime informations can produce different encrypted information, but the length of enciphered message is fixed; Using the message length that produces behind the encrypted content file of this algorithm to 16K is 20; After obtaining the enciphered message of first 16K file content, in internal memory, we are referred to as " file sample working area " with this information temporary storage; Obtain the next 16K content of file again, carry out Hash and calculate, the Hash result calculated is linked to the back in " file sample working area ", repeat above-mentioned steps till specified file is finished dealing with; Because the size of file is not all to be the multiple of 16K; So the last file content of handling can be less than 16K; But when being to use the SHA1 algorithm that the file content less than 16K is carried out computations, the enciphered message of generation also is 20, so the generation of " file sample " is not influenced.
After all the elements of file were finished dealing with, the content in " file sample working area " was exactly the actual content of " file sample ", and these information are saved as binary file, so that follow-up use.
2) intercepting and capturing of " load information " and analyzing and processing.
The source of " load information " mainly is BitTorrent network traffics of obtaining transmission over networks through the method for passive monitoring; Because the content of transmission over networks is a lot, in the data of these magnanimity, intercept and capture " load information " accurately and must use the transmission feature of BitTorrent " load information " to carry out: at first " load information " must be to chain at TCP to transmit; Secondly the BitTorrent consultation adds that in the front of " load information " character symbol transmits, and the content of character symbol is " 0000400907 ", and wherein the size of data of 0040 expression transmission is 16K, and the data that 07 expression is transmitted are " load information "; Because the size of " sheet " that transmit through TCP is 16K; And the size of TCP transmission package is 1406 bytes; TCP can be divided into a plurality of TCP bags with the data that need transmission and transmit; Intercept and capture when information foremost character is for " 0000400907 " in the TCP bag, need follow-up a plurality of TCP bags are made up, to generate actual " load information ".
After obtaining " load information "; If directly maneuvering load information and original compare, can increase the comparison amount of calculation, so carrying out Hash to " load information ", we calculate; Use the information after Hash calculates to compare, to reduce the comparison amount of calculation of system with " text sample "; It is identical that " load information " carried out the Hash computational methods that Hash Calculation Method and generation " file sample " adopted, and all is the SHA1 AES.
On user interface, click " load information " and intercept and capture treatment button; System calls " load information " automatically and intercepts and captures processing module; This module is intercepted and captured for " load information " of transmission through network; And " load information " intercepted and captured carried out the Hash computations, after computations is accomplished, directly call " load information " and compare with " file sample " comparison module; " load information " quantity of in 1 hour, successfully intercepting and capturing and handling is 38953.
3) the relatively judgement of load information and " file sample ".
When system moved, the initial method of " load information " and " file sample " comparison module can at first call in system, and this method can need " the file sample " of monitored file to read in the internal memory all, with the efficient of accelerating relatively to judge.
After " load information " intercepted and captured and carried out the Hash computations; Can call the determination methods of this module; This method can compare according to " the file sample " in " load information " that import into and the internal memory fast, whether belongs to monitored file to judge current " load information " that imports into.If do not belong to monitored file, then directly return, if belong to monitored file, then directly call audient's information extraction and processing module, obtain audient's information of current " load information ".
The quantity that " load information " intercepted and captured judged is 38953, has 7795 " load informations " to belong to monitored file.
The relatively judgement of " load information " and " file sample " uses character string matching method to carry out, because the encrypted result of the SHA1 AES that adopts is 20 characters, so n*20+1 compares to 20 of (n+1) * in the encrypted result of relatively the time, using " load information " and " the file sample "; Wherein n is 0; 1,2 ...; Maxn
Figure G2009102191623D00051
In order to accelerate the efficient of comparison, when system's operation starts, need monitored " file sample " is read in the internal memory, when comparing, directly in internal memory, compare.
When judging certain " load information ", can not directly just say that this " load information " belongs to the part of monitored file, because exist the possibility that single " load information " repeats with " file sample " coupling; Write down current " load information " place TCP link information, " file sample " numbering and " load information " and meet quantity, proceed follow-up " load information " more relatively; When certain TCP was linked at " load information " quantity that meets on one " file sample " and reaches threshold value set in the system, " load information " that current TCP link is transmitted belonged to monitored specific file.
4) extraction of audient's information and processing.
When judging " load information " that TCP link transmitted and be exactly the specific file that belongs to monitored; Need extract audient's information that current TCP link is comprised: the reference address information and the paper conveyance direction that at first can obtain audient's information according to source IP address, source port and purpose IP address, destination interface in the TCP link; Can obtain the fileinfo that current P2P link is transmitted according to " file sample " under " load information " again, extract current system time as the file transfer time.
For the audient's information and the file transmission information that obtain; We need be saved in these information in the database; So that use when carrying out other operation; The data of preserving comprise: source IP address, source port, purpose IP address, destination interface, the file name of being transmitted, file hash value, transmission time, " load information " number of matches, record, writing time.
Audient's information for collecting is put in order; Can form complete audient's monitor message; But owing in the process of acquisition of information, do not filter, so in collected data, comprise redundancy and duplicate message, need filter the data of having collected: the data for repeating only keep one; Data format with standard deposits the audience data storehouse in, guarantees uniqueness, consistency and the integrality of data in the audience data storehouse.
When " load information " is judged as the monitored file of data; System can call the extraction and the processing module of audient's information automatically; Obtain audient's information of the P2P link at " load information " place; Audient's information of obtaining comprises: source IP address, source port, purpose IP address, destination interface, the file name of being transmitted, file hash value, transmission time, " load information " number of matches, record, writing time; Extract audient's information to judging " load information " that belong to monitored file, the audient's information data recording quantity that obtains is 7795, and these audient's information are filtered, put in order, and obtaining different audient's information contents is 3.

Claims (1)

1. passive load information monitoring method based on BitTorrent is characterized in that may further comprise the steps:
(a) use BitTorrent software that needed file is downloaded on the local hard drive; Generate " file sample " information according to the fileinfo that has existed; " file sample " information is carried out Hash to be handled; And the message length after Hash handles must be identical, and the information after Hash is handled is spliced to form data message;
(b) " load information " carried out the Hash encryption, compare with " the file sample " handled through Hash " load information " with the process encryption;
(c) adopt character string matching method to carry out the relatively judgement of " load information " and " file sample ", n*20+1 compares to 20 of (n+1) * in " load information " encryption result and " the file sample ", and n is 0,1,2 in the formula ..., Maxn,
Figure FSB00000694096400011
When judging certain " load information " and " file sample " coupling, write down current " load information " place TCP link information, " file sample " numbering and " load information " and meet quantity, proceed follow-up " load information " again and compare; When certain TCP was linked at " load information " quantity that meets on one " file sample " and reaches threshold value set in the system, " load information " that current TCP link is transmitted belonged to monitored specific file;
(d) according to source IP address, source port and purpose IP address, destination interface in the TCP link; Obtain the reference address information and the paper conveyance direction of audient's information; Again according to " file sample " under " load information "; Obtain the fileinfo that current P2P link is transmitted, extract current system time, audient's information that obtains and file transmission information are saved in the database as the file transfer time; The data of being preserved comprise: source IP address, source port, purpose IP address, destination interface, the file name of being transmitted, file hash value, transmission time, " load information " number of matches, record; Only keep one for the data that repeat writing time, deposits the audience data storehouse in the data format of standard.
CN200910219162A 2009-11-26 2009-11-26 Passive load information monitoring method based on BitTorrent Active CN101719907B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200910219162A CN101719907B (en) 2009-11-26 2009-11-26 Passive load information monitoring method based on BitTorrent

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200910219162A CN101719907B (en) 2009-11-26 2009-11-26 Passive load information monitoring method based on BitTorrent

Publications (2)

Publication Number Publication Date
CN101719907A CN101719907A (en) 2010-06-02
CN101719907B true CN101719907B (en) 2012-08-29

Family

ID=42434422

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200910219162A Active CN101719907B (en) 2009-11-26 2009-11-26 Passive load information monitoring method based on BitTorrent

Country Status (1)

Country Link
CN (1) CN101719907B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105763317B (en) * 2016-04-25 2018-10-23 江苏科技大学 Secret information transmission method based on BitTorrent agreement Have message
US10911337B1 (en) * 2018-10-10 2021-02-02 Benjamin Thaddeus De Kosnik Network activity monitoring service
CN111683036B (en) * 2020-02-29 2022-05-27 新华三信息安全技术有限公司 Data storage method and device and message identification method and device
CN112799853B (en) * 2021-04-13 2021-06-22 广州征安电子科技有限公司 Load message overload protection method based on digital signal transmission

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1633111A (en) * 2005-01-14 2005-06-29 中国科学院计算技术研究所 High-speed network traffic flow classification method
CN101447985A (en) * 2008-12-26 2009-06-03 刘学明 Digital credentials method based on notarization information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1633111A (en) * 2005-01-14 2005-06-29 中国科学院计算技术研究所 High-speed network traffic flow classification method
CN101447985A (en) * 2008-12-26 2009-06-03 刘学明 Digital credentials method based on notarization information

Also Published As

Publication number Publication date
CN101719907A (en) 2010-06-02

Similar Documents

Publication Publication Date Title
CN110460594B (en) Threat information data acquisition processing method, device and storage medium
CN110113328B (en) Software defined opportunistic network DDoS defense method based on block chain
CN111277570A (en) Data security monitoring method and device, electronic equipment and readable medium
CN110008757A (en) Data guard method and system in a kind of internet-of-things terminal firmware update
CN109802924A (en) A kind of method and device identifying encrypting traffic
CN113347156B (en) Intelligent flow confusion method and system for website fingerprint defense and computer storage medium
CN105429968B (en) Network forensics load affiliation method based on Bloom filter and system
CN101741908A (en) Identification method for application layer protocol characteristic
CN110784493B (en) Comprehensive meteorological data acquisition system based on NB-IoT communication
CN101719907B (en) Passive load information monitoring method based on BitTorrent
CN110933040B (en) Block chain based data uplink method, device, equipment and medium
CN112560029A (en) Website content monitoring and automatic response protection method based on intelligent analysis technology
CN113726615B (en) Encryption service stability judgment method based on network behaviors in IT intelligent operation and maintenance system
CN115174255B (en) Industrial Internet platform data transmission safety protection system
CN108011945A (en) A kind of cloud evidence obtaining process record method and system based on block chain
CN116489166A (en) Secure data exchange method and system based on blockchain technology
JP2004312083A (en) Learning data generating apparatus, intrusion detection system, and its program
CN115659383B (en) Electronic file secure sharing method and system
Wu et al. Inferring adu combinations from encrypted quic stream
CN116634098A (en) Power production operation ticket operation risk identification system and method based on video monitoring
CN113452724B (en) Separated storage electronic signature encryption protection system and method based on Internet
CN113468574B (en) Block chain data uplink method and device
CN114418092A (en) Block chain-based federal learning malicious node screening method
KR100799558B1 (en) Apparatus and method for tracking harmful file in P2P network
Lin et al. Netdetector: an anomaly detection platform for networked systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: NANTONG CHANGRONG MECHANICAL +ELECTRICAL CO., LTD.

Free format text: FORMER OWNER: NORTHWESTERN POLYTECHNICAL UNIVERSITY

Effective date: 20140813

Owner name: NORTHWESTERN POLYTECHNICAL UNIVERSITY

Effective date: 20140813

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 710072 XI AN, SHAANXI PROVINCE TO: 226600 NANTONG, JIANGSU PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20140813

Address after: 226600 Haian County Development Zone, Jiangsu City, Nantong Province

Patentee after: Nanchang Changrong Mechanical and Electrical Co., Ltd.

Patentee after: Northwestern Polytechnical University

Address before: 710072 Xi'an friendship West Road, Shaanxi, No. 127

Patentee before: Northwestern Polytechnical University