CN100593928C - Stream media content downloading method based on data characteristic - Google Patents

Stream media content downloading method based on data characteristic Download PDF

Info

Publication number
CN100593928C
CN100593928C CN200610113575A CN200610113575A CN100593928C CN 100593928 C CN100593928 C CN 100593928C CN 200610113575 A CN200610113575 A CN 200610113575A CN 200610113575 A CN200610113575 A CN 200610113575A CN 100593928 C CN100593928 C CN 100593928C
Authority
CN
China
Prior art keywords
file
packet
data
media content
session
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN200610113575A
Other languages
Chinese (zh)
Other versions
CN101155122A (en
Inventor
张冬明
张勇东
郭俊波
李锦涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN200610113575A priority Critical patent/CN100593928C/en
Publication of CN101155122A publication Critical patent/CN101155122A/en
Application granted granted Critical
Publication of CN100593928C publication Critical patent/CN100593928C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to a downloading method of stream media content based on data character, comprising: starting data package seizing wire; opening the link address containing video content and buffer memorizing the seized data package in the form of binary data stream in a buffer file. The buffer file is read and a number of conversational files are separated based on the conversation terminal. On the basis of the size of the conversational file and the opening code of the media content in the said file, the said conversational file is determined whether the media content is in the conversational file or not, and the conversational file without the media content is deleted. The data packages in the conversational file are read in turn based on the conversation index file and the retransmission package and error package are deleted on the basis of the sequence number and affirm number in the data package, and then the effective media data package is ordered and stored in the new media content file. Only page address with media content is provided to accurately download the required media content without reducing the website performance of stream media.

Description

A kind of stream media content downloading method based on data characteristics
Technical field
This method belongs to Internet resources and downloads particularly a kind of stream media content downloading method based on data characteristics.
Background technology
Along with the high bandwidth network construction, the network flow-medium service presents fast-developing situation.Network media search technique becomes the necessary means of the required resource of quick searching, and the Internet video search is wherein of paramount importance technology.The Internet video search technique at first obtains video content, then its key frame is carried out feature extractions such as color, texture and warehouse-in for user inquiring.At present, how to obtain video resource, become the bottleneck of Internet video search technique development.General download software need be downloaded according to the URL address of resource, adopts multi-threaded parallel to download usually in order to improve this class software of speed of download, and this may greatly reduce the performance of Streaming Media website.In addition, be limited to the network bandwidth, often very not smooth during user's browse network streaming medium content, often occur pausing, the user wishes to make full use of the data traffic that has produced in the navigation process, improves the quality of browsing streaming medium content on the low-bandwidth network.
There are following phenomenons in current streaming media service and videoblog website:
(1) at first, network flow-medium generally always provides streaming medium content according to certain specific protocol to client.Such as, popular videoblog website just is based on the TCP transmission video content is provided at present.
(2) secondly, streaming medium content has burst in the short time, characteristics that data traffic is big in Network Transmission.
(3) last, streaming medium content has own characteristic.Streaming medium content such as specific format has corresponding feature header.
Utilize the feature of Streaming Media transmission, and the feature of streaming medium content itself, a kind of general stream media content downloading method can be designed.
Summary of the invention
The purpose of this invention is to provide a kind ofly at streaming medium content, general method for down loading, this method for down loading can guarantee correctly to download the media content that we need on the one hand, also can not cause the Streaming Media web site performance to reduce on the other hand.
For achieving the above object, the present invention proposes a kind of stream media content downloading method based on data characteristics, may further comprise the steps:
(1) the turn-on data bag is caught thread;
(2) open the chained address that comprises video content, the packet that the captures form with binary data stream is cached in the cache file, set up a buffer memory index file simultaneously;
These all length of data package of buffer memory index file journal are so that read each packet content successively according to this index file from the binary data stream of cache file.After all data packet transmission finish, close link and the packet-capturing thread opened;
(3) read cache file according to the buffer memory index file, all packets that will have the same session port are stored to a session file, obtain a plurality of session files, simultaneously each session file are set up a session index file;
(4) confirm whether comprise media content in this session file according to the size of session file and the media content opening code in this document, deletion does not comprise the session file of media content;
(5) read packet in the session file successively according to the session index file, obtain the session port of each packet, determine that this packet is media data packet or is the affirmation packet, wherein, the destination interface address is that the machine MAC Address of program running is a media data packet, otherwise for confirming packet; According to the serial number in the packet with confirm number to confirm whether packet is effective; To the media data packet sequencing that efficiency confirmed then and deposit newly-built medium content file in.
In the technique scheme, all length of data package of buffer memory index file journal in the described step (2).
In the technique scheme, all length of data package of session index file journal in the described step (3).
In the technique scheme, in the described step (5), the method of media data packet that efficiency confirmed is as follows: read the session data bag successively according to the session index file, obtain the session port of each packet, the destination interface address is that the packet of the machine MAC Address of program running is confirmed as media data packet.
In the technique scheme, in the described step (5), the method that the media data packet that efficiency confirmed is deposited in newly-built medium content file is as follows: packet is confirmed as media data packet, and after efficiency confirmed, serial number ordering according to this packet, according to the order after the ordering, the data length of each effective data packets and the deviation post in session file are write the media content index file, at last according to the media content index file, from session file, read all media data packet more successively, and write newly-built medium content file.
In the technique scheme, also comprise the steps: after obtaining medium content file, detect according to the media content opening code and obtain medium type, further delete the data that this medium content file can not be decoded.
At above analysis result, the stream media content downloading method based on data characteristics provided by the invention, the advantage of this method is:
(1) thoroughly solve the media content chained address and be difficult to obtain, causing can not download problem;
(2) thoroughly solve the download problem that does not have corresponding URL address medium content;
(3) only need provide the web page address that comprises media content just can media content download;
(4) data cached in navigation process, downloading process can not cause the Streaming Media web site performance to reduce.
Description of drawings
The flow chart of Fig. 1 this method download stream media content.
Embodiment
Below in conjunction with drawings and the specific embodiments the present invention is done to describe further.
Embodiment 1
With reference to figure 1, the stream media content downloading method based on data characteristics in the present embodiment comprises the steps:
1, the turn-on data bag is caught thread
Mode of operation by network interface card is set to promiscuous mode, can catch all packets by network interface card.Adopt the WinpCap engineering of increasing income as the packet-capturing thread in the present embodiment.
2, open the chained address that comprises video content, and preserve the packet that captures general download software of the prior art and directly just can download the chained address of sensing institute downloaded resources.The link that comprises video content that the chained address that needs in the present embodiment is meant, this link are positioned on the webpage that the user browses, and do not need directly to point to institute's downloaded resources.
The relative data bag is caught, operating writing-file is slower process, for fear of causing the packet packet loss because of operating writing-file, when catching packet, do not carry out the packet mask work, but all packets are cached to a file with the form of binary data stream, and set up corresponding buffer memory index file, therefore these all length of data package of buffer memory index file journal just can read each packet content successively according to this index file from the binary data stream of cache file.Being convenient to follow-up packet separates and signature analysis.After all data packet transmission finish, close link and the packet-capturing thread opened.
3, packet separates
According to read data packet in order in the binary data of buffer memory index file from cache file, and write different files according to the session port difference of packet.A packet has destination interface A and stay of two nights port B, and the session port of this packet is designated as<A B 〉.In packet separates, general<A, B〉and<B, A〉regard the both direction of same session as, so destination interface is A, and stay of two nights port is that B and destination interface are B, stay of two nights port is that the packet of A leaves in the session file.Set up corresponding session index file in the packet separation process, the journal of described session index file belongs to all length of data package of this session.The corresponding session index file of each session file.
4, determine media content place packet
The packet document size of media content correspondence is generally bigger, and to the file of size greater than certain number of bytes, byte number is 512kb in the present embodiment.Detect length above the media information in the session file of this numerical value, the characteristic information that is detected comprises common media content, opening code as the WMV media content is: 0x3026B275, MPEG-I media content opening code is: 0x000001BA, FLV media content opening code is: 0x464C5601, rm media content opening code is: 0x2E524D46, in case detect these condition codes, promptly think to comprise media content in this session file.In the link, may comprise a plurality of media contents, need handle successively respectively this moment according to session is different.
5, bag verification and detection
Packet in the session file of determining to comprise media content is carried out verification and detection, abandon the packet of check errors.Adopt the check and the detection method of TCP bag in this example.
6, invalid packets is removed and data packet sequencing
The session file that step 5 was handled is further handled, removed invalid packets, and to the data packet sequencing.Specific as follows: as to read the session data bag successively according to the session index file, obtain the session port of each packet, determine that this packet is media data packet or is affirmation packet (the destination interface address is that the packet of the machine MAC Address of program running is a media data packet, otherwise is the affirmation packet).According to the serial number of packet, confirm number to come specified data bag whether effectively (serial number, confirm number to be the information that comprises in the packet).
Packet is confirmed as media data packet, and after efficiency confirmed, according to the serial number ordering of this bag.According to the order after the ordering, the data length of each effective data packets, deviation post in session file are write the media content index file.
According to the media content index file, from session file, read all media data packet more successively, and write newly-built medium content file at last.
7, further remove invalid packets
Previous step is handled in the medium content file that obtains, and also may comprise invalid content.The medium type that obtains according to detection, such as wmv, flv, rm, swf or the like, the further data that can not decode of deletion.The media content that final acquisition needs.

Claims (4)

1, a kind of stream media content downloading method based on data characteristics is characterized in that, may further comprise the steps:
(1) the turn-on data bag is caught thread;
(2) open the chained address that comprises video content, the packet that the captures form with binary data stream is cached in the cache file, set up a buffer memory index file simultaneously;
These all length of data package of buffer memory index file journal are so that read each packet content successively according to this index file from the binary data stream of cache file
Figure C2006101135750002C1
After all data packet transmission finish, close link and the packet-capturing thread opened;
(3) read cache file according to the buffer memory index file, all packets that will have the same session port are stored to a session file, obtain a plurality of session files, simultaneously each session file are set up a session index file;
(4) confirm whether comprise media content in this session file according to the size of session file and the media content opening code in this document, deletion does not comprise the session file of media content;
(5) read the session data bag successively according to the session index file, obtain the session port of each packet, determine that this packet is media data packet or is the affirmation packet, wherein, the destination interface address is that the machine MAC Address of program running is a media data packet, otherwise for confirming packet; According to the serial number in the packet with confirm number to confirm whether packet is effective; To the media data packet sequencing that efficiency confirmed then and deposit newly-built medium content file in.
2, by the described stream media content downloading method of claim 1, it is characterized in that all length of data package of session index file journal in the described step (3) based on data characteristics.
3, by the described stream media content downloading method of claim 1 based on data characteristics, it is characterized in that, in the described step (5), the method that the media data packet that efficiency confirmed is deposited in newly-built medium content file is as follows: packet is confirmed as media data packet, and after efficiency confirmed, serial number ordering according to this packet, according to the order after the ordering, the data length of each effective data packets and the deviation post in session file are write the media content index file, at last according to the media content index file, from session file, read all media data packet more successively, and write newly-built medium content file.
4, by the described stream media content downloading method of claim 1 based on data characteristics, it is characterized in that, also comprise the steps: after obtaining medium content file, detect according to the media content opening code and obtain medium type, further delete the data that this medium content file can not be decoded.
CN200610113575A 2006-09-30 2006-09-30 Stream media content downloading method based on data characteristic Expired - Fee Related CN100593928C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200610113575A CN100593928C (en) 2006-09-30 2006-09-30 Stream media content downloading method based on data characteristic

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200610113575A CN100593928C (en) 2006-09-30 2006-09-30 Stream media content downloading method based on data characteristic

Publications (2)

Publication Number Publication Date
CN101155122A CN101155122A (en) 2008-04-02
CN100593928C true CN100593928C (en) 2010-03-10

Family

ID=39256569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200610113575A Expired - Fee Related CN100593928C (en) 2006-09-30 2006-09-30 Stream media content downloading method based on data characteristic

Country Status (1)

Country Link
CN (1) CN100593928C (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4743239B2 (en) * 2008-08-22 2011-08-10 ソニー株式会社 Wireless communication apparatus, communication system, communication control method, and program
CN101739433B (en) * 2008-11-14 2012-12-19 鸿富锦精密工业(深圳)有限公司 System and method for correcting webpage download error
CN101739288B (en) * 2008-11-18 2013-01-30 康佳集团股份有限公司 Non-interpretation type dynamic downloading-running method
CN101483653B (en) * 2009-02-17 2012-04-25 杭州华三通信技术有限公司 Method, device and system for providing application layer data to the application layer from network appliances
CN102045294B (en) * 2009-10-23 2014-02-26 宏碁股份有限公司 Data transmission method and system
CN101841557B (en) * 2010-03-02 2013-01-02 中国科学院计算技术研究所 P2P streaming media downloading method and system based on orthogonal list
CN102096712A (en) * 2011-01-28 2011-06-15 深圳市五巨科技有限公司 Method and device for cache-control of mobile terminal
CN103873956B (en) * 2012-12-12 2018-02-13 中国电信股份有限公司 Media file playing method, system, player, terminal and media storage platform
CN103678527B (en) * 2013-12-02 2017-10-24 Tcl集团股份有限公司 A kind of video filtering method and system based on video title and content
AU2015358292B2 (en) * 2014-12-02 2021-09-23 Bankvault Pty Ltd Computing systems and methods

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004075077A1 (en) * 2003-02-19 2004-09-02 Maui X-Stream Inc. Methods, data structures, and systems for processing media data streams
CN1798097A (en) * 2004-12-24 2006-07-05 腾讯科技(深圳)有限公司 Method for buffering data in stream media
CN1816053A (en) * 2006-03-10 2006-08-09 清华大学 Flow-media direct-broadcasting P2P network method based on conversation initialization protocol

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004075077A1 (en) * 2003-02-19 2004-09-02 Maui X-Stream Inc. Methods, data structures, and systems for processing media data streams
CN1798097A (en) * 2004-12-24 2006-07-05 腾讯科技(深圳)有限公司 Method for buffering data in stream media
CN1816053A (en) * 2006-03-10 2006-08-09 清华大学 Flow-media direct-broadcasting P2P network method based on conversation initialization protocol

Also Published As

Publication number Publication date
CN101155122A (en) 2008-04-02

Similar Documents

Publication Publication Date Title
CN100593928C (en) Stream media content downloading method based on data characteristic
CN103281213B (en) A kind of network traffic content extracts and analyzes search method
CN102761517B (en) Content reduction method for high-speed network
US8416788B2 (en) Compression of data packets while maintaining endpoint-to-endpoint authentication
US20110125748A1 (en) Method and Apparatus for Real Time Identification and Recording of Artifacts
CN102045305B (en) Method and system for monitoring and tracking multimedia resource transmission
JP2002538731A (en) Dynamic parsing in high performance network interfaces
CN104170349A (en) Policy control enforcement at a packet gateway
US8611222B1 (en) Selectively enabling packet concatenation based on a transaction boundary
CN101212485A (en) Method for obtaining stream media link address
CN101316232B (en) Fragmentation and reassembly method based on network protocol version six
CN103401850A (en) Message filtering method and device
CN101867932B (en) Harmful information filtration system based on mobile Internet and method thereof
CN103001964A (en) Cache acceleration method under local area network environment
CN111147483A (en) Lossy compression storage method and device for original network data packet
WO2013091345A1 (en) Wireless webpage browsing resources optimization method, apparatus and system
CN109951425B (en) TCP (Transmission control protocol) flow state integrity detection method based on FPGA (field programmable Gate array)
CN114327833A (en) Efficient flow processing method based on software-defined complex rule
CN105337797A (en) Data capturing method of network protocol of complex electronic information system
CN103354546A (en) Message filtering method and message filtering apparatus
CN105491158A (en) HTTP content reduction method and HTTP content reduction system based on network data flow
CN111740996B (en) Method for rapidly splitting HTTP request and response in flow analysis scene
CN115550470A (en) Industrial control network data packet analysis method and device, electronic equipment and storage medium
CN114615347A (en) Data transmission method and device based on UDP GSO
WO2021253177A1 (en) File restoration method, and terminal and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100310

Termination date: 20190930