CN102857575A - Download method and system for Internet resources - Google Patents

Download method and system for Internet resources Download PDF

Info

Publication number
CN102857575A
CN102857575A CN201210353411XA CN201210353411A CN102857575A CN 102857575 A CN102857575 A CN 102857575A CN 201210353411X A CN201210353411X A CN 201210353411XA CN 201210353411 A CN201210353411 A CN 201210353411A CN 102857575 A CN102857575 A CN 102857575A
Authority
CN
China
Prior art keywords
module
resource
browser
download
artificial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210353411XA
Other languages
Chinese (zh)
Other versions
CN102857575B (en
Inventor
张云飞
刘军
陈伟
庞景良
李锦根
黄兴红
周青辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen easou world Polytron Technologies Inc
Original Assignee
Shenzhen Yisou Science & Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yisou Science & Technology Development Co Ltd filed Critical Shenzhen Yisou Science & Technology Development Co Ltd
Priority to CN201210353411.XA priority Critical patent/CN102857575B/en
Publication of CN102857575A publication Critical patent/CN102857575A/en
Application granted granted Critical
Publication of CN102857575B publication Critical patent/CN102857575B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention relates to the field of the Internet and provides a download method for Internet resources. The method comprises the following steps: requesting a download task of a resource and the description information of the resource from a downloading module by an analyzing module; scheduling a simulative artificial browser module to simulate an artificial web browsing, loading a packet intercepting module to obtain resource request information and sending the resource request information obtained by the packet intercepting module to the download module by the analyzing module; and downloading the resource according to the resource request information provided by the analyzing module by the downloading module. The invention further provides a download system of the Internet resources. Through the technical scheme provided by the invention, audio and video files on the Internet are automatically treated in batches respectively via a distributed system, so that the download efficiency of the audio and video files is greatly improved; and compared with the traditional download technology of the Internet resources, the download method and the download system has the advantages that the ratio for successfully downloading real download addresses of the audio and video files from the Internet is greatly improved.

Description

A kind of method for down loading of Internet resources and system
Technical field
The present invention relates to internet arena, particularly relate to a kind of method for down loading and system of Internet resources.
Background technology
Along with the fast development of the Internet, information has arrived the huge explosion epoch, and the instrument that search engine has also become people to be unable to do without, and all information of search engine are from the Internet.In the information of magnanimity, audio frequency, video file is to gather, yet many audio frequency, obtaining of video file can not rely on simple hyperlink to obtain, because most of resource website has all added the strategies (cookie is for example arranged, and the http head has been done special setting etc.) such as door chain, and is only fewer and feweri according to the resource file that hyperlink can download to.
In order better to obtain Internet resources, particularly audio frequency, video file, therefore, demand a kind of new technology urgently and this difficult problem occurs cracking.
Summary of the invention
Main purpose of the present invention is to propose a kind of method for down loading and system of Internet resources, and many audio frequency, obtaining of video file can not be obtained and the download efficiency problem by simple hyperlink in the prior art to solve.
For addressing the above problem, the invention provides a kind of method for down loading of Internet resources, comprise,
Parsing module is to the descriptor of resource downloading task of download module request and this resource;
The artificial browser module of parsing module dispatching simulation is simulated artificial browsing page, load to cut simultaneously a bag module Gains resources solicited message, and will cut the resource request information that the bag module obtains and send to download module;
The resource request information that download module provides according to parsing module is downloaded this resource.
Further,, also comprise before the descriptor of resource downloading task of download module request and this resource at parsing module,
Download module is to the descriptor of a collection of resource downloading task of database request and this resource.
Further, after the resource request information that download module provides according to parsing module is downloaded this resource, also comprise,
The resource that download module is downloaded is stored to database.
In the above-mentioned method, wherein, described resource comprises video, audio frequency and the lyrics; The descriptor of described resource comprise broadcast page URL, the video of audio frequency broadcast page URL, the lyrics browse a page URL, resource type, whether mouse click, browser type; Described resource type comprises audio frequency, video, the lyrics; Described browser type comprises IE browser, chrom browser; Described resource request information comprises http request header, URL and the resource task id of resource.
In the above-mentioned method, wherein, the artificial browser module of described parsing module dispatching simulation is simulated artificial browsing page, loads simultaneously to cut bag module Gains resources solicited message, specifically comprise,
Artificial browser module is simulated in initialization;
Set up User Datagram Protoco (UDP) udp service and cut the bag module communication;
Parsing module is determined the analog form of the artificial browser module of simulation and is cut the load mode of bag module according to browser type in the descriptor of resource;
Simulate artificial browser module simulation and manually browse page URL that browses of the broadcast page URL of the broadcast page URL of audio frequency, video, the lyrics;
Simultaneously, parsing module loads and cuts bag module Gains resources solicited message.
In the above-mentioned method, wherein, described parsing module determines that according to browser type in the descriptor of resource the analog form of the artificial browser module of simulation and the load mode of section bag module specifically comprise,
If browser type is the IE browser in the descriptor of resource, parsing module determines that the analog form of the artificial browser module of simulation is the IE analog form, the IE browser is manually browsed in i.e. simulation, utilizes simultaneously the Detour of Microsoft assembly to inject the artificial browser module of simulation cutting the bag module;
If browser type is the chrom browser in the descriptor of resource, parsing module determines that the analog form of the artificial browser module of simulation is the chrom analog form.
In the above-mentioned method, wherein, described section bag module Gains resources solicited message specifically comprise,
Intercept and capture each http request header, and record http request header, URL and socketID; If comprise .wma/.MP3 among the URL, then with this http request header and URL resource request information by default;
Intercept and capture each http head response, if the content of head response comprises audio frequency sign or video sign behind Content-Type, then the http request header that this socketID is corresponding and URL are as required resource request information; Otherwise, if there is the resource request information of acquiescence, with the resource request information of acquiescence as required resource request information; Otherwise, cut the failure of bag module Gains resources solicited message;
Cut the bag module resource request information of obtaining is sent to parsing module.
The present invention also provides a kind of download system of Internet resources, comprises,
Download module is used for to the descriptor of a collection of resource downloading task of database request and this resource, and is used for downloading this resource according to the resource request information that parsing module provides;
Parsing module, be used for to the descriptor of resource downloading task of download module request and this resource, and the artificial browser module of dispatching simulation is simulated artificial browsing page, load to cut simultaneously a bag module Gains resources solicited message, and will cut the resource request information that the bag module obtains and send to download module;
Simulate artificial browser module, be used for simulating artificial browsing page;
Cut the bag module, be used for the Gains resources solicited message, and the resource request information of obtaining is sent to parsing module.
Further, above-mentioned download system also comprises,
Database is used for the resource that the storage download module is downloaded.
Adopt technical scheme of the present invention, automatically the audio frequency on the Internet, video file are carried out the distributed system batch process, greatly improved the download efficiency of audio frequency, video file; The download technology of relatively existing Internet resources has improved the ratio that audio frequency, the real download address of video file and success are downloaded of directly obtaining from the Internet greatly.
Description of drawings
Accompanying drawing described herein is used to provide a further understanding of the present invention, consists of a part of the present invention, and illustrative examples of the present invention and explanation thereof are used for explaining the present invention, do not consist of improper restriction of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of first embodiment of the invention;
Fig. 2 is the system construction drawing of second embodiment of the invention.
Embodiment
In order to make technical problem to be solved by this invention, technical scheme and beneficial effect clearer, clear, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, is not intended to limit the present invention.
As shown in Figure 1, be the first embodiment of the invention flow chart, a kind of method for down loading of Internet resources is provided, specifically comprise,
Step S101, parsing module is to the descriptor of resource downloading task of download module request and this resource;
Particularly, described resource comprises video, audio frequency and the lyrics; The descriptor of described resource comprise broadcast page URL, the video of audio frequency broadcast page URL, the lyrics browse a page URL, resource type, whether mouse click, browser type; Described resource type comprises audio frequency, video, the lyrics; Described browser type comprises IE browser, chrom browser.
All resources of search engine are from the Internet, so must in the information of internet mass, gather the audio frequency and video file.
Particularly, before the descriptor of resource downloading task of download module request and resource, download module is to the descriptor of a collection of resource downloading task of database request and this resource at parsing module.
Step S102, the artificial browser module of parsing module dispatching simulation is simulated artificial browsing page, load to cut simultaneously a bag module Gains resources solicited message, and will cut the resource request information that the bag module obtains and send to download module; Described resource request information comprises http request header, URL and the resource task id of resource.
As an embodiment, specifically comprise,
A. artificial browser module is simulated in initialization;
B. set up User Datagram Protoco (UDP) udp service and cut the bag module communication;
C. parsing module is determined the analog form of the artificial browser of simulation and is cut the load mode of bag module according to browser type in the descriptor of resource;
If browser type is the IE browser in the descriptor of resource, parsing module determines that the analog form of the artificial browser module of simulation is the IE analog form, the IE browser is manually browsed in i.e. simulation, utilizes simultaneously the Detour of Microsoft assembly to inject the artificial browser module of simulation cutting the bag module;
If browser type is the chrom browser in the descriptor of resource, the analog form that parsing module will determine to simulate artificial browser module is the chrom analog form.
D. simulate artificial browser module simulation and manually browse page URL that browses of the broadcast page URL of the broadcast page URL of audio frequency, video, the lyrics;
E. simultaneously, parsing module loads and cuts bag module Gains resources solicited message,
1) intercept and capture each http request header, and record http request header, URL and socketID; If comprise .wma or .MP3 among the URL, then with this http request header and URL resource request information by default;
2) intercept and capture each http head response, if the content of head response comprises audio frequency sign or video sign behind Content-Type, then the http request header that this socketID is corresponding and URL are as required resource request information; Otherwise, if there is the resource request information of acquiescence, with the resource request information of acquiescence as required resource request information; Otherwise, cut the failure of bag module Gains resources solicited message;
3) cut the bag module resource request information of obtaining is sent to parsing module.
Step S103, the resource request information that download module provides according to parsing module is downloaded this resource.
Particularly, after the resource request information that download module provides according to parsing module was downloaded this resource, the resource that download module is downloaded was stored to database.
As shown in Figure 2, the present invention also provides a kind of download system of Internet resources, comprise,
Download module 201 is used for to the descriptor of a collection of resource downloading task of database request and this resource, and downloads this resource according to the resource request information that parsing module provides;
Parsing module 202, be used for to the descriptor of resource downloading task of download module request and this resource, and the artificial browser module of dispatching simulation is simulated artificial browsing page, load to cut simultaneously a bag module Gains resources solicited message, and will cut the resource request information that the bag module obtains and send to download module;
Simulate artificial browser module 203, be used for simulating artificial browsing page;
Cut bag module 204, be used for the Gains resources solicited message, and the resource request information of obtaining is sent to parsing module.
Further, above-mentioned download system also comprises,
Database 205 is used for the resource that the storage download module is downloaded.
Adopt above-mentioned technical scheme, automatically the audio frequency on the Internet, video file are carried out the distributed system batch process, greatly improved the download efficiency of audio frequency, video file; The download technology of relatively existing Internet resources has greatly improved and has directly obtained the ratio that audio frequency, the real download address of video file and success are downloaded from network.
Above-mentioned explanation illustrates and has described a preferred embodiment of the present invention, but as previously mentioned, be to be understood that the present invention is not limited to the disclosed form of this paper, should not regard the eliminating to other embodiment as, and can be used for various other combinations, modification and environment, and can in invention contemplated scope described herein, change by technology or the knowledge of above-mentioned instruction or association area.And the change that those skilled in the art carry out and variation do not break away from the spirit and scope of the present invention, then all should be in the protection range of claims of the present invention.

Claims (9)

1. the method for down loading of Internet resources is characterized in that, comprise,
Parsing module is to the descriptor of resource downloading task of download module request and this resource;
The artificial browser module of parsing module dispatching simulation is simulated artificial browsing page, load to cut simultaneously a bag module Gains resources solicited message, and will cut the resource request information that the bag module obtains and send to download module;
The resource request information that download module provides according to parsing module is downloaded this resource.
2. method for down loading according to claim 1 is characterized in that,, also comprise before the descriptor of resource downloading task of download module request and this resource at parsing module,
Download module is to the descriptor of a collection of resource downloading task of database request and this resource.
3. method for down loading according to claim 1 is characterized in that, after the resource request information that download module provides according to parsing module is downloaded this resource, also comprise,
The resource that download module is downloaded is stored to database.
4. according to claim 1 to 3 arbitrary described method for down loading, it is characterized in that, described resource comprises video, audio frequency and the lyrics; The descriptor of described resource comprise broadcast page URL, the video of audio frequency broadcast page URL, the lyrics browse a page URL, resource type, whether mouse click, browser type; Described resource type comprises audio frequency, video, the lyrics; Described browser type comprises IE browser, chrom browser; Described resource request information comprises http request header, URL and the resource task id of resource.
5. according to claim 1 to 3 arbitrary described method for down loading, it is characterized in that, the artificial browser module of described parsing module dispatching simulation is simulated artificial browsing page, loads simultaneously to cut bag module Gains resources solicited message, specifically comprise,
Artificial browser module is simulated in initialization;
Set up User Datagram Protoco (UDP) udp service and cut the bag module communication;
Parsing module is determined the analog form of the artificial browser module of simulation and is cut the load mode of bag module according to browser type in the descriptor of resource;
Simulate artificial browser module simulation and manually browse page URL that browses of the broadcast page URL of the broadcast page URL of audio frequency, video, the lyrics, simultaneously, parsing module loads and cuts bag module Gains resources solicited message.
6. method for down loading according to claim 5 is characterized in that, described parsing module determines that according to browser type in the descriptor of resource the analog form of the artificial browser module of simulation and the load mode of section bag module specifically comprise,
If browser type is the IE browser in the descriptor of resource, parsing module determines that the analog form of the artificial browser module of simulation is the IE analog form, the IE browser is manually browsed in i.e. simulation, utilizes simultaneously the Detour of Microsoft assembly to inject the artificial browser module of simulation cutting the bag module;
If browser type is the chrom browser in the descriptor of resource, parsing module determines that the analog form of the artificial browser module of simulation is the chrom analog form.
7. method for down loading according to claim 5 is characterized in that, described section bag module Gains resources solicited message specifically comprise,
Intercept and capture each http request header, and record http request header, URL and socketID; If comprise .wma or .MP3 among the URL, then with this http request header and URL resource request information by default;
Intercept and capture each http head response, if the content of head response comprises audio frequency sign or video sign behind Content-Type, then the http request header that this socketID is corresponding and URL are as required resource request information; Otherwise, if there is the resource request information of acquiescence, with the resource request information of acquiescence as required resource request information; Otherwise, cut the failure of bag module Gains resources solicited message.
8. the download system of Internet resources is characterized in that, comprise,
Download module is used for to the descriptor of a collection of resource downloading task of database request and this resource, and downloads this resource according to the resource request information that parsing module provides;
Parsing module, be used for to the descriptor of resource downloading task of download module request and this resource, and the artificial browser module of dispatching simulation is simulated artificial browsing page, load to cut simultaneously a bag module Gains resources solicited message, and will cut the resource request information that the bag module obtains and send to download module;
Simulate artificial browser module, be used for simulating artificial browsing page;
Cut the bag module, be used for the Gains resources solicited message, and the resource request information of obtaining is sent to parsing module.
9. download system according to claim 8 is characterized in that, also comprise,
Database is used for the resource that the storage download module is downloaded.
CN201210353411.XA 2012-09-21 2012-09-21 The method for down loading of a kind of Internet resources and system Active CN102857575B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210353411.XA CN102857575B (en) 2012-09-21 2012-09-21 The method for down loading of a kind of Internet resources and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210353411.XA CN102857575B (en) 2012-09-21 2012-09-21 The method for down loading of a kind of Internet resources and system

Publications (2)

Publication Number Publication Date
CN102857575A true CN102857575A (en) 2013-01-02
CN102857575B CN102857575B (en) 2016-12-21

Family

ID=47403763

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210353411.XA Active CN102857575B (en) 2012-09-21 2012-09-21 The method for down loading of a kind of Internet resources and system

Country Status (1)

Country Link
CN (1) CN102857575B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105141638A (en) * 2015-09-29 2015-12-09 北京奇艺世纪科技有限公司 Method and device for downloading video resources
CN107888940A (en) * 2016-09-30 2018-04-06 法乐第(北京)网络科技有限公司 Video and its related resource method for down loading and system
CN110392022A (en) * 2018-04-19 2019-10-29 阿里巴巴集团控股有限公司 A kind of network resource access method, computer equipment, storage medium
CN111404898A (en) * 2020-03-06 2020-07-10 北京创世云科技有限公司 Anti-stealing-link method and device, storage medium and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102360349A (en) * 2011-07-21 2012-02-22 深圳市万兴软件有限公司 Method and device for acquiring audio/video link address in webpage
CN102510536A (en) * 2011-12-21 2012-06-20 中国传媒大学 Method for downloading videos and audios of internet

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102360349A (en) * 2011-07-21 2012-02-22 深圳市万兴软件有限公司 Method and device for acquiring audio/video link address in webpage
CN102510536A (en) * 2011-12-21 2012-06-20 中国传媒大学 Method for downloading videos and audios of internet

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105141638A (en) * 2015-09-29 2015-12-09 北京奇艺世纪科技有限公司 Method and device for downloading video resources
CN105141638B (en) * 2015-09-29 2018-08-03 北京奇艺世纪科技有限公司 A kind of method for down loading and device of video resource
CN107888940A (en) * 2016-09-30 2018-04-06 法乐第(北京)网络科技有限公司 Video and its related resource method for down loading and system
CN110392022A (en) * 2018-04-19 2019-10-29 阿里巴巴集团控股有限公司 A kind of network resource access method, computer equipment, storage medium
CN110392022B (en) * 2018-04-19 2022-04-05 阿里巴巴集团控股有限公司 Network resource access method, computer equipment and storage medium
CN111404898A (en) * 2020-03-06 2020-07-10 北京创世云科技有限公司 Anti-stealing-link method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN102857575B (en) 2016-12-21

Similar Documents

Publication Publication Date Title
CN104301436B (en) Content to be displayed push, subscription, update method and its corresponding device
CN101247402B (en) Multimedia files downloading and broadcasting system and method
CN102333122B (en) Downloaded resource provision method, device and system
CN103384275B (en) Cross-terminal downloading method, system cloud server and terminal
US8180376B1 (en) Mobile analytics tracking and reporting
CN105045887B (en) The system and method for mixed mode cross-domain data interaction
CN103810176B (en) A kind of info web prefetches access method and device
CN103338249B (en) Caching method and device
CN102393857A (en) Method and system for local call based on web page
CN103685590B (en) Obtain the method and system of IP address
CN104572843B (en) The loading method and device of a kind of page
US20150304412A1 (en) Browser and system for download and download method
US20130117351A1 (en) Efficient transfer of web content to different user platforms
CN108319662A (en) Page processing method, device, electronic equipment and readable storage medium storing program for executing
CN106790593B (en) Page processing method and device
CN102857575A (en) Download method and system for Internet resources
CN103077191B (en) Adaptive Web platform audio playing method and device
CN105635201A (en) Application starting method and application starting system based on pushed information
WO2006038987A3 (en) A method and apparatus for assigning access control levels in providing access to networked content files
CN107370628B (en) Log processing method and system based on embedded points
CN103593233A (en) Method and system for pushing software information
CN102185917A (en) Method and system for adaptation between server and mobile terminal, and server adaptation device
CN104394512A (en) Message push system
CN105100291A (en) Resource address generating method, device and system
CN103607454A (en) Method for setting private proxy server for Android system browser

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address

Address after: 518057 C Building 5, Nanshan District software industry base, Shenzhen, Guangdong 403-409, China

Patentee after: Shenzhen easou world Polytron Technologies Inc

Address before: 518026 Guangdong city of Shenzhen province Futian District Binhe Road and CaiTian Road Interchange Union Square Tower A, A5501-A

Patentee before: Shenzhen Yisou Science & Technology Development Co., Ltd.

CP03 Change of name, title or address