CN103391312A - Resource offline downloading method and device - Google Patents

Resource offline downloading method and device Download PDF

Info

Publication number
CN103391312A
CN103391312A CN201310259489XA CN201310259489A CN103391312A CN 103391312 A CN103391312 A CN 103391312A CN 201310259489X A CN201310259489X A CN 201310259489XA CN 201310259489 A CN201310259489 A CN 201310259489A CN 103391312 A CN103391312 A CN 103391312A
Authority
CN
China
Prior art keywords
resource
line
download
network operator
disappears
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310259489XA
Other languages
Chinese (zh)
Other versions
CN103391312B (en
Inventor
陈夺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing 360 Zhiling Technology Co ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201310259489.XA priority Critical patent/CN103391312B/en
Publication of CN103391312A publication Critical patent/CN103391312A/en
Application granted granted Critical
Publication of CN103391312B publication Critical patent/CN103391312B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention relates to a resource offline downloading method and device. The method comprises determining a network operator to whom resources belong according to the networks attributes of the resources needing downloading offline; selecting offline downloading servers from offline downloading server clusters of the network operation according to a set task control strategy, wherein the offline downloading servers are used for downloading the resources offline; distributing offline downloading tasks of the resources to the selected offline downloading servers for downloading offline. According to the resource offline downloading method and device, the network attributes of the resources to be downloaded offline are analyzed and the network operator to whom the resources belong is determined, and the tasks are distributed to the offline downloading servers of the different offline downloading server clusters according to the task control strategy for processing, so that offload downloading speed and processing capacity of the servers can be improved, pressure of the offload downloading servers can be reduced, and maximum utilization of the resources can be achieved.

Description

Resource off-line method for down loading and device
Technical field
The present invention relates to the network resource management field, especially about resource off-line method for down loading and device, to carry out the off-line download management of Internet resources.
Background technology
It is exactly that the server generation of download tool is downloaded in advance for the user that off-line is downloaded.After the user sent the off-line download command, the off-line download server just can be under the not online state of user, the shared memory space that the resource downloading that the user is needed provides to the off-line download server.Then the user can be from this shared memory space at high speed the resource downloading of needs to this locality.Off-line is downloaded the time of the on-hook wait that can save the user, the most important thing is to vacate the network bandwidth and does other thing.
Off-line is downloaded and is mainly contained three advantages:
1. at a high speed:
Utilize the powerful bandwidth of off-line download server, can faster resource/file be downloaded to the high in the clouds storage (each user of high in the clouds storage can upload file to this high in the clouds storage or utilize off-line to download resource conservation is stored to this high in the clouds, etc.) of user side than user side internet off-line download server.After download is completed, the user can be resource/file from the stable machine that downloads to you of this high in the clouds storing high-speed (namely according to user's Demand time with resource, file etc. synchronously to subscriber's local).Due to the network of off-line download server bandwidth much larger than user side, the off-line download server can provide high-speed downloads speed.
2. stable:
The off-line download server can provide stable speed of download for the user.And if the user is subject to the impact of network environment while directly downloading, as, the subordinate's of A operator user is from B operator downloaded resources, and speed of download may be in the state of a fluctuation, can not expect.And the off-line download can provide stable speed.
3. save time:
The off-line download server can be saved the on-hook time, and download at a high speed is provided, and makes the download time utilance higher.
Existing off-line download technology scheme, be illustrated in figure 1 as the fundamental diagram of off-line download server in prior art, the user sends the off-line download instruction to Linux virtual server (LVS) 11, load dispatch device in Linux virtual server 11 sends to one group of off-line download server 12-1 with user's download request, 12-2,12-3, therefrom select an off-line download server to carry out the off-line downloading task.LVS adopts IP load-balancing technique and content-based request distribution technology, uses a plurality of off-line download servers required resource of download user respectively.The load dispatch utensil has good throughput, request balancedly can be transferred on different off-line download servers and carried out, and the load dispatch device can drop off the fault of line download server by automatic shield, thereby one group of off-line download server is formed high performance, a high available Virtual Server Cluster.The structure of whole server cluster is transparent to the client, and need not to revise the program of client and server end.
The network environment (Virtual network operator) that exists due to resource has uncertainty, may be at Netcom's net environment, and also may be under telecommunication network environment.Use the off-line download server cost of two-wire too high again fully, and use fixing off-line download server just likely to cause the cross operator downloaded resources, for example use the resource of the off-line download server telecommunication network environment of Netcom, speed will be very slow, affects the quality of off-line download service.Common LVS scheduling strategy is as long as the off-line download server does not quit work, will be to its transmission task, and the every loading condition according to the off-line download server that can't be real-time is dispatched.So this scheme is only suitable in the stress-free scene of off-line download server load.
And, because major part in the resource of identical URL(uniform resource locator) (URL) correspondence is all repetition, and existing off-line download server can not judge whether resource corresponding to same URL changes (if the Internet resources of URL not being done to disappear, heavily processing), can cause repeated downloads, and then cause a lot of extra expenses.
Summary of the invention
Defect for above-mentioned prior art, the technical problem that technical scheme of the present invention mainly solves is to provide a kind of resource off-line method for down loading and device, distribute the off-line downloading task according to the network attribute of resource, to solve across a network environmental resource that prior art exists, download and repeat resource downloading and the inefficient problem of off-line download server that causes.
According to one aspect of the present invention, a kind of resource off-line method for down loading is provided, it comprises: the network attribute of the resource of off-line download is determined the Virtual network operator that resource belongs to as required; According to the task regulating strategy of setting, select the off-line download server from the off-line download server cluster of Virtual network operator, this off-line download server is used for that resource is carried out off-line and downloads; The off-line downloading task of this resource is distributed to selected off-line download server to be downloaded to carry out off-line.
Wherein, the network attribute of the resource of off-line download determines that the Virtual network operator that described resource belongs to comprises as required: obtain the domain-name information corresponding with the uniform resource position mark URL of described resource, and parse the IP address corresponding with domain name information; To obtain the Virtual network operator corresponding with described IP address and it is defined as the Virtual network operator that described resource is belonged to, described database stores Virtual network operator and IP address thereof according to the IP address lookup database corresponding with domain name information.
Wherein, the task regulating strategy is for distributing to the off-line downloading task of described resource the off-line download server of present load weights minimum.
Wherein, the load weights are: k1*cpu use amount+k2* disk surplus+k3* internal memory surplus+k4* bandwidth resources, and this k1 is weights corresponding to cpu use amount, k2 is weights corresponding to disk surplus, k3 is weights corresponding to internal memory surplus, and described k4 is weights corresponding to bandwidth resources.
Wherein, before the network attribute of the resource downloaded of off-line is determined Virtual network operator that described resource belongs to as required, also comprise: obtain the described heavy feature that disappears that needs the resource of off-line download, the described heavy feature that disappears refers to that the identify label of described resource and its URL according to described resource, size and contents fragment generate; Judge that the described heavy feature that disappears of resource that off-line downloads that needs is whether identical with the disappear heavy feature that disappears of the resource that the off-line of storage is downloaded in heavy table of the overall situation, and describedly needing time interval between resource that resource that off-line downloads and described off-line download whether less than the setting-up time value, disappear heavy table of the described overall situation stores the heavy feature that disappears of the resource that off-line downloads; Identical and time interval of heavy feature, less than the setting-up time value, is not downloaded the described resource that needs off-line to download if disappear; Otherwise, set up the described off-line downloading task that needs the resource of off-line download.
Wherein, the described heavy feature that disappears generates through the following steps: extract a 100k content of described resource, middle random site 100k content, the afterbody 100k content content segments as resource; The URL of described resource, resource size and described content segments are spliced into character string; Described character string is carried out MD5 calculate to obtain the described heavy feature that disappears.
According to one aspect of the present invention, a kind of resource off-line download apparatus is provided, it comprises: the Virtual network operator determination module is suitable for the network attribute of the resource of off-line download as required and determines the Virtual network operator that described resource belongs to; The off-line download server is selected module, is suitable for according to the task regulating strategy of setting, and selects the off-line download server from the off-line download server cluster of described Virtual network operator, and wherein, described off-line download server is used for that resource is carried out off-line and downloads; Task execution module, be suitable for that the off-line downloading task of described resource is distributed to selected off-line download server and download to carry out off-line.
Wherein, described Virtual network operator determination module comprises: the first acquisition module is suitable for obtaining the domain-name information corresponding with the uniform resource position mark URL of described resource, and parses the IP address corresponding with domain name information; The second acquisition module, be suitable for the basis IP address lookup database corresponding with domain name information to obtain the Virtual network operator corresponding with described IP address and it is defined as the Virtual network operator that described resource is belonged to, described database stores Virtual network operator and IP address thereof.
Wherein, described task regulating strategy is for distributing to the off-line downloading task of described resource the off-line download server of present load weights minimum.
Wherein, the load weights are: k1*cpu use amount+k2* disk surplus+k3* internal memory surplus+k4* bandwidth resources, and this k1 is weights corresponding to cpu use amount, k2 is weights corresponding to disk surplus, k3 is weights corresponding to internal memory surplus, and k4 is weights corresponding to bandwidth resources.
Wherein, the heavy processing module that disappears, be suitable for obtaining the described heavy feature that disappears that needs the resource of off-line download, and the described heavy feature that disappears refers to that the identify label of described resource and its URL according to described resource, size and contents fragment generate; Judge that the described heavy feature that disappears of resource that off-line downloads that needs is whether identical with the disappear heavy feature that disappears of the resource that the off-line of storage is downloaded in heavy table of the overall situation, and describedly needing time interval between resource that resource that off-line downloads and described off-line download whether less than the setting-up time value, disappear heavy table of the described overall situation stores the heavy feature that disappears of the resource that off-line downloads; Identical and time interval of heavy feature, less than the setting-up time value, is not downloaded the described resource that needs off-line to download if disappear; Otherwise, set up the described off-line downloading task that needs the resource of off-line download.
Wherein, this device also comprises the heavy feature generation module that disappears, and it comprises: extraction unit is suitable for extracting a 100k content of described resource, middle random site 100k content, the afterbody 100k content content segments as resource; Concatenation unit, be suitable for the URL of described resource, resource size and described content segments are spliced into character string; Computing unit, be suitable for that described character string is carried out MD5 and calculate.
, by solution of the present invention, have following beneficial effect:
The present invention is by the network attribute of analyzing the off-line downloaded resources and the Virtual network operator of determining the resource ownership, and according to the task regulating strategy, the off-line download server that task is assigned in heterogeneous networks attribute cluster is processed, avoided across the slow defect of operator's speed of download, significantly improve the off-line speed of download, reduced off-line download server pressure, improved server handling ability.And the task regulating strategy is according to the load control technique, task sent on the machine of least-loaded and processes, thereby reached the peak use rate of resource.
The present invention is also based on the heavy strategy of disappearing of resource characteristic, resource to consolidated network address url disappears heavily, after namely a url is downloaded and is saved in server, the request of other identical url will be directly successful, needn't again download preservation once again, avoided the repeated downloads of same asset, reduce server stress, improved server significant response ability.
Description of drawings
In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, in below describing embodiment, the accompanying drawing of required use is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain according to these accompanying drawings other accompanying drawing:
Fig. 1 is the fundamental diagram of off-line download server in prior art;
Fig. 2 is the flow chart according to the resource off-line method for down loading of one embodiment of the invention;
Fig. 3 is the flow chart according to the resource off-line method for down loading of one embodiment of the invention;
Fig. 4 is the structural representation according to the resource off-line download apparatus of one embodiment of the invention;
Fig. 5 is the structural representation according to the resource off-line download apparatus of one embodiment of the invention;
Fig. 6 is the structural representation according to the resource off-line download apparatus of one embodiment of the invention.
Embodiment
Exemplary embodiment of the present invention is described below with reference to accompanying drawings in more detail.Although shown exemplary embodiment of the present invention in accompanying drawing, yet should be appreciated that and can realize the present invention and the embodiment that should do not set forth limits here with various forms.On the contrary, it is in order to understand the present invention more thoroughly that these embodiment are provided, and can with scope of the present invention complete convey to those skilled in the art.
Fig. 2 is that it comprises: step 21 according to the flow chart of the resource off-line method for down loading of one embodiment of the invention: the network attribute of the resource of off-line download is determined the Virtual network operator that described resource belongs to as required; Step 22:, according to the task regulating strategy of setting, select the off-line download server from the off-line download server cluster of described Virtual network operator, wherein, described off-line download server is used for that resource is carried out off-line and downloads; Step 23: the off-line downloading task of described resource is distributed to selected off-line download server to carry out off-line, download.
Hence one can see that, and while according to the technical scheme of this embodiment, carrying out the download of resource off-line, the network attribute of the resource of off-line download is determined the Virtual network operator that it belongs at first as required, determines namely which Virtual network operator is the resource of wanting off-line to download belong to.After determining Virtual network operator,, according to the task regulating strategy of setting, select the off-line download server from the off-line download server cluster of determined Virtual network operator.After choosing the off-line download server, just the off-line downloading task of this resource can be distributed to this off-line download server, by this off-line download server, this resource be carried out off-line and download.The Virtual network operator that the resource that needs just off-line to download due to determined Virtual network operator in this programme belongs to, so the problem of having avoided across a network operator to download.And the network operation business who belongs to from resource downloads this resource can significantly improve the speed that off-line is downloaded, and reduces the pressure of off-line download server.
According to one embodiment of the present of invention, the network attribute of the resource of off-line download determines that the Virtual network operator that described resource belongs to may further include as required:
Obtain the domain-name information corresponding with the uniform resource position mark URL of described resource, and parse the IP address corresponding with domain name information;
To obtain the Virtual network operator corresponding with described IP address and it is defined as the Virtual network operator that described resource is belonged to, described database stores Virtual network operator and IP address thereof according to the IP address lookup database corresponding with domain name information.
In the present embodiment, be to obtain the domain-name information corresponding with this URL according to the URL of resource, can parse the IP address corresponding with this domain-name information according to domain-name information., due to the information that stores Virtual network operator and IP address thereof in database, therefore, according to the IP address, just can find the Virtual network operator corresponding with this IP address in database, thereby can obtain the corresponding Virtual network operator of this resource.Certainly, can also adopt the present known or in the future known any mode in this area for the definite of Virtual network operator.
For the task regulating strategy, namely how the off-line downloading task is distributed, distribute to which off-line download server.According to one embodiment of the present of invention, the task regulating strategy is the off-line downloading task of described resource to be distributed to the off-line download server of present load weights minimum.
Wherein, described load weights can adopt following formula to calculate:
Load weights=k1*cpu use amount+k2* disk surplus+k3* internal memory surplus+k4* bandwidth resources, wherein, k1 is weights corresponding to cpu use amount, and k2 is weights corresponding to disk surplus, and k3 is weights corresponding to internal memory surplus, and k4 is weights corresponding to bandwidth resources.
According to one embodiment of the present of invention, before the network attribute of the resource that described off-line is as required downloaded was determined Virtual network operator that described resource belongs to, this embodiment also comprised:
Obtain the described heavy feature that disappears that needs the resource of off-line download, the described heavy feature that disappears refers to that the identify label of described resource and its URL according to described resource, size and contents fragment generate;
Judge that the described heavy feature that disappears of resource that off-line downloads that needs is whether identical with the disappear heavy feature that disappears of the resource that the off-line of storage is downloaded in heavy table of the overall situation, and describedly needing time interval between resource that resource that off-line downloads and described off-line download whether less than the setting-up time value, disappear heavy table of the described overall situation stores the heavy feature that disappears of the resource that off-line downloads;
Identical and time interval of heavy feature, less than the setting-up time value, is not downloaded the described resource that needs off-line to download if disappear; Otherwise, set up the described off-line downloading task that needs the resource of off-line download.
In brief, this embodiment disappears and heavily processes the resource that needs off-line to download, and avoids the problem of repeated downloads.
Disappear heavily to process and relate to the overall situation and disappear heavy and heavy two aspects that disappear, part.
The overall situation disappears heavily: to all users as seen, can avoid downloading the resource (namely in order to determine to ask the resource of downloading whether to be downloaded, avoiding repeated downloads to increase server stress) that other users had downloaded.Particularly, if a resource has been downloaded by the someone, the heavy feature that disappears of this resource can be recorded to a visible table of the overall situation, disappear and heavily show as the overall situation, every other people is when downloaded resources afterwards, can use the heavy feature that disappears of the resource that will download to remove to inquire about the visible table of this overall situation (as, the overall situation heavily table that disappears), if find that resource exists, needn't repeated downloads.
Wherein, the heavy feature that disappears can refer to the identify label of this resource.Can think, the identical a plurality of resources of the heavy feature that disappears, its content is identical.Can when Gains resources full content not, check the local identical resource that whether has.For example file A has been preserved in this locality, and file is larger, at this moment downloads a file resource from network again, but does not know whether be exactly file A,, if it is identical with local file A that file is all downloaded just discovery, has so just consumed larger resource.By the heavy feature of disappearing of comparison resource, just needn't download all the elements of this document resource, just can know whether this locality exists same file, thereby can prevent repeated downloads.
Particularly, the overall situation table that disappears heavily is a key-value structure.Wherein, key namely disappears and weighs feature, can comprise resource address (URL), resource size and resource characteristic (as the resource content fragment).And the value value is fixed as 1, be used for this key of expression and disappear and heavily show to exist in the overall situation, and this resource is downloaded and exists.When the user submits an off-line downloading task (task that the user submits to can embody with the off-line download request) to, can be by the resource address of resource corresponding to this task, resource size and resource characteristic are spliced into a character string, with this character string as key, and search in the overall situation disappears heavily table with the resource of this key coupling and whether exist, if exist, disappear heavily, it is the required resource of untrue download user, directly the prompting user downloads successfully, and meets user's demand with the resource identical with this resource of off-line download before; If there is no, the overall situation disappears and weighs failure (need not in other words the overall situation disappears heavily), at this moment the true required resource of download user, and when download is completed, based on the resource updates overall situation of this download heavily table that disappears, in disappearing heavily table, the overall situation adds the heavy feature of disappearing of this resource, if follow-uply need off-line to download with this to disappear while weighing the resource that feature is complementary and need not true download.The heavy feature that disappears just adds the overall situation to and disappears and heavily show after the resource actual download is completed, guaranteed like this downloading task of a user to a resource, is not subjected to the impact of another user on the failure of the downloading task of this same resource.For example: first user is when downloading a resource, the heavy feature that just will disappear is put into the overall situation heavily table that disappears, the second user also asks to download this same resource can heavily exist this resource in table because of in the overall situation, disappearing, and is not that the second user truly downloads (the second subscriber's local obtains).In case the first user failed download, the second user will inevitably failed download.
for example, off-line downloading task according to user's downloaded resources A of user request, obtain the URL of this resource A according to this off-line downloading task, size and contents fragment, and according to the URL of this resource A, size and contents fragment generate the heavy feature key (character string) that disappears, disappear and heavily inquire about the resource that is complementary with this key in table in the overall situation based on this heavy feature key that disappears, as, inquire about the key ' identical with this key, if in the overall situation, disappear and heavily inquire this key ' in table, represent that resource A was downloaded, need not downloaded resources A again, if do not inquire this key ', represent that resource A was not downloaded, need downloaded resources A, use for the user.
Part disappears heavily: only to the individual as seen, can avoid individual's submission task repeatedly to cause repeated downloads.In the situation that there have the overall situation to disappear to be heavy, also needing the part heavy reason that disappears is only after the downloading task of this resource is successfully completed, just corresponding information (disappear heavy feature) can be present in the overall situation disappears heavily in table, after namely a resource was downloaded complete fully, the heavy feature that disappears of this resource just can be added to the overall situation and disappear heavily in table.So, before file has not been downloaded, in case same user submits an identical URL address repeatedly to, in order to download identical resource (namely repeatedly submitting the downloading task of a plurality of identical same resources of request to), if do not have part to disappear heavily, also can cause resource repeatedly to be downloaded, strengthening server stress affects download efficiency.Part disappears to weigh and does not use the heavy feature of disappearing of resource to disappear heavily, but directly uses the URL address to disappear heavily,, if same user submits identical URL address to, disappears heavily.
A kind of mode, can disappear before heavy and carry out part and disappear heavily carrying out the overall situation.particularly, can be unique user ID of each user assignment in advance, utilize user ID and resource address (URL) to limit a user's downloading task, and search this user according to user ID and resource address and whether have identical downloading task, that is to say, can the server that receives user's off-line downloading task (as, task server) inquiring user task list in, use the URL of this user ID and resource as keyword, search this user and whether submitted identical downloading task in user task list, if find (namely, there is identical task for this user), represent that this user exists the downloading task of this resource, the task of returning exists information to the user, otherwise (namely this task does not exist) carried out the overall situation again and disappeared heavily.
For example, first user and the second user off-line simultaneously download the www.t.com/test.doc resource, at this moment first user and the second user have the downloading task of this this resource of correspondence, before downloading task is not completed, i.e. (resource downloaded to the offline service device by off-line before), the overall situation disappears and does not heavily have the heavy feature of disappearing of this resource in table, and certainly, the failure of such the second user's downloading task can not affect the download of first user to this resource.Do not have part to disappear while weighing, if first user is repeatedly submitted the downloading task of downloading this resource to, there will be so a plurality of identical downloading task corresponding to this user in user task list, this downloading task of first user submission repeatedly, will cause first user that the task of a lot of request same asset is arranged in user task list.In a single day disappear heavily and carry out part, can avoid same user repeatedly to download the problem of same resource, prevent from increasing server stress.
Another kind of mode, also can after the overall situation disappear heavily, carry out part and disappear heavily.
Generation for the heavy feature that disappears, according to one embodiment of the present of invention, can comprise: extract a 100k content of described resource, middle random site 100k content, the afterbody 100k content content segments as resource; The URL of described resource, resource size and described content segments are spliced into character string; Described character string is carried out MD5 calculate to obtain the described heavy feature that disappears.For example: URL is Www.t.com/test.docCan obtain resource size corresponding to this URL 5000, three resource fragments that resource head aaa... (100k byte data), middle bbb... (100k byte data) and afterbody ccc (100k byte data) are corresponding, disappearing heavily is characterized as MD5 and is:
www.t.com/test.doc5000aaa...bbb...ccc...”。
Fig. 3 is the flow chart according to the resource off-line method for down loading of one embodiment of the present of invention.
In this embodiment, further can comprise download request is disappeared heavily and to process etc., this disappears heavily to process and comprises that the overall situation disappears heavy and the part weight that disappears.Wherein the resource of request download can be the Internet resources that request is downloaded, the content that can download on network in other words, such as: game, software, music, text etc.
In this embodiment, at first, according to the download request from the user, the resource that will download is resolved checking, thereby determine the Virtual network operator of ownership, heavily process and also adopt before in the end determining to disappear, avoid the repeated downloads of same asset to reduce server stress, to improve the server responding ability.
Step S001, receive the off-line download request that the user sends.
Step S002, the resource that will download is carried out URL to be resolved and verifies, obtain the domain-name information corresponding to uniform resource position mark URL of described resource, parse IP address corresponding to domain name information by domain name system DNS, and send checking request (as checking whether this IP is arranged, this IP is correct etc.)., if this authentication failed in step S002, send authentication failed message, notify the user, as step S004.
If being verified in step S002, return to described resource file name, resource size and target domain name, and enter next step, i.e. step S003.
At step S003 place, can verify the user,, as the checking of user identity etc.,, if authentication failed message is sent in the user rs authentication failure, notify the user, as to step S004.If user rs authentication is passed through, can carry out the overall situation to URL and disappear heavy and part disappears heavily (in one embodiment, to URL, can first carry out the overall situation heavy part weight that disappears that carries out again that disappears), further,, if user rs authentication is passed through, enter step S005.
Step S005, judge whether that URL is carried out the overall situation to disappear heavily.If be judged as "Yes", the Internet resources of the described URL of meaning were downloaded namely and were existed by other users, cancelled download request, as step S011.Do not disappear heavily if do not need to carry out the overall situation, namely be judged as "No", enter into step S006.
Step S006, judge whether to carry out part and disappear heavily.If judgement needs part to disappear heavily (this user submits the request of repetition to), namely "Yes", enter step S011, cancels download request, and during the notice user task carried out.If judge that described URL does not need to carry out part and disappears heavily (certainly not needing the overall situation to disappear heavily) yet, i.e. the judgement of step S006 is also "No", and for this download request, the task server creation task, as step S007.The initial condition of creation task is " in task queue ".
The task that step S007 creates, will send to corresponding off-line download server cluster and go, and namely find the Virtual network operator of resource ownership., as step S008, remove exactly to determine off-line download server cluster (Virtual network operator of resource ownership).Each Virtual network operator and corresponding IP address thereof can be pre-stored in database (as the IP storehouse).And at abovementioned steps S002, the uniform resource position mark URL that download request is carried out described resource is resolved domain-name information DNS corresponding to this URL obtain, and and then IP address corresponding to this DNS by this dns resolution is obtained.At step S008, the pre-stored information of the described database of this IP address lookup that can be corresponding according to domain name information D NS (IP with corresponding operator), obtain the Virtual network operator corresponding with described IP address, this Virtual network operator is defined as Virtual network operator that described resource belongs to (such as Netcom, telecommunications, education network, mobile etc.), and then obtain off-line download server cluster number corresponding to this Virtual network operator, and an off-line download server in definite off-line download server cluster is carried out downloading task.Particularly, can set in advance the corresponding table of an off-line download server cluster and Virtual network operator.For example, telecommunications correspondence 3, No. 5 off-line download server clusters, Netcom's correspondence 2, No. 4 off-line download server clusters, when the network attribute of resource is telecommunications, can selects at random an off-line download server and carry out the off-line downloading task from off-line download server cluster corresponding to telecommunications number, also can select an off-line download server according to certain rule (as, minimum load etc.).
Before the network attribute of the resource that off-line is as required downloaded is determined Virtual network operator that described resource belongs to, can also obtain the heavy feature that disappears of the resource that off-line downloads, the resource that described off-line is downloaded disappear heavy characteristic storage in the overall situation disappears heavily table.
like this, in step S005, obtain described user request, need the disappearing of resource that off-line is downloaded to weigh feature, while whether carrying out judgement that the overall situation disappears heavy, judge that the described heavy feature that disappears of resource that off-line downloads that needs is whether identical with the disappear heavy feature that disappears of the resource that the off-line of storage is downloaded in heavy table of the described overall situation, certainly, can also be simultaneously at this, judge and describedly need time interval between resource that resource that off-line downloads and described off-line download whether less than the time value of setting (disappearing of will being described below of time value weighs the ageing of feature and describe)., if be judged as "Yes", cancel to download that described to need the request of the resource that off-line downloads be step S011.Otherwise, set up that described to need the off-line downloading task of the resource that off-line downloads be step S007.
Utilize the heavy processing (be called the overall situation disappear heavy strategy) that disappears of this global resource, the pressure of avoiding repeated downloads to cause server, it mainly utilizes the aforesaid heavy feature that disappears to realize.The heavy feature that disappears can generate according to URL, size and the contents fragment of described resource.Such as: can extract a 100k content of resource, middle random site 100k content, afterbody 100k content, as the fragment of resource content, are spliced into a character string in conjunction with resource URL and resource size, then this character string is generated a MD5 characteristic value.Give one example: the corresponding URL of the resource that the user need to download is Www.t.com/test.docCan obtain resource size corresponding to this URL, three resource fragments that resource head, centre and afterbody are corresponding, as, resource size is 5000, slice header: aaa... (100k byte data), centre: bbb... (100k byte data), afterbody: ccc (100k byte data), disappear heavily be characterized as md5 (" Www.t.com/test.doc5000aaa...bbb...ccc..."), further, can, by resource size corresponding to header acquisition request URL of http, can pass through the partial content of the range agreement Gains resources of http.When the MD5 characteristic value Already in the overall situation disappear heavily in table, it is carried out the overall situation and disappears heavy and repeated downloads again.
When resource being disappeared heavily, can only to the resource that belongs to the resource type in the type white list, disappear heavily.Further, resource type can be the file type of requested resource.This document type can also judge according to extension name, for example, picture/mb-type, extension name can be .jpg, gif etc.Being in the resource type in the type white list, can be the type of the resource that seldom is modified, such as picture, video, software program etc.
In one embodiment, the heavy feature that disappears can effective property, and for example its term of validity can be made as a week (only illustrate, not limit the invention) herein, crosses after date, and the heavy resource that disappears need to be downloaded again.Validity for the heavy feature that disappears, can disappear and heavily show to realize by the overall situation, particularly, the heavy feature that disappears when the resource of having obtained off-line download, the resource that just described off-line can be downloaded disappear heavy characteristic storage in the overall situation disappears heavily table, and the overall situation table that disappears is heavily upgraded, cross after date when the heavy feature that disappears, can discharge disappear this heavy feature that disappears in heavy table of the overall situation.And aforementioned when step S005 carries out judgement that the overall situation disappears heavy, compare judgement except offseting heavy feature, describedly need time interval between resource that resource that off-line downloads and described off-line download whether less than the time value of setting if also judge simultaneously, this time value is effective period, just can be faster, more effectively definite, whether need to do the processing that the overall situation disappears and weighs.
Step S009, can carry out the dynamic task regulation and control based on the load of off-line download server in described off-line download server cluster, and select specified off-line download server, can determine to distribute the off-line download server of this task.
When the off-line download server by in step S009 selection cluster, off-line can be downloaded asynchronous message and send to corresponding off-line download server cluster, and then the off-line downloading task can enter corresponding task queue, the execution of wait task.
Dynamic task regulates and controls to determine to assign the task to which the off-line download server in cluster, for example: after task can be submitted to as shown in Figure 4 a certain off-line download server cluster 44 or 45, the task dispatcher in off-line download server cluster 44 or 45 (not shown go out) can be assigned to task on the line of present load weights minimum the machine processing of getting on.The computing formula of load weights is:
Load weights=k1*cpu use amount+k2* disk surplus+k3* internal memory surplus+k4* bandwidth resources are done ranking operation;
Wherein K is the shared weights of every computer resource, and the off-line download service mainly relies on disk resource, so the weights that adopt can be k2=5; K1=k3=k4=1.
After the off-line downloading task is assigned to the off-line download server of described appointment, carries out off-line by this off-line download server (as its off-line, downloading progress of work worker) and download.Particularly, at first the off-line downloading task enters the task queue loitering phase, the off-line download progress of work worker of off-line download server obtains the task in task queue successively, and then can be according to obtaining of task, notify corresponding user's cluster (as, notice sends the user of off-line download request), and the modification task status is " in download ".
After the task that step S010, off-line download had been assigned to the off-line download server that above-mentioned steps S009 determines, the off-line download server of described appointment receives distributed the task of coming, start to carry out the off-line downloading task.After this, if download successfully the content that will download (as " picture " etc.) is saved in (as non-relational database cassandra) in database, after preserving successfully, download result parameter and be set to " success " concurrent line download feedback asynchronous message that is sent to corresponding user's cluster, and revise page metamessage (meta information), can upgrade task status and be " download is completed "; And if failed download or preserve unsuccessfully, send off-line after downloading result parameter and being set to " failure " and recording failure cause and download the feedback asynchronous message to corresponding user's cluster, and revise page metamessage (meta information), can upgrade task status and be " failed download ".
Fig. 4 is the structure chart of the resource off-line download apparatus of the embodiment of the invention.
The device of Fig. 4 comprises and disappears task server 41, message server 42, the overall situation refitting puts 43, off-line download server cluster 44,45, and the Storm Distributed Computing Platform (not shown) in the task dispatcher (not shown) in off-line download server cluster 44,45, off-line download server (not shown), off-line download server, cloud storage (not shown) etc.
Task server 41 is the network attribute of the resource of off-line download as required, determines the Virtual network operator that described resource belongs to.It receives user's request, the inquiry overall situation disappears and heavily shows to judge whether to carry out the overall situation and disappear heavily, can send a download message to message server 42 if resource disappears heavy, message content can comprise target off-line download server cluster that resource URL address, step S008 determine number etc.This task server 41 is mainly used to process user's request, and resolving resource URL obtains mission bit stream and off-line download server cluster number, in order to determine off-line download cluster.
Particularly, task server 41 first can carry out association store in the IP address of correspondence respectively with diverse network operator and each Virtual network operator.At first this task server 41 parses domain-name information corresponding to resource URL and by dns resolution, obtains IP information corresponding to domain name.Then can utilize this IP information to inquire about corresponding Virtual network operator (Netcom, telecommunications, education network, movement etc.) in the IP information bank.And calculate the off-line with same operator and download cluster number.But the operation in these task server 41 execution graphs 3 in step S001-S003, receive download request, URL parsing and checking, user rs authentication that the user sends, and wherein, arbitrary checking is by showing authentication failed, as the operation of step S004.When checking is all passed through, judging whether so to carry out URL disappears heavily, operation as step S005, S006, when can disappearing to weigh, definite resource performs step S011, namely cancel download request, this resource is not downloaded, utilized table 43 retry (the heavy strategy that disappears as above give an account of the global resource that continues disappear heavy strategy) that disappears that disappears heavily of the overall situation in this system.And if do not need to disappear heavily, when namely step S005, S006 are no, create the task of off-line downloaded resources, as step S007, and the task of this off-line downloaded resources is dealt into message server 42 in system.
The overall situation disappears to reset and puts 43, and the execution global resource disappears heavy tactful, as described in the method for above-mentioned Fig. 3 description.Before the network attribute of the resource that off-line is as required downloaded is determined Virtual network operator that described resource belongs to, can also obtain the heavy feature that disappears of the resource that off-line downloads, the resource that described off-line is downloaded disappear heavy characteristic storage in the overall situation disappears heavily table.Can also obtain the described heavy feature that disappears that needs the resource of off-line download.Judge that the described heavy feature that disappears of resource that off-line downloads that needs is whether identical with the disappear heavy feature that disappears of the resource that the off-line of storage is downloaded in heavy table of the described overall situation, whether and judging describedly needs time interval between resource that resource that off-line downloads and described off-line download less than the time value of setting, if so, not downloading the described resource that needs off-line to download is step S011.Otherwise, set up that described to need the off-line downloading task of the resource that off-line downloads be step S007.Wherein, the heavy feature that disappears of described resource generates according to URL, size and the contents fragment of described resource.And can feature heavy according to disappearing of the resource after the download of obtaining after having downloaded in the back upgrade this overall situation heavily table that disappears.The overall situation table that disappears heavily can be arranged in the refitting that disappears of this overall situation and puts 43, by this refitting that disappears, is put and is carried out above-mentioned global resource disappear heavy strategy and corresponding implementation step.
This message server 42, can receive information, the off-line downloading task from described task server and process corresponding information, and send described information, off-line downloading task and described information to corresponding off-line download server cluster.The various message that namely will receive, information and each task are distributed to corresponding off-line download server cluster (as pressing cluster number distribution), realize the operation of forwarding messages, it receives from the message of task server 41, sends it to correct destination again, as is forwarded to target off-line download cluster.
Two off-line download server clusters (for example corresponding off-line download server cluster 44 and off-line download server cluster 45 corresponding to telecommunications of Netcom) have been shown in Fig. 4, those skilled in the art should infer, the data of off-line download server cluster of the present invention can be not limited to this, be that off-line download server cluster can comprise a plurality of off-line download servers, further, each off-line download server can comprise Storm platform, cloud storage, this Storm platform can be used for downloading target resource, and the cloud storage can be used for storage resources information.Off-line download server cluster 44,45, according to the task regulating strategy of setting, from the off-line download server cluster of described definite Virtual network operator, select the off-line download server, and the off-line downloading task of described resource is distributed to described off-line download server.
Wherein, the task that message server 42 will create according to off-line download server cluster number is sent to corresponding off-line download server cluster, as the operation of step S008.Each off-line download server in each off-line download server cluster can also comprise a Storm Distributed Computing Platform (not shown).The Storm Distributed Computing Platform is distributed, fault-tolerant real time computation system., for distributed real-time calculating provides one group of generic primitives, can be used among " stream is processed " processing messages and more new database in real time.Storm also can be used to " calculate continuously " (continuous computation), and data flow is done continuous-query, when calculating just with result with the formal output that flows to the user.The cloud storage can be stored the Internet resources that off-line is downloaded, and the user can access the Internet resources that off-line is downloaded by access cloud memory space.
The task regulating strategy comprises one or more task scheduling strategies, and off-line download server cluster, according to described one or more task scheduling strategies, distributes the off-line downloading task of described resource the off-line download server that is given to the present load minimum.In off-line download server cluster, have task dispatcher, arrange in its download server of off-line at described Virtual network operator cluster and carry out the distribution of off-line downloading task.Particularly, the off-line downloading task of described resource is sent to described task dispatcher, this task dispatcher, according to described task scheduling strategy, calculate the load weights of each off-line download server, the off-line download server of present load weights minimum is appointed as the off-line download server of described task.
The off-line downloading task that is assigned in this off-line download cluster is distributed to task dispatcher.This task dispatcher is according to the resource service condition of each off-line download server in this cluster, then task is assigned on the line of present load weights minimum that machine (off-line download server) is upper to be processed, so that as step S010 execution downloading task.The computing formula of load weights is:
Load weights=k1*cpu use amount+k2* disk surplus+k3* internal memory surplus+k4* bandwidth resources are done ranking operation;
Wherein K is the shared weights of every computer resource, and the off-line download service mainly relies on disk resource, so the weights that adopt can be k2=5; K1=k3=k4=1.
Fig. 5 is the structure chart according to the resource off-line download apparatus of one embodiment of the invention.This device can comprise Virtual network operator determination module 51, is used for the network attribute of the resource of off-line download as required and determines the Virtual network operator that described resource belongs to; The off-line download server is selected module 52, is used for selecting the off-line download server of appointment according to the task regulating strategy of setting from the off-line download server cluster of described Virtual network operator; Task execution module 53, be used for that the off-line downloading task of described resource is distributed to selected off-line download server and download to carry out off-line.Above-mentioned module is the functional module corresponding to the treatment step of method shown in Fig. 2,3.
Further, Virtual network operator determination module 51 can carry out association store in the IP address of correspondence respectively with diverse network operator and diverse network operator, obtain the domain-name information corresponding to uniform resource position mark URL of described resource, by domain name system DNS, parse IP address corresponding to domain name information (the first acquisition module); The information of the IP address lookup described association store corresponding according to domain name information, obtain Virtual network operator corresponding to IP address corresponding to domain name information, this Virtual network operator is defined as the Virtual network operator (the second acquisition module) that described resource belongs to.Virtual network operator determination module 51 comprises first, second acquisition module (not shown).
Off-line download server selection module 52 can be according to one or more task scheduling strategies, the off-line downloading task of described resource is distributed the off-line download server that is given to the present load minimum, namely select the off-line download server of present load minimum in off-line download server cluster.further, can be used for carrying out the off-line download server cluster setting of described Virtual network operator the task dispatcher (scheduler in the Storm Distributed Computing Platform of mentioning in as Fig. 4) that the off-line downloading task is distributed, the off-line downloading task of described resource is distributed to described task dispatcher, by described task dispatcher, resource service condition according to each off-line download server in the off-line download server cluster of described Virtual network operator, calculate the load weights of each off-line download server, by described task dispatcher, the off-line download server of load weights minimum is defined as the off-line download server of described appointment.The off-line download server is selected module 52, and the formula of concrete load weights for calculate each off-line download server by described task dispatcher is as follows:
Load weights=k1*cpu use amount+k2* disk surplus+k3* internal memory surplus+k4* bandwidth resources;
Described k1 is weights corresponding to cpu use amount, and described k2 is weights corresponding to disk surplus, and described k3 is weights corresponding to internal memory surplus, and described k4 is weights corresponding to bandwidth resources.
Fig. 6 is the structure chart according to the resource off-line download apparatus of one embodiment of the present of invention.The module 61 that can comprising in this device disappears heavily processes, Virtual network operator determination module 62, off-line download server are selected module 63, task execution module 64.
Modules is also the functional module of the execution step of method shown in corresponding diagram 2,3.
As shown in Figure 6.This device can weigh function and the enforcement of processing policy in corresponding said method about disappearing.It comprises: heavy processing module 61 disappears, can be used for obtaining the heavy feature that disappears of the resource of will off-line downloading, and judge whether the resource that described off-line is downloaded needs to disappear heavily (carrying out before Virtual network operator determines the module of processing before the i.e. processing of Virtual network operator determination module 62).
Virtual network operator determination module 62, can be used for the network attribute of the resource of off-line download as required and determine the Virtual network operator that described resource belongs to.
The off-line download server is selected module 63, is used for selecting the off-line download server of appointment according to the task regulating strategy of setting from the off-line download server cluster of described Virtual network operator.
Task execution module 64, be used for that the off-line downloading task of described resource is distributed to selected off-line download server and download to carry out off-line.
Further, the resource that the heavy processing module 61 that disappears is downloaded described off-line disappear heavy characteristic storage in the overall situation disappears heavily table, the heavy feature that disappears of described resource can generate according to URL, size and the contents fragment of described resource.The heavy processing module 61 that disappears is obtained the described heavy feature that disappears that needs the resource of off-line download; And judge that the described heavy feature that disappears of resource that off-line downloads that needs is whether identical with the disappear heavy feature that disappears of the resource that the off-line of storage is downloaded in heavy table of the described overall situation, and describedly need time interval between resource that resource that off-line downloads and described off-line download less than the time value of setting, if so, do not download the described resource that needs off-line to download; Otherwise, set up the described off-line downloading task that needs the resource of off-line download.
Further, the heavy feature that disappears during the overall situation disappears and heavily shows can be to extract the head 100k content of described resource, middle random site 100k content, afterbody 100k content is as the content segments of resource, the URL of described resource, resource size and resource content segment are spliced into a character string, described character string are done Message Digest Algorithm 5 (MD5) calculate.
This global resource heavy strategy that disappears, depend on and above-mentionedly complete look into weight-normality, and heavy resource corresponding to url that guarantee to disappear is that consistent or inconsistent probability is in tolerable scope.
Intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment with demonstration at this algorithm that provides.Various general-purpose systems also can with based on using together with this teaching.According to top description, it is apparent constructing the desired structure of this type systematic.In addition, the present invention is not also for any certain programmed language.Should be understood that and can utilize various programming languages to realize content of the present invention described here, and the top description that language-specific is done is in order to disclose preferred forms of the present invention.
In the specification that provides herein, a large amount of details have been described.Yet, can understand, embodiments of the invention can be in the situation that do not have these details to put into practice.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand one or more in each inventive aspect, in the description to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes in the above.Yet the method for the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires the more feature of feature of clearly putting down in writing than institute in each claim.Or rather, as following claims reflected, inventive aspect was to be less than all features of the disclosed single embodiment in front.Therefore, follow claims of embodiment and incorporate clearly thus this embodiment into, wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and can adaptively change and they are arranged in one or more equipment different from this embodiment the module in the equipment in embodiment.Can be combined into a module or unit or assembly to the module in embodiment or unit or assembly, and can put them into a plurality of submodules or subelement or sub-component in addition.At least some in such feature and/or process or unit are mutually repelling, and can adopt any combination to disclosed all features in this specification (comprising claim, summary and the accompanying drawing followed) and so all processes or the unit of disclosed any method or equipment make up.Unless clearly statement in addition, in this specification (comprising claim, summary and the accompanying drawing followed) disclosed each feature can be by providing identical, be equal to or the alternative features of similar purpose replaces.
In addition, those skilled in the art can understand, although embodiment more described herein comprise some feature rather than further feature included in other embodiment, the combination of the feature of different embodiment mean be in scope of the present invention within and form different embodiment.For example, in the following claims, the one of any of embodiment required for protection can be used with compound mode arbitrarily.
All parts embodiment of the present invention can realize with hardware, perhaps with the software module of moving on one or more processor, realizes, perhaps the combination with them realizes.It will be understood by those of skill in the art that and can use in practice microprocessor or digital signal processor (DSP) to realize according to some or all some or repertoire of parts in the equipment of the embodiment of the present invention.The present invention can also be embodied as be used to part or all equipment or the device program (for example, computer program and computer program) of carrying out method as described herein.The program of the present invention that realizes like this can be stored on computer-readable medium, perhaps can have the form of one or more signal.Such signal can be downloaded and obtain from internet website, perhaps provides on carrier signal, perhaps with any other form, provides.
It should be noted above-described embodiment the present invention will be described rather than limit the invention, and those skilled in the art can design alternative embodiment in the situation that do not break away from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and is not listed in element or step in claim.Being positioned at word " " before element or " one " does not get rid of and has a plurality of such elements.The present invention can realize by means of the hardware that includes some different elements and by means of the computer of suitably programming.In having enumerated the unit claim of some devices, several in these devices can be to carry out imbody by same hardware branch.The use of word first, second and C grade does not represent any order.Can be title with these word explanations.
The invention discloses A1, a kind of resource off-line method for down loading, it comprises: the network attribute of the resource of off-line download is determined the Virtual network operator that described resource belongs to as required; According to the task regulating strategy of setting, select the off-line download server from the off-line download server cluster of described Virtual network operator, wherein, described off-line download server is used for that resource is carried out off-line and downloads; The off-line downloading task of described resource is distributed to selected off-line download server to be downloaded to carry out off-line.A2, method as described in A1, wherein, the network attribute of the resource that described off-line is as required downloaded is determined the Virtual network operator that described resource belongs to, further comprise: obtain the domain-name information corresponding with the uniform resource position mark URL of described resource, and parse the IP address corresponding with domain name information; To obtain the Virtual network operator corresponding with described IP address and it is defined as the Virtual network operator that described resource is belonged to, described database stores Virtual network operator and IP address thereof according to the IP address lookup database corresponding with domain name information.A3, as A1 or the described method of A2, wherein, described task regulating strategy is for distributing to the off-line downloading task of described resource the off-line download server of present load weights minimum.A4, method as described in A3, wherein, described load weights are: k1*cpu use amount+k2* disk surplus+k3* internal memory surplus+k4* bandwidth resources, wherein, described k1 is weights corresponding to cpu use amount, described k2 is weights corresponding to disk surplus, and described k3 is weights corresponding to internal memory surplus, and described k4 is weights corresponding to bandwidth resources.A5, method as described in A1 to A4 any one, wherein, before the network attribute of the resource that described off-line is as required downloaded is determined Virtual network operator that described resource belongs to, also comprise: obtain the described heavy feature that disappears that needs the resource of off-line download, the described heavy feature that disappears refers to that the identify label of described resource and its URL according to described resource, size and contents fragment generate; Judge that the described heavy feature that disappears of resource that off-line downloads that needs is whether identical with the disappear heavy feature that disappears of the resource that the off-line of storage is downloaded in heavy table of the overall situation, and describedly needing time interval between resource that resource that off-line downloads and described off-line download whether less than the setting-up time value, disappear heavy table of the described overall situation stores the heavy feature that disappears of the resource that off-line downloads; Identical and time interval of heavy feature, less than the setting-up time value, is not downloaded the described resource that needs off-line to download if disappear; Otherwise, set up the described off-line downloading task that needs the resource of off-line download.A6, method as described in A5, wherein, the described heavy feature that disappears generates through the following steps: extract a 100k content of described resource, middle random site 100k content, the afterbody 100k content content segments as resource; The URL of described resource, resource size and described content segments are spliced into character string; Described character string is carried out MD5 calculate to obtain the described heavy feature that disappears.
The invention also discloses B7, a kind of resource off-line download apparatus, it comprises: the Virtual network operator determination module is suitable for the network attribute of the resource of off-line download as required and determines the Virtual network operator that described resource belongs to; The off-line download server is selected module, is suitable for according to the task regulating strategy of setting, and selects the off-line download server from the off-line download server cluster of described Virtual network operator, and wherein, described off-line download server is used for that resource is carried out off-line and downloads; Task execution module, be suitable for that the off-line downloading task of described resource is distributed to selected off-line download server and download to carry out off-line.B8, as device as described in B7, wherein, described Virtual network operator determination module further comprises: the first acquisition module is suitable for obtaining the domain-name information corresponding with the uniform resource position mark URL of described resource, and parses the IP address corresponding with domain name information; The second acquisition module, be suitable for the basis IP address lookup database corresponding with domain name information to obtain the Virtual network operator corresponding with described IP address and it is defined as the Virtual network operator that described resource is belonged to, described database stores Virtual network operator and IP address thereof.B9, as B7 or the described device of B8, wherein, described task regulating strategy is for distributing to the off-line downloading task of described resource the off-line download server of present load weights minimum.B10, device as described in B9, wherein, described load weights are: k1*cpu use amount+k2* disk surplus+k3* internal memory surplus+k4* bandwidth resources, wherein, described k1 is weights corresponding to cpu use amount, described k2 is weights corresponding to disk surplus, and described k3 is weights corresponding to internal memory surplus, and described k4 is weights corresponding to bandwidth resources.B11, device as described in B7 to B10 any one, wherein, also comprise: the heavy processing module that disappears is suitable for: obtain the described heavy feature that disappears that needs the resource of off-line download, the described heavy feature that disappears refers to that the identify label of described resource and its URL according to described resource, size and contents fragment generate; Judge that the described heavy feature that disappears of resource that off-line downloads that needs is whether identical with the disappear heavy feature that disappears of the resource that the off-line of storage is downloaded in heavy table of the overall situation, and describedly needing time interval between resource that resource that off-line downloads and described off-line download whether less than the setting-up time value, disappear heavy table of the described overall situation stores the heavy feature that disappears of the resource that off-line downloads; Identical and time interval of heavy feature, less than the setting-up time value, is not downloaded the described resource that needs off-line to download if disappear; Otherwise, set up the described off-line downloading task that needs the resource of off-line download.B12, device as described in B11, also comprising the heavy feature generation module that disappears, and it comprises: extraction unit is suitable for extracting a 100k content of described resource, middle random site 100k content, the afterbody 100k content content segments as resource; Concatenation unit, be suitable for the URL of described resource, resource size and described content segments are spliced into character string; Computing unit, be suitable for that described character string is carried out MD5 and calculate.

Claims (10)

1. resource off-line method for down loading, it comprises:
The network attribute of the resource of off-line download is determined the Virtual network operator that described resource belongs to as required;
According to the task regulating strategy of setting, select the off-line download server from the off-line download server cluster of described Virtual network operator, wherein, described off-line download server is used for that resource is carried out off-line and downloads;
The off-line downloading task of described resource is distributed to selected off-line download server to be downloaded to carry out off-line.
2. the method for claim 1, wherein the network attribute of the resource of the described download of off-line is as required determined the Virtual network operator that described resource belongs to, and further comprises:
Obtain the domain-name information corresponding with the uniform resource position mark URL of described resource, and parse the IP address corresponding with domain name information;
To obtain the Virtual network operator corresponding with described IP address and it is defined as the Virtual network operator that described resource is belonged to, described database stores Virtual network operator and IP address thereof according to the IP address lookup database corresponding with domain name information.
3. method as claimed in claim 1 or 2, wherein, described task regulating strategy is for distributing to the off-line downloading task of described resource the off-line download server of present load weights minimum.
4. method as claimed in claim 3, wherein, described load weights are: k1*cpu use amount+k2* disk surplus+k3* internal memory surplus+k4* bandwidth resources, wherein,
Described k1 is weights corresponding to cpu use amount, and described k2 is weights corresponding to disk surplus, and described k3 is weights corresponding to internal memory surplus, and described k4 is weights corresponding to bandwidth resources.
5. method as described in claim 1 to 4 any one wherein, before the network attribute of the resource that described off-line is as required downloaded is determined Virtual network operator that described resource belongs to, also comprises:
Obtain the described heavy feature that disappears that needs the resource of off-line download, the described heavy feature that disappears refers to that the identify label of described resource and its URL according to described resource, size and contents fragment generate;
Judge that the described heavy feature that disappears of resource that off-line downloads that needs is whether identical with the disappear heavy feature that disappears of the resource that the off-line of storage is downloaded in heavy table of the overall situation, and describedly needing time interval between resource that resource that off-line downloads and described off-line download whether less than the setting-up time value, disappear heavy table of the described overall situation stores the heavy feature that disappears of the resource that off-line downloads;
Identical and time interval of heavy feature, less than the setting-up time value, is not downloaded the described resource that needs off-line to download if disappear; Otherwise, set up the described off-line downloading task that needs the resource of off-line download.
6. method as claimed in claim 5, wherein, the described heavy feature that disappears generates through the following steps:
Extract a 100k content of described resource, middle random site 100k content, the afterbody 100k content content segments as resource;
The URL of described resource, resource size and described content segments are spliced into character string;
Described character string is carried out MD5 calculate to obtain the described heavy feature that disappears.
7. resource off-line download apparatus, it comprises:
The Virtual network operator determination module, be suitable for the network attribute of the resource of off-line download as required and determine the Virtual network operator that described resource belongs to;
The off-line download server is selected module, is suitable for according to the task regulating strategy of setting, and selects the off-line download server from the off-line download server cluster of described Virtual network operator, and wherein, described off-line download server is used for that resource is carried out off-line and downloads;
Task execution module, be suitable for that the off-line downloading task of described resource is distributed to selected off-line download server and download to carry out off-line.
8. install as claimed in claim 7, wherein, described Virtual network operator determination module further comprises:
The first acquisition module, be suitable for obtaining the domain-name information corresponding with the uniform resource position mark URL of described resource, and parse the IP address corresponding with domain name information;
The second acquisition module, be suitable for the basis IP address lookup database corresponding with domain name information to obtain the Virtual network operator corresponding with described IP address and it is defined as the Virtual network operator that described resource is belonged to, described database stores Virtual network operator and IP address thereof.
9. install as claimed in claim 7 or 8, wherein, described task regulating strategy is for distributing to the off-line downloading task of described resource the off-line download server of present load weights minimum.
10. device as claimed in claim 9, wherein, described load weights are: k1*cpu use amount+k2* disk surplus+k3* internal memory surplus+k4* bandwidth resources, wherein,
Described k1 is weights corresponding to cpu use amount, and described k2 is weights corresponding to disk surplus, and described k3 is weights corresponding to internal memory surplus, and described k4 is weights corresponding to bandwidth resources.
CN201310259489.XA 2013-06-26 2013-06-26 Resource offline method for down loading and device Active CN103391312B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310259489.XA CN103391312B (en) 2013-06-26 2013-06-26 Resource offline method for down loading and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310259489.XA CN103391312B (en) 2013-06-26 2013-06-26 Resource offline method for down loading and device

Publications (2)

Publication Number Publication Date
CN103391312A true CN103391312A (en) 2013-11-13
CN103391312B CN103391312B (en) 2017-06-09

Family

ID=49535467

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310259489.XA Active CN103391312B (en) 2013-06-26 2013-06-26 Resource offline method for down loading and device

Country Status (1)

Country Link
CN (1) CN103391312B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104735041A (en) * 2013-12-24 2015-06-24 华为终端有限公司 Method and device for having access to offline resources
CN105391805A (en) * 2015-12-21 2016-03-09 天津海量信息技术有限公司 Data downloading system and downloading method based on multi-client cluster collaboration
CN106227734A (en) * 2016-07-08 2016-12-14 大唐融合通信股份有限公司 A kind of data processing method based on problem search system and system
CN106254561A (en) * 2016-10-12 2016-12-21 上海安馨信息科技有限公司 The real-time offline download method of a kind of Internet resources file and system
CN106487823A (en) * 2015-08-24 2017-03-08 上海斐讯数据通信技术有限公司 A kind of document transmission method based on SDN framework and system
WO2017050141A1 (en) * 2015-09-24 2017-03-30 网宿科技股份有限公司 Distributed storage-based file delivery system and method
CN106909627A (en) * 2017-01-22 2017-06-30 北京奇艺世纪科技有限公司 A kind of content loading method, device and mobile device
CN107169024A (en) * 2017-04-11 2017-09-15 微梦创科网络科技(中国)有限公司 The operation system and service implementation method of a kind of compatible type
CN109840844A (en) * 2017-11-27 2019-06-04 上海仪电(集团)有限公司中央研究院 A kind of financial big data acquisition processing device and system based on FPGA
CN109857539A (en) * 2017-11-30 2019-06-07 阿里巴巴集团控股有限公司 Resource regulating method and terminal
CN111314446A (en) * 2020-01-21 2020-06-19 北京达佳互联信息技术有限公司 Resource updating method, device, server and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101771931A (en) * 2008-12-26 2010-07-07 中国移动通信集团公司 P2P (peer 2 peer) resource downloading method and identification device
CN102316135A (en) * 2010-07-02 2012-01-11 深圳市快播科技有限公司 Network on-demand method and system
CN102387220A (en) * 2011-12-22 2012-03-21 乐视网信息技术(北京)股份有限公司 Offline downloading method and system based on cloud storage
CN102394898A (en) * 2011-04-07 2012-03-28 传聚互动(北京)科技有限公司 File downloading method and system based on P2P (point to point)

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101771931A (en) * 2008-12-26 2010-07-07 中国移动通信集团公司 P2P (peer 2 peer) resource downloading method and identification device
CN102316135A (en) * 2010-07-02 2012-01-11 深圳市快播科技有限公司 Network on-demand method and system
CN102394898A (en) * 2011-04-07 2012-03-28 传聚互动(北京)科技有限公司 File downloading method and system based on P2P (point to point)
CN102387220A (en) * 2011-12-22 2012-03-21 乐视网信息技术(北京)股份有限公司 Offline downloading method and system based on cloud storage

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104735041A (en) * 2013-12-24 2015-06-24 华为终端有限公司 Method and device for having access to offline resources
CN104735041B (en) * 2013-12-24 2018-12-14 华为终端(东莞)有限公司 Offline resources acquisition methods and device
CN106487823A (en) * 2015-08-24 2017-03-08 上海斐讯数据通信技术有限公司 A kind of document transmission method based on SDN framework and system
WO2017050141A1 (en) * 2015-09-24 2017-03-30 网宿科技股份有限公司 Distributed storage-based file delivery system and method
US10776159B2 (en) 2015-09-24 2020-09-15 Wangsu Science & Technology Co., Ltd. Distributed storage-based filed delivery system and method using calculated dependencies between tasks to ensure consistancy of files
CN105391805A (en) * 2015-12-21 2016-03-09 天津海量信息技术有限公司 Data downloading system and downloading method based on multi-client cluster collaboration
CN106227734B (en) * 2016-07-08 2019-06-25 大唐融合通信股份有限公司 A kind of data processing method and system based on problem search system
CN106227734A (en) * 2016-07-08 2016-12-14 大唐融合通信股份有限公司 A kind of data processing method based on problem search system and system
CN106254561A (en) * 2016-10-12 2016-12-21 上海安馨信息科技有限公司 The real-time offline download method of a kind of Internet resources file and system
CN106254561B (en) * 2016-10-12 2019-12-17 上海安馨信息科技有限公司 real-time off-line downloading method and system for network resource file
CN106909627A (en) * 2017-01-22 2017-06-30 北京奇艺世纪科技有限公司 A kind of content loading method, device and mobile device
CN107169024A (en) * 2017-04-11 2017-09-15 微梦创科网络科技(中国)有限公司 The operation system and service implementation method of a kind of compatible type
CN109840844A (en) * 2017-11-27 2019-06-04 上海仪电(集团)有限公司中央研究院 A kind of financial big data acquisition processing device and system based on FPGA
CN109840844B (en) * 2017-11-27 2023-12-22 上海仪电(集团)有限公司中央研究院 Financial big data acquisition processing device and system based on FPGA
CN109857539A (en) * 2017-11-30 2019-06-07 阿里巴巴集团控股有限公司 Resource regulating method and terminal
CN109857539B (en) * 2017-11-30 2022-11-15 阿里巴巴集团控股有限公司 Resource scheduling method and terminal
CN111314446A (en) * 2020-01-21 2020-06-19 北京达佳互联信息技术有限公司 Resource updating method, device, server and storage medium

Also Published As

Publication number Publication date
CN103391312B (en) 2017-06-09

Similar Documents

Publication Publication Date Title
CN103391312A (en) Resource offline downloading method and device
US10949253B2 (en) Data forwarder for distributed data acquisition, indexing and search system
CN108028853B (en) System, method, and medium for customizable event-triggered computation at edge locations
US8539080B1 (en) Application intelligent request management based on server health and client information
US20170344910A1 (en) Continuously provisioning large-scale machine learning models
CN108173774B (en) Client upgrading method and system
CN103312733B (en) Information processing method and device
CN105119973B (en) User information processing method and server
US10623470B2 (en) Optimizing internet data transfers using an intelligent router agent
JP4925231B2 (en) Sending request fragments from a response aggregation surrogate
US9998534B2 (en) Peer-to-peer seed assurance protocol
CN103209223A (en) Distributed application conversation information sharing method and system and application server
CN105183470B (en) A kind of natural language processing system service platform
CN105373420A (en) Data transmission method and apparatus
CN108701130A (en) Hints model is updated using auto-browsing cluster
CN110336848A (en) A kind of dispatching method and scheduling system, equipment of access request
CN102984277B (en) Prevent the system and method that malice connects
CN101883079A (en) Method and device used for accelerating to request multimedia contents in internet
JP5957965B2 (en) Virtualization system, load balancing apparatus, load balancing method, and load balancing program
CN105915655B (en) Network agent method and agency plant
CN110943876B (en) URL state detection method, device, equipment and system
US20160301625A1 (en) Intelligent High-Volume Cloud Application Programming Interface Request Caching
CN103024051B (en) A kind of device, method and corresponding system carrying out server distribution
CN111866197B (en) Domain name resolution method and system
CN103609074B (en) Ask to route using specific WEB

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220727

Address after: 300450 No. 9-3-401, No. 39, Gaoxin 6th Road, Binhai Science Park, Binhai New Area, Tianjin

Patentee after: 3600 Technology Group Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230714

Address after: 1765, floor 17, floor 15, building 3, No. 10 Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: Beijing Hongxiang Technical Service Co.,Ltd.

Address before: 300450 No. 9-3-401, No. 39, Gaoxin 6th Road, Binhai Science Park, Binhai New Area, Tianjin

Patentee before: 3600 Technology Group Co.,Ltd.

TR01 Transfer of patent right
CP03 Change of name, title or address

Address after: 1765, floor 17, floor 15, building 3, No. 10 Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: Beijing 360 Zhiling Technology Co.,Ltd.

Country or region after: China

Address before: 1765, floor 17, floor 15, building 3, No. 10 Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee before: Beijing Hongxiang Technical Service Co.,Ltd.

Country or region before: China