CN102929721A - Balanced scheduling system and method based on station quota - Google Patents

Balanced scheduling system and method based on station quota Download PDF

Info

Publication number
CN102929721A
CN102929721A CN2012103769223A CN201210376922A CN102929721A CN 102929721 A CN102929721 A CN 102929721A CN 2012103769223 A CN2012103769223 A CN 2012103769223A CN 201210376922 A CN201210376922 A CN 201210376922A CN 102929721 A CN102929721 A CN 102929721A
Authority
CN
China
Prior art keywords
time
website
domain name
page
last scheduled
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012103769223A
Other languages
Chinese (zh)
Other versions
CN102929721B (en
Inventor
卢宏林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201210376922.3A priority Critical patent/CN102929721B/en
Publication of CN102929721A publication Critical patent/CN102929721A/en
Application granted granted Critical
Publication of CN102929721B publication Critical patent/CN102929721B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a balanced scheduling system based on a station quota, and relates to the technical field of Internet. The system comprises a scheduling task acquisition module for acquiring a scheduling task in a domain name queue of a station, and a scheduling module for downloading a corresponding number of pages from a server appointed by the scheduling task according to a preset number of disposable schedulable pages. The invention also discloses a balanced scheduling method based on the station quota. By the balanced scheduling system and the balanced scheduling method based on the station quota, under any condition, all the stations can get certain downloading opportunities. Moreover, different quota limitations can be set according to an actual situation, so that the efficiency and the timeliness can be considered, and requirements of different search products can be met. A foundation is laid for uniformly processing all-network search and perpendicular search.

Description

Balance dispatching system and method based on the website quota
Technical field
The present invention relates to Internet technical field, be specifically related to a kind of balance dispatching system and method based on the website quota.
Background technology
For search engine, be the first step of its processing from the internet crawl page.Yet the page number of internet accumulation is huge, and every day, the new page number that upgrades and newly produce was equally very huge.How to obtain in time these pages, be the matter of utmost importance that search engine faces.In order to grasp in time these magnanimity pages, must carry out rationally and efficient scheduling.Therefore, adopt which kind of dispatching algorithm just extremely important.
At present, Webpage search is ranked in order for the newfound page when carrying out webpage crawl scheduling.For history page, the frequency that then decides it to reschedule according to the renewal frequency of history page.
In Webpage search, because all pages all adopt unified strategy, download the time of delay nearly all by over the sky.This can not put up with for a lot of vertical searches.
In particular cases, the big data quantity website will affect the timely processing of other websites.If the server count quantity not sufficient, the page of the website that some data volumes are large, most of processing power that will account for causes other websites in time to process.
Summary of the invention
In view of the above problems, the present invention has been proposed in order to a kind of balance dispatching system and method based on the website quota that overcomes the problems referred to above or address the above problem at least in part is provided.
According to one aspect of the present invention, a kind of balance dispatching system based on the website quota is provided, comprising:
The scheduler task acquisition module is suitable for obtaining the scheduler task in the domain name formation of website;
Scheduler module is suitable for once can dispatching page number is downloaded respective numbers from described server the page to the specified server of described scheduler task according to pre-configured.
Alternatively, described scheduler task acquisition module is suitable for obtaining scheduler task in the domain name formation of website by predetermined task priority.
Alternatively, described system also comprises: the feedback adjusting module, the last scheduled time with described server of being suitable for behind page of scheduling is updated to the described last scheduled time and adds page time-out time.
Alternatively, described system also comprises: the feedback adjusting module, the last scheduled time with described server of being suitable for after downloading a page is updated to the described last scheduled time and deducts the readjustment time, and the described readjustment time is the poor of page time-out time and described page downloading time.
Alternatively, consuming time less than default download when the page actual download time, then described page downloading time is that described default download is consuming time, otherwise is the page actual download time.
Alternatively, described system also comprises: the allocation of quota module specifically comprises:
The site structure locating module is suitable for obtaining the website sum on the current server, according to the website sequence number, and direct localizer station dot structure;
Domain name allocation of quota module is suitable for obtaining in the slave site structure domain name sum and the head and the tail domain name sequence number of this website, and website domain name was sorted by the last scheduled time, selects the domain name of predetermined quantity the domain name after selecting ordering;
Domain name IP locating module, be suitable for selected domain name is located its domain name IP address according to its domain name IP sum and domain name IP head and the tail sequence number, by the IP offset orientation IP structure address among the domain name IP, choose successively and record the last scheduled time of IP, select server corresponding to last scheduled time IP the earliest;
Scheduling time arranges module, be suitable for locating the IP structure after, last scheduled time and current time in the IP structure relatively.If the last scheduled time greater than or etc. the current time, be not the described page number of once can dispatching of this server-assignment, if the last scheduled time less than the current time, once can be dispatched page number for this server-assignment is described, the last scheduled time with this IP is set to the current time simultaneously;
The cycle assignment module, the IP structure has assigned in this domain name if be suitable for, the next domain name of circular treatment, otherwise in remaining IP, select a last scheduled time IP the earliest to continue to process, after all domain names have assigned in the current website, the website sequence number is added 1, in order to process next website, if the website number reaches maximal value, sequence number resets to 0, if all checked once do not have afterwards to distribute website at all websites, after the dormancy schedule time again to all website cycle assignment.
Alternatively, the schedule time is 1 second.
According to a further aspect in the invention, provide the equalization scheduling method based on the website quota, may further comprise the steps:
Obtain the scheduler task in the domain name formation of website;
Once can dispatch page number is downloaded respective numbers from described server the page to the specified server of described scheduler task according to pre-configured.
Alternatively, obtain scheduler task in the domain name formation of website by predetermined task priority.
Alternatively, also comprise step behind page of scheduling: last scheduled time of described server is updated to the described last scheduled time adds page time-out time.
Alternatively, download and also comprise step behind the page: last scheduled time of described server is updated to the described last scheduled time deducts the readjustment time, the described readjustment time is the poor of page time-out time and described page downloading time.
Alternatively, consuming time less than default download when the page actual download time, then described page downloading time is that described default download is consuming time, otherwise is the page actual download time.
Alternatively, the server in the described scheduler task is downloaded the page of respective numbers and is specifically comprised according to the pre-configured page number of once can dispatching from described server:
Obtain the website sum on the current server, according to the website sequence number, direct localizer station dot structure;
Obtain domain name sum and the head and the tail domain name sequence number of this website in the slave site structure;
Website domain name was sorted by the last scheduled time, from the domain name after the ordering, select the domain name of predetermined quantity;
Selected domain name is located its domain name IP address according to its domain name IP sum and domain name IP head and the tail sequence number, by the IP offset orientation IP structure address among the domain name IP, choose successively and record the last scheduled time of each IP, select server corresponding to last scheduled time IP the earliest;
After the IP structure of location, compare last scheduled time and current time in the IP structure.If the last scheduled time is more than or equal to the current time, be not the described page number of once can dispatching of this server-assignment, if the last scheduled time less than the current time, once can be dispatched page number for this server-assignment is described, the last scheduled time with this IP is set to the current time simultaneously;
If the IP structure has assigned in this domain name, the next domain name of circular treatment, otherwise in remaining IP, select a last scheduled time IP the earliest to continue to process;
After all domain names have assigned in the current website, the website sequence number is added 1, in order to process next website, if the website number reaches maximal value, sequence number resets to 0, if all checked once do not have afterwards to distribute website at all websites, after the dormancy schedule time again to all website cycle assignment.
Alternatively, the schedule time is 1 second.
Balance dispatching system based on the website quota according to the present invention guarantees that with method all websites under any circumstance can both obtain certain downloading machine meeting.Simultaneously, can different quota restrictions be set according to actual conditions, to take into account efficient and ageing, satisfy the demand of different searching products.This also can lay a good foundation with unified processing of vertical search for the whole network search.
Above-mentioned explanation only is the general introduction of technical solution of the present invention, for can clearer understanding technological means of the present invention, and can be implemented according to the content of instructions, and for above and other objects of the present invention, feature and advantage can be become apparent, below especially exemplified by the specific embodiment of the present invention.
Description of drawings
By reading hereinafter detailed description of the preferred embodiment, various other advantage and benefits will become cheer and bright for those of ordinary skills.Accompanying drawing only is used for the purpose of preferred implementation is shown, and does not think limitation of the present invention.And in whole accompanying drawing, represent identical parts with identical reference symbol.In the accompanying drawings:
Fig. 1 shows according to an embodiment of the invention the equalization scheduling method process flow diagram based on the website quota;
Fig. 2 shows the particular flow sheet of step S120 among Fig. 1;
Fig. 3 shows according to an embodiment of the invention the balance dispatching system architecture synoptic diagram based on the website quota;
Fig. 4 shows scheduler module concrete structure synoptic diagram among Fig. 3.
Embodiment
Exemplary embodiment of the present disclosure is described below with reference to accompanying drawings in more detail.Although shown exemplary embodiment of the present disclosure in the accompanying drawing, yet should be appreciated that and to realize the disclosure and the embodiment that should do not set forth limits here with various forms.On the contrary, it is in order to understand the disclosure more thoroughly that these embodiment are provided, and can with the scope of the present disclosure complete convey to those skilled in the art.
Present embodiment based on the equalization scheduling method flow process of website quota as shown in Figure 1, comprising:
Step S110 obtains the scheduler task in the domain name formation of website.Can from the domain name formation of the whole network search system, obtain scheduler task.
Step S120 once can dispatch page number is downloaded respective numbers from server the page to the specified server of scheduler task according to pre-configured.Quantity by control downloading web pages from server, guaranteed that all websites under any circumstance can both obtain certain downloading machine meeting, can be because of the page of the large website of some data volume, most of downloading machine meeting that will account for causes other websites in time to download and to process.
Further, if the task in the domain name formation has priority, then obtain scheduler task in the domain name formation of website by predetermined task priority.
From every Website server downloading web pages the time, except limiting certain number of downloads, also to be controlled at by certain dispatching algorithm when download the webpage that this time can dispatch the page number amount, to guarantee that Website server is in the pressure tolerance of self.Pressure control must be for ip, and namely server (in general, the server of a corresponding website of ip is so must be undertaken in order to avoid the website particular server is caused too large pressure by ip by pressure control.Ip is corresponding with domain name, and a domain name can have a plurality of ip, also may only have an ip.What may use in url is domain name, also may directly use ip), an ip may belong to a plurality of domain names simultaneously, so in the repeatedly scheduling process for an ip, will carry out robin scheduling to affiliated domain name, for example, if the corresponding same ip of two domain names is arranged, dispatch a collection of url of first domain name when dispatching so specifically this ip, just should dispatch a collection of url of another domain name when dispatch this ip next time so.Therefore, the scheduling time that needs the page in the Control Server.
Also last scheduled time of this page being updated to the last scheduled time behind page of scheduling adds page time-out time, wherein, the last scheduled time is an attribute of the page, just upgrades this attribute after calling this page, and namely be updated to: the last scheduled time adds page time-out time.That is to say in a page time-out time and no longer dispatch this page.
Because there is the long weak point that has the download time of the different pages, therefore after page-downloading is finished, unless the page is genuine overtime, in page time-out time, can't obtain again so scheduling, if it is not overtime, the last scheduled time with this page of then downloading behind the page also is updated to the last scheduled time and deducts the readjustment time, and the described readjustment time is the poor of page time-out time and page downloading time.Wherein, for convenient to the same management of same website, the download time that a designated time is set is the page, if the page actual download time less than default download time, then page downloading time is the described designated time, otherwise is the page actual download time.
For example: ip upper (Website server) can not surpass the soonest 5 seconds (designated time) and download a page, but the website is very busy sometimes, may 10 seconds even surpassed time-out time and all can't download a page.So too not large to server stress in order to guarantee, but a time-out time directly increased when dispatching download time next time at every turn, such as 60 seconds.This ip can not be dispatched in 60 seconds again like this, rather than the initial designated time (above-mentioned 5 seconds).Know how long this page-downloading has been used owing to only download.If this page is final overtime, that is to say that downloading consuming time is whole time-out times, such as 60 seconds.This time just need to not dispatch so, illustrates that this website is really extremely busy, and basic just download of page of scheduling in 5 seconds do not come.If but downloaded the time-out time that is less than consuming time, that will be in two kinds of situation.A kind of situation is that download time is even less than the fastest time of agreement, such as 5 seconds.So at this moment, the readjustment time should be that time-out time deducts the fastest designated time, adjusts back exactly 55 seconds, and be about to the last scheduled set of time and be: the last scheduled time adds 5 seconds, can dispatch this page after these 5 seconds.Because if the more words of readjustment have just been broken the agreement of downloading a page in the fastest 5 seconds.Another kind of situation is that such as 15 seconds, the readjustment time at this moment just should be that time-out time deducts the actual download time, namely adjusts back 45 seconds if download the time that surpasses agreement consuming time.Like this when speed of download is very fast, also can control by the designated time, and when download is slow, also can be by actual conditions, go to download with the time that is slower than agreement, and can guarantee just can continue paging after in front page-downloading is finished, and slow website can not appear upgrading also always by the designated time scheduling, the situation that results page is saved bit by bit more and more.
Also comprised before downloading page: the cycle assignment website is downloaded the step of quota, as shown in Figure 2, specifically comprises:
Step S210 obtains the website sum on the current server, according to the website sequence number, and direct localizer station dot structure.
Step S220 obtains domain name sum and the head and the tail domain name sequence number of this website in the slave site structure, website domain name was sorted by the last scheduled time, selects the domain name of predetermined quantity from the domain name after the ordering.
Step S230, selected domain name is located its domain name IP address according to its domain name IP sum and domain name IP head and the tail sequence number, by the IP offset orientation IP structure address among the domain name IP, the last scheduled time of choosing successively and recording each IP, select server corresponding to last scheduled time IP the earliest.
Step S240 after the IP structure of location, compares last scheduled time and current time in the IP structure.If the last scheduled time is more than or equal to the current time (time value that records in the last scheduled time attribute, by among the above-mentioned steps S120 this property value adjustment being obtained), be not the described page number of once can dispatching of this server-assignment, if the last scheduled time is less than the current time, once can dispatch page number for this server-assignment is described, the last scheduled time of the server that this IP is corresponding is set to the current time simultaneously.
Step S250, if the IP structure has assigned in this domain name, the next domain name of circular treatment, otherwise in remaining IP, select a last scheduled time IP the earliest to continue to process.Step S250, after all domain names have assigned in the current website, the website sequence number is added 1, in order to process next website, if the website number reaches maximal value, sequence number resets to 0, if all checked once do not have afterwards to distribute website at all websites, after the dormancy schedule time (such as 1 second) again to all website cycle assignment.All website continuous arrangements, first website sequence number is 0, adds one by one 1 later on, first since No. 0 website, adds No. 1 website of 1 scheduling next time during scheduling.After being dispatched to last website, will turn back next time again from No. 0 website scheduling.
The present invention also provides a kind of balance dispatching system based on the website quota, and concrete structure comprises as shown in Figure 3: scheduler task acquisition module 310 and scheduler module 320.
Scheduler task acquisition module 310 is suitable for obtaining the scheduler task in the domain name formation of website, if the scheduler task in the domain name formation has priority orders, then obtains scheduler task in the domain name formation of website by predetermined task priority.Scheduler module 320 is suitable for once can dispatching page number is downloaded respective numbers from described server the page to the specified server of described scheduler task according to pre-configured.
The system of present embodiment also comprises: the feedback adjusting module, the last scheduled time with described server of being suitable for behind page of scheduling is updated to the described last scheduled time and adds page time-out time.
The system of present embodiment also comprises: the feedback adjusting module, the last scheduled time with described server of being suitable for after downloading a page is updated to the described last scheduled time and deducts the readjustment time, and the described readjustment time is the poor of page time-out time and described page downloading time.Wherein, consuming time less than default download when the page actual download time, then described page downloading time is that described default download is consuming time, otherwise is the page actual download time.
The system of present embodiment also comprises: allocation of quota module 4 specifically as shown in Figure 4, comprising:
Site structure locating module 410 is suitable for obtaining the website sum on the current server, according to the website sequence number, and direct localizer station dot structure;
Domain name allocation of quota module 420 is suitable for obtaining in the slave site structure domain name sum and the head and the tail domain name sequence number of this website, and website domain name was sorted by the last scheduled time, selects the domain name of predetermined quantity the domain name after selecting ordering;
Domain name IP locating module 430, be suitable for selected domain name is located its domain name IP address according to its domain name IP sum and domain name IP head and the tail sequence number, by the IP offset orientation IP structure address among the domain name IP, choose successively and record the last scheduled time of IP, select server corresponding to last scheduled time IP the earliest;
Scheduling time arranges module 440, be suitable for locating the IP structure after, last scheduled time and current time in the IP structure relatively.If the last scheduled time greater than or etc. the current time, be not the described page number of once can dispatching of this server-assignment, if the last scheduled time less than the current time, once can be dispatched page number for this server-assignment is described, the last scheduled time with this IP is set to the current time simultaneously;
Cycle assignment module 450, the IP structure has assigned in this domain name if be suitable for, the next domain name of circular treatment, otherwise in remaining IP, select a last scheduled time IP the earliest to continue to process, after all domain names have assigned in the current website, the website sequence number is added 1, in order to process next website, if the website number reaches maximal value, sequence number resets to 0, if all checked once do not have afterwards to distribute website at all websites, after the dormancy schedule time (such as 1 second) again to all website cycle assignment.
Intrinsic not relevant with any certain computer, virtual system or miscellaneous equipment with demonstration at this algorithm that provides.Various general-purpose systems also can be with using based on the teaching at this.According to top description, it is apparent constructing the desired structure of this type systematic.In addition, the present invention is not also for any certain programmed language.Should be understood that and to utilize various programming languages to realize content of the present invention described here, and the top description that language-specific is done is in order to disclose preferred forms of the present invention.
In the instructions that provides herein, a large amount of details have been described.Yet, can understand, embodiments of the invention can be put into practice in the situation of these details not having.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand one or more in each inventive aspect, in the description to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes in the above.Yet the method for the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires the more feature of feature clearly put down in writing than institute in each claim.Or rather, as following claims reflected, inventive aspect was to be less than all features of the disclosed single embodiment in front.Therefore, follow claims of embodiment and incorporate clearly thus this embodiment into, wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and can adaptively change and they are arranged in one or more equipment different from this embodiment the module in the equipment among the embodiment.Can be combined into a module or unit or assembly to the module among the embodiment or unit or assembly, and can be divided into a plurality of submodules or subelement or sub-component to them in addition.In such feature and/or process or unit at least some are mutually repelling, and can adopt any combination to disclosed all features in this instructions (comprising claim, summary and the accompanying drawing followed) and so all processes or the unit of disclosed any method or equipment make up.Unless in addition clearly statement, disclosed each feature can be by providing identical, being equal to or the alternative features of similar purpose replaces in this instructions (comprising claim, summary and the accompanying drawing followed).
In addition, those skilled in the art can understand, although embodiment more described herein comprise some feature rather than further feature included among other embodiment, the combination of the feature of different embodiment means and is within the scope of the present invention and forms different embodiment.For example, in the following claims, the one of any of embodiment required for protection can be used with array mode arbitrarily.
All parts embodiment of the present invention can realize with hardware, perhaps realizes with the software module of moving at one or more processor, and perhaps the combination with them realizes.It will be understood by those of skill in the art that can use in practice microprocessor or digital signal processor (DSP) realize according to the embodiment of the invention based on some or all some or repertoire of parts in the balance dispatching system of website quota.The present invention can also be embodied as be used to part or all equipment or the device program (for example, computer program and computer program) of carrying out method as described herein.Such realization program of the present invention can be stored on the computer-readable medium, perhaps can have the form of one or more signal.Such signal can be downloaded from internet website and obtain, and perhaps provides at carrier signal, perhaps provides with any other form.
It should be noted above-described embodiment the present invention will be described rather than limit the invention, and those skilled in the art can design alternative embodiment in the situation of the scope that does not break away from claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and is not listed in element or step in the claim.Being positioned at word " " before the element or " one " does not get rid of and has a plurality of such elements.The present invention can realize by means of the hardware that includes some different elements and by means of the computing machine of suitably programming.In having enumerated the unit claim of some devices, several in these devices can be to come imbody by same hardware branch.The use of word first, second and C grade does not represent any order.Can be title with these word explanations.

Claims (14)

1. balance dispatching system based on the website quota comprises:
The scheduler task acquisition module is suitable for obtaining the scheduler task in the domain name formation of website;
Scheduler module is suitable for once can dispatching page number is downloaded respective numbers from described server the page to the specified server of described scheduler task according to pre-configured.
2. the balance dispatching system based on the website quota as claimed in claim 1 is characterized in that, described scheduler task acquisition module is suitable for obtaining scheduler task in the domain name formation of website by predetermined task priority.
3. the balance dispatching system based on the website quota as claimed in claim 1 or 2, it is characterized in that, described system also comprises: the feedback adjusting module, the last scheduled time with the described page of being suitable for behind page of scheduling is updated to the described last scheduled time and adds page time-out time.
4. such as each described balance dispatching system based on the website quota in the claim 1 ~ 3, it is characterized in that, described system also comprises: the feedback adjusting module, the last scheduled time with the described page of being suitable for after downloading a page is updated to the described last scheduled time and deducts the readjustment time, and the described readjustment time is the poor of page time-out time and described page downloading time.
5. such as each described balance dispatching system based on the website quota in the claim 1 ~ 4, it is characterized in that, consuming time less than default download when the page actual download time, then described page downloading time is that described default download is consuming time, otherwise is the page actual download time.
6. such as each described balance dispatching system based on the website quota in the claim 1 ~ 5, it is characterized in that described system also comprises: the allocation of quota module specifically comprises:
The site structure locating module is suitable for obtaining the website sum on the current server, according to the website sequence number, and direct localizer station dot structure;
Domain name allocation of quota module is suitable for obtaining in the slave site structure domain name sum and the head and the tail domain name sequence number of this website, and website domain name was sorted by the last scheduled time, selects the domain name of predetermined quantity the domain name after selecting ordering;
Domain name IP locating module, be suitable for selected domain name is located its domain name IP address according to its domain name IP sum and domain name IP head and the tail sequence number, by the IP offset orientation IP structure address among the domain name IP, choose successively and record the last scheduled time of IP, select server corresponding to last scheduled time IP the earliest;
Scheduling time arranges module, after being suitable for locating the IP structure, compare last scheduled time and current time in the IP structure, if the last scheduled time greater than or etc. the current time, be not the described page number of once can dispatching of this server-assignment, if the last scheduled time less than the current time, once can be dispatched page number for this server-assignment is described, the last scheduled time with this IP is set to the current time simultaneously;
The cycle assignment module, the IP structure has assigned in this domain name if be suitable for, the next domain name of circular treatment, otherwise in remaining IP, select a last scheduled time IP the earliest to continue to process, after all domain names have assigned in the current website, the website sequence number is added 1, in order to process next website, if the website number reaches maximal value, sequence number resets to 0, if all checked once do not have afterwards to distribute website at all websites, after the dormancy schedule time again to all website cycle assignment.
7. such as each described balance dispatching system based on the website quota in the claim 1 ~ 6, it is characterized in that the schedule time is 1 second.
8. equalization scheduling method based on the website quota may further comprise the steps:
Obtain the scheduler task in the domain name formation of website;
Once can dispatch page number is downloaded respective numbers from described server the page to the specified server of described scheduler task according to pre-configured.
9. the equalization scheduling method based on the website quota as claimed in claim 8 is characterized in that, obtains the scheduler task in the domain name formation of website by predetermined task priority.
10. as claimed in claim 8 or 9 based on the equalization scheduling method of website quota, it is characterized in that, dispatch and also comprise step behind the page: last scheduled time of the described page is updated to the described last scheduled time adds page time-out time.
11. such as each described equalization scheduling method based on the website quota in the claim 8 ~ 10, it is characterized in that, download and also comprise step behind the page: last scheduled time of the described page is updated to the described last scheduled time deducts the readjustment time, the described readjustment time is the poor of page time-out time and described page downloading time.
12. such as each described equalization scheduling method based on the website quota in the claim 8 ~ 11, it is characterized in that, consuming time less than default download when the page actual download time, then described page downloading time is that described default download is consuming time, otherwise is the page actual download time.
13. such as each described equalization scheduling method based on the website quota in the claim 8 ~ 12, it is characterized in that, before downloading page, also comprise:
Obtain the website sum on the current server, according to the website sequence number, direct localizer station dot structure;
Obtain domain name sum and the head and the tail domain name sequence number of this website in the slave site structure, website domain name was sorted by the last scheduled time, from the domain name after the ordering, select the domain name of predetermined quantity;
Selected domain name is located its domain name IP address according to its domain name IP sum and domain name IP head and the tail sequence number, by the IP offset orientation IP structure address among the domain name IP, choose successively and record the last scheduled time of each IP, select server corresponding to last scheduled time IP the earliest;
After the IP structure of location, compare last scheduled time and current time in the IP structure, if the last scheduled time is more than or equal to the current time, be not the described page number of once can dispatching of this server-assignment, if the last scheduled time is less than the current time, once can dispatch page number for this server-assignment is described, the last scheduled time with this IP is set to the current time simultaneously;
If the IP structure has assigned in this domain name, the next domain name of circular treatment, otherwise in remaining IP, select a last scheduled time IP the earliest to continue to process, after all domain names have assigned in the current website, the website sequence number is added 1, in order to process next website, if the website number reaches maximal value, sequence number resets to 0, if all checked once do not have afterwards to distribute website at all websites, after the dormancy schedule time again to all website cycle assignment.
14. such as each described equalization scheduling method based on the website quota in the claim 8 ~ 13, it is characterized in that the schedule time is 1 second.
CN201210376922.3A 2012-09-29 2012-09-29 Balanced scheduling system and method based on station quota Expired - Fee Related CN102929721B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210376922.3A CN102929721B (en) 2012-09-29 2012-09-29 Balanced scheduling system and method based on station quota

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210376922.3A CN102929721B (en) 2012-09-29 2012-09-29 Balanced scheduling system and method based on station quota

Publications (2)

Publication Number Publication Date
CN102929721A true CN102929721A (en) 2013-02-13
CN102929721B CN102929721B (en) 2015-04-08

Family

ID=47644528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210376922.3A Expired - Fee Related CN102929721B (en) 2012-09-29 2012-09-29 Balanced scheduling system and method based on station quota

Country Status (1)

Country Link
CN (1) CN102929721B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868639A (en) * 2012-09-29 2013-01-09 北京奇虎科技有限公司 Balanced scheduling system and balanced scheduling method based on site quota
CN104639462A (en) * 2015-02-27 2015-05-20 北京奇虎科技有限公司 System and method for balanced scheduling based on station quota
CN104750558A (en) * 2013-12-31 2015-07-01 伊姆西公司 Resource allocation management method and device of hierarchical quota system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101453361A (en) * 2007-12-07 2009-06-10 中国科学院声学研究所 Website request queue management method
CN101635718A (en) * 2009-08-26 2010-01-27 中兴通讯股份有限公司 Network crawler system and method for acquiring resource as well as network resource gripping device
CN102301677A (en) * 2008-11-25 2011-12-28 思杰系统有限公司 Systems And Methods For Gslb Site Persistence

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101453361A (en) * 2007-12-07 2009-06-10 中国科学院声学研究所 Website request queue management method
CN102301677A (en) * 2008-11-25 2011-12-28 思杰系统有限公司 Systems And Methods For Gslb Site Persistence
CN101635718A (en) * 2009-08-26 2010-01-27 中兴通讯股份有限公司 Network crawler system and method for acquiring resource as well as network resource gripping device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868639A (en) * 2012-09-29 2013-01-09 北京奇虎科技有限公司 Balanced scheduling system and balanced scheduling method based on site quota
CN104750558A (en) * 2013-12-31 2015-07-01 伊姆西公司 Resource allocation management method and device of hierarchical quota system
CN104750558B (en) * 2013-12-31 2018-07-03 伊姆西公司 The method and apparatus that resource allocation is managed in quota system is layered
CN104639462A (en) * 2015-02-27 2015-05-20 北京奇虎科技有限公司 System and method for balanced scheduling based on station quota

Also Published As

Publication number Publication date
CN102929721B (en) 2015-04-08

Similar Documents

Publication Publication Date Title
US10949253B2 (en) Data forwarder for distributed data acquisition, indexing and search system
CN111027921B (en) Service processing method and device, electronic equipment and storage medium
CN102929672B (en) Application upgrade system and method
US9836189B2 (en) System and method of inter-widget communication
US8830913B1 (en) Location-based software updates
CN105389191A (en) Software upgrading method, apparatus and system based on local area network
CN105095445A (en) Page generation method and system
CN103888619B (en) A kind of message treatment method and system thereof
CN103605764A (en) Web crawler system and web crawler multitask executing and scheduling method
CN103559083A (en) Web crawl task scheduling method and task scheduler
CN102929671B (en) Server, application upgrade method and application upgrade system
CN102868726A (en) Method and system for publishing Internet information
CN103530182A (en) Working scheduling method and device
CN107015849B (en) Timed task reminding method and device
CN101355590B (en) Method, system and apparatus for prompting download
CN102902785A (en) Webpage information acquisition system and method
CN105577718A (en) Intelligent network information acquisition method and network information acquisition system
CN108833584B (en) Message pushing method, terminal, server and computer storage medium
CN102929721B (en) Balanced scheduling system and method based on station quota
CN101969399A (en) Routing method and system for clients to call services
CN109885642A (en) Classification storage method and device towards full-text search
CN102760073A (en) Method, system and device for scheduling task
CN102868639A (en) Balanced scheduling system and balanced scheduling method based on site quota
CN108959439A (en) A kind of data template generation method and system
CN112395337B (en) Data export method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150408

Termination date: 20210929