CN106656860A - Multi-site HTTP access frequency control method - Google Patents
Multi-site HTTP access frequency control method Download PDFInfo
- Publication number
- CN106656860A CN106656860A CN201610920014.4A CN201610920014A CN106656860A CN 106656860 A CN106656860 A CN 106656860A CN 201610920014 A CN201610920014 A CN 201610920014A CN 106656860 A CN106656860 A CN 106656860A
- Authority
- CN
- China
- Prior art keywords
- task
- queue
- frequency
- frequency control
- site http
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
Abstract
The invention relates to a multi-site HTTP access frequency control method. The method comprises that the frequency is configured after a download task is received, a download task queue is entered, tasks of the download task queue are polled according to the queue, and downloading is implemented; after that a system starts processing, one task is read from the head of the first queue, if frequency state of the task is recorded, whether the task can be executed at present is determined, if it is determined that the task can be executed at present, the task is downloaded, finish time and waiting time are recorded, the task is deleted from the queue, whether to continue is determined, if YES, another task is read from the head of the next queue, and the frequency state of the task is continued to be inquired; and if the frequency state of the task is not recorded, the task is downloaded, the finish time and waiting time are recorded, the task is deleted from the queue, whether to continue is determined, and if NO, a thread is completed. Thus, problems caused by frequency control are solved.
Description
Technical field
Patent of the present invention belongs to control field, more particularly to a kind of multi-site HTTP visiting frequency control methods.
Background technology
With developing rapidly for network, WWW becomes the carrier of bulk information, how to efficiently extract and utilizes these
Information becomes a huge challenge.It is the main method for extracting the network information that HTTP is accessed, continuous with anti-reptile mechanism
Strengthen, controlling HTTP visiting frequencies just becomes the Main Means for preventing website from limiting access, but frequency control is just faced with money
The service efficiency problem in source.
At present, in the case problems faced has:1st, frequency control is nonessential:It is not that each website is required for frequency control
System.2nd, frequency control interacts:If a task receives frequency control in original downloading task list queue, certainly will
Task below can be affected to download in time.3rd, frequency control partition problem:One website might have different subdomains and enjoy list
Only frequency, multiple websites may share a frequency.
Patent of invention content
Patent of the present invention provides a kind of multi-site HTTP visiting frequency control methods, to solve what is brought due to frequency control
Problem.
A kind of multi-site HTTP visiting frequency control methods, configure including frequency after downloading task is received, is then made,
Subsequently into downloading task queue, according to downloading task queue snoop queue task, download is finally made;Start in system thread
Afterwards, a task is read from the head of first queue, if there is the frequency state of this task in record, is if it is judged
Whether current time task can perform, if it is, downloading task, record end time and stand-by period, and from queue
Middle this task of deletion, and be confirmed whether to continue, if it is, reading a task from the head of next queue, and continue to look into
Ask the frequency state of task;If having the frequency state of this task in record, if it is not, then during downloading task record end
Between and the stand-by period, and this task is deleted from queue, and be confirmed whether to continue, if it is not, then thread terminates.
Further, used in frequency rule canonical dividing frequency unit;Regular expression has great flexibility,
Both can divide according to the subdomain of URL, it is also possible to collectively constitute a rule by multiple domain names.
Further, in frequency cell queue, by specifying frequency rule the task of different frequency units is divided
Different queues are arrived.
Further, the task of different frequency units in different queues, is gone to look into by scheduler program in multiqueue dispatching
Whether see the task of each queue can perform.
Description of the drawings
Fig. 1 is a kind of multi-site HTTP visiting frequencies control method system information Organization Chart
Fig. 2 is a kind of multi-site HTTP visiting frequencies control method system process chart
Specific embodiment
Embodiment:A kind of multi-site HTTP visiting frequency control methods, including after downloading task is received, then make frequency
Degree configuration, subsequently into downloading task queue, according to downloading task queue snoop queue task, finally makes download;In system
After process starts, a task is read from the head of first queue, if there is the frequency state of this task in record, if
It is to judge whether current time task can perform, if it is, downloading task, record end time and stand-by period,
And this task is deleted from queue, and be confirmed whether to continue, if it is, a task is read from the head of next queue,
And continue to inquire the frequency state of task;If there is the frequency state of this task in record, if it is not, then downloading task note
Record end time and stand-by period, and this task is deleted from queue, and be confirmed whether to continue, if it is not, then thread terminates.
Frequency rule, contains canonical, minimum latency, maximum latency.Wherein canonical is used for matching task
URL, distinguishes different frequency control units, and minimum latency refers to the minimum time of required wait before tasks carrying,
Maximum latency refers to the maximum time of required wait before tasks carrying.Multigroup frequency is contained in frequency control system
Rule.When downloading task is received, the canonical in rule is in duty mapping to different task queues.
Scheduling flow, itself can record the last downloaded end time of each frequency unit and need the time for waiting, it
Meeting each task queue of poll simultaneously judges whether this task meets the time interval that needs are waited under current time, if can hold
It is capable then task is taken out into download from queue, the poll next task if it can not perform.Scheduling flow can be completed in task
Download end time of logger task afterwards, and the next required by task time to be waited is calculated according to frequency rule.
Wherein, used in frequency rule canonical dividing frequency unit;Regular expression has great flexibility, both may be used
To divide according to the subdomain of URL, it is also possible to collectively constitute a rule by multiple domain names.
Wherein, in frequency cell queue, by specifying frequency rule the task of different frequency units is divided into
Different queues.
Wherein, the task of different frequency units in different queues, is gone to check each by scheduler program in multiqueue dispatching
Whether the task of individual queue can perform, and so as to ensure that each queue of the task can be performed in time, carry so as to maximized
Rise resource utilization.
Although an embodiment of the present invention has been shown and described, for the ordinary skill in the art, can be with
Understanding can carry out various changes, modification, replacement to these embodiments without departing from the principles and spirit of the present invention
And modification, the scope of the present invention be defined by the appended.
Claims (4)
1. a kind of multi-site HTTP visiting frequency control methods, it is characterised in that:After downloading task is received, then frequency is made
Configuration, subsequently into downloading task queue, according to downloading task queue snoop queue task, finally makes download;At system
After reason starts, a task is read from the head of first queue, if there is the frequency state of this task in record, if
Then judge whether current time task can perform, if it is, downloading task, record end time and stand-by period, and
This task is deleted from queue, and is confirmed whether to continue, if it is, a task is read from the head of next queue, and
Continue the frequency state for inquiring task;If there is the frequency state of this task in record, if it is not, then downloading task record
End time and stand-by period, and this task is deleted from queue, and be confirmed whether to continue, if it is not, then thread terminates.
2. a kind of multi-site HTTP visiting frequency control methods according to claim 1, it is characterised in that:In frequency rule
Frequency unit is divided using canonical;Regular expression has great flexibility, both can divide according to the subdomain of URL,
A rule can be collectively constituted by multiple domain names.
3. a kind of multi-site HTTP visiting frequency control methods according to claim 1, it is characterised in that:Frequency unit team
In row, the task of different frequency units is caused to be divided into different queues by specifying frequency rule.
4. a kind of multi-site HTTP visiting frequency control methods according to claim 1, it is characterised in that:Multiqueue dispatching
The task of middle different frequency units goes whether the checking each queue of the task can be held in different queues by scheduler program
OK.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610920014.4A CN106656860A (en) | 2016-10-21 | 2016-10-21 | Multi-site HTTP access frequency control method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610920014.4A CN106656860A (en) | 2016-10-21 | 2016-10-21 | Multi-site HTTP access frequency control method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106656860A true CN106656860A (en) | 2017-05-10 |
Family
ID=58856086
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610920014.4A Pending CN106656860A (en) | 2016-10-21 | 2016-10-21 | Multi-site HTTP access frequency control method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106656860A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108154431A (en) * | 2018-01-17 | 2018-06-12 | 北京网信云服信息科技有限公司 | A kind of target raises condition processing method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102298622A (en) * | 2011-08-11 | 2011-12-28 | 中国科学院自动化研究所 | Search method for focused web crawler based on anchor text and system thereof |
CN102902785A (en) * | 2012-09-29 | 2013-01-30 | 合一网络技术(北京)有限公司 | Webpage information acquisition system and method |
CN103873597A (en) * | 2014-04-15 | 2014-06-18 | 厦门市美亚柏科信息股份有限公司 | Distributed webpage downloading method and system |
CN105260388A (en) * | 2015-09-11 | 2016-01-20 | 广州极数宝数据服务有限公司 | Optimization method of distributed vertical crawler service system |
-
2016
- 2016-10-21 CN CN201610920014.4A patent/CN106656860A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102298622A (en) * | 2011-08-11 | 2011-12-28 | 中国科学院自动化研究所 | Search method for focused web crawler based on anchor text and system thereof |
CN102902785A (en) * | 2012-09-29 | 2013-01-30 | 合一网络技术(北京)有限公司 | Webpage information acquisition system and method |
CN103873597A (en) * | 2014-04-15 | 2014-06-18 | 厦门市美亚柏科信息股份有限公司 | Distributed webpage downloading method and system |
CN105260388A (en) * | 2015-09-11 | 2016-01-20 | 广州极数宝数据服务有限公司 | Optimization method of distributed vertical crawler service system |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108154431A (en) * | 2018-01-17 | 2018-06-12 | 北京网信云服信息科技有限公司 | A kind of target raises condition processing method and device |
CN108154431B (en) * | 2018-01-17 | 2021-07-06 | 北京网信云服信息科技有限公司 | Target recruitment state processing method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ren et al. | Hopper: Decentralized speculation-aware cluster scheduling at scale | |
CN105915633B (en) | Automatic operation and maintenance system and method | |
CN104717636B (en) | Method for upgrading software, terminal device and aerial download server | |
CN103679392B (en) | A kind of task scheduling processing method and system | |
US20180349178A1 (en) | A method and system for scalable job processing | |
JP2011242991A (en) | Cloud computing system, document processing method, and computer program | |
JP4876138B2 (en) | Control computer and control system | |
CN103235835A (en) | Inquiry implementation method for database cluster and device | |
US20160019090A1 (en) | Data processing control method, computer-readable recording medium, and data processing control device | |
CN106664714A (en) | Periodic uplink grant alignment in a cellular network | |
Kim et al. | An analytical framework to characterize the efficiency and delay in a mobile data offloading system | |
CN104202386B (en) | A kind of high concurrent amount distributed file system and its secondary load equalization methods | |
JP2017168074A (en) | Method and apparatus for controlling data transmission | |
CN105138598A (en) | Method and system for remotely timing task | |
CN105791371A (en) | Cloud storage service system and method | |
CN102664950A (en) | Data communication method between welding power sources and computers | |
Chen et al. | DTS: dynamic TDMA scheduling for networked control systems | |
CN106656860A (en) | Multi-site HTTP access frequency control method | |
CN107015855A (en) | A kind of asynchronous service centralized dispatching method and device for supporting time parameter method | |
CN104750545A (en) | Process scheduling method and device | |
ATE447813T1 (en) | SYSTEM AND METHOD FOR TIME-BASED PLANNING | |
CN107819823A (en) | A kind of information processing method, server and computer-readable recording medium | |
CN106911739B (en) | Information distribution method and device | |
US20170279895A1 (en) | Information processing system and information processing method | |
CN106776032A (en) | The treating method and apparatus of the I/O Request of distributed block storage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170510 |
|
WD01 | Invention patent application deemed withdrawn after publication |