CN107784116A - Task distributes the realization method and system in distributed system - Google Patents

Task distributes the realization method and system in distributed system Download PDF

Info

Publication number
CN107784116A
CN107784116A CN201711101744.2A CN201711101744A CN107784116A CN 107784116 A CN107784116 A CN 107784116A CN 201711101744 A CN201711101744 A CN 201711101744A CN 107784116 A CN107784116 A CN 107784116A
Authority
CN
China
Prior art keywords
equipment
task
distributed
average delay
interval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711101744.2A
Other languages
Chinese (zh)
Inventor
马岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Creative Technology (shenzhen) Co Ltd
Original Assignee
Creative Technology (shenzhen) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Creative Technology (shenzhen) Co Ltd filed Critical Creative Technology (shenzhen) Co Ltd
Priority to CN201711101744.2A priority Critical patent/CN107784116A/en
Publication of CN107784116A publication Critical patent/CN107784116A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system

Abstract

The invention discloses a kind of task to distribute the implementation method in distributed system, and methods described comprises the following steps:Distributed apparatus receives or initiated task message, and the task message is used to distribute web retrieval task in a distributed system;N number of packet is sent to other M equipment of distributed system by distributed apparatus successively;Distributed apparatus counts the N number of time delay of M groups for N number of packet that M equipment returns;Distributed apparatus distributes webpage task according to the average delay of every group of Yanzhong when N number of.Technical scheme provided by the invention has the advantages of efficiency high.

Description

Task distributes the realization method and system in distributed system
Technical field
The present invention relates to data processing field, more particularly to a kind of distribution of task distributed system implementation method and be System.
Background technology
Web retrieval is a kind of abbreviation being acquired to particular webpage, and for web retrieval, existing webpage is adopted Collection is typically realized in distributed system, but existing web retrieval can not enter according to actual conditions to the task of web retrieval Row distribution, can not cause web retrieval efficiency low according to the suitable executive mode of specific type selecting.
The content of the invention
The application provides a kind of implementation method of task distribution in distributed system.It solves the technical scheme of prior art The shortcomings that efficiency is low.
On the one hand, there is provided a kind of task distributes the implementation method in distributed system, and methods described comprises the following steps:
Distributed apparatus receives or initiated task message, and the task message is used for distribution webpage in a distributed system and adopted Set task;
N number of packet is sent to other M equipment of distributed system by distributed apparatus successively;Distributed apparatus counts The N number of time delay of M groups for N number of packet that M equipment returns;Average delay distribution of the distributed apparatus according to every group of Yanzhong when N number of Webpage task gives execution equipment;
Type of the equipment to the distribution webpage task is performed, is performed according to type allotment and the executive mode of the type The webpage task.
Optionally, the distributed apparatus specifically includes according to the M time delay and distribution web retrieval task:
Distributed apparatus distributes the first web pages acquisition tasks to X equipment of the average delay in first interval, will be average Time delay gives the average delay of the second web pages acquisition tasks, wherein X equipment of first interval in Y equipment of second interval Less than the average delay of Y equipment of second interval, the first web pages acquisition tasks are more than the second web pages acquisition tasks.
Optionally, the executive mode according to type allotment and the type, which performs the webpage task, includes:
Perform apparatus preparing single thread or multithreading performs the webpage task.
Second aspect, there is provided a kind of task distributes the application system in distributed system, and the system includes:Distribution is set Standby and M execution equipment, the distributed apparatus perform equipment with M and are connected;
Distributed apparatus, for receiving or initiating task message, the task message is used to distribute in a distributed system Web retrieval task;N number of packet is sent to other M equipment of distributed system successively;The N that M equipment of statistics returns The N number of time delay of M groups of individual packet;Average delay distribution webpage task according to every group of Yanzhong when N number of is to M execution equipment;
The M equipment, web retrieval task is distributed for receiving.To the type of the distribution webpage task, according to described in Type is allocated performs the webpage task with the executive mode of the type.
Optionally, the distributed apparatus, it is additionally operable to distribute the first networking to X equipment of the average delay in first interval Page acquisition tasks, average delay is given into the second web pages acquisition tasks in Y equipment of second interval, wherein first interval The average delay of X equipment is less than the average delay of Y equipment of second interval, and the first web pages acquisition tasks are more than second group Web retrieval task.
Optionally, the execution equipment, it is additionally operable to allocate single thread or multithreading performs the webpage task.
The third aspect, there is provided a kind of distributed apparatus, including:Processor, wireless transceiver, memory and bus, it is described Processor, wireless transceiver, memory are connected by bus,
The wireless transceiver, for receiving or initiating task message, the task message is used in a distributed system Distribute web retrieval task;
The processor, for N number of packet to be sent to other M equipment of distributed system successively;Statistics M is set The N number of time delay of M groups of the standby N number of packet returned;Average delay according to every group of Yanzhong when N number of distributes webpage task.
Optionally, the processor, for distributing the collection of the first web pages in X equipment of first interval to average delay Task, average delay is given into the second web pages acquisition tasks in Y equipment of second interval, the wherein X of first interval set Standby average delay is less than the average delay of Y equipment of second interval, and the first web pages acquisition tasks are more than the second web pages Acquisition tasks.
Optionally, the processor, for when configuring the first web retrieval task, by the first web retrieval task group The other equipment of distributed system is issued, receives the confirmation message that other equipment returns.
Fourth aspect, there is provided a kind of computer-readable recording medium, it stores the computer journey for electronic data interchange Sequence, wherein, the computer program causes computer to perform the method that first aspect provides.
Technical scheme provided by the invention distributes the task of web retrieval by average delay, i.e. average delay is less Equipment distributes more web retrieval task, and the larger equipment of average delay distributes less web retrieval task, and passes through Scheme that type selecting matches with the type performs, the advantages of so as to improve efficiency.
Brief description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Accompanying drawing be briefly described, it should be apparent that, drawings in the following description are some embodiments of the present invention, for this area For those of ordinary skill, on the premise of not paying creative work, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of implementation method of task distribution in distributed system of the first better embodiment offer of the invention Flow chart;
Fig. 2 is a kind of application system of task distribution in distributed system of the second better embodiment offer of the invention Structure chart.
Fig. 3 is a kind of hardware structure diagram for distributed apparatus that the second better embodiment of the invention provides.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is part of the embodiment of the present invention, rather than whole embodiments.Based on this hair Embodiment in bright, the every other implementation that those of ordinary skill in the art are obtained under the premise of creative work is not made Example, belongs to the scope of protection of the invention.
Fig. 1 is refer to, Fig. 1 is that a kind of task that the first better embodiment of the invention proposes is distributed in distributed system Implementation method, this method is as shown in figure 1, comprise the following steps:
Step S101, distributed apparatus receives or initiated task message, and the task message is used in a distributed system Distribute web retrieval task.
Step S102, N number of packet is sent to other M equipment of distributed apparatus by distributed apparatus successively, counts M The N number of time delay of M groups for N number of packet that individual equipment returns, every group of time delay for including N number of packet.
Above-mentioned steps S102 implementation method can be:
Distributed apparatus obtains the size (i.e. capacity, how many individual MB or multiple KB) for the packet that history is shared;Extraction is gone through The big minizone of history packet, by the size interval division into N number of subinterval, distributed apparatus virtually N number of packet, wherein N The size of m-th packet in individual packet is the intermediate value in m-th subinterval in N number of section, and distributed apparatus is by N number of data Bag is sent to M other distributed apparatus successively, and UE counts N number of packet of each access point in other M distributed apparatus Time delay, obtain the N number of time delay of M groups.
Calculation of the feedback parameter as time sum is illustrated using the example of a reality below;
Here the size of packet can specifically include:6MB, 5MB, 4MB, 3MB, 2MB, 1MB, what is divided here is N number of Section is by taking 2 sections as an example, and specific 2 sections may range from, section 1【6MB, 4MB】;Section 2【3MB, 1MB】, that Distributed apparatus fictionalizes 2 packets, for convenience of explanation, represents first interval virtual data bag here with packet A, Packet B represents second interval virtual data bag, and packet A size is 5MB, and packet B size is 2MB, by packet A And packet B is sent to M other equipment successively, (here by taking three AP as an example, respectively AP1, AP2 and AP3), AP1 ACK (1a) can be returned to later by receiving packet A, and the reception time is tACK(1a), the packet A transmission time is t1a, AP1 receptions ACK (1b) can be returned to after to packet B, it can be t to receive the timeACK(1b), the packet B transmission time is t1b;So AP1 N number of time delay be:tACK(1a)-t1aAnd tACK(1b)-t1b.AP2 and AP3 N number of time delay, average delay can similarly be calculated =【(tACK(1a)-t1a)+(tACK(1b)-t1b)】/2。
Step S103, the average delay distribution webpage task according to every group of Yanzhong when N number of gives execution equipment.
Technical scheme provided by the invention distributes the task of web retrieval by average delay, i.e. average delay is less Equipment distributes more web retrieval task, and the larger equipment of average delay distributes less web retrieval task, so as to improve The advantages of efficiency.
Optionally, above-mentioned steps S103 implementation method is specifically as follows:
Distributed apparatus distributes the first web pages acquisition tasks to X equipment of the average delay in first interval, will be average Time delay gives the average delay of the second web pages acquisition tasks, wherein X equipment of first interval in Y equipment of second interval Less than the average delay of Y equipment of second interval, the first web pages acquisition tasks are more than the second web pages acquisition tasks.
Step S104, equipment is performed to the type of the distribution webpage task, according to holding for type allotment and the type Line mode performs the webpage task.
Optionally, the above method can also include after step s 103:
First web retrieval task is mass-sended when configuring the first web retrieval task and gives distribution system by distributed apparatus The other equipment of system, receive the confirmation message that other equipment returns.
Fig. 2 is refer to, Fig. 2 is that a kind of distributed reptile that the second better embodiment of the invention proposes realizes system, should System as shown in Fig. 2 including:Distributed apparatus 201 and M equipment 202, the distributed apparatus are connected with equipment;
Distributed apparatus, for receiving or initiating task message, the task message is used to distribute in a distributed system Web retrieval task;N number of packet is sent to other M equipment of distributed system successively;The N that M equipment of statistics returns The N number of time delay of M groups of individual packet;Average delay according to every group of Yanzhong when N number of distributes webpage task;
The M equipment 202, web retrieval tasks carrying web retrieval is distributed for receiving.
Optionally, the distributed apparatus, it is additionally operable to distribute the first networking to X equipment of the average delay in first interval Page acquisition tasks, average delay is given into the second web pages acquisition tasks in Y equipment of second interval, wherein first interval The average delay of X equipment is less than the average delay of Y equipment of second interval, and the first web pages acquisition tasks are more than second group Web retrieval task.
Optionally, the distributed apparatus, it is additionally operable to when configuring the first web retrieval task, by the first web retrieval Task mass-sends the other equipment to distributed system, receives the confirmation message that other equipment returns.
Refering to Fig. 3, Fig. 3 is a kind of distributed apparatus 30, including:Processor 301, wireless transceiver 302, memory 303 With bus 304, wireless transceiver 302 is used for the transceiving data between external equipment.The quantity of processor 301 can be one or It is multiple.In some embodiments of the present application, processor 301, memory 302 and transceiver 303 can pass through bus 304 or its other party Formula connects.The step of server 30 can be used for performing Fig. 1.The implication for the term being related on the present embodiment and citing, can With the embodiment with reference to corresponding to figure 1.Here is omitted.
Wireless transceiver 302, for receiving or initiating task message, the task message is used to divide in a distributed system With webpage acquisition tasks;
Processor 301, for N number of packet to be sent to other M equipment of distributed system successively;Statistics M is set The N number of time delay of M groups of the standby N number of packet returned;Average delay according to every group of Yanzhong when N number of distributes webpage task.
Wherein, store program codes in memory 303.Processor 901 is used to call the program generation stored in memory 903 Code, for performing following operation:
Processor 301, will for distributing the first web pages acquisition tasks to X equipment of the average delay in first interval Average delay gives the second web pages acquisition tasks in Y equipment of second interval, and wherein X equipment of first interval is averaged Time delay is less than the average delay of Y equipment of second interval, and the first web pages acquisition tasks are more than the second web pages acquisition tasks.
It should be noted that processor 301 here can be a treatment element or multiple treatment elements It is referred to as.For example, the treatment element can be central processing unit (Central Processing Unit, CPU) or spy Determine integrated circuit (Application Specific Integrated Circuit, ASIC), or be arranged to implement this Apply for one or more integrated circuits of embodiment, such as:One or more microprocessors (digital singnal Processor, DSP), or, one or more field programmable gate array (Field Programmable Gate Array, FPGA)。
Memory 303 can be the general designation of a storage device or multiple memory elements, and for storing and can hold Parameter, data etc. required for line program code or the operation of application program running gear.And memory 303 can include random storage Device (RAM), nonvolatile memory (non-volatile memory), such as magnetic disk storage, flash memory can also be included (Flash) etc..
Bus 304 can be that industry standard architecture (Industry Standard Architecture, ISA) is total Line, external equipment interconnection (Peripheral Component, PCI) bus or extended industry-standard architecture (Extended Industry Standard Architecture, EISA) bus etc..The bus can be divided into address bus, data/address bus, control Bus processed etc..For ease of representing, only represented in Fig. 3 with a thick line, it is not intended that an only bus or a type of Bus.
The terminal can also include input/output unit, be connected to bus 304, with by bus and the grade of processor 301 its Its part connects.The input/output unit can provide an inputting interface for operating personnel, so that operating personnel pass through the input Interface selects item of deploying to ensure effective monitoring and control of illegal activities, and can also be other interfaces, can pass through the external miscellaneous equipment of the interface.
It should be noted that for foregoing each embodiment of the method, in order to be briefly described, therefore it is all expressed as to a system The combination of actions of row, but those skilled in the art should know, the present invention is not limited by described sequence of movement, because For according to the present invention, certain some step can use other orders or carry out simultaneously.Secondly, those skilled in the art also should Know, embodiment described in this description belongs to preferred embodiment, involved action and module not necessarily this hair Necessary to bright.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and is not described in some embodiment Part, may refer to the associated description of other embodiment.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can To instruct the hardware of correlation to complete by program, the program can be stored in a computer-readable recording medium, storage Medium can include:Flash disk, read-only storage (English:Read-Only Memory, referred to as:ROM), random access device (English Text:Random Access Memory, referred to as:RAM), disk or CD etc..
The content download method and relevant device that are there is provided above the embodiment of the present invention, system are described in detail, Specific case used herein is set forth to the principle and embodiment of the present invention, and the explanation of above example is simply used Understand the method and its core concept of the present invention in help;Meanwhile for those of ordinary skill in the art, according to the present invention's Thought, there will be changes in specific embodiments and applications, in summary, this specification content should not be construed as Limitation of the present invention.

Claims (10)

1. a kind of task distribution is in the implementation method of distributed system, it is characterised in that methods described comprises the following steps:
Distributed apparatus receives or initiated task message, and the task message is used to distribute web retrieval times in a distributed system Business;
N number of packet is sent to other M equipment of distributed system by distributed apparatus successively;Distributed apparatus statistics M The N number of time delay of M groups for N number of packet that equipment returns;Distributed apparatus distributes webpage according to the average delay of every group of Yanzhong when N number of Task gives execution equipment;
Type of the equipment to the distribution webpage task is performed, the net is performed according to type allotment and the executive mode of the type Page task.
2. according to the method for claim 1, it is characterised in that the distributed apparatus is according to the M time delay and distribution net Page acquisition tasks, are specifically included:
Distributed apparatus distributes the first web pages acquisition tasks to X equipment of the average delay in first interval, by average delay The second web pages acquisition tasks are given in Y equipment of second interval, the average delay of wherein X equipment of first interval is less than The average delay of Y equipment of second interval, the first web pages acquisition tasks are more than the second web pages acquisition tasks.
3. according to the method for claim 1, it is characterised in that described according to type allotment and the execution side of the type Formula, which performs the webpage task, to be included:
Perform apparatus preparing single thread or multithreading performs the webpage task.
4. a kind of task distribution is in the application system of distributed system, it is characterised in that the system includes:Distributed apparatus with And M execution equipment, the distributed apparatus perform equipment with M and are connected;
Distributed apparatus, for receiving or initiating task message, the task message is used to distribute webpage in a distributed system Acquisition tasks;N number of packet is sent to other M equipment of distributed system successively;N number of number that M equipment of statistics returns According to the N number of time delay of M groups of bag;Average delay distribution webpage task according to every group of Yanzhong when N number of is to M execution equipment;
The M equipment, web retrieval task is distributed for receiving.To the type of the distribution webpage task, according to the type Allotment and the executive mode of the type perform the webpage task.
5. system according to claim 4, it is characterised in that
The distributed apparatus, it is additionally operable to distribute the first web pages acquisition tasks to X equipment of the average delay in first interval, Average delay is given into the second web pages acquisition tasks in Y equipment of second interval, wherein X equipment of first interval is flat Equal time delay is less than the average delay of Y equipment of second interval, and the first web pages acquisition tasks are more than the collection times of the second web pages Business.
6. according to the method for claim 4, it is characterised in that
The execution equipment, it is additionally operable to allocate single thread or multithreading performs the webpage task.
7. a kind of distributed apparatus, including:Processor, wireless transceiver, memory and bus, the processor, wireless receiving and dispatching Device, memory are connected by bus, it is characterised in that
The wireless transceiver, for receiving or initiating task message, the task message is used to distribute in a distributed system Web retrieval task;
The processor, for N number of packet to be sent to other M equipment of distributed system successively;M equipment of statistics is returned The N number of time delay of M groups of the N number of packet returned;Average delay according to every group of Yanzhong when N number of distributes webpage task.
8. server according to claim 7, it is characterised in that the processor, for average delay in the firstth area Between X equipment distribute the first web pages acquisition tasks, Y equipment of the average delay in second interval is given into the second web pages The average delay of X equipment of acquisition tasks, wherein first interval less than second interval Y equipment average delay, first Web pages acquisition tasks are more than the second web pages acquisition tasks.
9. server according to claim 7, it is characterised in that the processor, for adopting configuring the first webpage During set task, the first web retrieval task is mass-sended into the other equipment to distributed system, receives the confirmation that other equipment returns Message.
A kind of 10. computer-readable recording medium, it is characterised in that it stores the computer program for electronic data interchange, Wherein, the computer program causes computer to perform the method as described in claim any one of 1-3.
CN201711101744.2A 2017-11-10 2017-11-10 Task distributes the realization method and system in distributed system Pending CN107784116A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711101744.2A CN107784116A (en) 2017-11-10 2017-11-10 Task distributes the realization method and system in distributed system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711101744.2A CN107784116A (en) 2017-11-10 2017-11-10 Task distributes the realization method and system in distributed system

Publications (1)

Publication Number Publication Date
CN107784116A true CN107784116A (en) 2018-03-09

Family

ID=61433109

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711101744.2A Pending CN107784116A (en) 2017-11-10 2017-11-10 Task distributes the realization method and system in distributed system

Country Status (1)

Country Link
CN (1) CN107784116A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102045402A (en) * 2010-12-28 2011-05-04 云海创想信息技术(北京)有限公司 Data acquisition method and data acquisition system of distributed file system
CN103425519A (en) * 2012-05-16 2013-12-04 富士通株式会社 Distributed computing method and distributed computing system
CN104834722A (en) * 2015-05-12 2015-08-12 网宿科技股份有限公司 CDN (Content Delivery Network)-based content management system
US20170068735A1 (en) * 2015-09-08 2017-03-09 MOLBASE (Shanghai) Biotechnology Co., Ltd . Task-crawling system and task-crawling method for distributed crawler system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102045402A (en) * 2010-12-28 2011-05-04 云海创想信息技术(北京)有限公司 Data acquisition method and data acquisition system of distributed file system
CN103425519A (en) * 2012-05-16 2013-12-04 富士通株式会社 Distributed computing method and distributed computing system
CN104834722A (en) * 2015-05-12 2015-08-12 网宿科技股份有限公司 CDN (Content Delivery Network)-based content management system
US20170068735A1 (en) * 2015-09-08 2017-03-09 MOLBASE (Shanghai) Biotechnology Co., Ltd . Task-crawling system and task-crawling method for distributed crawler system

Similar Documents

Publication Publication Date Title
CN106934027A (en) Distributed reptile realization method and system
CN107819870A (en) Increment pulling data method, apparatus, storage medium, terminal device and server
CN109542512A (en) A kind of data processing method, device and storage medium
CN103338464B (en) Communication means and equipment
CN104484383A (en) JS file processing method and device
CN104571957B (en) A kind of method for reading data and assembling device
CN106130810A (en) Website monitoring method and device
CN107705838A (en) A kind of transmission method of medical image, device, server, medium and system
CN107589991A (en) The webpage distribution method and system of distributed system
CN107784116A (en) Task distributes the realization method and system in distributed system
CN109753012A (en) A kind of processing machine long-range control method, apparatus and system based on cloud platform
CN107679233A (en) Distributed reptile method for allocating tasks and system
CN107679243A (en) Task distributes the application process and system in distributed system
CN107707673A (en) Realization method and system based on webpage task
CN107729153A (en) Web retrieval method for allocating tasks and system
CN106656842A (en) Load balancing method and flow forwarding device
CN105847363A (en) Method and system used for cross-region file sharing
CN107766522A (en) The distribution method and system of task manager in distributed reptile system
CN107800789A (en) The distribution method and system of task manager in distributed reptile system
CN109359799A (en) Declaration form tune form processing method, device, computer equipment and storage medium
CN106506176A (en) A kind of strategy and charging regulation generation method and system
CN107656806A (en) A kind of resource allocation methods and resource allocation device
CN106873470A (en) The statistics and distribution method and system of coil winding machine
CN107562956A (en) Distributed reptile method for allocating tasks and system
CN107305581A (en) Table connection method and distributed data base system in distributed data base system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180309

RJ01 Rejection of invention patent application after publication