CN107038069A - Dynamic labels match DLMS dispatching methods under Hadoop platform - Google Patents
Dynamic labels match DLMS dispatching methods under Hadoop platform Download PDFInfo
- Publication number
- CN107038069A CN107038069A CN201710181055.0A CN201710181055A CN107038069A CN 107038069 A CN107038069 A CN 107038069A CN 201710181055 A CN201710181055 A CN 201710181055A CN 107038069 A CN107038069 A CN 107038069A
- Authority
- CN
- China
- Prior art keywords
- node
- label
- cpu
- data
- resource
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000013468 resource allocation Methods 0.000 claims abstract description 16
- 238000009826 distribution Methods 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000015556 catabolic process Effects 0.000 claims description 6
- 238000006731 degradation reaction Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 238000007418 data mining Methods 0.000 claims description 5
- 238000005516 engineering process Methods 0.000 claims description 5
- 230000006399 behavior Effects 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000013461 design Methods 0.000 claims description 4
- 238000001514 detection method Methods 0.000 claims description 4
- 238000007726 management method Methods 0.000 claims description 4
- 238000004519 manufacturing process Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 241001269238 Data Species 0.000 claims description 2
- 238000009412 basement excavation Methods 0.000 claims description 2
- 238000007630 basic procedure Methods 0.000 claims description 2
- 230000003111 delayed effect Effects 0.000 claims description 2
- 230000007613 environmental effect Effects 0.000 claims description 2
- 235000003642 hunger Nutrition 0.000 claims description 2
- 230000037351 starvation Effects 0.000 claims description 2
- 230000009466 transformation Effects 0.000 claims description 2
- 235000013399 edible fruits Nutrition 0.000 claims 1
- 238000005201 scrubbing Methods 0.000 claims 1
- 238000012360 testing method Methods 0.000 abstract description 5
- 238000004904 shortening Methods 0.000 abstract 1
- 238000004422 calculation algorithm Methods 0.000 description 10
- 238000002474 experimental method Methods 0.000 description 7
- 101100481876 Danio rerio pbk gene Proteins 0.000 description 5
- 101100481878 Mus musculus Pbk gene Proteins 0.000 description 5
- 238000010586 diagram Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 206010068052 Mosaicism Diseases 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 210000003765 sex chromosome Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
DLMS dispatching methods are matched the invention discloses dynamic labels under Hadoop platform, belong to computer software fields, for Hadoop performance clusters difference is big, resource allocation randomness, the problem of perform overlong time, the present invention proposes a kind of by joint behavior label (hereinafter referred to as node label) and the scheduler of job class label (hereinafter referred to as operation label) progress Dynamic Matching.Node preliminary classification simultaneously assigns ancestor node label, and nodal test self performance index generation dynamic node label, operation carries out classification generation operation label according to part operation information, and Resource Scheduler distributes to node resource the operation of corresponding label.Test result indicates that, there is larger shortening on the Job execution time relative to the scheduler carried in YARN.
Description
Technical field
The invention belongs to computer software fields, it is related to a kind of based on dynamic labels matching DLMS scheduling under Hadoop platform
The design and realization of method.
Background technology
Early stage Hadoop version is due to resource scheduling management and MapReduce framework integrations in a module, being caused
The decoupling of code is poor, it is impossible to be extended well, and a variety of frameworks are not supported.The design of Hadoop open source communities realizes one
The Hadoop system of new generation of brand-new framework is planted, the system is Hadoop2.0 versions, and scheduling of resource is extracted and constructs one
Individual new scheduling of resource framework, i.e. Hadoop system YARN of new generation.It is known that suitable in the environment of a certain determination
Dispatching algorithm can be while user job request be met, the overall performance and system of effective lifting Hadoop job platforms
Resource utilization.Acquiescence carries three kinds of schedulers in YARN:FIFO (fifo), Fair Scheduler (Fair
) and computing capability scheduler (Capacity Scheduler) Scheduler.Hadoop acquiescences use fifo schedulers,
The algorithm uses the scheduling strategy of first in first out, simple easily to realize, but is detrimental to the execution of short operation, and shared cluster is not supported
And multi-user management;Different user and the difference of operation resource distribution demand are considered by the Facebook fair scheduling algorithms proposed
It is different, the resource of the shared cluster of support user fairness, but the configuration strategy underaction of operation resource, easily cause the wave of resource
Take, and do not support operation to seize;The computing capability dispatching algorithm that Yahoo proposes supports many queues of multiple users share, computing capability
Flexibly, but do not support operation is seized easily to be absorbed in local optimum.
But in actual enterprise's production, increased with the data volume of enterprise, annual cluster can all add some new sections
Point, but the performance difference of clustered node is significant, this isomeric group is very universal in enterprise's production environment.If contemplated
The task of one amount of calculation very big machine learning is distributed on the very poor machine node of CPU computing capabilitys, it is clear that can influence
The overall execution time of operation.Three kinds of Resource Schedulers that Hadoop is carried do not solve this problem well, the present invention
The resource regulating method (DLMS) of a kind of joint behavior and job class label Dynamic Matching is proposed, cpu performance is relatively good
Machine stick CPU labels, IO labels are sticked on the good machine of disk I/O performance comparision either both general general
Logical label, operation can stick CPU labels, IO labels task or common label according to classification, subsequently into different labels
The resource allocation of corresponding label node is given corresponding label operation by queue, scheduler as far as possible, so as to reduce operation
Run time, improve resource utilization ratio, improve system whole efficiency.
The content of the invention
Clustered node is carried out preliminary classification and assigns corresponding label by dispatching method proposed by the present invention.
NodeManager is sent to carry out self detecting and entering original tag Mobile state adjustment before heartbeat, and uses machine learning classification
Algorithm carries out classification to operation and assigns corresponding label, and the job priority set according to user, and the operation stand-by period etc. belongs to
Property dynamic implement operation sequence, and by the resource allocation of respective labels give corresponding tag queue in operation.
Dispatching method proposed by the invention mainly includes with lower module:
(1) clustered node original classification and its dynamic cataloging label
Clustered node is classified firstly the need of preliminary classification is carried out according to the CPU of node and disk I/O performance.In cluster
Each node is required for the task of one specified type of isolated operation and records the time that the node runs such operation, according to
Node is divided into CPU types by the magnitude relationship of all node run time average values in the time of node operation individual task and cluster
Node, disk I/O type node, plain edition node.
, can be to this if a node operation Partial Jobs cause load excessive during clustered node is run
The label of node carries out degradation processing, is directly downgraded to ordinary node.During one node initial labels is CPU type labels, node
Run CPU type tasks, although this node also has part resource to be not used, but now environment interior joint cpu performance advantage is
Lose, to avoid such case from occurring, take dynamic labels method, the heart is sent to ResourceManager in NodeManager
CPU the and IO utilization rates of dynamic detection node machine when jump, if it exceeds the threshold, just sticking this node label commonly
Label, is required for being detected once, is achieved in node dynamic labels when sending heartbeat every time.This threshold value can be in configuration text
Voluntarily configured in part, if user does not configure meeting reference system default value.
(2) acquisition and passback of Map execution informations
Hadoop operations generally fall into Map stages and Reduce stages, generally big operation map quantity at up to a hundred even more
Many, an operation main time is spent in the calculating in Map stages, but each Map is identical execution logic again,
So the operation information of first map process of job run can be collected, these information NodeManager to
ResourceManager is delivered in scheduler when sending heartbeat, and scheduler carries out the classification of operation according to the information passed back.
In enterprise's production environment, the operation of some identical content logics, i.e. user can be all run daily known to operation answer institute
The label of category, homework type label is set in order line or code for operation, and scheduler can be examined when scheduling
Look into, if user is labelled to operation, just saves the link of job class, be directly scheduled.
(3) multipriority queue
To meet the demand of different user, prevent small operation from " starvation " phenomenon occur, using job priority scheme.Adjusting
Newly-built 5 queues are in degree device:Original queue, wait priority query, CPU priority queries, IO priority queries and common
Priority query.User submit operation to be entered in original queue first, is first run operation part map and is collected this part map
Operation information, then operation, which enters to wait in priority query, waits Map operation information to return and classified, finally according to
The class categories label of operation is entered in the queue of corresponding label.
(3) job class
Need to pre-process data before classification, data prediction refers to carry out at some data in early stage
Reason.Quality for raising data mining generates Data Preprocessing Technology.Data Preprocessing Technology has a variety of methods:Data are clear
Reason, data integration, data conversion and hough transformation.These data processing techniques are used before data mining, greatly improve number
According to the quality of mining mode, reduction actual excavation required time.Circumferential edge pretreatment is mainly in terms of data normalization.Number
It is exactly each variable data all linearly to be transformed on a new scale according to normalization, variable minimum value is 0 after conversion, maximum
It is worth for 1, so ensures that all variable datas are both less than equal to 1.
It has selected in terms of job class simply, using the preferable Naive Bayes Classification of commonplace and classifying quality
Device is classified.If user is in order line and task code if the type of added operation, the step can save,
It is directly entered the medium resource to be allocated of corresponding queue.
(4) data locality
It is " mobile computing is more preferable than mobile data " that a principle is followed in Hadoop, is moved to the calculating section for placing data
Point more saves cost than moving data to a calculate node, and performance is more preferable.On data, the locality present invention takes
Delay degradation scheduling strategy.
Have the beneficial effect that:
1. the present invention proposes a kind of dispatching method of dynamic labels matching for isomeric group environment, by node and work
Industry is classified, and calculating job priority is organized jointly with reference to the attribute of operation self character and submission user, in distribution resource
When same type of resources and node are matched, it is contemplated that the performance of node is adopted with the relation of the task amount run at this stage
Node label is dynamically adjusted with self-sensing method.Algorithm performance is analyzed finally by experiment.
2. the present invention is directed to the local sex chromosome mosaicism of data, it is proposed that the algorithm that delay degrades, degradation be divided into this current node, this
Three kinds of frame node and random node, data locality is improved by reducing locality grade in certain delay time.
3. the present invention is previously run different type operation, run according to single node first using the method for dynamic labels
Time and the average time of all nodes of cluster classify to comparing node, then according to the load of clustered node operation task
Situation carries out Autonomous test to joint behavior and generates corresponding new label.
4. the present invention proposes to classify to operation, because MapReduce operations Map parts are all that identical processing is patrolled
Volume, it is possible to the partial information first carried out in advance according to operation is classified to operation.
Brief description of the drawings
Fig. 1 job scheduling general frame flow charts;
Fig. 2 dispatching algorithm flow charts;
The lower three kinds of operations total run time comparison diagram of Fig. 3 difference dispatching algorithms;
Container distribution numbers spirogram under 500M data volumes under Fig. 4 DLMS;
Container distribution numbers spirogram under 1G data volumes under Fig. 5 DLMS;
Container distribution numbers spirogram under 1.5G data volumes under Fig. 6 DLMS;
Fig. 7 operations group run time comparison diagram under different dispatching algorithms;
Embodiment
For the purpose of the present invention, technical scheme and feature is more clearly understood, below in conjunction with specific embodiment, and join
According to accompanying drawing, further refinement explanation is carried out to the present invention.YARN Scheduling Frameworks are as shown in Figure 1.
Each step is explained as follows:
(1) user submits application program to YARN, including user program, starts ApplicationMaster orders.
(2) ResourceManager be first Container of the application assigned, and with it is corresponding
NodeManager communicates, it is desirable to which it starts the ApplicationMaster of application program.
(3) it is each task application resource, and monitor after ApplicationMaster is registered to ResourceManager
Their running status, until end of run
(4) NodeManager sends and self dynamic node label of detection generation is carried out before heartbeat, and to
ResourceManager reports resource.
(5) classification of task enters in different tag queues, carries out the resources to be allocated such as priority ranking.
(6) ApplicationMaster applies for and got resource by RPC agreements to ResourceManager.
(7) according to the NodeManager node labels reported and resource, scheduler is by the resource allocation of this node to correspondence
The operation of tag queue.
(8) ApplicationMaster applies to after resource, is just communicated with corresponding NodeManager, it is desirable to which it starts
Task.
(9) NodeManager is that task is set after running environment (environmental variance, JAR bags, binary program etc.), will
Task start order is write in script, and by running the script startup task.
(10) each task reports the state and progress of oneself by some RPC agreement to ApplicationMaster, can
To restart task in mission failure.
(11) after the completion of application program operation, ApplicationMaster is nullified to ResourceManager and closed certainly
Oneself.
Preliminary classification is carried out to cluster physical node first, the procedure of classification is as follows:
(1) clustered machine set of node is set as N={ Ni| i ∈ [1, n] } n be node total number amount, i for since 1 n it is just whole
Number, NiRepresent i-th of physical machine in cluster.
(2) CPU, IO of identical task amount are carried out on every node and plain edition operation and records operation and holds
The row time;Tcpu(i) represent in NiThe cost time of CPU operations is performed on individual node;Tio(i) represent in NiOn individual node
Perform the cost time of IO operations, Tcom(i) represent in NiThe cost time of Ordinary Work is performed on individual node.
(3) the cluster average time of every kind of operation is calculated, the calculation formula of cluster average time is as follows: J represents the type of operation, calculates each node under such a operation
With the time difference of average time, if Tcpu(i)<Avgcpu, it is the original tag that this node sticks CPU type nodes, if Tcpu
(i)>Avgcpu, it is that this node sticks plain edition original tag, is had by the label being likely to more afterwards on every node many
Individual, the label for selecting the saving time most is the last label of this node.
If Map operation information be M, it include it is following need collect information M=MIn, MOut, Rate, Acpu,
Mcpu, Zcpu, Mrate } Min represents map input data amounts, MOut represents map output data quantities, and Rate represents input data
Amount/output data quantity, Acpu represents CPU average service rates, and Mcpu represents cpu medians, and Zcpu represents that cpu utilization rates exceed
90% average, MRate represents internal memory usage amount, and these data are by the characteristic attribute as this later job class.
Find that simple calculating CPU average time can not react the feature of operation very well, be found through experiments that CPU during experiment
The number of times that the CPU usage of type operation is more than 90% is relatively more, and other kinds of operation CPU usage is more than 90% number of times phase
To less, so this information is also added in the information of map passbacks.
The design method of the double-deck weight of User Defined is taken in terms of queue priority, is provided as shared by the size attribute of industry
Weight be worthNum, the attribute falls into three classes num ∈ { long, mid, short }, shared by owner's attribute of operation
Weight is worthUser, and the attribute is divided into two grade user ∈ { root, others }, weight shared by the urgency level of operation
For worthEmogence, the attribute be divided into Three Estate prority ∈ highPrority, midPrority,
LowPrority }, the weight shared by the stand-by period of operation is worthWait, and the calculation formula of wait is waitTime=
NowTime-submitTime, assigns corresponding weight, the priority number of each task is finally calculated, then in corresponding team
It is ranked up in row.Above-mentioned five kinds of task attribute weights are added and are 100%, and specific formula is as follows.
WorthNum+worthUser+worthEmogence+worthWtait=100%;
Last weight calculation formula:
FinalWort=worthNum*num+worthUser*user+worthEmogence*pror ity+
worthWait*waitTime
In terms of job class, using Naive Bayes Classifier, specific classifying step is as follows:
(1) it is the conditional probability under CPU, IO or plain edition operation under certain conditions to calculate an operation respectively:
P (job=labcpu|V1,V2,…,Vn)
P (job=labio|V1,V2,…,Vn)
P (job=labcom|V1,V2,…,Vn)
Wherein job ∈ { cpu, io, com } represent job class label;ViFor the attributive character of operation.
(2) according to Bayesian formula P (B | A)=P (AB)/P (A):
Assuming that ViBetween it is relatively independent, assumed wherein according to independent
(3) P (V in actual calculating1,V2,…,Vn) unrelated with operation negligible, therefore can finally obtain
Similarly have
Operation is that the operation of CPU types, the operation of IO types or plain edition operation are bigger depending on which probable value.
Locality takes delay degradation scheduling strategy herein.The tactful concretism is as follows:
Increase a delay time attribute for each operation, if TiFor the current delay time of i-th of operation, i ∈ [1,
N], n is the interstitial content of cluster, TlocalRepresent local node delay time threshold value, TrackRepresent frame node delay time threshold
Value.When scheduler allocates resources to operation, if the execution node and data input node of operation are not on one node, this
When TiFrom increasing 1, represent that the operation is once delayed scheduling, now by this resource allocation to other suitable operations, until working as Ti>
TlocalWhen, the locality of operation will be reduced to frame locality, as long as now the node in this frame can be by resource
Distribute to the operation;Work as Ti>TrackWhen, the locality of operation is reduced to random node.T thereinlocalAnd TrackAll using configuration
The mode of file is voluntarily configured by user according to cluster situation.It can be ensured in certain delay using the scheduling strategy of delay
It is interior to obtain preferable locality.
The basic thought of DLMS dispatching methods is to allocate Partial Jobs execution in advance, and the information returned according to operation is to operation
Classified, then the giving the resource allocation of node label in corresponding queue of the task, basic procedure:
Step 1 is when node reports resource by heartbeat to resource management, if original queue is not sky, time
Operation in original queue is gone through, the operation that homework type label is specified in order line or program is assigned to accordingly
In label priority query, original queue removes this operation.
If step 2 original queue is not sky, by the resource allocation on this node to original queue, operation enters etc.
Treat to wait distribution next time resource in queue, original queue removes this operation, this wheel distribution terminates.
If it is not sky that step 3, which waits priority query, to waiting the operation in priority query to be sorted into
Corresponding label priority query.
If the step 4 such as corresponding job class queue of joint behavior label is not sky, by the resource allocation of this node
This queue is given, this wheel distribution terminates.
Step 5 sets and checks resource access times variable, if it exceeds the quantity of cluster, then press the resource of node
CPU, IO, common, wait priority orders allocate resources to corresponding queue, this wheel finishing scheduling.This step can be prevented
There is similar situations below, cpu queue operation is excessive, causes CPU type node resources to exhaust, the node of other labels also has
Resource, but operation can not distribute the situation of resource.
The flow chart of algorithm is as shown in Figure 2.
Experimental situation
This section by verified by testing set forth herein DLMS schedulers actual effect.Experimental situation is 5 PCs
The complete distributed type assemblies of Hadoop built, the unified node machine configuration of cluster is operating system Ubuntu-
12.04.1, JDK1.6, Hadoop2.5.1, internal memory 2G, hard disk 50G.Wherein NameNode CPU check figure is 2,
The CPU core number that the CPU core number that dataNode1 CPU core number is 2, dataNode2 is 4, dataNode3 is 2, dataNode4's
CPU core number is 4.
Experimental result and explanation
Prepare the wordCount (IO types) that data volume is 128M, each one of kmeans (CPU types) operation, respectively 4 first
Platform node is operated above 6 times, records the time of job run.S represents chronomere second in table 1, and avg represents that the node is run
The average time of respective labels task, allAvg represents that all nodes run the total average time of respective labels task, rate's
Calculation formula is as follows:
Negative sign represents reduction of the average time relative to total average time, when positive sign represents that average time is relative to overall average
Between increase.
DataNode1 is time saving in the time of two tasks of operation as can be seen from Table 1, and we take saving
Most CPU operations are as the original tag of machine, and DataNode2 is IO labels, and DataNode3, DataNode4 are common machine
Device.
The original classification of table 1 tests table
Experimental result and its analysis
Using can substantially distinguish several operations of homework type, WordCount needs substantial amounts of reading number in the Map stages
According to write-in intermediate data, Map stages and Reduce stages there is no arithmetic computation, thus by such a operation it is qualitative be IO
Type operation, Kmeans is required for the substantial amounts of distance calculated between point and point in Map stages and Reduce stages, not too many
Intermediate data write-in, so by such a operation it is qualitative be CPU type operations, TopK do not have substantial amounts of data in the Reduce stages
Write disk, also it is substantial amounts of calculate, relate only to simply compare, taking human as think that this is medium-sized
Business.
Verified by two groups of experiments, first group of Setup Experiments scheduler is fifo, in 500M, 1G and 1.5G number
According to being separately operable WordCount under amount, Kmeans, each 3 times of Topk operations record the average time of each operation 3 times as most
Between terminal hour, switching scheduler does for Capacity and DLMS schedulers records DLMS scheduling in same experimental implementation, experiment
The Container of every kind of operation is in the distribution of cluster under device, and Container is the dividing unit for representing cluster resource, be have recorded
Each Map and Reduce processes are come with a Container in the distribution situation that operation burst is run in the cluster, YARN
It is indicated.Each Node distribution ratio shows that node performs the ratio of job task amount to Container in the cluster.Fig. 2
Abscissa be operation data volume, ordinate is WordCount, Kmeans, this 3 kinds of operations of Topk be operated together it is total when
Between.In the case where data volume increases, DLMS schedulers save about 10%-20% time compared to other schedulers.
Because DLMS can be by the resource allocation of respective nodes label to respective labels operation.The Map and reduce of operation
It is to be run in the form of a Container on node, Fig. 3 to Fig. 5 is the different pieces of information amount operation under DMLS schedulers
Container quantity.It is the label node of CPU types according to the original classification Node1 of upper section, Node2 and Node3 are common marks
Node is signed, Node4 is IO label nodes.WordCount is IO type operations, and Topk is plain edition operation, and Kmeans is that CPU types are made
Industry.As can be seen from the figure the Container regularity of distribution is the Container ratios that WordCount operations are distributed on Node4
More, the comparison of Tokp EDS maps on ordinary node Node2 and Node3 is more, Kmeans operations EDS maps on Node1 nodes
Comparison it is many.Distributions of the Container of above different work on clustered node shows that DLMS schedulers improve corresponding section
Probability of the resource allocation of point label to corresponding label operation.
Second group of experiment, has prepared 5 operations, is the WordCount operations of 128M and 500M data volumes respectively, 128M and
The Kmeans operations of 500M data volumes, 500M Topk operations constitute an operation group.5 operations submit operation simultaneously.Not
With implementation status of working continuously is simulated in the cluster of scheduler, the total time that operation group has been performed is recorded.Operation group is different
Run 3 times under scheduler, record the total time that operation group has been run.Concrete outcome is shown in Fig. 6, as can be seen from Figure 6 herein
The DLMS schedulers of proposition are it will be apparent that originally compared to the Hadoop times for carrying scheduler execution identical operation group saving
The DMLS schedulers that text is proposed save for about 20% time compared to the Fifo schedulers that Hadoop is carried, and are adjusted than Capacity
Degree device saves about 10% run time.
Claims (2)
- Dynamic labels match DLMS dispatching methods under 1.Hadoop platforms, it is characterised in that:Clustered node original classification and its dynamic cataloging label;Clustered node is classified firstly the need of preliminary classification is carried out according to the CPU of node and disk I/O performance;It is each in cluster Node is required for the task of one specified type of isolated operation and records the time that the node runs such operation, according to node Node is divided into CPU type sections by the magnitude relationship of all node run time average values in the time of operation individual task and cluster Point, disk I/O type node, plain edition node;, can be to this node if a node operation Partial Jobs cause load excessive during clustered node is run Label carry out degradation processing, be directly downgraded to ordinary node;One node initial labels is operation in CPU type labels, node CPU type tasks, although this node also has part resource to be not used, but now environment interior joint cpu performance advantage has lost, To avoid such case from occurring, dynamic labels method is taken, heartbeat is sent to ResourceManager in NodeManager When the dynamic detection node machine CPU and IO utilization rates, if it exceeds the threshold, this node label just is sticked into common mark Label, are required for being detected once, are achieved in node dynamic labels when sending heartbeat every time;This threshold value can be in configuration file In voluntarily configure, if do not configure can reference system default value by user;(1) acquisition and passback of Map execution informationsHadoop operations generally fall into Map stages and Reduce stages, and generally big operation map quantity is even more more at up to a hundred, One operation main time is spent in the calculating in Map stages, but each Map is identical execution logic, institute again With the operation information for the first map process that can collect job run, these information NodeManager to ResourceManager is delivered in scheduler when sending heartbeat, and scheduler carries out the classification of operation according to the information passed back;In enterprise's production environment, it can all run what operation known to the operation of some identical content logics, i.e. user should be affiliated daily Label, homework type label is set in order line or code for operation, and scheduler can be checked when scheduling, such as Fruit user is labelled to operation, just saves the link of job class, is directly scheduled;(2) multipriority queueTo meet the demand of different user, prevent small operation from " starvation " phenomenon occur, using job priority scheme;In scheduler In newly-built 5 queues be:Original queue, wait priority query, CPU priority queries, IO priority queries and normal precedence Level queue;User submit operation to be entered in original queue first, is first run operation part map and is collected this part map operations Information, then operation, which enters to wait in priority query, waits Map operation information to return and classified, finally according to operation Class categories label enter in the queue of corresponding label;(3) job classNeed to pre-process data before classification, data prediction refers to carry out data some processing in early stage;For The quality for improving data mining generates Data Preprocessing Technology;Data Preprocessing Technology has a variety of methods:Data scrubbing, data Integrated, data conversion and hough transformation;These data processing techniques are used before data mining, greatly improve data mining mould The time required to the quality of formula, reduction actual excavation;Circumferential edge pretreatment is mainly in terms of data normalization;Data normalization Exactly each variable data is all linearly transformed on a new scale, variable minimum value is 0 after conversion, and maximum is 1, this Sample ensures that all variable datas are both less than equal to 1;Have selected in terms of job class it is simple, entered using the preferable Naive Bayes Classifier of commonplace and classifying quality Row classification;If user is in order line and task code if the type of added operation, the step can save, directly Into the medium resource to be allocated of corresponding queue;(4) data localityIt is " mobile computing is more preferable than mobile data " that a principle is followed in Hadoop, and the calculate node for being moved to placement data will Cost is more saved than moving data to a calculate node, performance is more preferable;On data locality, this invention takes prolong When degrade scheduling strategy.
- 2. dynamic labels match DLMS dispatching methods under Hadoop platform according to claim 1, it is characterised in that:(1) user submits application program to YARN, including user program, starts ApplicationMaster orders;(2) ResourceManager is first Container of the application assigned, and is led to corresponding NodeManager Letter, it is desirable to which it starts the ApplicationMaster of application program;(3) it is each task application resource, and monitor them after ApplicationMaster is registered to ResourceManager Running status, until end of run(4) NodeManager sends and self dynamic node label of detection generation is carried out before heartbeat, and to ResourceManager reports resource;(5) classification of task enters in different tag queues, carries out the resources to be allocated such as priority ranking;(6) ApplicationMaster applies for and got resource by RPC agreements to ResourceManager;(7) according to the NodeManager node labels reported and resource, scheduler is by the resource allocation of this node to corresponding label The operation of queue;(8) ApplicationMaster applies to after resource, is just communicated with corresponding NodeManager, it is desirable to which it, which starts, appoints Business;(9) NodeManager is that task is set after running environment (environmental variance, JAR bags, binary program etc.), by task Start order to write in script, and by running the script startup task;(10) each task reports the state and progress of oneself, Ke Yi by some RPC agreement to ApplicationMaster Task is restarted during mission failure;(11) after the completion of application program operation, ApplicationMaster is nullified to ResourceManager and is closed oneself;Preliminary classification is carried out to cluster physical node first, the procedure of classification is as follows:(1) clustered machine set of node is set as N={ Ni| i ∈ [1, n] } n be node total number amount, i for since 1 n positive integer, Ni Represent i-th of physical machine in cluster;(2) CPU, IO and the plain edition operation and when recording Job execution of identical task amount are carried out on every node Between;Tcpu(i) represent in NiThe cost time of CPU operations is performed on individual node;Tio(i) represent in NiPerformed on individual node The cost time of IO operations, Tcom(i) represent in NiThe cost time of Ordinary Work is performed on individual node;(3) the cluster average time of every kind of operation is calculated, the calculation formula of cluster average time is as follows: J represents the type of operation, calculates each node under such a operation With the time difference of average time, if Tcpu(i)<Avgcpu, it is the original tag that this node sticks CPU type nodes, if Tcpu (i)>Avgcpu, it is that this node sticks plain edition original tag, is had by the label being likely to more afterwards on every node many Individual, the label for selecting the saving time most is the last label of this node;If Map operation information be M, it include it is following need collect information M=MIn, MOut, Rate, Acpu, Mcpu, Zcpu, Mrate } Min represents map input data amounts, MOut represents map output data quantities, and Rate represents input data amount/output Data volume, Acpu represents CPU average service rates, and Mcpu represents cpu medians, and Zcpu represents that cpu utilization rates are flat more than 90% Mean, MRate represents internal memory usage amount, and these data are by the characteristic attribute as this later job class;The design method of the double-deck weight of User Defined is taken in terms of queue priority, the power shared by the size attribute of industry is provided as Weight is worthNum, and the attribute falls into three classes num ∈ { long, mid, short }, weight shared by owner's attribute of operation For worthUser, the attribute is divided into two grade user ∈ { root, others }, and weight shared by the urgency level of operation is WorthEmogence, the attribute be divided into Three Estate prority ∈ highPrority, midPrority, LowPrority }, the weight shared by the stand-by period of operation is worthWait, and the calculation formula of wait is waitTime= NowTime-submitTime, assigns corresponding weight, the priority number of each task is finally calculated, then in corresponding team It is ranked up in row;Above-mentioned five kinds of task attribute weights are added and are 100%, and specific formula is as follows;WorthNum+worthUser+worthEmogence+worthWtait=100%;Last weight calculation formula:FinalWort=worthNum*num+worthUser*user+worthEmogence*pror ity+worthWait* waitTimeIn terms of job class, using Naive Bayes Classifier, specific classifying step is as follows:(1) it is the conditional probability under CPU, IO or plain edition operation under certain conditions to calculate an operation respectively:P (job=labcpu|V1,V2,…,Vn)P (job=labio|V1,V2,…,Vn)P (job=labcom|V1,V2,…,Vn)Wherein job ∈ { cpu, io, com } represent job class label;ViFor the attributive character of operation;(2) according to Bayesian formula P (B | A)=P (AB)/P (A):Assuming that ViBetween it is relatively independent, assumed wherein according to independent(3) P (V in actual calculating1,V2,…,Vn) unrelated with operation negligible, therefore can finally obtainSimilarly haveOperation is that the operation of CPU types, the operation of IO types or plain edition operation are bigger depending on which probable value;Locality takes delay degradation scheduling strategy herein;The tactful concretism is as follows:Increase a delay time attribute for each operation, if TiFor the current delay time of i-th of operation, i ∈ [1, n], n is The interstitial content of cluster, TlocalRepresent local node delay time threshold value, TrackRepresent frame node delay time threshold;Work as tune When spending device distribution resource to operation, if the execution node and data input node of operation are not on one node, now TiFrom Increase 1, represent that the operation is once delayed scheduling, now by this resource allocation to other suitable operations, until working as Ti>Tlocal When, the locality of operation will be reduced to frame locality, as long as now the node in this frame can be by resource allocation Give the operation;Work as Ti>TrackWhen, the locality of operation is reduced to random node;T thereinlocalAnd TrackAll use configuration file Mode voluntarily configured according to cluster situation by user;It can be ensured in certain delay time using the scheduling strategy of delay Obtain preferable locality;The basic thought of DLMS dispatching methods is to allocate Partial Jobs execution in advance, and the information returned according to operation is carried out to operation Classification, then the giving the resource allocation of node label in corresponding queue of the task, basic procedure:Step 1 is when node reports resource by heartbeat to resource management, if original queue is not sky, traversal is former Operation in beginning queue, corresponding label is assigned to by the operation that homework type label is specified in order line or program In priority query, original queue removes this operation;If step 2 original queue is not sky, by the resource allocation on this node to original queue, operation, which enters, waits team Distribution next time resource is waited in row, original queue removes this operation, this wheel distribution terminates;It is corresponding to waiting the operation progress in priority query to be sorted into if it is not sky that step 3, which waits priority query, Label priority query;If the step 4 such as corresponding job class queue of joint behavior label is not sky, this is given by the resource allocation of this node Queue, this wheel distribution terminates;Step 5 sets and checks resource access times variable, if it exceeds the quantity of cluster, then by the resource of node by CPU, IO, Commonly, priority orders are waited to allocate resources to corresponding queue, this wheel finishing scheduling;This step can prevent similar Situations below, cpu queue operation is excessive, causes CPU type node resources to exhaust, the node of other labels also has resource, but It is that operation can not distribute the situation of resource.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710181055.0A CN107038069B (en) | 2017-03-24 | 2017-03-24 | Dynamic label matching DLMS scheduling method under Hadoop platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710181055.0A CN107038069B (en) | 2017-03-24 | 2017-03-24 | Dynamic label matching DLMS scheduling method under Hadoop platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107038069A true CN107038069A (en) | 2017-08-11 |
CN107038069B CN107038069B (en) | 2020-05-08 |
Family
ID=59534217
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710181055.0A Expired - Fee Related CN107038069B (en) | 2017-03-24 | 2017-03-24 | Dynamic label matching DLMS scheduling method under Hadoop platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107038069B (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107766150A (en) * | 2017-09-20 | 2018-03-06 | 电子科技大学 | A kind of job scheduling algorithm based on hadoop |
CN107832153A (en) * | 2017-11-14 | 2018-03-23 | 北京科技大学 | A kind of Hadoop cluster resources self-adapting distribution method |
CN107832134A (en) * | 2017-11-24 | 2018-03-23 | 平安科技(深圳)有限公司 | multi-task processing method, application server and storage medium |
CN108052443A (en) * | 2017-10-30 | 2018-05-18 | 北京奇虎科技有限公司 | A kind of test assignment dispatching method, device, server and storage medium |
CN108509280A (en) * | 2018-04-23 | 2018-09-07 | 南京大学 | A kind of Distributed Calculation cluster locality dispatching method based on push model |
CN108959580A (en) * | 2018-07-06 | 2018-12-07 | 深圳市彬讯科技有限公司 | A kind of optimization method and system of label data |
CN110278257A (en) * | 2019-06-13 | 2019-09-24 | 中信银行股份有限公司 | A kind of method of mobilism configuration distributed type assemblies node label |
CN110532085A (en) * | 2018-05-23 | 2019-12-03 | 阿里巴巴集团控股有限公司 | A kind of dispatching method and dispatch server |
WO2020034646A1 (en) * | 2018-08-17 | 2020-02-20 | 华为技术有限公司 | Resource scheduling method and device |
CN111124765A (en) * | 2019-12-06 | 2020-05-08 | 中盈优创资讯科技有限公司 | Big data cluster task scheduling method and system based on node labels |
WO2020119117A1 (en) * | 2018-12-14 | 2020-06-18 | 平安医疗健康管理股份有限公司 | Distributed computing method, apparatus and system, device and readable storage medium |
CN111930493A (en) * | 2019-05-13 | 2020-11-13 | 中国移动通信集团湖北有限公司 | NodeManager state management method and device in cluster and computing equipment |
CN112039709A (en) * | 2020-09-02 | 2020-12-04 | 北京首都在线科技股份有限公司 | Resource scheduling method, device, equipment and computer readable storage medium |
CN112445925A (en) * | 2020-11-24 | 2021-03-05 | 浙江大华技术股份有限公司 | Clustering archiving method, device, equipment and computer storage medium |
CN113590294A (en) * | 2021-07-30 | 2021-11-02 | 北京睿芯高通量科技有限公司 | Self-adaptive and rule-guided distributed scheduling method |
CN114064294A (en) * | 2021-11-29 | 2022-02-18 | 郑州轻工业大学 | Dynamic resource allocation method and system in mobile edge computing environment |
CN114840343A (en) * | 2022-05-16 | 2022-08-02 | 江苏安超云软件有限公司 | Task scheduling method and system based on distributed system |
WO2023051233A1 (en) * | 2021-09-30 | 2023-04-06 | 华为技术有限公司 | Task scheduling method, device, apparatus and medium |
WO2023056618A1 (en) * | 2021-10-09 | 2023-04-13 | 国云科技股份有限公司 | Cross-cloud platform resource scheduling method and apparatus, terminal device, and storage medium |
CN117056061A (en) * | 2023-10-13 | 2023-11-14 | 浙江远算科技有限公司 | Cross-supercomputer task scheduling method and system based on container distribution mechanism |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013015942A1 (en) * | 2011-07-28 | 2013-01-31 | Yahoo! Inc. | Method and system for distributed application stack deployment |
CN104915407A (en) * | 2015-06-03 | 2015-09-16 | 华中科技大学 | Resource scheduling method under Hadoop-based multi-job environment |
-
2017
- 2017-03-24 CN CN201710181055.0A patent/CN107038069B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013015942A1 (en) * | 2011-07-28 | 2013-01-31 | Yahoo! Inc. | Method and system for distributed application stack deployment |
CN104915407A (en) * | 2015-06-03 | 2015-09-16 | 华中科技大学 | Resource scheduling method under Hadoop-based multi-job environment |
Non-Patent Citations (1)
Title |
---|
徐鹏: ""云计算平台作业调度算法优化研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107766150A (en) * | 2017-09-20 | 2018-03-06 | 电子科技大学 | A kind of job scheduling algorithm based on hadoop |
CN108052443A (en) * | 2017-10-30 | 2018-05-18 | 北京奇虎科技有限公司 | A kind of test assignment dispatching method, device, server and storage medium |
CN107832153B (en) * | 2017-11-14 | 2020-12-29 | 北京科技大学 | Hadoop cluster resource self-adaptive allocation method |
CN107832153A (en) * | 2017-11-14 | 2018-03-23 | 北京科技大学 | A kind of Hadoop cluster resources self-adapting distribution method |
CN107832134A (en) * | 2017-11-24 | 2018-03-23 | 平安科技(深圳)有限公司 | multi-task processing method, application server and storage medium |
CN107832134B (en) * | 2017-11-24 | 2021-07-20 | 平安科技(深圳)有限公司 | Multitasking method, application server and storage medium |
CN108509280A (en) * | 2018-04-23 | 2018-09-07 | 南京大学 | A kind of Distributed Calculation cluster locality dispatching method based on push model |
CN108509280B (en) * | 2018-04-23 | 2022-05-31 | 南京大学 | Distributed computing cluster locality scheduling method based on push model |
CN110532085B (en) * | 2018-05-23 | 2022-11-04 | 阿里巴巴集团控股有限公司 | Scheduling method and scheduling server |
CN110532085A (en) * | 2018-05-23 | 2019-12-03 | 阿里巴巴集团控股有限公司 | A kind of dispatching method and dispatch server |
CN108959580A (en) * | 2018-07-06 | 2018-12-07 | 深圳市彬讯科技有限公司 | A kind of optimization method and system of label data |
WO2020034646A1 (en) * | 2018-08-17 | 2020-02-20 | 华为技术有限公司 | Resource scheduling method and device |
WO2020119117A1 (en) * | 2018-12-14 | 2020-06-18 | 平安医疗健康管理股份有限公司 | Distributed computing method, apparatus and system, device and readable storage medium |
CN111930493A (en) * | 2019-05-13 | 2020-11-13 | 中国移动通信集团湖北有限公司 | NodeManager state management method and device in cluster and computing equipment |
CN111930493B (en) * | 2019-05-13 | 2023-08-01 | 中国移动通信集团湖北有限公司 | NodeManager state management method and device in cluster and computing equipment |
CN110278257A (en) * | 2019-06-13 | 2019-09-24 | 中信银行股份有限公司 | A kind of method of mobilism configuration distributed type assemblies node label |
CN111124765A (en) * | 2019-12-06 | 2020-05-08 | 中盈优创资讯科技有限公司 | Big data cluster task scheduling method and system based on node labels |
CN112039709A (en) * | 2020-09-02 | 2020-12-04 | 北京首都在线科技股份有限公司 | Resource scheduling method, device, equipment and computer readable storage medium |
CN112039709B (en) * | 2020-09-02 | 2022-01-25 | 北京首都在线科技股份有限公司 | Resource scheduling method, device, equipment and computer readable storage medium |
CN112445925A (en) * | 2020-11-24 | 2021-03-05 | 浙江大华技术股份有限公司 | Clustering archiving method, device, equipment and computer storage medium |
CN112445925B (en) * | 2020-11-24 | 2022-08-26 | 浙江大华技术股份有限公司 | Clustering archiving method, device, equipment and computer storage medium |
CN113590294A (en) * | 2021-07-30 | 2021-11-02 | 北京睿芯高通量科技有限公司 | Self-adaptive and rule-guided distributed scheduling method |
CN113590294B (en) * | 2021-07-30 | 2023-11-17 | 北京睿芯高通量科技有限公司 | Self-adaptive and rule-guided distributed scheduling method |
WO2023051233A1 (en) * | 2021-09-30 | 2023-04-06 | 华为技术有限公司 | Task scheduling method, device, apparatus and medium |
WO2023056618A1 (en) * | 2021-10-09 | 2023-04-13 | 国云科技股份有限公司 | Cross-cloud platform resource scheduling method and apparatus, terminal device, and storage medium |
CN114064294B (en) * | 2021-11-29 | 2022-10-04 | 郑州轻工业大学 | Dynamic resource allocation method and system in mobile edge computing environment |
CN114064294A (en) * | 2021-11-29 | 2022-02-18 | 郑州轻工业大学 | Dynamic resource allocation method and system in mobile edge computing environment |
CN114840343A (en) * | 2022-05-16 | 2022-08-02 | 江苏安超云软件有限公司 | Task scheduling method and system based on distributed system |
CN117056061A (en) * | 2023-10-13 | 2023-11-14 | 浙江远算科技有限公司 | Cross-supercomputer task scheduling method and system based on container distribution mechanism |
CN117056061B (en) * | 2023-10-13 | 2024-01-09 | 浙江远算科技有限公司 | Cross-supercomputer task scheduling method and system based on container distribution mechanism |
Also Published As
Publication number | Publication date |
---|---|
CN107038069B (en) | 2020-05-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107038069A (en) | Dynamic labels match DLMS dispatching methods under Hadoop platform | |
US9542223B2 (en) | Scheduling jobs in a cluster by constructing multiple subclusters based on entry and exit rules | |
CN104298550B (en) | A kind of dynamic dispatching method towards Hadoop | |
CN110166282A (en) | Resource allocation methods, device, computer equipment and storage medium | |
CN104317658A (en) | MapReduce based load self-adaptive task scheduling method | |
US8375228B2 (en) | Multiple-node system power utilization management | |
CN109408229A (en) | A kind of dispatching method and device | |
CN104881322A (en) | Method and device for dispatching cluster resource based on packing model | |
CN116560860B (en) | Real-time optimization adjustment method for resource priority based on machine learning | |
CN112068959A (en) | Self-adaptive task scheduling method and system and retrieval method comprising method | |
US8180823B2 (en) | Method of routing messages to multiple consumers | |
CN111144701B (en) | ETL job scheduling resource classification evaluation method under distributed environment | |
CN113127176A (en) | Multi-role task allocation method and system for working platform | |
CN103268261A (en) | Hierarchical computing resource management method suitable for large-scale high-performance computer | |
CN115665157B (en) | Balanced scheduling method and system based on application resource types | |
Garg et al. | Optimal virtual machine scheduling in virtualized cloud environment using VIKOR method | |
CN110084507A (en) | The scientific workflow method for optimizing scheduling of perception is classified under cloud computing environment | |
CN116755872A (en) | TOPSIS-based containerized streaming media service dynamic loading system and method | |
CN115391047A (en) | Resource scheduling method and device | |
Thamsen et al. | Hugo: a cluster scheduler that efficiently learns to select complementary data-parallel jobs | |
CN113553353A (en) | Scheduling system for distributed data mining workflow | |
CN110427217B (en) | Content-based publish-subscribe system matching algorithm lightweight parallel method and system | |
Seethalakshmi et al. | Job scheduling in big data-a survey | |
CN112579324A (en) | Commodity summary statistical method based on cost model | |
CN110532071A (en) | A kind of more application schedules system and method based on GPU |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200508 |