CN109992404A - PC cluster resource regulating method, device, equipment and medium - Google Patents

PC cluster resource regulating method, device, equipment and medium Download PDF

Info

Publication number
CN109992404A
CN109992404A CN201711494651.0A CN201711494651A CN109992404A CN 109992404 A CN109992404 A CN 109992404A CN 201711494651 A CN201711494651 A CN 201711494651A CN 109992404 A CN109992404 A CN 109992404A
Authority
CN
China
Prior art keywords
task
resource
neural network
feature
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711494651.0A
Other languages
Chinese (zh)
Other versions
CN109992404B (en
Inventor
郭慈
颜海涛
罗泉
刘炼
孟晓莉
王子翔
张韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Hubei Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Hubei Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Hubei Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201711494651.0A priority Critical patent/CN109992404B/en
Publication of CN109992404A publication Critical patent/CN109992404A/en
Application granted granted Critical
Publication of CN109992404B publication Critical patent/CN109992404B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention discloses a kind of PC cluster resource regulating method, device, equipment and medium based on neural network algorithm, this method comprises: the task of submission;Preprocessing tasks obtain pretreatment information;Extract data characteristics, task feature, PC cluster resource characteristic;Pretreatment information, data characteristics, task feature and cluster computing resource feature are calculated with neural network algorithm, generates resource allocation information;Task is run according to resource allocation information, obtains task resource monitoring, log feature extracts, accuracy rate verifying;It is monitored according to task resource, log feature extracts, and accuracy rate verifies training neural network model;Submit task again, be repeated several times the above operation, it is trained after neural network model calculate after export final PC cluster resource scheduling scheme.The method that the embodiment of the present invention realizes intelligent automation, improves utilization rate of equipment and installations and data analysis efficiency, reduces the energy consumption of data center.

Description

PC cluster resource regulating method, device, equipment and medium
Technical field
The present invention relates to software, communication, internet, big data industry fields, more particularly to artificial intelligence technology and greatly The comprehensive application technology field of data technique more particularly to a kind of PC cluster scheduling of resource side based on neural network algorithm Method, device, equipment and medium.
Background technique
In the production cluster based on Hadoop technology, theoretically task should obtain the request of computing resource in time Meet, but the computing resource of cluster is often limited under actual conditions, is frequently necessary to be lined up when task needs computing resource It waits, the prior art, which is difficult to find a strategy, can satisfy all application scenarios.
The scheduler and strategy that existing technical solution generally uses Hadoop technology included, in addition system operation management people Member carries out the management and running of the computing resource of data center by hand.System manager selectes a kind of scheduling strategy, then according to rent The resource requirement at family is planned, all known tenant's queues is pre-created in advance, while setting each tenant's queue energy The minimum and maximum value of the computing resource enough used must be submitted to specified when each tenant's submission task is then strictly required In tenant's queue, realizes each tenant and obtain can be carried out resource isolation and limitation again while minimal computational resources ensure.Later period, System manager assesses the reasonability of tenant's computing resource according to the efficiency of the task run of each tenant, carries out data center's meter The adjustment and optimization for calculating resource parameters, ensure the task stable operation of tenant.
In conclusion the prior art and management method are applied when multi-tenant shares cluster, there are following disadvantages: renting The demand of family task priority and the contradiction of Hadoop scheduling strategy;Resource allocates the flexibility for causing resource to use in advance not Foot;Excessive resource, which is seized, leads to task run low efficiency.The prior art only provides basic scheduling means and method, this is not The difficulty of data center's O&M is increased only, simultaneously as all also increasing the risk of error by artificial manual operations.In data There is an urgent need to a kind of intelligent, dynamic, automatic, real-time resource management and scheduling means for the heart.
Summary of the invention
The PC cluster resource regulating method that the embodiment of the invention provides a kind of based on neural network algorithm, is set device Standby and medium, the resource allocation for solving task in the prior art is not flexible, task run low efficiency, data center's O&M Difficulty, and the problem of manual operations, to reach intelligent, dynamic, automatic, real-time resource management and scheduling.
In a first aspect, the embodiment of the invention provides a kind of PC cluster scheduling of resource side based on neural network algorithm Method, method include: that task is submitted;
Task pretreatment obtains pretreatment information;
Extract data characteristics, task feature, PC cluster resource characteristic;
Pretreatment information, data characteristics, task feature and cluster computing resource feature are calculated with neural network algorithm, it is raw At resource allocation information;
Task is run according to resource allocation information, obtains task resource monitoring, and log feature extracts, accuracy rate verifying;
Model training is monitored according to task resource, and log feature extracts, and accuracy rate verifies training neural network model;
Task is submitted again, is repeated several times the above operation, it is trained after neural network model calculate after export it is final PC cluster resource scheduling scheme.
Second aspect, the embodiment of the invention provides a kind of, and the PC cluster scheduling of resource based on neural network algorithm fills It sets, device includes:
Preprocessing module obtains pretreatment information for pre-processing to task;
Characteristic extracting module, for extracting data characteristics, task feature, PC cluster resource characteristic;
Neural computing module, it is special for calculating pretreatment information, data characteristics, task feature and cluster computing resource Sign generates resource allocation information;
Task run module, for being run according to resource allocation information, acquisition task resource monitoring, log feature is extracted, Accuracy rate verifying;
Model training module, for being monitored according to task resource, log feature is extracted, and accuracy rate verifies training neural network Model.
The PC cluster resource regulating method equipment based on neural network algorithm that the embodiment of the invention provides a kind of, packet It includes: at least one processor, at least one processor and computer program instructions stored in memory, when computer journey The method such as first aspect in above embodiment is realized in sequence instruction when being executed by processor.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer journey The method such as first aspect in above embodiment is realized in sequence instruction when computer program instructions are executed by processor.
PC cluster resource regulating method provided in an embodiment of the present invention based on neural network algorithm, device, equipment and Medium, the good effect having are as follows:
(1) neural network algorithm is used, solves the contradiction of task trip and Hadoop scheduling strategy.
(2) neural network algorithm is used, resource is solved and allocates the flexibility foot for causing resource to use in advance
The problem of.
(3) neural network algorithm is used, solving excessive resource and seizing leads to task run low efficiency
Problem.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention Attached drawing is briefly described, for those of ordinary skill in the art, without creative efforts, also Other drawings may be obtained according to these drawings without any creative labor.
Fig. 1 shows the PC cluster resource regulating method process based on neural network algorithm of one embodiment of the invention Figure;
Fig. 2 shows the flow charts of the intelligent scheduling operation of one embodiment of the invention;
Fig. 3 shows the BP neural network structural schematic diagram of one embodiment of the invention;
Fig. 4 shows Yarn scheduler comparison diagram in the prior art;
The PC cluster resource scheduling device module based on neural network algorithm that Fig. 5 shows one embodiment of the invention is shown It is intended to;
Fig. 6 shows the hardware structural diagram of PC cluster resource regulating method equipment provided in an embodiment of the present invention.
Specific embodiment
The feature and exemplary embodiment of various aspects of the invention is described more fully below, in order to make mesh of the invention , technical solution and advantage be more clearly understood, with reference to the accompanying drawings and embodiments, the present invention is further retouched in detail It states.It should be understood that specific embodiment described herein is only configured to explain the present invention, it is not configured as limiting the present invention. To those skilled in the art, the present invention can be real in the case where not needing some details in these details It applies.Below the description of embodiment is used for the purpose of better understanding the present invention to provide by showing example of the invention.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence " including ... ", it is not excluded that including There is also other identical elements in the process, method, article or equipment of the element.
Fig. 1 shows the PC cluster resource regulating method process based on neural network algorithm of one embodiment of the invention Figure;The PC cluster resource regulating method based on neural network algorithm of the embodiment of the present invention includes:
S110: task is submitted.
S120: preprocessing tasks obtain pretreatment information.
S130: data characteristics, task feature, PC cluster resource characteristic are extracted.
S140: it is special that pretreatment information, data characteristics, task feature and cluster computing resource are calculated with neural network algorithm Sign generates resource allocation information.
S150: running task according to resource allocation information, obtains task resource monitoring, and log feature extracts, and accuracy rate is tested Card.
S160: monitoring according to task resource, and log feature extracts, and accuracy rate verifies training neural network model.
S170: submitting task again, is repeated several times the above operation, it is trained after neural network model calculate after export most Whole PC cluster resource scheduling scheme.
In step s 110, task submits environment to refer mainly to the computing engines such as Hive, Spark based on Hadoop cluster. In the prior art scheme, job is first submitted by ETL system, then by Hive Server2 and Spark Thrift Server Server-side is received.
In the step s 120, preprocessing tasks include: task agent and task parsing.Task agent, by simulating Hive Server2 and Spark Thrift Server RPC communication mechanism is realized, is intercepted in the job that application is submitted by proxy server Hold.Task parsing, intercepting job content information includes SQL statement, tenant's title, permission.It is obtained by parsing SQL statement Content includes table name, field name, function, SQL logic, genetic connection etc..
In step s 130, data characteristics, task feature, PC cluster resource characteristic are extracted, wherein data characteristics mentions Take, by the information extractions such as metadata management system and the table name, the field name that parse participate in task computation data volume scale, Distribution, quantity etc..The extraction of task feature is searched same type task in history log information (SQL statement is essentially identical) Historic task operation information.Submit, be lined up including, user name, state, minimax resource (CPU memory), task, distribution, It executes, terminate the relevant informations such as duration.
In step 140, pretreatment information, data characteristics, task feature and cluster meter are calculated with neural network algorithm Resource characteristic to be calculated, resource allocation information is generated, wherein the calculating process of neural network algorithm includes tagsort, model application, The result of resource parameters exports.
Data characteristics, task characteristic input neural network algorithm model are carried out forecast analysis by tagsort.
Model application, i.e., it is special to the data characteristics of input, task feature, PC cluster resource according to BP neural network algorithm Sign carries out operation.
Using every mass index such as data characteristics, task feature, resource characteristic as input, by corresponding resource allocation number According to as output.In terms of program development, these data normalizations are handled using the function that Tensorflow is carried.
Fig. 3 shows the BP neural network structural schematic diagram of one embodiment of the invention, BP neural network structure of the invention Design includes:
The design of input and output layer: the model, as input, is exported with resource and is joined by every mass index of every group of data Number is output, so the number of nodes of input layer is 58, the number of nodes of output layer is 4.The model by every group of data every quality Index is output with resource output parameter as input, so the number of nodes of input layer is 58, the number of nodes of output layer is 4.It is logical Repetition test test is crossed, determines that optimal neuronal structure and quantity are<Isosorbide-5-Nitrae 0>,<2,13>.
40 and 13 be repetition test empirical value adjusted.In fact it can thus be appreciated that because being looked for from 58 parameters Common characteristic quantity in this 58 parameters, can be classified as many classifications in fact.Overall major class is data characteristics, task feature, money Source feature.2nd layer of 40 neuron nodes are to be finely divided according to 58 parameters, are classified as 40 small subclasses.58 parameters are defeated Enter, deactivates the output of 40 feature subclasses.Then, third layer searches out 13 big category features from 40 features.Then Final 13 big category features deactivate a final dimension table data, that is, such feature combination determines final use Which kind of resource is equipped with parameter, and parameter group is 4 values.If that be done is more preferable, can more it refine, 5 layers even 10 layers.It is currently reason By and experiment, only emphasize feasibility, do not emphasize optimality.
Hidden layer design: prediction mould is established using the BP network of four layers of multiple input single output containing two hidden layers in this method Type.In network design process, the determination of hidden nodes is particularly significant.Hidden neuron number is excessive, can increase network Calculation amount is simultaneously easy to produce overfitting problem;Neuron number is very few, then will affect network performance, falls flat. The complexity of the number and practical problem of hidden neuron in network, the neuron number for outputting and inputting layer and expectation is missed The setting of difference has direct connection.This method is surveyed on the problem of choosing hidden neuron number with reference to by repetition test Examination determines that optimal neuronal structure and quantity are<Isosorbide-5-Nitrae 0>,<2,13>.
The selection of excitation function: BP neural network generallys use Sigmoid differentiable function and linear function as network Excitation function.Excitation function of the S type tangent function tansig as hidden neuron is selected in this programme.And it is defeated due to network It is normalized in [- 1,1] range out, therefore prediction model chooses excitation of the S type logarithmic function tansig as output layer neuron Function.
The application of model: the Neural Network Toolbox that this time prediction is selected in Tensorflow carries out the training of network, just The specific implementation steps are as follows for step model: will input network, setting network hidden layer and output layer after the normalization of training sample data Excitation function is respectively tansig and logsig function, and network training function is traingdx, network performance function mse, if Determine network parameter, network the number of iterations epochs is 5000 times, and anticipation error goal is 0.00000001, and learning rate lr is 0.01, after having set parameter, start to train network.
The network is then completed to learn after reaching anticipation error by a certain amount of sample data repetitive learning.Neural network model After the completion of training, it is only necessary to which the output of expected result data can be obtained in various features data target input network model.In conjunction with The distribution and scheduling of intelligent resource can be realized in above-mentioned overall operation process.
The result of resource parameters exports: after completing analysis by neural network algorithm, carrying out prediction result output, exports content It mainly include resource allocation parameters (minimax CPU memory, resource percentage) etc..
It further include that the resource allocation and task agent of task are submitted among step S140 and S150.
Resource allocation is completed the reading of output parameter by broker program, and is configured by the resource parameters of Hadoop platform Entrance is adjusted and comes into force.Main actions include queue creation, queue scheduling strategy, resource parameters etc..
Task agent is submitted, and mission bit stream is encapsulated newest queue resource relevant information by broker program, forwarding is submitted Give service receiving end, i.e. HiveServer2 and Spark Thrift Server.
In step S150, task is run according to resource allocation information, obtains task resource monitoring, log feature extracts, Accuracy rate verifying;Specially include:
Task run: task is run in strict accordance with the parameter of submission.
Running log: a large amount of running log information is generated in task operational process, and is stored in the text, while can To be obtained by API.
Task terminates: task execution finishes, and log archive is history log.
Monitoring resource: in task operational process, the resource that will be occupied in task run by monitoring programme, including system money Source, queue resource etc. are stored as structural data.The structural data is exported to neural network algorithm, is used for training pattern.
Log feature extracts: submit, be lined up including user name, state, minimax resource (CPU memory), task, point Match, execute, terminating the relevant informations such as duration.The data are exported to neural network algorithm, are used for training pattern.
Accuracy rate verifying: by task execution as a result, the task run duration under unit resource is occupied, with desired value It compares, the success or not for judgment models training.
It in step S160, is monitored according to task resource, log feature extracts, and accuracy rate verifies training neural network mould Type;Specially initial stage inputs neural network mould by dimension datas such as resource characteristic, task feature, data characteristics, results-drivens In type, the formula that exercises supervision learning training.Later period exports characteristic, self-optimizing model parameter by accuracy rate verifying and task Optimized model.
In step S170, submit task again, be repeated several times the above operation, it is trained after neural network model calculate After exporting final PC cluster resource scheduling scheme afterwards, i.e. output parameter.
The Resource Scheduler and strategy for the Hadoop that the present invention uses, specifically are as follows:
It can choose in Hadoop there are three types of scheduler: FIFO Scheduler, Capacity Scheduler, FairS cheduler。
Application is lined up a queue by the sequence of submission by first in first out (FIFO) scheduler, this is a first in first out team Column are first allocated resource to the application in queue on most head, to the application demand on most head when carrying out resource allocation Next distribution is given after satisfaction again, and so on.FIFO Scheduler is the shared cluster for being not particularly suited for multi-tenant environment. Big application may occupy all cluster resources, this results in other application to be blocked.In the shared cluster of multi-tenant environment In, it is more suitable for using Capacity Scheduler or Fair Scheduler, the two schedulers all allow big task and small Task obtains certain system resource while submitting." Yarn scheduler comparison diagram " illustrates these schedulers below Difference, it can be seen from the figure that small task can be by big task blocking in FIFO scheduler.
Capacity (Capacity) scheduler has a special queue to be used to run small task, but special for small task A queue is arranged can occupy certain cluster resource in advance, this results in the execution time of big task that can lag behind using FIFO Time when scheduler.
In fair (Fair) scheduler, do not need to occupy certain system resource in advance, Fair scheduler can be all fortune Capable job dynamically adjusts system resource.When first big job is submitted, only this job is being run, it is obtained at this time All cluster resources;After second small task is submitted, Fair scheduler can distribute half resource to this small task, allow this The shared cluster resource of two task justices.
It should be noted that in Fair scheduler in Fig. 4, it is submitted to from second task and obtains resource and have centainly Delay because it needs to wait the Container of first task release busy.Small task execution can also discharge after completing The resource that oneself is occupied, big task obtain whole system resource again.Final effect is exactly that Fair scheduler obtains High resource utilization can guarantee that small task is completed in time again.
The prior art is based on labor management, preparatory projected resources, distribution resource, observation resource using effect, adjustment money A set of Complicated Flow such as source is completed.Hadoop itself only provides three kinds of basic schedulers for meeting simple resource management With scheduling mechanism.In terms of the distribution of resource there is: resource queue's distribution excessively macroscopic view, resource setting period renewal time it is long, Resource adjustment coarse size, is unsatisfactory for dynamic dispatching ability, is unable to satisfy tenant task priority the no actual algorithm support of adjustment Demand is unable to satisfy completely isolated and sufficiently shares mixed strategy etc., and there are many in the super large data center of multi-tenant Problem.
The present invention is to be completed task based access control using a set of process and program and submitted the resource management of the frequency and shared out the work, I.e. according to every subtask feature distribution according to need resource, resource refinement is promoted to task level from tenant's grade.It is used in algorithm support Neural network algorithm and machine learning techniques, from data characteristics, task feature, the big dimension of resource characteristic three more than 50 inputs because Son and aggregation of variable consider the resource allocation requirements of single task.It solves the problems, such as various in 1), and forms self training Ability, be continuously improved precision, realize intelligent resource allocation and scheduling and improve equipment utilization instead of manual work Rate and data analysis efficiency, reduce the energy consumption of data center.
The PC cluster resource scheduling device module based on neural network algorithm that Fig. 5 shows one embodiment of the invention is shown It is intended to;The device includes:
S210: module is submitted, for submitting task.
S220: preprocessing module is used for preprocessing tasks, obtains pretreatment information.
S230: extraction module, for extracting data characteristics, task feature, PC cluster resource characteristic.
S240: neural computing module, for calculating pretreatment information, data characteristics, task feature and PC cluster Resource characteristic generates resource allocation information.
S250: operation module, for being run according to resource allocation information, acquisition task resource monitoring, log feature is extracted, Accuracy rate verifying.
S260: model training module, for being monitored according to task resource, log feature is extracted, accuracy rate verifying training mind Through network model.
S270: the above operation is repeated several times for submitting task again in output module, it is trained after neural network mould Type exports final PC cluster resource scheduling scheme after calculating.
PC cluster resource scheduling device based on neural network algorithm of the invention, can be realized step S110-S170 The PC cluster resource regulating method based on neural network algorithm.
In the production cluster based on Hadoop technology, theoretically application should obtain the request of computing resource in time Meet, but the computing resource of cluster is often limited under actual conditions, is frequently necessary to be lined up when application needs computing resource It waits, itself is a problem, the prior art, which is difficult to find a perfect strategy, can satisfy institute for the scheduling of computing resource Some application scenarios.
The scheduler and strategy that existing technical solution generally uses Hadoop technology included, in addition system operation management people Member carries out the management and running of the computing resource of data center by hand.
System manager selectes a kind of scheduling strategy first, is then planned according to the resource requirement of tenant, in advance in advance All known tenant's queues are first created, while setting the minimum and maximum for the computing resource that each tenant's queue is able to use Value, being then strictly required when each tenant submits task must be submitted in specified tenant's queue, realize each tenant and obtain Minimal computational resources can be carried out resource isolation and limitation while guarantee again.
Later period, system manager assess the reasonability of tenant's computing resource according to the efficiency of the task run of each tenant, into The adjustment and optimization of row data center calculation resource parameters, ensure the task stable operation of tenant.
Fig. 6 shows the hardware structural diagram of PC cluster resource regulating method equipment provided in an embodiment of the present invention.
PC cluster resource regulating method equipment may include processor 401 and be stored with depositing for computer program instructions Reservoir 402.
Specifically, above-mentioned processor 401 may include central processing unit (CPU) or specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured to implement implementation of the present invention One or more integrated circuits of example.
Memory 402 may include the mass storage for data or instruction.For example it rather than limits, memory 402 may include hard disk drive (Hard Disk Drive, HDD), floppy disk drive, flash memory, CD, magneto-optic disk, tape or logical With the combination of universal serial bus (Universal Serial Bus, USB) driver or two or more the above.It is closing In the case where suitable, memory 402 may include the medium of removable or non-removable (or fixed).In a suitable case, it stores Device 402 can be inside or outside data processing equipment.In a particular embodiment, memory 402 is nonvolatile solid state storage Device.In a particular embodiment, memory 402 includes read-only memory (ROM).In a suitable case, which, which can be, covers ROM, the programming ROM (PROM), erasable PROM (EPROM), electric erasable PROM (EEPROM), electrically rewritable of mould programming The combination of ROM (EAROM) or flash memory or two or more the above.
Processor 401 is by reading and executing the computer program instructions stored in memory 402, to realize above-mentioned implementation Any one PC cluster resource regulating method in example.
In one example, PC cluster resource regulating method equipment may also include communication interface 403 and bus 410.Its In, as shown in fig. 6, processor 401, memory 402, communication interface 403 are connected by bus 410 and complete mutual lead to Letter.
Communication interface 403 is mainly used for realizing in the embodiment of the present invention between each module, device, unit and/or equipment Communication.
Bus 410 includes hardware, software or both, and the component of PC cluster resource regulating method equipment is coupled to each other Together.For example it rather than limits, bus may include accelerated graphics port (AGP) or other graphics bus, enhancing industrial standard Framework (EISA) bus, front side bus (FSB), super transmission (HT) interconnection, Industry Standard Architecture (ISA) bus, infinite bandwidth are mutual Company, low pin count (LPC) bus, memory bus, micro- channel architecture (MCA) bus, peripheral component interconnection (PCI) bus, PCI-Express (PCI-X) bus, Serial Advanced Technology Attachment (SATA) bus, Video Electronics Standards Association part (VLB) The combination of bus or other suitable buses or two or more the above.In a suitable case, bus 410 can wrap Include one or more buses.Although specific bus has been described and illustrated in the embodiment of the present invention, the present invention considers any suitable Bus or interconnection.
In addition, in conjunction with the PC cluster resource regulating method in above-described embodiment, the embodiment of the present invention can provide a kind of meter Calculation machine readable storage medium storing program for executing is realized.Computer program instructions are stored on the computer readable storage medium;The computer journey Any one PC cluster resource regulating method in above-described embodiment is realized in sequence instruction when being executed by processor.
It should be clear that the invention is not limited to specific configuration described above and shown in figure and processing. For brevity, it is omitted here the detailed description to known method.In the above-described embodiments, several tools have been described and illustrated The step of body, is as example.But method process of the invention is not limited to described and illustrated specific steps, this field Technical staff can be variously modified, modification and addition after understanding spirit of the invention, or suitable between changing the step Sequence.
Functional block shown in structures described above block diagram can be implemented as hardware, software, firmware or their group It closes.When realizing in hardware, it may, for example, be electronic circuit, specific integrated circuit (ASIC), firmware appropriate, insert Part, function card etc..When being realized with software mode, element of the invention is used to execute program or the generation of required task Code section.Perhaps code segment can store in machine readable media program or the data-signal by carrying in carrier wave is passing Defeated medium or communication links are sent." machine readable media " may include any medium for capableing of storage or transmission information. The example of machine readable media includes electronic circuit, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), soft Disk, CD-ROM, CD, hard disk, fiber medium, radio frequency (RF) link, etc..Code segment can be via such as internet, inline The computer network of net etc. is downloaded.
It should also be noted that, the exemplary embodiment referred in the present invention, is retouched based on a series of step or device State certain methods or system.But the present invention is not limited to the sequence of above-mentioned steps, that is to say, that can be according in embodiment The sequence referred to executes step, may also be distinct from that the sequence in embodiment or several steps are performed simultaneously.
The above description is merely a specific embodiment, it is apparent to those skilled in the art that, For convenience of description and succinctly, the system, module of foregoing description and the specific work process of unit can refer to preceding method Corresponding process in embodiment, details are not described herein.It should be understood that scope of protection of the present invention is not limited thereto, it is any to be familiar with Those skilled in the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or substitutions, These modifications or substitutions should be covered by the protection scope of the present invention.

Claims (12)

1. a kind of PC cluster resource regulating method based on neural network algorithm, which is characterized in that the described method includes:
Submission task;
Preprocessing tasks obtain pretreatment information;
Extract data characteristics, task feature, PC cluster resource characteristic;
Pretreatment information, data characteristics, task feature and cluster computing resource feature are calculated with neural network algorithm, generates money Information is distributed in source;
Task is run according to resource allocation information, obtains task resource monitoring, log feature extracts, accuracy rate verifying;
It is monitored according to task resource, log feature extracts, and accuracy rate verifies training neural network model;
Submit task again, be repeated several times the above operation, it is trained after neural network model calculate after export final cluster Computing resource scheduling scheme.
2. the method according to claim 1, wherein the preprocessing tasks include: task agent and task solution Analysis.
3. the method according to claim 1, wherein the acquisition data characteristics, task feature, PC cluster Resource characteristic includes: to obtain data characteristics, task feature, PC cluster resource characteristic from data system.
4. the method according to claim 1, wherein the data characteristics include: table name, field name, function, SQL logic and genetic connection.
5. the method according to claim 1, wherein the task feature includes: user name, state, maximum Minimum CPU memory, task are submitted, are lined up, distribution, executing and terminate duration.
6. the method according to claim 1, wherein the neural network algorithm is BP neural network algorithm.
7. the method according to claim 1, wherein the resource allocation information include: minimax CPU Memory and resource percentage.
8. the method according to claim 1, wherein the task resource monitoring includes: that monitoring programme will appoint System resource, the queue resource occupied in business operation is stored as structural data.
9. the method according to claim 1, wherein the extraction of the described log feature include: user name, state, Minimax CPU memory, task are submitted, are lined up, distribution, executing and terminate duration.
10. a kind of PC cluster resource scheduling device based on neural network algorithm, which is characterized in that described device includes:
Module is submitted, for submitting task;
Preprocessing module is used for preprocessing tasks, obtains pretreatment information;
Extraction module, for extracting data characteristics, task feature, PC cluster resource characteristic;
Neural computing module, for calculating pretreatment information, data characteristics, task feature and cluster computing resource feature, Generate resource allocation information;
Module is run, for running according to resource allocation information, obtains task resource monitoring, log feature extracts, and accuracy rate is tested Card;
Model training module, for being monitored according to task resource, log feature is extracted, and accuracy rate verifies training neural network mould Type;
The above operation is repeated several times for submitting task again in output module, it is trained after neural network model calculate after it is defeated Final PC cluster resource scheduling scheme out.
11. a kind of PC cluster resource regulating method equipment characterized by comprising at least one processor, at least one deposits The computer program instructions of reservoir and storage in the memory, when the computer program instructions are held by the processor Such as claim 1-9 described in any item methods are realized when row.
12. a kind of computer readable storage medium, is stored thereon with computer program instructions, which is characterized in that when the calculating Machine program instruction realizes method as claimed in any one of claims 1-9 wherein when being executed by processor.
CN201711494651.0A 2017-12-31 2017-12-31 Cluster computing resource scheduling method, device, equipment and medium Active CN109992404B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711494651.0A CN109992404B (en) 2017-12-31 2017-12-31 Cluster computing resource scheduling method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711494651.0A CN109992404B (en) 2017-12-31 2017-12-31 Cluster computing resource scheduling method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN109992404A true CN109992404A (en) 2019-07-09
CN109992404B CN109992404B (en) 2022-06-10

Family

ID=67110823

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711494651.0A Active CN109992404B (en) 2017-12-31 2017-12-31 Cluster computing resource scheduling method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN109992404B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750363A (en) * 2019-12-26 2020-02-04 中科寒武纪科技股份有限公司 Computer storage management method and device, electronic equipment and storage medium
CN110795226A (en) * 2020-01-03 2020-02-14 中科寒武纪科技股份有限公司 Method for processing task using computer system, electronic device and storage medium
CN111190718A (en) * 2020-01-07 2020-05-22 第四范式(北京)技术有限公司 Method, device and system for realizing task scheduling
CN111209077A (en) * 2019-12-26 2020-05-29 中科曙光国际信息产业有限公司 Deep learning framework design method
CN111381970A (en) * 2020-03-16 2020-07-07 第四范式(北京)技术有限公司 Cluster task resource allocation method and device, computer device and storage medium
CN111985831A (en) * 2020-08-27 2020-11-24 北京华胜天成科技股份有限公司 Scheduling method and device of cloud computing resources, computer equipment and storage medium
CN112000478A (en) * 2020-08-24 2020-11-27 中国银行股份有限公司 Job operation resource allocation method and device
CN112241321A (en) * 2020-09-24 2021-01-19 北京影谱科技股份有限公司 Computing power scheduling method and device based on Kubernetes
CN112256418A (en) * 2020-10-26 2021-01-22 清华大学深圳国际研究生院 Big data task scheduling method
CN112862085A (en) * 2019-11-27 2021-05-28 杭州海康威视数字技术股份有限公司 Storage space optimization method and device
CN112953767A (en) * 2021-02-05 2021-06-11 深圳前海微众银行股份有限公司 Resource allocation parameter setting method and device based on Hadoop platform and storage medium
CN113037800A (en) * 2019-12-09 2021-06-25 华为技术有限公司 Job scheduling method and job scheduling device
CN113296907A (en) * 2021-04-29 2021-08-24 上海淇玥信息技术有限公司 Task scheduling processing method and system based on cluster and computer equipment
CN113535399A (en) * 2021-07-15 2021-10-22 电子科技大学 NFV resource scheduling method, device and system
CN113568425A (en) * 2020-04-28 2021-10-29 北京理工大学 Cluster cooperative guidance method based on neural network learning
CN113612839A (en) * 2021-07-30 2021-11-05 国汽智控(北京)科技有限公司 Method and device for determining driving task calculation terminal and computer equipment
CN113886036A (en) * 2021-09-13 2022-01-04 天翼数字生活科技有限公司 Method and system for optimizing cluster configuration of distributed system
CN114297808A (en) * 2020-12-02 2022-04-08 北京航空航天大学 Task allocation and resource scheduling method of avionics system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5442730A (en) * 1993-10-08 1995-08-15 International Business Machines Corporation Adaptive job scheduling using neural network priority functions
CN103593323A (en) * 2013-11-07 2014-02-19 浪潮电子信息产业股份有限公司 Machine learning method for Map Reduce task resource allocation parameters
CN103699440A (en) * 2012-09-27 2014-04-02 北京搜狐新媒体信息技术有限公司 Method and device for cloud computing platform system to distribute resources to task
CN105337765A (en) * 2015-10-10 2016-02-17 上海新炬网络信息技术有限公司 Distributed hadoop cluster fault automatic diagnosis and restoration system
CN105718479A (en) * 2014-12-04 2016-06-29 中国电信股份有限公司 Execution strategy generation method and device under cross-IDC (Internet Data Center) big data processing architecture
CN106790529A (en) * 2016-12-20 2017-05-31 北京并行科技股份有限公司 The dispatching method of computing resource, control centre and scheduling system
CN106777079A (en) * 2016-12-13 2017-05-31 苏州蜗牛数字科技股份有限公司 A kind of daily record data Visualized Analysis System and method
CN107229693A (en) * 2017-05-22 2017-10-03 哈工大大数据产业有限公司 The method and system of big data system configuration parameter tuning based on deep learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5442730A (en) * 1993-10-08 1995-08-15 International Business Machines Corporation Adaptive job scheduling using neural network priority functions
CN103699440A (en) * 2012-09-27 2014-04-02 北京搜狐新媒体信息技术有限公司 Method and device for cloud computing platform system to distribute resources to task
CN103593323A (en) * 2013-11-07 2014-02-19 浪潮电子信息产业股份有限公司 Machine learning method for Map Reduce task resource allocation parameters
CN105718479A (en) * 2014-12-04 2016-06-29 中国电信股份有限公司 Execution strategy generation method and device under cross-IDC (Internet Data Center) big data processing architecture
CN105337765A (en) * 2015-10-10 2016-02-17 上海新炬网络信息技术有限公司 Distributed hadoop cluster fault automatic diagnosis and restoration system
CN106777079A (en) * 2016-12-13 2017-05-31 苏州蜗牛数字科技股份有限公司 A kind of daily record data Visualized Analysis System and method
CN106790529A (en) * 2016-12-20 2017-05-31 北京并行科技股份有限公司 The dispatching method of computing resource, control centre and scheduling system
CN107229693A (en) * 2017-05-22 2017-10-03 哈工大大数据产业有限公司 The method and system of big data system configuration parameter tuning based on deep learning

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112862085A (en) * 2019-11-27 2021-05-28 杭州海康威视数字技术股份有限公司 Storage space optimization method and device
CN112862085B (en) * 2019-11-27 2023-08-22 杭州海康威视数字技术股份有限公司 Storage space optimization method and device
CN113037800A (en) * 2019-12-09 2021-06-25 华为技术有限公司 Job scheduling method and job scheduling device
CN113037800B (en) * 2019-12-09 2024-03-05 华为云计算技术有限公司 Job scheduling method and job scheduling device
CN111209077A (en) * 2019-12-26 2020-05-29 中科曙光国际信息产业有限公司 Deep learning framework design method
CN110750363A (en) * 2019-12-26 2020-02-04 中科寒武纪科技股份有限公司 Computer storage management method and device, electronic equipment and storage medium
CN110795226A (en) * 2020-01-03 2020-02-14 中科寒武纪科技股份有限公司 Method for processing task using computer system, electronic device and storage medium
CN111190718A (en) * 2020-01-07 2020-05-22 第四范式(北京)技术有限公司 Method, device and system for realizing task scheduling
CN111381970A (en) * 2020-03-16 2020-07-07 第四范式(北京)技术有限公司 Cluster task resource allocation method and device, computer device and storage medium
WO2021185206A1 (en) * 2020-03-16 2021-09-23 第四范式(北京)技术有限公司 Resource allocation method and apparatus for cluster task, and computer apparatus and storage medium
CN113568425B (en) * 2020-04-28 2024-05-14 北京理工大学 Cluster collaborative guidance method based on neural network learning
CN113568425A (en) * 2020-04-28 2021-10-29 北京理工大学 Cluster cooperative guidance method based on neural network learning
CN112000478A (en) * 2020-08-24 2020-11-27 中国银行股份有限公司 Job operation resource allocation method and device
CN112000478B (en) * 2020-08-24 2024-02-23 中国银行股份有限公司 Method and device for distributing operation resources
CN111985831A (en) * 2020-08-27 2020-11-24 北京华胜天成科技股份有限公司 Scheduling method and device of cloud computing resources, computer equipment and storage medium
CN112241321A (en) * 2020-09-24 2021-01-19 北京影谱科技股份有限公司 Computing power scheduling method and device based on Kubernetes
CN112256418B (en) * 2020-10-26 2023-10-24 清华大学深圳国际研究生院 Big data task scheduling method
CN112256418A (en) * 2020-10-26 2021-01-22 清华大学深圳国际研究生院 Big data task scheduling method
CN114297808A (en) * 2020-12-02 2022-04-08 北京航空航天大学 Task allocation and resource scheduling method of avionics system
CN114297808B (en) * 2020-12-02 2023-04-07 北京航空航天大学 Task allocation and resource scheduling method of avionics system
CN112953767B (en) * 2021-02-05 2022-11-04 深圳前海微众银行股份有限公司 Resource allocation parameter setting method and device based on Hadoop platform and storage medium
CN112953767A (en) * 2021-02-05 2021-06-11 深圳前海微众银行股份有限公司 Resource allocation parameter setting method and device based on Hadoop platform and storage medium
CN113296907B (en) * 2021-04-29 2023-12-22 上海淇玥信息技术有限公司 Task scheduling processing method, system and computer equipment based on clusters
CN113296907A (en) * 2021-04-29 2021-08-24 上海淇玥信息技术有限公司 Task scheduling processing method and system based on cluster and computer equipment
CN113535399A (en) * 2021-07-15 2021-10-22 电子科技大学 NFV resource scheduling method, device and system
CN113612839A (en) * 2021-07-30 2021-11-05 国汽智控(北京)科技有限公司 Method and device for determining driving task calculation terminal and computer equipment
CN113886036A (en) * 2021-09-13 2022-01-04 天翼数字生活科技有限公司 Method and system for optimizing cluster configuration of distributed system
CN113886036B (en) * 2021-09-13 2024-04-19 天翼数字生活科技有限公司 Method and system for optimizing distributed system cluster configuration

Also Published As

Publication number Publication date
CN109992404B (en) 2022-06-10

Similar Documents

Publication Publication Date Title
CN109992404A (en) PC cluster resource regulating method, device, equipment and medium
Mei et al. An efficient feature selection algorithm for evolving job shop scheduling rules with genetic programming
CN103092683B (en) For data analysis based on didactic scheduling
Tay et al. Evolving dispatching rules using genetic programming for solving multi-objective flexible job-shop problems
CN110389820B (en) Private cloud task scheduling method for resource prediction based on v-TGRU model
US11436050B2 (en) Method, apparatus and computer program product for resource scheduling
Hunt et al. Evolving" less-myopic" scheduling rules for dynamic job shop scheduling with genetic programming
CN105373432B (en) A kind of cloud computing resource scheduling method based on virtual resource status predication
CN110008259A (en) The method and terminal device of visualized data analysis
CN113778646B (en) Task level scheduling method and device based on execution time prediction
CN110825522A (en) Spark parameter self-adaptive optimization method and system
Cheong et al. SCARL: Attentive reinforcement learning-based scheduling in a multi-resource heterogeneous cluster
CN109343972A (en) Task processing method and terminal device
CN110084507A (en) The scientific workflow method for optimizing scheduling of perception is classified under cloud computing environment
CN112148471A (en) Method and device for scheduling resources in distributed computing system
Đurasević et al. Collaboration methods for ensembles of dispatching rules for the dynamic unrelated machines environment
CN113608858A (en) MapReduce architecture-based block task execution system for data synchronization
CN116501505B (en) Method, device, equipment and medium for generating data stream of load task
RU2411574C2 (en) Intellectual grid-system for highly efficient data processing
CN114327925A (en) Power data real-time calculation scheduling optimization method and system
Prado et al. On providing quality of service in grid computing through multi-objective swarm-based knowledge acquisition in fuzzy schedulers
CN115599522A (en) Task scheduling method, device and equipment for cloud computing platform
Tuli et al. Optimizing the Performance of Fog Computing Environments Using AI and Co-Simulation
Liu et al. 5G/B5G Network Slice Management via Staged Reinforcement Learning
WO2017085454A1 (en) Fuzzy caching mechanism for thread execution layouts

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant