CN109992404A - PC cluster resource regulating method, device, equipment and medium - Google Patents
PC cluster resource regulating method, device, equipment and medium Download PDFInfo
- Publication number
- CN109992404A CN109992404A CN201711494651.0A CN201711494651A CN109992404A CN 109992404 A CN109992404 A CN 109992404A CN 201711494651 A CN201711494651 A CN 201711494651A CN 109992404 A CN109992404 A CN 109992404A
- Authority
- CN
- China
- Prior art keywords
- task
- resource
- neural network
- feature
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Debugging And Monitoring (AREA)
Abstract
The embodiment of the invention discloses a kind of PC cluster resource regulating method, device, equipment and medium based on neural network algorithm, this method comprises: the task of submission;Preprocessing tasks obtain pretreatment information;Extract data characteristics, task feature, PC cluster resource characteristic;Pretreatment information, data characteristics, task feature and cluster computing resource feature are calculated with neural network algorithm, generates resource allocation information;Task is run according to resource allocation information, obtains task resource monitoring, log feature extracts, accuracy rate verifying;It is monitored according to task resource, log feature extracts, and accuracy rate verifies training neural network model;Submit task again, be repeated several times the above operation, it is trained after neural network model calculate after export final PC cluster resource scheduling scheme.The method that the embodiment of the present invention realizes intelligent automation, improves utilization rate of equipment and installations and data analysis efficiency, reduces the energy consumption of data center.
Description
Technical field
The present invention relates to software, communication, internet, big data industry fields, more particularly to artificial intelligence technology and greatly
The comprehensive application technology field of data technique more particularly to a kind of PC cluster scheduling of resource side based on neural network algorithm
Method, device, equipment and medium.
Background technique
In the production cluster based on Hadoop technology, theoretically task should obtain the request of computing resource in time
Meet, but the computing resource of cluster is often limited under actual conditions, is frequently necessary to be lined up when task needs computing resource
It waits, the prior art, which is difficult to find a strategy, can satisfy all application scenarios.
The scheduler and strategy that existing technical solution generally uses Hadoop technology included, in addition system operation management people
Member carries out the management and running of the computing resource of data center by hand.System manager selectes a kind of scheduling strategy, then according to rent
The resource requirement at family is planned, all known tenant's queues is pre-created in advance, while setting each tenant's queue energy
The minimum and maximum value of the computing resource enough used must be submitted to specified when each tenant's submission task is then strictly required
In tenant's queue, realizes each tenant and obtain can be carried out resource isolation and limitation again while minimal computational resources ensure.Later period,
System manager assesses the reasonability of tenant's computing resource according to the efficiency of the task run of each tenant, carries out data center's meter
The adjustment and optimization for calculating resource parameters, ensure the task stable operation of tenant.
In conclusion the prior art and management method are applied when multi-tenant shares cluster, there are following disadvantages: renting
The demand of family task priority and the contradiction of Hadoop scheduling strategy;Resource allocates the flexibility for causing resource to use in advance not
Foot;Excessive resource, which is seized, leads to task run low efficiency.The prior art only provides basic scheduling means and method, this is not
The difficulty of data center's O&M is increased only, simultaneously as all also increasing the risk of error by artificial manual operations.In data
There is an urgent need to a kind of intelligent, dynamic, automatic, real-time resource management and scheduling means for the heart.
Summary of the invention
The PC cluster resource regulating method that the embodiment of the invention provides a kind of based on neural network algorithm, is set device
Standby and medium, the resource allocation for solving task in the prior art is not flexible, task run low efficiency, data center's O&M
Difficulty, and the problem of manual operations, to reach intelligent, dynamic, automatic, real-time resource management and scheduling.
In a first aspect, the embodiment of the invention provides a kind of PC cluster scheduling of resource side based on neural network algorithm
Method, method include: that task is submitted;
Task pretreatment obtains pretreatment information;
Extract data characteristics, task feature, PC cluster resource characteristic;
Pretreatment information, data characteristics, task feature and cluster computing resource feature are calculated with neural network algorithm, it is raw
At resource allocation information;
Task is run according to resource allocation information, obtains task resource monitoring, and log feature extracts, accuracy rate verifying;
Model training is monitored according to task resource, and log feature extracts, and accuracy rate verifies training neural network model;
Task is submitted again, is repeated several times the above operation, it is trained after neural network model calculate after export it is final
PC cluster resource scheduling scheme.
Second aspect, the embodiment of the invention provides a kind of, and the PC cluster scheduling of resource based on neural network algorithm fills
It sets, device includes:
Preprocessing module obtains pretreatment information for pre-processing to task;
Characteristic extracting module, for extracting data characteristics, task feature, PC cluster resource characteristic;
Neural computing module, it is special for calculating pretreatment information, data characteristics, task feature and cluster computing resource
Sign generates resource allocation information;
Task run module, for being run according to resource allocation information, acquisition task resource monitoring, log feature is extracted,
Accuracy rate verifying;
Model training module, for being monitored according to task resource, log feature is extracted, and accuracy rate verifies training neural network
Model.
The PC cluster resource regulating method equipment based on neural network algorithm that the embodiment of the invention provides a kind of, packet
It includes: at least one processor, at least one processor and computer program instructions stored in memory, when computer journey
The method such as first aspect in above embodiment is realized in sequence instruction when being executed by processor.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer journey
The method such as first aspect in above embodiment is realized in sequence instruction when computer program instructions are executed by processor.
PC cluster resource regulating method provided in an embodiment of the present invention based on neural network algorithm, device, equipment and
Medium, the good effect having are as follows:
(1) neural network algorithm is used, solves the contradiction of task trip and Hadoop scheduling strategy.
(2) neural network algorithm is used, resource is solved and allocates the flexibility foot for causing resource to use in advance
The problem of.
(3) neural network algorithm is used, solving excessive resource and seizing leads to task run low efficiency
Problem.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention
Attached drawing is briefly described, for those of ordinary skill in the art, without creative efforts, also
Other drawings may be obtained according to these drawings without any creative labor.
Fig. 1 shows the PC cluster resource regulating method process based on neural network algorithm of one embodiment of the invention
Figure;
Fig. 2 shows the flow charts of the intelligent scheduling operation of one embodiment of the invention;
Fig. 3 shows the BP neural network structural schematic diagram of one embodiment of the invention;
Fig. 4 shows Yarn scheduler comparison diagram in the prior art;
The PC cluster resource scheduling device module based on neural network algorithm that Fig. 5 shows one embodiment of the invention is shown
It is intended to;
Fig. 6 shows the hardware structural diagram of PC cluster resource regulating method equipment provided in an embodiment of the present invention.
Specific embodiment
The feature and exemplary embodiment of various aspects of the invention is described more fully below, in order to make mesh of the invention
, technical solution and advantage be more clearly understood, with reference to the accompanying drawings and embodiments, the present invention is further retouched in detail
It states.It should be understood that specific embodiment described herein is only configured to explain the present invention, it is not configured as limiting the present invention.
To those skilled in the art, the present invention can be real in the case where not needing some details in these details
It applies.Below the description of embodiment is used for the purpose of better understanding the present invention to provide by showing example of the invention.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence " including ... ", it is not excluded that including
There is also other identical elements in the process, method, article or equipment of the element.
Fig. 1 shows the PC cluster resource regulating method process based on neural network algorithm of one embodiment of the invention
Figure;The PC cluster resource regulating method based on neural network algorithm of the embodiment of the present invention includes:
S110: task is submitted.
S120: preprocessing tasks obtain pretreatment information.
S130: data characteristics, task feature, PC cluster resource characteristic are extracted.
S140: it is special that pretreatment information, data characteristics, task feature and cluster computing resource are calculated with neural network algorithm
Sign generates resource allocation information.
S150: running task according to resource allocation information, obtains task resource monitoring, and log feature extracts, and accuracy rate is tested
Card.
S160: monitoring according to task resource, and log feature extracts, and accuracy rate verifies training neural network model.
S170: submitting task again, is repeated several times the above operation, it is trained after neural network model calculate after export most
Whole PC cluster resource scheduling scheme.
In step s 110, task submits environment to refer mainly to the computing engines such as Hive, Spark based on Hadoop cluster.
In the prior art scheme, job is first submitted by ETL system, then by Hive Server2 and Spark Thrift Server
Server-side is received.
In the step s 120, preprocessing tasks include: task agent and task parsing.Task agent, by simulating Hive
Server2 and Spark Thrift Server RPC communication mechanism is realized, is intercepted in the job that application is submitted by proxy server
Hold.Task parsing, intercepting job content information includes SQL statement, tenant's title, permission.It is obtained by parsing SQL statement
Content includes table name, field name, function, SQL logic, genetic connection etc..
In step s 130, data characteristics, task feature, PC cluster resource characteristic are extracted, wherein data characteristics mentions
Take, by the information extractions such as metadata management system and the table name, the field name that parse participate in task computation data volume scale,
Distribution, quantity etc..The extraction of task feature is searched same type task in history log information (SQL statement is essentially identical)
Historic task operation information.Submit, be lined up including, user name, state, minimax resource (CPU memory), task, distribution,
It executes, terminate the relevant informations such as duration.
In step 140, pretreatment information, data characteristics, task feature and cluster meter are calculated with neural network algorithm
Resource characteristic to be calculated, resource allocation information is generated, wherein the calculating process of neural network algorithm includes tagsort, model application,
The result of resource parameters exports.
Data characteristics, task characteristic input neural network algorithm model are carried out forecast analysis by tagsort.
Model application, i.e., it is special to the data characteristics of input, task feature, PC cluster resource according to BP neural network algorithm
Sign carries out operation.
Using every mass index such as data characteristics, task feature, resource characteristic as input, by corresponding resource allocation number
According to as output.In terms of program development, these data normalizations are handled using the function that Tensorflow is carried.
Fig. 3 shows the BP neural network structural schematic diagram of one embodiment of the invention, BP neural network structure of the invention
Design includes:
The design of input and output layer: the model, as input, is exported with resource and is joined by every mass index of every group of data
Number is output, so the number of nodes of input layer is 58, the number of nodes of output layer is 4.The model by every group of data every quality
Index is output with resource output parameter as input, so the number of nodes of input layer is 58, the number of nodes of output layer is 4.It is logical
Repetition test test is crossed, determines that optimal neuronal structure and quantity are<Isosorbide-5-Nitrae 0>,<2,13>.
40 and 13 be repetition test empirical value adjusted.In fact it can thus be appreciated that because being looked for from 58 parameters
Common characteristic quantity in this 58 parameters, can be classified as many classifications in fact.Overall major class is data characteristics, task feature, money
Source feature.2nd layer of 40 neuron nodes are to be finely divided according to 58 parameters, are classified as 40 small subclasses.58 parameters are defeated
Enter, deactivates the output of 40 feature subclasses.Then, third layer searches out 13 big category features from 40 features.Then
Final 13 big category features deactivate a final dimension table data, that is, such feature combination determines final use
Which kind of resource is equipped with parameter, and parameter group is 4 values.If that be done is more preferable, can more it refine, 5 layers even 10 layers.It is currently reason
By and experiment, only emphasize feasibility, do not emphasize optimality.
Hidden layer design: prediction mould is established using the BP network of four layers of multiple input single output containing two hidden layers in this method
Type.In network design process, the determination of hidden nodes is particularly significant.Hidden neuron number is excessive, can increase network
Calculation amount is simultaneously easy to produce overfitting problem;Neuron number is very few, then will affect network performance, falls flat.
The complexity of the number and practical problem of hidden neuron in network, the neuron number for outputting and inputting layer and expectation is missed
The setting of difference has direct connection.This method is surveyed on the problem of choosing hidden neuron number with reference to by repetition test
Examination determines that optimal neuronal structure and quantity are<Isosorbide-5-Nitrae 0>,<2,13>.
The selection of excitation function: BP neural network generallys use Sigmoid differentiable function and linear function as network
Excitation function.Excitation function of the S type tangent function tansig as hidden neuron is selected in this programme.And it is defeated due to network
It is normalized in [- 1,1] range out, therefore prediction model chooses excitation of the S type logarithmic function tansig as output layer neuron
Function.
The application of model: the Neural Network Toolbox that this time prediction is selected in Tensorflow carries out the training of network, just
The specific implementation steps are as follows for step model: will input network, setting network hidden layer and output layer after the normalization of training sample data
Excitation function is respectively tansig and logsig function, and network training function is traingdx, network performance function mse, if
Determine network parameter, network the number of iterations epochs is 5000 times, and anticipation error goal is 0.00000001, and learning rate lr is
0.01, after having set parameter, start to train network.
The network is then completed to learn after reaching anticipation error by a certain amount of sample data repetitive learning.Neural network model
After the completion of training, it is only necessary to which the output of expected result data can be obtained in various features data target input network model.In conjunction with
The distribution and scheduling of intelligent resource can be realized in above-mentioned overall operation process.
The result of resource parameters exports: after completing analysis by neural network algorithm, carrying out prediction result output, exports content
It mainly include resource allocation parameters (minimax CPU memory, resource percentage) etc..
It further include that the resource allocation and task agent of task are submitted among step S140 and S150.
Resource allocation is completed the reading of output parameter by broker program, and is configured by the resource parameters of Hadoop platform
Entrance is adjusted and comes into force.Main actions include queue creation, queue scheduling strategy, resource parameters etc..
Task agent is submitted, and mission bit stream is encapsulated newest queue resource relevant information by broker program, forwarding is submitted
Give service receiving end, i.e. HiveServer2 and Spark Thrift Server.
In step S150, task is run according to resource allocation information, obtains task resource monitoring, log feature extracts,
Accuracy rate verifying;Specially include:
Task run: task is run in strict accordance with the parameter of submission.
Running log: a large amount of running log information is generated in task operational process, and is stored in the text, while can
To be obtained by API.
Task terminates: task execution finishes, and log archive is history log.
Monitoring resource: in task operational process, the resource that will be occupied in task run by monitoring programme, including system money
Source, queue resource etc. are stored as structural data.The structural data is exported to neural network algorithm, is used for training pattern.
Log feature extracts: submit, be lined up including user name, state, minimax resource (CPU memory), task, point
Match, execute, terminating the relevant informations such as duration.The data are exported to neural network algorithm, are used for training pattern.
Accuracy rate verifying: by task execution as a result, the task run duration under unit resource is occupied, with desired value
It compares, the success or not for judgment models training.
It in step S160, is monitored according to task resource, log feature extracts, and accuracy rate verifies training neural network mould
Type;Specially initial stage inputs neural network mould by dimension datas such as resource characteristic, task feature, data characteristics, results-drivens
In type, the formula that exercises supervision learning training.Later period exports characteristic, self-optimizing model parameter by accuracy rate verifying and task
Optimized model.
In step S170, submit task again, be repeated several times the above operation, it is trained after neural network model calculate
After exporting final PC cluster resource scheduling scheme afterwards, i.e. output parameter.
The Resource Scheduler and strategy for the Hadoop that the present invention uses, specifically are as follows:
It can choose in Hadoop there are three types of scheduler: FIFO Scheduler, Capacity Scheduler,
FairS cheduler。
Application is lined up a queue by the sequence of submission by first in first out (FIFO) scheduler, this is a first in first out team
Column are first allocated resource to the application in queue on most head, to the application demand on most head when carrying out resource allocation
Next distribution is given after satisfaction again, and so on.FIFO Scheduler is the shared cluster for being not particularly suited for multi-tenant environment.
Big application may occupy all cluster resources, this results in other application to be blocked.In the shared cluster of multi-tenant environment
In, it is more suitable for using Capacity Scheduler or Fair Scheduler, the two schedulers all allow big task and small
Task obtains certain system resource while submitting." Yarn scheduler comparison diagram " illustrates these schedulers below
Difference, it can be seen from the figure that small task can be by big task blocking in FIFO scheduler.
Capacity (Capacity) scheduler has a special queue to be used to run small task, but special for small task
A queue is arranged can occupy certain cluster resource in advance, this results in the execution time of big task that can lag behind using FIFO
Time when scheduler.
In fair (Fair) scheduler, do not need to occupy certain system resource in advance, Fair scheduler can be all fortune
Capable job dynamically adjusts system resource.When first big job is submitted, only this job is being run, it is obtained at this time
All cluster resources;After second small task is submitted, Fair scheduler can distribute half resource to this small task, allow this
The shared cluster resource of two task justices.
It should be noted that in Fair scheduler in Fig. 4, it is submitted to from second task and obtains resource and have centainly
Delay because it needs to wait the Container of first task release busy.Small task execution can also discharge after completing
The resource that oneself is occupied, big task obtain whole system resource again.Final effect is exactly that Fair scheduler obtains
High resource utilization can guarantee that small task is completed in time again.
The prior art is based on labor management, preparatory projected resources, distribution resource, observation resource using effect, adjustment money
A set of Complicated Flow such as source is completed.Hadoop itself only provides three kinds of basic schedulers for meeting simple resource management
With scheduling mechanism.In terms of the distribution of resource there is: resource queue's distribution excessively macroscopic view, resource setting period renewal time it is long,
Resource adjustment coarse size, is unsatisfactory for dynamic dispatching ability, is unable to satisfy tenant task priority the no actual algorithm support of adjustment
Demand is unable to satisfy completely isolated and sufficiently shares mixed strategy etc., and there are many in the super large data center of multi-tenant
Problem.
The present invention is to be completed task based access control using a set of process and program and submitted the resource management of the frequency and shared out the work,
I.e. according to every subtask feature distribution according to need resource, resource refinement is promoted to task level from tenant's grade.It is used in algorithm support
Neural network algorithm and machine learning techniques, from data characteristics, task feature, the big dimension of resource characteristic three more than 50 inputs because
Son and aggregation of variable consider the resource allocation requirements of single task.It solves the problems, such as various in 1), and forms self training
Ability, be continuously improved precision, realize intelligent resource allocation and scheduling and improve equipment utilization instead of manual work
Rate and data analysis efficiency, reduce the energy consumption of data center.
The PC cluster resource scheduling device module based on neural network algorithm that Fig. 5 shows one embodiment of the invention is shown
It is intended to;The device includes:
S210: module is submitted, for submitting task.
S220: preprocessing module is used for preprocessing tasks, obtains pretreatment information.
S230: extraction module, for extracting data characteristics, task feature, PC cluster resource characteristic.
S240: neural computing module, for calculating pretreatment information, data characteristics, task feature and PC cluster
Resource characteristic generates resource allocation information.
S250: operation module, for being run according to resource allocation information, acquisition task resource monitoring, log feature is extracted,
Accuracy rate verifying.
S260: model training module, for being monitored according to task resource, log feature is extracted, accuracy rate verifying training mind
Through network model.
S270: the above operation is repeated several times for submitting task again in output module, it is trained after neural network mould
Type exports final PC cluster resource scheduling scheme after calculating.
PC cluster resource scheduling device based on neural network algorithm of the invention, can be realized step S110-S170
The PC cluster resource regulating method based on neural network algorithm.
In the production cluster based on Hadoop technology, theoretically application should obtain the request of computing resource in time
Meet, but the computing resource of cluster is often limited under actual conditions, is frequently necessary to be lined up when application needs computing resource
It waits, itself is a problem, the prior art, which is difficult to find a perfect strategy, can satisfy institute for the scheduling of computing resource
Some application scenarios.
The scheduler and strategy that existing technical solution generally uses Hadoop technology included, in addition system operation management people
Member carries out the management and running of the computing resource of data center by hand.
System manager selectes a kind of scheduling strategy first, is then planned according to the resource requirement of tenant, in advance in advance
All known tenant's queues are first created, while setting the minimum and maximum for the computing resource that each tenant's queue is able to use
Value, being then strictly required when each tenant submits task must be submitted in specified tenant's queue, realize each tenant and obtain
Minimal computational resources can be carried out resource isolation and limitation while guarantee again.
Later period, system manager assess the reasonability of tenant's computing resource according to the efficiency of the task run of each tenant, into
The adjustment and optimization of row data center calculation resource parameters, ensure the task stable operation of tenant.
Fig. 6 shows the hardware structural diagram of PC cluster resource regulating method equipment provided in an embodiment of the present invention.
PC cluster resource regulating method equipment may include processor 401 and be stored with depositing for computer program instructions
Reservoir 402.
Specifically, above-mentioned processor 401 may include central processing unit (CPU) or specific integrated circuit
(Application Specific Integrated Circuit, ASIC), or may be configured to implement implementation of the present invention
One or more integrated circuits of example.
Memory 402 may include the mass storage for data or instruction.For example it rather than limits, memory
402 may include hard disk drive (Hard Disk Drive, HDD), floppy disk drive, flash memory, CD, magneto-optic disk, tape or logical
With the combination of universal serial bus (Universal Serial Bus, USB) driver or two or more the above.It is closing
In the case where suitable, memory 402 may include the medium of removable or non-removable (or fixed).In a suitable case, it stores
Device 402 can be inside or outside data processing equipment.In a particular embodiment, memory 402 is nonvolatile solid state storage
Device.In a particular embodiment, memory 402 includes read-only memory (ROM).In a suitable case, which, which can be, covers
ROM, the programming ROM (PROM), erasable PROM (EPROM), electric erasable PROM (EEPROM), electrically rewritable of mould programming
The combination of ROM (EAROM) or flash memory or two or more the above.
Processor 401 is by reading and executing the computer program instructions stored in memory 402, to realize above-mentioned implementation
Any one PC cluster resource regulating method in example.
In one example, PC cluster resource regulating method equipment may also include communication interface 403 and bus 410.Its
In, as shown in fig. 6, processor 401, memory 402, communication interface 403 are connected by bus 410 and complete mutual lead to
Letter.
Communication interface 403 is mainly used for realizing in the embodiment of the present invention between each module, device, unit and/or equipment
Communication.
Bus 410 includes hardware, software or both, and the component of PC cluster resource regulating method equipment is coupled to each other
Together.For example it rather than limits, bus may include accelerated graphics port (AGP) or other graphics bus, enhancing industrial standard
Framework (EISA) bus, front side bus (FSB), super transmission (HT) interconnection, Industry Standard Architecture (ISA) bus, infinite bandwidth are mutual
Company, low pin count (LPC) bus, memory bus, micro- channel architecture (MCA) bus, peripheral component interconnection (PCI) bus,
PCI-Express (PCI-X) bus, Serial Advanced Technology Attachment (SATA) bus, Video Electronics Standards Association part (VLB)
The combination of bus or other suitable buses or two or more the above.In a suitable case, bus 410 can wrap
Include one or more buses.Although specific bus has been described and illustrated in the embodiment of the present invention, the present invention considers any suitable
Bus or interconnection.
In addition, in conjunction with the PC cluster resource regulating method in above-described embodiment, the embodiment of the present invention can provide a kind of meter
Calculation machine readable storage medium storing program for executing is realized.Computer program instructions are stored on the computer readable storage medium;The computer journey
Any one PC cluster resource regulating method in above-described embodiment is realized in sequence instruction when being executed by processor.
It should be clear that the invention is not limited to specific configuration described above and shown in figure and processing.
For brevity, it is omitted here the detailed description to known method.In the above-described embodiments, several tools have been described and illustrated
The step of body, is as example.But method process of the invention is not limited to described and illustrated specific steps, this field
Technical staff can be variously modified, modification and addition after understanding spirit of the invention, or suitable between changing the step
Sequence.
Functional block shown in structures described above block diagram can be implemented as hardware, software, firmware or their group
It closes.When realizing in hardware, it may, for example, be electronic circuit, specific integrated circuit (ASIC), firmware appropriate, insert
Part, function card etc..When being realized with software mode, element of the invention is used to execute program or the generation of required task
Code section.Perhaps code segment can store in machine readable media program or the data-signal by carrying in carrier wave is passing
Defeated medium or communication links are sent." machine readable media " may include any medium for capableing of storage or transmission information.
The example of machine readable media includes electronic circuit, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), soft
Disk, CD-ROM, CD, hard disk, fiber medium, radio frequency (RF) link, etc..Code segment can be via such as internet, inline
The computer network of net etc. is downloaded.
It should also be noted that, the exemplary embodiment referred in the present invention, is retouched based on a series of step or device
State certain methods or system.But the present invention is not limited to the sequence of above-mentioned steps, that is to say, that can be according in embodiment
The sequence referred to executes step, may also be distinct from that the sequence in embodiment or several steps are performed simultaneously.
The above description is merely a specific embodiment, it is apparent to those skilled in the art that,
For convenience of description and succinctly, the system, module of foregoing description and the specific work process of unit can refer to preceding method
Corresponding process in embodiment, details are not described herein.It should be understood that scope of protection of the present invention is not limited thereto, it is any to be familiar with
Those skilled in the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or substitutions,
These modifications or substitutions should be covered by the protection scope of the present invention.
Claims (12)
1. a kind of PC cluster resource regulating method based on neural network algorithm, which is characterized in that the described method includes:
Submission task;
Preprocessing tasks obtain pretreatment information;
Extract data characteristics, task feature, PC cluster resource characteristic;
Pretreatment information, data characteristics, task feature and cluster computing resource feature are calculated with neural network algorithm, generates money
Information is distributed in source;
Task is run according to resource allocation information, obtains task resource monitoring, log feature extracts, accuracy rate verifying;
It is monitored according to task resource, log feature extracts, and accuracy rate verifies training neural network model;
Submit task again, be repeated several times the above operation, it is trained after neural network model calculate after export final cluster
Computing resource scheduling scheme.
2. the method according to claim 1, wherein the preprocessing tasks include: task agent and task solution
Analysis.
3. the method according to claim 1, wherein the acquisition data characteristics, task feature, PC cluster
Resource characteristic includes: to obtain data characteristics, task feature, PC cluster resource characteristic from data system.
4. the method according to claim 1, wherein the data characteristics include: table name, field name, function,
SQL logic and genetic connection.
5. the method according to claim 1, wherein the task feature includes: user name, state, maximum
Minimum CPU memory, task are submitted, are lined up, distribution, executing and terminate duration.
6. the method according to claim 1, wherein the neural network algorithm is BP neural network algorithm.
7. the method according to claim 1, wherein the resource allocation information include: minimax CPU
Memory and resource percentage.
8. the method according to claim 1, wherein the task resource monitoring includes: that monitoring programme will appoint
System resource, the queue resource occupied in business operation is stored as structural data.
9. the method according to claim 1, wherein the extraction of the described log feature include: user name, state,
Minimax CPU memory, task are submitted, are lined up, distribution, executing and terminate duration.
10. a kind of PC cluster resource scheduling device based on neural network algorithm, which is characterized in that described device includes:
Module is submitted, for submitting task;
Preprocessing module is used for preprocessing tasks, obtains pretreatment information;
Extraction module, for extracting data characteristics, task feature, PC cluster resource characteristic;
Neural computing module, for calculating pretreatment information, data characteristics, task feature and cluster computing resource feature,
Generate resource allocation information;
Module is run, for running according to resource allocation information, obtains task resource monitoring, log feature extracts, and accuracy rate is tested
Card;
Model training module, for being monitored according to task resource, log feature is extracted, and accuracy rate verifies training neural network mould
Type;
The above operation is repeated several times for submitting task again in output module, it is trained after neural network model calculate after it is defeated
Final PC cluster resource scheduling scheme out.
11. a kind of PC cluster resource regulating method equipment characterized by comprising at least one processor, at least one deposits
The computer program instructions of reservoir and storage in the memory, when the computer program instructions are held by the processor
Such as claim 1-9 described in any item methods are realized when row.
12. a kind of computer readable storage medium, is stored thereon with computer program instructions, which is characterized in that when the calculating
Machine program instruction realizes method as claimed in any one of claims 1-9 wherein when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711494651.0A CN109992404B (en) | 2017-12-31 | 2017-12-31 | Cluster computing resource scheduling method, device, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711494651.0A CN109992404B (en) | 2017-12-31 | 2017-12-31 | Cluster computing resource scheduling method, device, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109992404A true CN109992404A (en) | 2019-07-09 |
CN109992404B CN109992404B (en) | 2022-06-10 |
Family
ID=67110823
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711494651.0A Active CN109992404B (en) | 2017-12-31 | 2017-12-31 | Cluster computing resource scheduling method, device, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109992404B (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110750363A (en) * | 2019-12-26 | 2020-02-04 | 中科寒武纪科技股份有限公司 | Computer storage management method and device, electronic equipment and storage medium |
CN110795226A (en) * | 2020-01-03 | 2020-02-14 | 中科寒武纪科技股份有限公司 | Method for processing task using computer system, electronic device and storage medium |
CN111190718A (en) * | 2020-01-07 | 2020-05-22 | 第四范式(北京)技术有限公司 | Method, device and system for realizing task scheduling |
CN111209077A (en) * | 2019-12-26 | 2020-05-29 | 中科曙光国际信息产业有限公司 | Deep learning framework design method |
CN111381970A (en) * | 2020-03-16 | 2020-07-07 | 第四范式(北京)技术有限公司 | Cluster task resource allocation method and device, computer device and storage medium |
CN111985831A (en) * | 2020-08-27 | 2020-11-24 | 北京华胜天成科技股份有限公司 | Scheduling method and device of cloud computing resources, computer equipment and storage medium |
CN112000478A (en) * | 2020-08-24 | 2020-11-27 | 中国银行股份有限公司 | Job operation resource allocation method and device |
CN112241321A (en) * | 2020-09-24 | 2021-01-19 | 北京影谱科技股份有限公司 | Computing power scheduling method and device based on Kubernetes |
CN112256418A (en) * | 2020-10-26 | 2021-01-22 | 清华大学深圳国际研究生院 | Big data task scheduling method |
CN112862085A (en) * | 2019-11-27 | 2021-05-28 | 杭州海康威视数字技术股份有限公司 | Storage space optimization method and device |
CN112953767A (en) * | 2021-02-05 | 2021-06-11 | 深圳前海微众银行股份有限公司 | Resource allocation parameter setting method and device based on Hadoop platform and storage medium |
CN113037800A (en) * | 2019-12-09 | 2021-06-25 | 华为技术有限公司 | Job scheduling method and job scheduling device |
CN113296907A (en) * | 2021-04-29 | 2021-08-24 | 上海淇玥信息技术有限公司 | Task scheduling processing method and system based on cluster and computer equipment |
CN113535399A (en) * | 2021-07-15 | 2021-10-22 | 电子科技大学 | NFV resource scheduling method, device and system |
CN113568425A (en) * | 2020-04-28 | 2021-10-29 | 北京理工大学 | Cluster cooperative guidance method based on neural network learning |
CN113612839A (en) * | 2021-07-30 | 2021-11-05 | 国汽智控(北京)科技有限公司 | Method and device for determining driving task calculation terminal and computer equipment |
CN113886036A (en) * | 2021-09-13 | 2022-01-04 | 天翼数字生活科技有限公司 | Method and system for optimizing cluster configuration of distributed system |
CN114297808A (en) * | 2020-12-02 | 2022-04-08 | 北京航空航天大学 | Task allocation and resource scheduling method of avionics system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5442730A (en) * | 1993-10-08 | 1995-08-15 | International Business Machines Corporation | Adaptive job scheduling using neural network priority functions |
CN103593323A (en) * | 2013-11-07 | 2014-02-19 | 浪潮电子信息产业股份有限公司 | Machine learning method for Map Reduce task resource allocation parameters |
CN103699440A (en) * | 2012-09-27 | 2014-04-02 | 北京搜狐新媒体信息技术有限公司 | Method and device for cloud computing platform system to distribute resources to task |
CN105337765A (en) * | 2015-10-10 | 2016-02-17 | 上海新炬网络信息技术有限公司 | Distributed hadoop cluster fault automatic diagnosis and restoration system |
CN105718479A (en) * | 2014-12-04 | 2016-06-29 | 中国电信股份有限公司 | Execution strategy generation method and device under cross-IDC (Internet Data Center) big data processing architecture |
CN106790529A (en) * | 2016-12-20 | 2017-05-31 | 北京并行科技股份有限公司 | The dispatching method of computing resource, control centre and scheduling system |
CN106777079A (en) * | 2016-12-13 | 2017-05-31 | 苏州蜗牛数字科技股份有限公司 | A kind of daily record data Visualized Analysis System and method |
CN107229693A (en) * | 2017-05-22 | 2017-10-03 | 哈工大大数据产业有限公司 | The method and system of big data system configuration parameter tuning based on deep learning |
-
2017
- 2017-12-31 CN CN201711494651.0A patent/CN109992404B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5442730A (en) * | 1993-10-08 | 1995-08-15 | International Business Machines Corporation | Adaptive job scheduling using neural network priority functions |
CN103699440A (en) * | 2012-09-27 | 2014-04-02 | 北京搜狐新媒体信息技术有限公司 | Method and device for cloud computing platform system to distribute resources to task |
CN103593323A (en) * | 2013-11-07 | 2014-02-19 | 浪潮电子信息产业股份有限公司 | Machine learning method for Map Reduce task resource allocation parameters |
CN105718479A (en) * | 2014-12-04 | 2016-06-29 | 中国电信股份有限公司 | Execution strategy generation method and device under cross-IDC (Internet Data Center) big data processing architecture |
CN105337765A (en) * | 2015-10-10 | 2016-02-17 | 上海新炬网络信息技术有限公司 | Distributed hadoop cluster fault automatic diagnosis and restoration system |
CN106777079A (en) * | 2016-12-13 | 2017-05-31 | 苏州蜗牛数字科技股份有限公司 | A kind of daily record data Visualized Analysis System and method |
CN106790529A (en) * | 2016-12-20 | 2017-05-31 | 北京并行科技股份有限公司 | The dispatching method of computing resource, control centre and scheduling system |
CN107229693A (en) * | 2017-05-22 | 2017-10-03 | 哈工大大数据产业有限公司 | The method and system of big data system configuration parameter tuning based on deep learning |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112862085A (en) * | 2019-11-27 | 2021-05-28 | 杭州海康威视数字技术股份有限公司 | Storage space optimization method and device |
CN112862085B (en) * | 2019-11-27 | 2023-08-22 | 杭州海康威视数字技术股份有限公司 | Storage space optimization method and device |
CN113037800A (en) * | 2019-12-09 | 2021-06-25 | 华为技术有限公司 | Job scheduling method and job scheduling device |
CN113037800B (en) * | 2019-12-09 | 2024-03-05 | 华为云计算技术有限公司 | Job scheduling method and job scheduling device |
CN111209077A (en) * | 2019-12-26 | 2020-05-29 | 中科曙光国际信息产业有限公司 | Deep learning framework design method |
CN110750363A (en) * | 2019-12-26 | 2020-02-04 | 中科寒武纪科技股份有限公司 | Computer storage management method and device, electronic equipment and storage medium |
CN110795226A (en) * | 2020-01-03 | 2020-02-14 | 中科寒武纪科技股份有限公司 | Method for processing task using computer system, electronic device and storage medium |
CN111190718A (en) * | 2020-01-07 | 2020-05-22 | 第四范式(北京)技术有限公司 | Method, device and system for realizing task scheduling |
CN111381970A (en) * | 2020-03-16 | 2020-07-07 | 第四范式(北京)技术有限公司 | Cluster task resource allocation method and device, computer device and storage medium |
WO2021185206A1 (en) * | 2020-03-16 | 2021-09-23 | 第四范式(北京)技术有限公司 | Resource allocation method and apparatus for cluster task, and computer apparatus and storage medium |
CN113568425B (en) * | 2020-04-28 | 2024-05-14 | 北京理工大学 | Cluster collaborative guidance method based on neural network learning |
CN113568425A (en) * | 2020-04-28 | 2021-10-29 | 北京理工大学 | Cluster cooperative guidance method based on neural network learning |
CN112000478A (en) * | 2020-08-24 | 2020-11-27 | 中国银行股份有限公司 | Job operation resource allocation method and device |
CN112000478B (en) * | 2020-08-24 | 2024-02-23 | 中国银行股份有限公司 | Method and device for distributing operation resources |
CN111985831A (en) * | 2020-08-27 | 2020-11-24 | 北京华胜天成科技股份有限公司 | Scheduling method and device of cloud computing resources, computer equipment and storage medium |
CN112241321A (en) * | 2020-09-24 | 2021-01-19 | 北京影谱科技股份有限公司 | Computing power scheduling method and device based on Kubernetes |
CN112256418B (en) * | 2020-10-26 | 2023-10-24 | 清华大学深圳国际研究生院 | Big data task scheduling method |
CN112256418A (en) * | 2020-10-26 | 2021-01-22 | 清华大学深圳国际研究生院 | Big data task scheduling method |
CN114297808A (en) * | 2020-12-02 | 2022-04-08 | 北京航空航天大学 | Task allocation and resource scheduling method of avionics system |
CN114297808B (en) * | 2020-12-02 | 2023-04-07 | 北京航空航天大学 | Task allocation and resource scheduling method of avionics system |
CN112953767B (en) * | 2021-02-05 | 2022-11-04 | 深圳前海微众银行股份有限公司 | Resource allocation parameter setting method and device based on Hadoop platform and storage medium |
CN112953767A (en) * | 2021-02-05 | 2021-06-11 | 深圳前海微众银行股份有限公司 | Resource allocation parameter setting method and device based on Hadoop platform and storage medium |
CN113296907B (en) * | 2021-04-29 | 2023-12-22 | 上海淇玥信息技术有限公司 | Task scheduling processing method, system and computer equipment based on clusters |
CN113296907A (en) * | 2021-04-29 | 2021-08-24 | 上海淇玥信息技术有限公司 | Task scheduling processing method and system based on cluster and computer equipment |
CN113535399A (en) * | 2021-07-15 | 2021-10-22 | 电子科技大学 | NFV resource scheduling method, device and system |
CN113612839A (en) * | 2021-07-30 | 2021-11-05 | 国汽智控(北京)科技有限公司 | Method and device for determining driving task calculation terminal and computer equipment |
CN113886036A (en) * | 2021-09-13 | 2022-01-04 | 天翼数字生活科技有限公司 | Method and system for optimizing cluster configuration of distributed system |
CN113886036B (en) * | 2021-09-13 | 2024-04-19 | 天翼数字生活科技有限公司 | Method and system for optimizing distributed system cluster configuration |
Also Published As
Publication number | Publication date |
---|---|
CN109992404B (en) | 2022-06-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109992404A (en) | PC cluster resource regulating method, device, equipment and medium | |
Mei et al. | An efficient feature selection algorithm for evolving job shop scheduling rules with genetic programming | |
CN103092683B (en) | For data analysis based on didactic scheduling | |
Tay et al. | Evolving dispatching rules using genetic programming for solving multi-objective flexible job-shop problems | |
CN110389820B (en) | Private cloud task scheduling method for resource prediction based on v-TGRU model | |
US11436050B2 (en) | Method, apparatus and computer program product for resource scheduling | |
Hunt et al. | Evolving" less-myopic" scheduling rules for dynamic job shop scheduling with genetic programming | |
CN105373432B (en) | A kind of cloud computing resource scheduling method based on virtual resource status predication | |
CN110008259A (en) | The method and terminal device of visualized data analysis | |
CN113778646B (en) | Task level scheduling method and device based on execution time prediction | |
CN110825522A (en) | Spark parameter self-adaptive optimization method and system | |
Cheong et al. | SCARL: Attentive reinforcement learning-based scheduling in a multi-resource heterogeneous cluster | |
CN109343972A (en) | Task processing method and terminal device | |
CN110084507A (en) | The scientific workflow method for optimizing scheduling of perception is classified under cloud computing environment | |
CN112148471A (en) | Method and device for scheduling resources in distributed computing system | |
Đurasević et al. | Collaboration methods for ensembles of dispatching rules for the dynamic unrelated machines environment | |
CN113608858A (en) | MapReduce architecture-based block task execution system for data synchronization | |
CN116501505B (en) | Method, device, equipment and medium for generating data stream of load task | |
RU2411574C2 (en) | Intellectual grid-system for highly efficient data processing | |
CN114327925A (en) | Power data real-time calculation scheduling optimization method and system | |
Prado et al. | On providing quality of service in grid computing through multi-objective swarm-based knowledge acquisition in fuzzy schedulers | |
CN115599522A (en) | Task scheduling method, device and equipment for cloud computing platform | |
Tuli et al. | Optimizing the Performance of Fog Computing Environments Using AI and Co-Simulation | |
Liu et al. | 5G/B5G Network Slice Management via Staged Reinforcement Learning | |
WO2017085454A1 (en) | Fuzzy caching mechanism for thread execution layouts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |