CN110008028A

CN110008028A - Computational resource allocation method, apparatus, computer equipment and storage medium

Info

Publication number: CN110008028A
Application number: CN201910285304.XA
Authority: CN
Inventors: 姚成吉; 杨越; 高华佐; 田忠博; 贾开
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd
Priority date: 2019-04-10
Filing date: 2019-04-10
Publication date: 2019-07-12
Anticipated expiration: 2039-04-10
Also published as: CN110008028B

Abstract

This application involves a kind of computational resource allocation method, apparatus, computer equipment and storage mediums.The described method includes: the topology diagram of ergodic distribution formula neural network, obtains the operating instruction of each machine；The operating instruction that each machine is traversed according to the topology diagram obtains the corresponding single machine neural network of each machine；Obtain the state of the operating instruction of each machine；According to the state of the operating instruction of each machine, it is trained in the corresponding single machine neural network of each machine or reasoning and calculation.It can be facilitated using this method and carry out resource allocation in distributed deep neural network, improve distributed deep neural network operational efficiency.

Description

Computational resource allocation method, apparatus, computer equipment and storage medium

Technical field

This application involves technical field of information processing, more particularly to a kind of computational resource allocation method, apparatus, computer Equipment and storage medium.

Background technique

With the development of internet technology, deep neural network relies on the features such as its good generalization, Yi Xunlian, extensive Applied to artificial intelligence fields such as image recognition, speech processes.Since neural network computing is to the computing capability of equipment and interior It deposits that capacity requirement is higher, in order to be run in the equipment of lower computing capability and lower memory size, generally requires to mind It is transformed through network, distributed operation can be carried out on more machine multiple devices.

Distributed Artificial Neural Network is able to carry out that data parallel, model be parallel and hybrid parallel method.Data parallel refers to not Same machine has multiple copies of the same model, and different machines are assigned to different data inputs, then by all machines Calculated result merges in a manner；Model refers to that the different machines in distributed system are responsible for single network model not parallel Same part, such as: the nonidentity operation operation of neural network model is assigned to different machines, or a big operation is grasped Several small arithmetic operations are splitted into, different machines is redistributed to；Hybrid parallel, which refers to each machine not only, different inputs, And also there is certain difference on neural network structure.

However, the Resource Allocation Formula at present in Distributed Artificial Neural Network is very complicated, fortune is easily reduced if dealing with improperly Line efficiency.

Summary of the invention

Based on this, it is necessary to which in view of the above technical problems, providing one kind can facilitate in distributed deep neural network Computational resource allocation method, apparatus, the computer equipment for carrying out resource allocation, improving distributed deep neural network operational efficiency And storage medium.

A kind of computational resource allocation method, which comprises

The topology diagram of ergodic distribution formula neural network, obtains the operating instruction of each machine；

The operating instruction that each machine is traversed according to the topology diagram obtains the corresponding single machine mind of each machine Through network；

Obtain the state of the operating instruction of each machine；

According to the state of the operating instruction of each machine, instructed in the corresponding single machine neural network of each machine Experienced or reasoning and calculation.

The topology diagram of the ergodic distribution formula neural network in one of the embodiments, obtains each machine Operating instruction, comprising:

The topology diagram is traversed, if the machine number of operating instruction and currently running machine in the topology diagram It is number consistent, then using the operating instruction as the operating instruction in currently running machine.

The operating instruction that each machine is traversed according to the topology diagram in one of the embodiments, Obtain the corresponding single machine neural network of each machine, comprising:

The operating instruction that each machine is traversed according to the topology diagram, the operation for obtaining each machine refer to The machine number parameter and device number parameter of order；

According to the machine number parameter and device number parameter, judges whether to need to increase data between different devices and copy Shellfish instruction, and whether need to increase network transmission instruction between different machines；

According to the topology diagram, the forward direction subgraph and backward subgraph of each single machine neural network are constructed；

Gradient calculating is carried out to the identical operating instruction of multiple states in each single machine neural network, is obtained The corresponding single machine neural network of each machine.

The operating instruction that each machine is traversed according to the topology diagram in one of the embodiments, Obtain the corresponding single machine neural network of each machine, further includes:

According to the topology diagram, the forward direction subgraph of each single machine neural network is constructed, each machine pair is obtained The single machine neural network answered.

It is described according to the machine number parameter and device number parameter in one of the embodiments, judge whether to need Increase data copy instruction between different devices, and whether needs to increase network transmission instruction, packet between different machines It includes:

If the machine number parameter of the operating instruction is consistent with the input machine number of the operating instruction, the operating instruction Device number parameter and the operating instruction input equipment number it is inconsistent, then increase the data copy between different devices Instruction；

If the machine number parameter of the operating instruction and the input machine number of the operating instruction are inconsistent, in different machines Increase the network transmission instruction between device.

It is described according to the topology diagram in one of the embodiments, construct each single machine neural network Forward direction subgraph and backward subgraph, comprising:

The forward direction subgraph is constructed by forward calculation, obtains the endpoint of each single machine neural network；

The backward subgraph is constructed by backcasting, shape is carried out to the operating instruction in each single machine neural network State updates.

The identical fortune of multiple states in each single machine neural network in one of the embodiments, Row instruction carries out gradient calculating, comprising:

The original gradient of the identical operating instruction of multiple states is added, the update of each operating instruction is obtained Gradient.

In one of the embodiments, the method also includes: construct the topology diagram of the Distributed Artificial Neural Network.

The topology diagram of the building Distributed Artificial Neural Network in one of the embodiments, comprising:

According to the state of the operating instruction of each machine, the operating instruction of each machine is distributed to instruction column In table；

Operating instruction in described instruction list is calculated in the Distributed Artificial Neural Network, and real-time update institute State the state of operating instruction；

After the completion of calculating, the state of each operating instruction in described instruction list is stored.

The state of the operating instruction according to each machine in one of the embodiments, by each machine The operating instruction of device is distributed into instruction list, comprising:

The identical multiple operating instructions of state are distributed into same described instruction list.

The operating instruction by described instruction list is in the Distributed Artificial Neural Network in one of the embodiments, In calculated, comprising:

Operating instruction in described instruction list is calculated in the Distributed Artificial Neural Network, obtains the first calculating As a result；

Corresponding first calculated result of multiple described instruction lists is merged, distributed computing result is obtained.

In one of the embodiments, the method also includes:

During computational resource allocation and reasoning and calculation, scheduled store is carried out to the state of the operating instruction；Or

During computational resource allocation and reasoning and calculation, distributed storage is carried out to the state of the operating instruction.

In one of the embodiments, the operating instruction include single device operating instruction, distributed operating instruction and Parameter operating instruction.

In one of the embodiments, the parameter of the operating instruction further include distributed nature, split operating instruction with And merge operating instruction.

A kind of computational resource allocation device, described device include:

Operating instruction obtains module and obtains the fortune of each machine for the topology diagram of ergodic distribution formula neural network Row instruction；

Single machine neural network obtains module, and the operation for traversing each machine according to the topology diagram refers to It enables, obtains the corresponding single machine neural network of each machine；

Operating instruction state acquisition module, the state of the operating instruction for obtaining each machine；

Computing module, for the state according to the operating instruction of each machine, in the corresponding single machine mind of each machine It is trained in network or reasoning and calculation.

A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing Device performs the steps of when executing the computer program

Obtain the state of the operating instruction of each machine；

A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor It is performed the steps of when row

Obtain the state of the operating instruction of each machine；

Above-mentioned computational resource allocation method, apparatus, computer equipment and storage medium, by from Distributed Artificial Neural Network The operating instruction of each machine is extracted in topology diagram, and it is corresponding according to the operating instruction of each machine to obtain each machine Single machine neural network, finally according to the state of the operating instruction of each machine, in the corresponding single machine neural network of each machine It is trained or reasoning and calculation, so that the storage of operating instruction state and the convenience loaded in Distributed Artificial Neural Network are improved, Meanwhile, it is capable to which realizing that more machines are synchronous executes computational resource allocation and reasoning and calculation, distributed depth nerve net is improved The operational efficiency of network.

Detailed description of the invention

Fig. 1 is the structural schematic diagram that resource allocation system 100 is calculated in one embodiment；

Fig. 2 is the flow diagram that resource allocation methods are calculated in one embodiment；

Fig. 3 is the topology diagram of Distributed Artificial Neural Network in one embodiment；

Fig. 4 is the flow diagram of implementation model conversion in training process in one embodiment；

Fig. 5 is the flow diagram that implementation model is converted during reasoning and calculation in one embodiment；

Fig. 6 is the flow diagram that Distributed Artificial Neural Network is constructed in one embodiment；

Fig. 7 is the structural block diagram that resource allocation device is calculated in one embodiment；

Fig. 8 is the internal structure chart of computer equipment in one embodiment.

Specific embodiment

It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.

In one embodiment, as shown in Figure 1, providing a kind of computational resource allocation system 100, comprising: allocation unit 101, converting unit 102, storage unit 103, load units 104 and computing unit 105, wherein allocation unit 101 is used for According to the status identifier of operating instruction, the operating instruction is distributed into corresponding instruction list；Converting unit 102 is used In the topology diagram according to Distributed Artificial Neural Network, the corresponding single machine neural network of each machine is obtained；Storage unit 103, It is stored for the state to operating instruction；Load units 104 are loaded for the state to operating instruction；It calculates single Member 105, for making inferences calculating in the Distributed Artificial Neural Network or the single machine neural network.In addition, computing resource Distribution system 100 further includes design cell 106, and for providing a set of primitive (i.e. the parameter of operating instruction), which can be retouched State above-mentioned Distributed Artificial Neural Network.

In one embodiment, as shown in Fig. 2, providing a kind of computational resource allocation method, this method can be in such as Fig. 1 institute It runs, the described method comprises the following steps in the computational resource allocation system shown:

Step S202, the topology diagram of ergodic distribution formula neural network, obtains the operating instruction of each machine.

Wherein, Distributed Artificial Neural Network refers to multimachine overall situation neural network；The topology diagram of Distributed Artificial Neural Network It refers to that multiple machines execute the topology diagram of distributed computing, please refers to Fig. 3；Operating instruction is used to indicate each machine and holds Which kind of operation of row, such as: addition operating instruction is used to indicate machine and executes add operation to input data, compares operating instruction use Comparison operation is executed to multiple input datas in instruction machine.

Further, which includes single device operating instruction, distributed operating instruction and parameter operating instruction.It is single Equipment operating instruction refers to the instruction that can only be run on single machine or equipment, and single device operating instruction includes carrier state The single device operating instruction of single device operating instruction and not carrier state；Distributed operating instruction is referred to distributed nature Operating instruction, that is to say, that distributed operating instruction can the distributed operation on more machines or equipment, distribution operation Instruction includes the distributed operating instruction and the not distributed operating instruction of carrier state of carrier state, a distributed operating instruction It can be made of one or more single device operating instructions；Parameter operating instruction refers to one in the operating instruction of carrier state Kind, that is to say, that parameter operating instruction is all the operating instruction of carrier state, and parameter operating instruction can be the single device of carrier state Operating instruction is also possible to the distributed operating instruction of carrier state.

Specifically, computational resource allocation system can traverse the topology diagram of multimachine overall situation Distributed Artificial Neural Network, root According to the parameter of operating instruction in the topology diagram, the operating instruction of each machine is obtained.

Step S204 traverses the operating instruction of each machine according to the topology diagram, obtains each machine pair The single machine neural network answered.

Wherein, single machine neural network refers to execute the neural network of reasoning and calculation on each machine.Specifically, The operating instruction of computational resource allocation system each machine according to obtained in step S202 is advised according to certain conversion Then, which is converted into the corresponding single machine neural network of each machine.

Step S206 obtains the state of the operating instruction of each machine.

Specifically, computational resource allocation system can be obtained by the status identifier (tag) of the operating instruction of each machine To the state of the operating instruction.

As an alternative embodiment, computational resource allocation system can after the completion of Distributed Artificial Neural Network training The state of the single device operating instruction of carrier state identifier (tag) on each machine to be stored.

As another optional embodiment, during computational resource allocation and reasoning and calculation, computing resource point Match system can be according to the preset time cycle, to the single device operating instruction of carrier state identifier (tag) on each machine State carries out scheduled store.It optionally, can be according to distribution or the time cycle of operation demand setting scheduled store.

As another optional embodiment, during computational resource allocation and reasoning and calculation, computing resource point Match system can be according to preset storage rule, to the single device operating instruction of carrier state identifier (tag) on each machine State carries out distributed storage.It optionally, can be according to distribution or the storage rule of operation demand setting distributed storage, example Such as: status identifier (tag) the multiple single device operating instructions for being 0 can be stored on the machine that machine number is 0, by state Multiple single device operating instructions that identifier (tag) is 1 are stored on the machine that machine number is 1.

Step S208, according to the state of the operating instruction of each machine, in the corresponding single machine nerve net of each machine It is trained in network or reasoning and calculation.

Wherein, calculating is made inferences in the corresponding single machine neural network of each machine refers to each machine pair of use The trained single machine neural network answered makes inferences calculating.Specifically, computational resource allocation system first loads pre-stored The state of single device operating instruction, and according to the state of the single device operating instruction, in the corresponding single machine nerve net of each machine It is trained in network or reasoning and calculation.Optionally, computational resource allocation system can be to pre-stored single device operating instruction State carry out distributed loading.

In above-mentioned computational resource allocation method, by extracting each machine from the topology diagram of Distributed Artificial Neural Network Operating instruction, and the corresponding single machine neural network of each machine is obtained according to the operating instruction of each machine, it is last according to each The state of the operating instruction of a machine, is trained or reasoning and calculation in the corresponding single machine neural network of each machine, can In the overload that Distributed Artificial Neural Network executes computational resource allocation and reasoning and calculation, convenient progress operating instruction state The model conversion of storage, loading and multimachine network to single machine network works, to improve distributed deep neural network Operational efficiency.

Step S202 is specifically included in one of the embodiments: the topology diagram is traversed, if the topological structure The machine number of operating instruction is consistent with currently running machine number in figure, then using the operating instruction as currently running machine In operating instruction.

Wherein, the parameter of operating instruction includes machine number parameter (node), device number parameter (device), status identifier (tag), distributed nature (distributed attributes), instruction list (placement context), fractionation operation It instructs (sub placement) and merges operating instruction (contact placement).

Wherein, machine number parameter (node) indicates that the operating instruction is run on which platform machine, passes through between more machines Network connection, a machine may include multiple devices；Device number parameter (device) refers to that the operating instruction is set in which platform For upper operation, do not need to pass through network connection between the multiple devices inside a machine；Status identifier (tag) is for indicating Whether the state of operating instruction, can be distinguished with the status identifier between multiple operating instructions is shared state；It is distributed Attribute (distributed attributes) be used for indicate operating instruction calculate and output position, distributed nature by Node, device and tag composition, indicate that the distribution operating instruction is transported in which equipment of which machine and every machine Row, the status identifier of the distribution operating instruction are used to describe the shape of the running multiple single device operating instructions of the distribution State shares situation, such as: if distributed operating instruction includes 4 single device operating instructions, define the distribution operating instruction Tag is [0,1,0,1], indicates that the in the distribution operating instruction the 0th single device operating instruction and the 2nd single device operation refer to It is shared state between order, is shared state between the 1st single device operating instruction and the 3rd single device operating instruction；Instruction List (placement context) can assist describing the Distributed Artificial Neural Network, instruction list by node, device and Tag composition, the operating instruction in same instruction list is distributed operating instruction；Split operating instruction (sub It placement is) that a kind of special operating instruction obtains a part therein for splitting to distributed operating instruction Single device operating instruction；Merging operating instruction (contact placement) is also a kind of special operating instruction, and being used for will be more A distribution operating instruction merges, and includes that all in multiple distributed operating instructions set up in the operating instruction after merging Standby operating instruction.

Further, the parameter of operating instruction further includes a series of messaging parameters, such as: allreduce, allgather, Broadcast etc., the messaging parameter are used to indicate which kind of collective communication operation operating instruction executes.

Specifically, the topology diagram of computational resource allocation system traversal multimachine overall situation Distributed Artificial Neural Network, and obtain The machine number parameter of each operating instruction in topology diagram, including obtain each single device operation in distributed operating instruction and refer to The machine number parameter of order, and the machine number parameter for the single device operating instruction that will acquire and the machine number of current machine carry out pair Than retaining the single device if the machine number parameter of the single device operating instruction got is consistent with the machine number of current machine Operating instruction, and using the single device operating instruction as the operating instruction in currently running machine.

In above-mentioned computational resource allocation method, by by the machine number of operating instruction in Distributed Artificial Neural Network and current machine The machine number of device compares, to obtain the operating instruction in currently running machine, can easily extract each machine The operating instruction of corresponding single machine neural network, improves the operational efficiency of Distributed Artificial Neural Network.

In one of the embodiments, as shown in figure 4, during being trained to Distributed Artificial Neural Network, step S204 specifically includes the following steps:

Step S2042 traverses the operating instruction of each machine according to the topology diagram, obtains each machine The machine number parameter and device number parameter of the operating instruction of device.

Specifically, topology diagram of the computational resource allocation system according to multimachine overall situation Distributed Artificial Neural Network, traversal step The operating instruction of each machine obtained in rapid S202, and obtain the machine number parameter and equipment of the operating instruction of each machine Number parameter, machine number parameter and device number parameter including obtaining each single device operating instruction in distributed operating instruction.

Step S2044 judges whether to need between different devices according to the machine number parameter and device number parameter Increase data copy instruction, and whether needs to increase network transmission instruction between different machines.

Wherein, data copy instruction between the distinct device of uniform machinery for carrying out data transmission；Network transmission refers to It enables for being carried out data transmission between different machines by network.

Specifically, computational resource allocation system is according to the machine of the operating instruction of each machine got in step S2042 Device parameter and device number parameter judge whether to need to increase data copy instruction between the distinct device of uniform machinery, And whether need to increase network transmission instruction between different machines.

As an alternative embodiment, step S2044 specifically includes the following steps:

Step S20442, if the machine number parameter of the operating instruction is consistent with the input machine number of the operating instruction, The device number parameter of the operating instruction and the input equipment number of the operating instruction are inconsistent, then increase between different devices The data copy instruction.

Specifically, computational resource allocation system joins the machine number of the single device operating instruction got in step S2042 Several input machine numbers with the operating instruction compare, if the machine number parameter of the single device operating instruction got and the fortune The input machine number of row instruction is consistent, then illustrates that the single device operating instruction inputs on same machine with it.Further, it counts Resource allocation system is calculated by the defeated of the device number parameter of the single device operating instruction got in step S2042 and the operating instruction Enter device number to compare, if the input equipment number of the device number parameter of the single device operating instruction got and the operating instruction Unanimously, then illustrate that the single device operating instruction inputs in same equipment of same machine with it, do not need extra process； If the device number parameter of the single device operating instruction got and the input equipment number of the operating instruction are inconsistent, illustrate the list Equipment operating instruction inputs on the distinct device of same machine with it, needs to increase between the distinct device of same machine Data copy is added to instruct, convenient for carrying out data transmission between different devices.

Step S20444, if the machine number parameter of the operating instruction and the input machine number of the operating instruction are different It causes, then increases the network transmission instruction between different machines.

Specifically, computational resource allocation system joins the machine number of the single device operating instruction got in step S2042 Several input machine numbers with the operating instruction compare, if the machine number parameter of the single device operating instruction got and the fortune The input machine number of row instruction is inconsistent, then illustrates that the single device operating instruction inputs on different machines with it, need not With network transmission instruction is increased between machine, convenient for carrying out data transmission between different machines.

Step S2046 constructs the forward direction subgraph of each single machine neural network and backward according to the topology diagram Subgraph.

Wherein, forward direction subgraph refers to the topology diagram that positive ergodic distribution formula neural network obtains；Backward subgraph refers to Be topology diagram that reversed ergodic distribution formula neural network obtains.

Specifically, computational resource allocation system can lead to according to the topology diagram of multimachine overall situation Distributed Artificial Neural Network The mode crossing positive traversal and reversely traversing constructs the forward direction subgraph and backward son of the corresponding single machine neural network of each machine Figure.

As an alternative embodiment, step S2046 specifically includes the following steps:

Step S20462 constructs the forward direction subgraph by forward calculation, obtains the end of each single machine neural network Point.

Wherein, the effect of forward calculation is the influence for calculating input layer node to hidden layer node, that is to say, that according to defeated Enter the sequence forward direction ergodic distribution formula neural network of layer, hidden layer and output layer, each node is to it in calculating topology diagram The influence of next layer of node.

Specifically, computational resource allocation system judges which operation refers to according to the topology diagram of Distributed Artificial Neural Network Order needs to carry out forward calculation, constructs the corresponding forward direction subgraph of each single machine neural network by executing forward calculation, and obtain The endpoint (endpoints) of the corresponding subgraph of single machine neural network.

Step S20464 constructs the backward subgraph by backcasting, to the fortune in each single machine neural network Row instruction carry out state update.

Wherein, the effect of backcasting is the weight relationship adjusted in neural network, the output result and reality reduced Deviation between the result of border constructs the corresponding backward subgraph of each single machine neural network by executing backcasting, and to each Single device operating instruction in single machine neural network carries out state update.

Step S2048 carries out ladder to the identical operating instruction of multiple states in each single machine neural network Degree calculates, and obtains the corresponding single machine neural network of each machine.

Specifically, if there are the identical parameter operating instructions of multiple states in single machine neural network, after execution to meter Gradient calculating is carried out to the identical parameter operating instruction of multiple states during calculation, using the gradient being calculated as each ginseng The new gradient of number operating instruction, finally obtains the corresponding single machine neural network subgraph of each machine.

As an alternative embodiment, step S2048 is specifically included: by the identical operating instruction of multiple states Original gradient be added, obtain the update gradient of each operating instruction.

Wherein, the original gradient of operating instruction refers to ladder of the operating instruction in multimachine overall situation Distributed Artificial Neural Network Degree；The update gradient of operating instruction refers to gradient of the operating instruction in single machine neural network.

Specifically, computational resource allocation system passes through allreduce pairs of messaging parameter after execution to during calculating The identical parameter operating instruction of multiple states carries out gradient calculating, using the gradient being calculated as each parameter operating instruction Update gradient.

In above-mentioned computational resource allocation method, according to certain transformation rule, multimachine overall situation Distributed Artificial Neural Network is turned Change the corresponding single machine neural network of each machine into, it can be convenient to realize that global network arrives in neural network training process The model conversion of single machine network.

In one of the embodiments, as shown in figure 5, during making inferences calculating to Distributed Artificial Neural Network, Step S204 specifically includes the following steps:

Step S2042a traverses the operating instruction of each machine according to the topology diagram, obtains described each The machine number parameter and device number parameter of the operating instruction of machine.

Please refer to step S2042.

Step S2044a, according to the machine number parameter and device number parameter, judge whether to need distinct device it Between increase data copy instruction, and whether need between different machines increase network transmission instruction.

Please refer to step S2044.

Step S2046a constructs the forward direction subgraph of each single machine neural network, obtains according to the topology diagram The corresponding single machine neural network of each machine.

Step S2046 is please referred to, unlike step S2046, calculating is being made inferences to Distributed Artificial Neural Network In the process, it does not need to execute Xiang Zitu after backcasting constructs, therefore, also there is no need to carry out gradient calculating to operating instruction.

In above-mentioned computational resource allocation method, according to certain transformation rule, multimachine overall situation Distributed Artificial Neural Network is turned It changes the corresponding single machine reasoning neural network of each machine into, the process of reasoning and calculation can be executed in trained neural network In, the model conversion of convenient realization global network to single machine network.

In one of the embodiments, as shown in fig. 6, providing another computational resource allocation method, this method can be It is run in computational resource allocation system as shown in Figure 1, which comprises carrying out computational resource allocation and reasoning and calculation Before, the topology diagram of the Distributed Artificial Neural Network is constructed, specifically includes the following steps:

Step S302, according to the state of the operating instruction of each machine, by the operating instruction of each machine point It is assigned in instruction list.

Specifically, computational resource allocation system can be obtained by the status identifier (tag) of the operating instruction of each machine The state of the operating instruction of each machine is taken, and the identical multiple operating instructions of state are distributed to same described instruction and are arranged In table.

Step S304 calculates the operating instruction in described instruction list in the Distributed Artificial Neural Network, and The state of operating instruction described in real-time update.

As an alternative embodiment, step S302 specifically includes the following steps:

Operating instruction in described instruction list is calculated in the Distributed Artificial Neural Network, is obtained by step S3022 To the first calculated result.

Wherein, the first calculated result refer to the operating instruction in each instruction list in the Distributed Artificial Neural Network into The calculated result that row is calculated, the number of the first calculated result and the number of instruction list correspond.

Corresponding first calculated result of multiple described instruction lists is merged, obtains distributed computing by step S3024 As a result.

Specifically, computational resource allocation system passes through messaging parameter allreduce, allgather, broadcast etc. one Multiple first in step S3022 are calculated and are merged by sequence of maneuvers, obtain a distributed computing result.

It include two equipment in every machine for example, including two machines node0 and node1 in Distributed Artificial Neural Network Device0 and device1 presets two instruction lists pcon1 and pcon2, in topology diagram as shown in Figure 3, Defining two operating instructions data and conv is distributed operating instruction, and the state of data and conv is identical, and data and Conv includes 4 single device operating instructions, and data and conv are distributed and execute distributed computing into instruction list pcon1, The corresponding first calculated result y1 of pcon1 is obtained, the single device operating instruction of articulamentum complete in neural network is distributed to instruction Distributed computing is executed in list pcon2, the corresponding first calculated result y2 of pcon2 is obtained, passes through messaging parameter The sequence of operations such as allreduce, allgather, broadcast merge y1 and y2, finally obtain a distribution Calculated result y, wherein data_0_0 indicates that data is the single device operating instruction on node0 and device0.

Step S306 after the completion of calculating, stores the state of each operating instruction in described instruction list.

Specifically, after the completion of Distributed Artificial Neural Network calculates, computational resource allocation system can be by each instruction list In the state of each operating instruction stored.

In above-mentioned computational resource allocation method, according to the state of operating instruction, operating instruction is distributed to different instructions In list, and the operating instruction in each instruction list is calculated in Distributed Artificial Neural Network, can easily be constructed Distributed Artificial Neural Network, meanwhile, easily the state of operating instruction is stored.

It should be understood that although each step in the flow chart of Fig. 2-6 is successively shown according to the instruction of arrow, These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, at least one in Fig. 2-6 Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately It executes.

In one embodiment, as shown in fig. 7, providing a kind of computational resource allocation device, comprising: operating instruction obtains Module 401, single machine neural network obtain module 402, operating instruction state acquisition module 403 and computing module 404, in which:

Operating instruction obtains module 401 and obtains each machine for the topology diagram of ergodic distribution formula neural network Operating instruction；

Single machine neural network obtains module 402, for traversing the operation of each machine according to the topology diagram Instruction, obtains the corresponding single machine neural network of each machine；

Operating instruction state acquisition module 403, the state of the operating instruction for obtaining each machine；

Computing module 404, for the state according to the operating instruction of each machine, in the corresponding single machine of each machine It is trained in neural network or reasoning and calculation.

Operating instruction obtains module 401 in one of the embodiments, is specifically used for traversing the topology diagram, if The machine number of operating instruction is consistent with currently running machine number in the topology diagram, then using the operating instruction as working as Operating instruction in the machine of preceding operation.

Single machine neural network obtains module 402 in one of the embodiments, is specifically used for according to the topology diagram The operating instruction for traversing each machine obtains the machine number parameter and device number ginseng of the operating instruction of each machine Number；According to the machine number parameter and device number parameter, judges whether to need to increase data copy between different devices and refer to It enables, and whether needs to increase network transmission instruction between different machines；According to the topology diagram, construct each described The forward direction subgraph and backward subgraph of single machine neural network；It is identical to multiple states in each single machine neural network Operating instruction carries out gradient calculating, obtains the corresponding single machine neural network of each machine.

Single machine neural network obtains module 402 in one of the embodiments, is specifically used for according to the topology diagram The operating instruction for traversing each machine obtains the machine number parameter and device number ginseng of the operating instruction of each machine Number；According to the machine number parameter and device number parameter, judges whether to need to increase data copy between different devices and refer to It enables, and whether needs to increase network transmission instruction between different machines；According to the topology diagram, construct each described The forward direction subgraph of single machine neural network obtains the corresponding single machine neural network of each machine.

Single machine neural network obtains module 402 in one of the embodiments, if specifically for the machine of the operating instruction Device parameter is consistent with the input machine number of the operating instruction, the device number parameter of the operating instruction and the operating instruction Input equipment number it is inconsistent, then increase data copy instruction between different devices；If the machine of the operating instruction Number parameter and the input machine number of the operating instruction are inconsistent, then increase the network transmission instruction between different machines.

Single machine neural network obtains module 402 in one of the embodiments, specifically for constructing institute by forward calculation Preceding Xiang Zitu is stated, the endpoint of each single machine neural network is obtained；The backward subgraph is constructed by backcasting, to each Operating instruction in the single machine neural network carries out state update.

Single machine neural network obtains module 402 in one of the embodiments, is specifically used for multiple states are identical Operating instruction original gradient be added, obtain the update gradient of each operating instruction.

The device further includes Distributed Artificial Neural Network building module 405 in one of the embodiments, described for constructing The topology diagram of Distributed Artificial Neural Network.

Distributed Artificial Neural Network constructs module 405 in one of the embodiments, is specifically used for according to each machine Operating instruction state, the operating instruction of each machine is distributed into instruction list；It will be in described instruction list Operating instruction is calculated in the Distributed Artificial Neural Network, and the state of operating instruction described in real-time update；It calculates and completes Afterwards, the state of each operating instruction in described instruction list is stored.

Distributed Artificial Neural Network constructs module 405 in one of the embodiments, is specifically used for state is identical multiple The operating instruction is distributed into same described instruction list.

Distributed Artificial Neural Network constructs module 405 in one of the embodiments, and being specifically used for will be in described instruction list Operating instruction calculated in the Distributed Artificial Neural Network, obtain the first calculated result；By multiple described instruction lists Corresponding first calculated result merges, and obtains distributed computing result.

The device further includes memory module 406 in one of the embodiments, in computational resource allocation and reasoning In calculating process, scheduled store is carried out to the state of the operating instruction；Or in computational resource allocation and reasoning and calculation process In, distributed storage is carried out to the state of the operating instruction.

Specific about computational resource allocation device limits the limit that may refer to above for computational resource allocation method Fixed, details are not described herein.Modules in above-mentioned computational resource allocation device can fully or partially through software, hardware and its Combination is to realize.Above-mentioned each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be with It is stored in the memory in computer equipment in a software form, in order to which processor calls the above modules of execution corresponding Operation.

In one embodiment, a kind of computer equipment is provided, which can be server, internal junction Composition can be as shown in Figure 8.The computer equipment include by system bus connect processor, memory, network interface and Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The database of machine equipment is for storing computational resource allocation data.The network interface of the computer equipment is used for and external terminal It is communicated by network connection.To realize a kind of computational resource allocation method when the computer program is executed by processor.

It will be understood by those skilled in the art that structure shown in Fig. 8, only part relevant to application scheme is tied The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.

In one embodiment, a kind of computer equipment, including memory and processor are provided, is stored in memory Computer program, the processor perform the steps of when executing computer program

Obtain the state of the operating instruction of each machine；

Processor can also realize the step that resource allocation methods are calculated in any of the above-described embodiment when executing computer program Suddenly.

In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program performs the steps of when being executed by processor

Obtain the state of the operating instruction of each machine；

Computer program can also be realized when being executed by processor calculates resource allocation methods in any of the above-described embodiment Step.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.

The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims

1. a kind of computational resource allocation method, which is characterized in that the described method includes:

The operating instruction that each machine is traversed according to the topology diagram obtains the corresponding single machine nerve net of each machine Network；

Obtain the state of the operating instruction of each machine；

According to the state of the operating instruction of each machine, be trained in the corresponding single machine neural network of each machine or Reasoning and calculation.

2. the method according to claim 1, wherein the topology diagram of the ergodic distribution formula neural network, Obtain the operating instruction of each machine, comprising:

The topology diagram is traversed, if the machine number of operating instruction and currently running machine number one in the topology diagram It causes, then using the operating instruction as the operating instruction in currently running machine.

3. the method according to claim 1, wherein described traverse each machine according to the topology diagram The operating instruction of device obtains the corresponding single machine neural network of each machine, comprising:

The operating instruction that each machine is traversed according to the topology diagram obtains the operating instruction of each machine Machine number parameter and device number parameter；

According to the machine number parameter and device number parameter, judges whether to need to increase data copy between different devices and refer to It enables, and whether needs to increase network transmission instruction between different machines；

Gradient calculating is carried out to the identical operating instruction of multiple states in each single machine neural network, is obtained each The corresponding single machine neural network of machine.

4. the method according to claim 1, wherein described traverse each machine according to the topology diagram The operating instruction of device obtains the corresponding single machine neural network of each machine, further includes:

According to the topology diagram, the forward direction subgraph of each single machine neural network is constructed, it is corresponding to obtain each machine Single machine neural network.

5. the method according to claim 3 or 4, which is characterized in that described according to the machine number parameter and device number Whether parameter judges whether to need to increase data copy instruction between different devices, and needs to increase between different machines Network transmission is added to instruct, comprising:

If the machine number parameter of the operating instruction is consistent with the input machine number of the operating instruction, the operating instruction is set Standby number parameter and the input equipment number of the operating instruction are inconsistent, then increase the data copy between different devices and refer to It enables；

If the machine number parameter of the operating instruction and the input machine number of the operating instruction are inconsistent, different machines it Between increase network transmission instruction.

6. according to the method described in claim 3, building is each described it is characterized in that, described according to the topology diagram The forward direction subgraph and backward subgraph of single machine neural network, comprising:

The backward subgraph is constructed by backcasting, state is carried out more to the operating instruction in each single machine neural network Newly.

7. according to the method described in claim 3, it is characterized in that, multiple institutes in each single machine neural network It states the identical operating instruction of state and carries out gradient calculating, comprising:

The original gradient of the identical operating instruction of multiple states is added, the update ladder of each operating instruction is obtained Degree.

8. the method according to claim 1, wherein the method also includes the building distributed nerve nets The topology diagram of network.

9. according to the method described in claim 8, it is characterized in that, the topological structure of the building Distributed Artificial Neural Network Figure, comprising:

According to the state of the operating instruction of each machine, the operating instruction of each machine is distributed to instruction list In；

Operating instruction in described instruction list is calculated in the Distributed Artificial Neural Network, and is transported described in real-time update The state of row instruction；

10. according to the method described in claim 9, it is characterized in that, the shape of the operating instruction according to each machine State distributes the operating instruction of each machine into instruction list, comprising:

11. according to the method described in claim 9, it is characterized in that, the operating instruction by described instruction list is in institute It states in Distributed Artificial Neural Network and is calculated, comprising:

Operating instruction in described instruction list is calculated in the Distributed Artificial Neural Network, obtains the first calculating knot Fruit；

12. the method according to claim 1, wherein the method also includes:

13. the method according to claim 1, wherein the operating instruction includes single device operating instruction, distribution Formula operating instruction and parameter operating instruction.

14. the method according to claim 1, wherein the parameter of the operating instruction further include distributed nature, It splits operating instruction and merges operating instruction.

15. a kind of computational resource allocation device, which is characterized in that described device includes:

Operating instruction obtains module, and for the topology diagram of ergodic distribution formula neural network, the operation for obtaining each machine refers to It enables；

Single machine neural network obtains module and obtains for traversing the operating instruction of each machine according to the topology diagram To the corresponding single machine neural network of each machine；

Computing module, for the state according to the operating instruction of each machine, in the corresponding single machine nerve net of each machine It is trained in network or reasoning and calculation.

16. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In the step of processor realizes any one of claims 1 to 14 the method when executing the computer program.

17. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method described in any one of claims 1 to 14 is realized when being executed by processor.