WO2022083527A1 - Method for determining logical core arrangement, model training method, electronic device and medium - Google Patents

Method for determining logical core arrangement, model training method, electronic device and medium Download PDF

Info

Publication number
WO2022083527A1
WO2022083527A1 PCT/CN2021/124311 CN2021124311W WO2022083527A1 WO 2022083527 A1 WO2022083527 A1 WO 2022083527A1 CN 2021124311 W CN2021124311 W CN 2021124311W WO 2022083527 A1 WO2022083527 A1 WO 2022083527A1
Authority
WO
WIPO (PCT)
Prior art keywords
layout
time step
current time
cores
neural network
Prior art date
Application number
PCT/CN2021/124311
Other languages
French (fr)
Chinese (zh)
Inventor
邓磊
李涵
施路平
祝夭龙
Original Assignee
北京灵汐科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京灵汐科技有限公司 filed Critical 北京灵汐科技有限公司
Publication of WO2022083527A1 publication Critical patent/WO2022083527A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7803System on board, i.e. computer system on one or more PCB, e.g. motherboards, daughterboards or blades
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method for determining the layout of a logical core, a training method for a layout model, a method for determining the layout of a logical core, an electronic device, and a computer-readable medium.
  • Many-core architecture is a parallel processing architecture widely used to execute neural network models. As shown in Figure 1, in the many-core architecture, each physical core (CORE) can complete a certain computing function, and a certain number of physical cores (CORE) are connected through a certain topology to form a chip (CHIP). Chips (CHIP) are connected through a certain topology to form a chip array board, and so on, which can be expanded to obtain a larger-scale system.
  • the neural network model is deployed to the many-core architecture through the following steps: (1) Split and map the neural network model into a logical core computation graph, which is composed of multiple logical cores connected by a certain topology; (2) The Logical cores are laid out to physical cores.
  • the effect of deploying the neural network model to the many-core architecture is not satisfactory.
  • Embodiments of the present disclosure provide a method for determining the layout of a logical core, a method for training a layout model, a method for determining the layout of a logical core, an electronic device, and a computer-readable medium.
  • an embodiment of the present disclosure provides a method for determining logical core layout, for placing multiple logical cores with a determined topology to multiple physical cores with a determined topology, the method comprising: updating based on a reinforcement learning method parameters of the first neural network to obtain a target layout according to the first neural network, where the target layout includes the mapping relationship between the logical core and the physical core; the first neural network is configured to be based on the layout state of the current time step
  • the data generates a layout action, and the layout state data represents the topology structure of a plurality of the physical cores with a determined topology, and the mapping relationship between the logical cores and the physical cores that have been laid out; the layout action represents at least one to-be-to-be.
  • the mapping relationship between the logical core and the physical core in the layout is
  • the method before the step of updating the parameters of the first neural network based on the reinforcement learning method, the method further includes: determining a data-based representation structure, the data-based representation structure representing the mapping between logical cores and physical cores relationship and topology structure of multiple physical cores, and the layout state data of the current time step conforms to the data representation structure.
  • the step of updating the parameters of the first neural network based on the reinforcement learning method includes: generating, by the first neural network, the layout action of the current time step according to the layout state data of the current time step;
  • the gain parameter of the current time step updates the parameters of the first neural network to increase the expected value of the gain parameter of the current time step;
  • the gain parameter includes at least the actual gain of the layout state of the current time step ; determine whether the learning termination condition is satisfied, if so, the learning ends, and if otherwise, return to the step of generating the layout action of the current time step through the first neural network.
  • the step of updating the parameters of the first neural network according to the gain parameter of the current time step to increase the expected value of the gain parameter of the current time step includes: according to the gain parameter of the current time step Layout state data and the layout action of the current time step, determine the overall gain of the current time step through the second neural network; update the parameters of the second neural network according to the overall gain of the current time step, so that all The overall revenue of the current time step is close to the expected value of the cumulative revenue of the current time step, and the cumulative revenue of the current time step is determined by the actual revenue of the current time step and the actual revenue of all subsequent time steps; according to the The overall gain at the current time step updates the parameters of the first neural network to increase the expected value of the overall gain at the current time step.
  • the method further comprises: determining the actual benefit of the current time step according to the layout state data of the current time step and the layout actions of the current time step.
  • the step of determining the actual benefit of the current time step according to the layout state data of the current time step and the layout action of the current time step includes: there is no layout at the current time step In the case of a logical core on a physical core, the predetermined benefit value is determined as the actual benefit of the current time step; in the case that there is no logical core that is not laid out on the physical core in the current time step, according to the The operational performance of the layout state for the current time step determines the actual gain for that current time step.
  • the operational performance includes at least one of latency, throughput, power consumption.
  • the cumulative return of the current time step is equal to the sum of the actual return of the current time step and the result of weighting the actual return of each subsequent time step by the discount coefficient of each subsequent time step, so The discount coefficient represents the magnitude of the impact of the layout action of the subsequent time step on the overall revenue of the current time step.
  • the discount coefficient for each subsequent time step decreases one by one.
  • the step of judging whether the learning termination condition is satisfied includes: judging whether there are logical cores that are not allocated to the physical core at the current time step; and that there is no logical core that is not allocated to the physical core at the current time step In the case of the logic core of , it is determined whether the iteration termination condition is satisfied; in the case that the iteration termination condition is satisfied, it is determined that the learning termination condition is satisfied.
  • the iteration termination condition satisfies at least one of the following conditions: the number of iterations of the current time step reaches a predetermined number of iterations, the overall benefit of the current time step reaches a predetermined benefit value, the The parameters of one neural network and the parameters of the second neural network both converge.
  • the method further includes: generating the layout state data of the next time step according to the layout state data of the current time step and the layout action of the current time step; there is no layout state data in the current time step
  • the layout represented by the layout state data of the next time step is determined to be the next time step.
  • the optimal layout of a time step in the case that the running performance of the layout represented by the layout state data of the next time step is inferior to the running performance of the optimal layout of the current time step, the The optimal layout is taken as the optimal layout for the next time step.
  • the method further includes: placing the placement of the next time step The state data is reset to the initial layout state data.
  • the step of obtaining the target layout includes: taking the optimal layout in the next time step as the target layout.
  • the step of obtaining the target layout includes: determining that there is no at least one time step of a logical core that is not placed on a physical core before the current time step Among the layouts represented by the layout state data of , the layout with the best running performance is determined; the layout with the best running performance is determined as the target layout.
  • the first neural network is any one of a convolutional neural network, a recurrent neural network, and a graph neural network.
  • the method further includes: determining, according to a predetermined algorithm, identification information of a plurality of logical cores having a determined topology.
  • the step of determining the identification information of the plurality of logical cores having the determined topology includes: determining the identification information of the plurality of logical cores according to signal flow directions in the plurality of logical cores.
  • the number of logical cores placed to physical cores in each placement action is a fixed value.
  • an embodiment of the present disclosure provides a training method for a layout model, where the layout model is used to layout a plurality of logical cores with a definite topology to a plurality of physical cores with a definite topology, and the training method includes: determining A plurality of samples, each of which includes information of a plurality of logic cores having a determined topology; performing any one of the above-mentioned methods for determining the layout of logic cores on the samples; and using the obtained first neural network as a layout model.
  • an embodiment of the present disclosure provides a method for determining logical core layout, for arranging a plurality of logical cores with a determined topology to a plurality of physical cores with a determined topology, the method comprising: The identification information of the multiple logic cores is input into the layout model to obtain the target layout of the multiple logic cores with the determined topology; the layout model is obtained according to the training method of the layout model described in the second aspect of the embodiment of the present disclosure.
  • embodiments of the present disclosure provide an electronic device, which includes: one or more processors; a storage device on which one or more programs are stored, when the one or more programs are stored by the one or more programs A plurality of processors execute, so that the one or more processors implement at least one of the following methods: the method for determining a logical core layout according to the first aspect of the embodiment of the present disclosure; according to the second aspect of the embodiment of the present disclosure The training method of the layout model; the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements at least one of the following methods: according to the first aspect of the embodiment of the present disclosure The method for determining the layout of the logic core according to the method of the present disclosure; the training method for the layout model according to the second aspect of the embodiment of the present disclosure; and the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure.
  • the first neural network processes the layout state data that can represent the topology structure of the physical core and the mapping relationship between the logical core and the physical core, and generates The layout action is performed, and the parameters of the first neural network are updated according to the profit determined by the running performance, and finally the layout of the logic core whose running performance meets the preset requirements can be obtained.
  • the optimization of the logic core layout based on reinforcement learning effectively reduces the search space of the logic core layout, which is conducive to the search for a better layout of the logic core, and can effectively solve the problems on the chip or between chips when the many-core architecture system runs the neural network model. Data transmission delay and non-uniformity of data transmission delay, so as to effectively ensure the running performance of the neural network model after layout.
  • Figure 1 is a schematic diagram of a many-core architecture.
  • FIG. 2 is a flowchart of some steps in a method for determining the layout of a logic core provided by an embodiment of the present disclosure.
  • FIG. 3 is a schematic flowchart of optimizing the layout of logic cores based on reinforcement learning in an embodiment of the present disclosure.
  • FIG. 4 is a flowchart of some steps in another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram of a model for optimizing logic core layout based on reinforcement learning in an embodiment of the present disclosure.
  • FIG. 6 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
  • FIG. 7 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
  • FIG. 8 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
  • FIG. 9 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
  • FIG. 10 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
  • FIG. 11 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
  • FIG. 12 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
  • FIG. 13 is a flowchart of a training method for a layout model provided by an embodiment of the present disclosure.
  • FIG. 14 is a flowchart of a method for determining the layout of a logic core provided by an embodiment of the present disclosure.
  • FIG. 15 is a block diagram of the composition of an electronic device according to an embodiment of the present disclosure.
  • FIG. 16 is a block diagram of the composition of a computer-readable medium provided by an embodiment of the present disclosure.
  • the following describes the method for determining the layout of the logic core, the training method for the layout model, the method for determining the layout of the logic core, the electronic equipment, and the computer software provided by the present disclosure with reference to the accompanying drawings. Read the medium for a detailed description.
  • the inventors of the present disclosure have found that when a neural network model is deployed into a many-core architecture, even if the logical core calculation graphs after the neural network model is split are consistent, different physical layouts of physical cores will lead to different inter-core delays. , and with the expansion of the scale of the system composed of multiple chips or multiple chip array boards, it will lead to greater data transmission delay, and non-uniform data transmission delay due to uneven communication capabilities on-chip or between chips. question.
  • sequential layout sequential layout of logical cores on physical cores according to the sequence number of logical cores
  • heuristic search random search according to certain rules
  • the sequential layout scheme is not optimized for the above-mentioned non-uniformity of data transmission delay and data transmission delay, and the running performance of the neural network model after layout is poor; when the system scale of many-core architecture is large, the heuristic search scheme is due to The search space is huge and the optimal layout cannot be obtained. Therefore, in some related technologies, the solution of deploying the neural network model to the many-core architecture cannot solve the above-mentioned non-uniform problems of data transmission delay and data transmission delay, and thus cannot ensure the running performance of the neural network model after the layout.
  • an embodiment of the present disclosure provides a method for determining logical core layout, for laying out multiple logical cores having a determined topology to multiple physical cores having a determined topology, the Methods include:
  • step S100 the parameters of the first neural network are updated based on the reinforcement learning method to obtain the target layout according to the first neural network; the first neural network is configured to generate layout actions according to the layout state data of the current time step.
  • the target layout may include topology structures of multiple physical cores, and a mapping relationship between logical cores and physical cores.
  • FIG. 3 is a schematic flowchart of optimizing the layout of logic cores based on Reinforcement Learning (RL, Reinforcement Learning) in an embodiment of the present disclosure.
  • the agent In reinforcement learning, the agent (Agent) generates layout actions at each time step, and then the environment (Environment) evaluates the benefits of the current time step, updates the parameters of the agent, and then updates the agent's parameters.
  • Layout strategy in order to obtain the expectation of greater profit, through multiple iterations, a logical core layout that meets the requirements can finally be obtained, that is, the target layout.
  • “Available physical cores” represents idle physical cores, that is, physical cores without logical cores
  • "Placed logic cores” represents non-idle physical cores, that is, physical cores with logical cores corresponding to the layout.
  • a first neural network is constructed as the agent, the first neural network can process layout state data and generate layout actions, wherein the layout state data represents the topology structure, and the mapping relationship between the logical cores and physical cores that have been laid out; the layout action represents the mapping relationship between at least one logical core to be laid out and the physical cores.
  • the benefit of the current time step is determined according to the running performance of the neural network model that needs to be deployed to the many-core architecture in the layout state of the current time step.
  • the embodiment of the present disclosure does not specifically limit the specific type of the running performance.
  • the operational performance may be at least one of latency, throughput, power consumption.
  • the latency (Latency) shown in FIG. 3 is only an exemplary illustration.
  • the benefit is determined according to the operating performance, so that the corresponding operating performance of the target layout obtained by executing step S100 can meet the preset requirements.
  • step S100 when the target layout of the multiple logic cores with the determined topology is obtained through step S100, the process of adjusting the parameters of the first neural network is also completed, so that the first neural network can be run in the subsequent operation. It is expected that better results can be obtained, that is, a target layout with better actual running performance can be obtained.
  • step S100 it may be to process multiple logic cores with a determined topology in the manner of step S100 to improve the parameters in the first neural network, which is equivalent to processing the first neural network Do "training".
  • the corresponding target layout may actually be obtained, but the target layout may not be actually applied.
  • a plurality of logic cores with different topologies may be processed by means of step S100 to improve parameters in the first neural network and complete the training of the first neural network. Therefore, in the subsequent process, the trained first neural network can be directly used to lay out a plurality of logic cores with an arbitrarily determined topology, and the parameters of the first neural network do not need to be changed in this process.
  • the first neural network processes the layout state data that can represent the topology structure of the physical core and the mapping relationship between the logical core and the physical core, and generates The layout action is performed, and the parameters of the first neural network are updated according to the profit determined by the running performance, and finally the layout of the logic core whose running performance meets the preset requirements can be obtained.
  • the optimization of the logic core layout based on reinforcement learning effectively reduces the search space of the logic core layout, which is conducive to the search for a better layout of the logic core, and can effectively solve the problems on the chip or between chips when the many-core architecture system runs the neural network model. Data transmission delay and non-uniformity of data transmission delay, so as to effectively ensure the running performance of the neural network model after layout.
  • the embodiments of the present disclosure define a data representation structure, which matches the connection topology of the physical core, and can digitize the topology structure of the physical core and the mapping relationship between the logical core and the physical core that has been laid out into a first neural network. Identify and process data so that reinforcement learning can be applied to optimize the placement of logical cores.
  • the method further includes:
  • step S200 a data-based representation structure is determined, the data-based representation structure represents the mapping relationship between logical cores and physical cores and the topology structures of multiple physical cores, and the layout state data of the current time step conforms to the data-based representation structure.
  • step S100 the layout state data of the current time step conforms to the data representation structure; the generated layout actions may also be represented by data conforming to the data representation structure.
  • the embodiment of the present disclosure does not specifically limit the specific form of the data representation structure.
  • the topology of multiple physical cores can be represented by coordinates, and the mapping relationship between logical cores and physical cores can be represented by the correspondence between the identifiers of logical cores and the coordinates of physical cores; the topology of multiple physical cores can also be represented by a two-dimensional matrix.
  • the structure, and the mapping relationship between logical cores and physical cores; the topology structure of multiple physical cores and the mapping relationship between logical cores and physical cores can also be represented by three-dimensional diagrams.
  • FIG. 5 shows an optional implementation manner of the data representation structure in the embodiment of the present disclosure.
  • the topology structure of a plurality of physical cores and the mapping relationship between logical cores and physical cores are represented by a two-dimensional matrix.
  • the elements in the matrix on the right correspond to the physical cores in the many-core architecture on the left one-to-one, that is, the topological structures of multiple physical cores are digitized; the elements with a value of 0 in the matrix on the right correspond to idle physical cores ( That is, the physical cores without logical cores are laid out), the elements that are not 0 correspond to the physical cores on which the logical cores are laid out, and the corresponding value represents the identification of the logical cores deployed on the physical cores, that is, the mapping relationship between the logical cores and the physical cores Digitization.
  • the output of step S100 is layout data that can represent the target layout and conforms to the data representation structure in the embodiment of the present disclosure.
  • the output layout data the topology structure of multiple physical cores in the target layout can be obtained, and the mapping relationship between logical cores and physical cores in the target layout can be obtained.
  • the mapping relationship between logical cores and physical cores in the target layout the actual How to lay out logical cores to actual physical cores in many-core architecture.
  • the data representation structure is determined according to the topology structures of multiple physical cores.
  • the same neural network may be used to process data conforming to different data representation structures, and corresponding neural networks may also be constructed according to different data representation structures, which is not done in the embodiments of the present disclosure Special restrictions.
  • the step of updating the parameters of the first neural network based on the reinforcement learning method includes:
  • step S110 the layout action of the current time step is generated by the first neural network according to the layout state data of the current time step.
  • step S120 the parameters of the first neural network are updated according to the gain parameters of the current time step to increase the expected value of the gain parameters of the current time step; the gain parameters at least include the current time step The actual gain of the layout state.
  • step S130 it is judged whether the learning termination condition is satisfied, and if so, the learning ends, and if otherwise, the process returns to the step of generating the layout action of the current time step through the first neural network.
  • steps S110 to S130 correspond to a time step in the reinforcement learning iteration, and each iteration corresponds to a state from an initial layout state where no logical cores are laid out on a physical core to a state where all logical cores are deployed on the physical cores Multiple time steps for the layout state.
  • a reward function can be constructed to calculate the benefit of the current time step according to the layout state data of the current time step and the layout actions of the current time step.
  • the strategy of the first neural network for generating layout actions also changes. That is, different parameters of the first neural network correspond to different strategies.
  • the expected value of the profit parameter at the current time step described in step S120 refers to keeping the parameters of the selected first neural network unchanged (that is, the selected strategy is unchanged), from the current time according to the selected strategy.
  • the first neural network may be a convolutional neural network, a cyclic neural network, or a graph neural network, which is not particularly limited in this embodiment of the present disclosure.
  • the updating of the parameters of the first neural network according to the gain parameter of the current time step to increase the expected value of the gain parameter of the current time step includes:
  • step S121 according to the layout state data of the current time step and the layout action of the current time step, the overall benefit of the current time step is determined through the second neural network.
  • step S122 the parameters of the second neural network are updated according to the overall income of the current time step, so that the overall income of the current time step is close to the expected value of the accumulated income of the current time step.
  • the cumulative gain of a time step is determined by the actual gain of the current time step and the actual gain of the layout states corresponding to all subsequent time steps.
  • step S123 the parameters of the first neural network are updated according to the overall income of the current time step, so as to increase the expected value of the overall income of the current time step.
  • the expected value of the overall benefit of the current time step represents the expected benefit that can be obtained by selecting the layout action of the current time step under the condition that the parameters of the first neural network remain unchanged.
  • the cumulative return of the current time step represents the discounted value of the actual return at the current time step of the subsequent time step after the current time step when the parameters of the first neural network remain unchanged, representing the value of the layout action at the current time step Relative to the value of layout actions at subsequent time steps.
  • the second neural network also continuously learns and updates the parameters of the second neural network, so that the overall revenue of the current time step determined by the second neural network is close to the expected value of the accumulated revenue of the current time step.
  • the overall return of the current time step is close to the expected value of the cumulative return of the current time step, indicating that the overall return of the current time step generated by the second neural network is more accurate.
  • the cumulative revenue of the current time step can reflect the influence of the value of the layout action of the subsequent time step on the value of the layout action of the current time step
  • the second neural network continuously learns to make
  • the overall revenue of the current time step determined by the second neural network can also reflect the value of the layout action of the subsequent time step to the layout action of the current time step. value impact.
  • the first neural network increases the overall revenue of the current time step through continuous learning, it means that the first neural network considers the future revenue when generating the layout action of the current time step, which is more conducive to searching for multiple logic cores the best layout.
  • FIG. 5 shows a model for optimizing the layout of logic cores based on reinforcement learning in an embodiment of the present disclosure.
  • the inputs of the first neural network and the second neural network are all data-based data conforming to the definition of the embodiment of the present disclosure.
  • the layout state data representing the current time step of the structure; the first neural network outputs the layout action (action) of the current time step, and the layout action of the current time step is input to the second neural network; the second neural network outputs the whole of the current time step
  • the cumulative return of the current time step is equal to the sum of the actual return of the current time step and the result of weighting the actual return of each subsequent time step by the discount coefficient of each subsequent time step, so The discount coefficient represents the magnitude of the impact of the layout action of the subsequent time step on the overall revenue of the current time step.
  • the discount coefficient for each subsequent time step decreases one by one.
  • formula (1) is used to calculate the cumulative income:
  • R t is the cumulative return at time step t
  • n is the number of time steps in an iteration
  • ri is the actual return at time step i
  • ⁇ it is the discount coefficient at time step i.
  • the value range of ⁇ is (0, 1]. It can be understood that in formula (1), the discount coefficient of the current time step is 1, and the discount coefficient of each subsequent time step is 1. The coefficients are decreasing in turn.
  • the layout state data after executing the layout action of the current time step can be determined according to the layout state data of the current time step and the layout action of the current time step, so as to obtain the layout of the logic core.
  • the running performance of the logical core layout can be evaluated after the logical core layout is determined.
  • the actual gain at the current time step represents the operational performance of the logical core layout.
  • the method further includes: in step S300 , determining the current time step according to the layout state data of the current time step and the layout action of the current time step actual income.
  • step S300 includes:
  • step S301 in the case that there are logical cores that are not laid out on the physical cores at the current time step, a predetermined benefit value is determined as the actual benefit of the layout state of the current time step.
  • step S302 in the case that there is no logical core that is not laid out on the physical core at the current time step, the actual benefit of the layout state of the current time step is determined according to the running performance of the layout state of the current time step .
  • the predetermined benefit value in step S301 is 0.
  • how to evaluate the running performance of the logical core layout is not particularly limited.
  • a hardware model can be used to simulate the layout of the logic cores to evaluate operational performance.
  • step S130 includes:
  • step S131 it is determined whether there is a logical core that is not allocated to a physical core at the current time step.
  • step S132 in the case that there is no logical core that is not laid out on the physical core at the current time step, it is determined whether the iteration termination condition is satisfied.
  • step S133 if the iteration termination condition is satisfied, it is determined that the learning termination condition is satisfied.
  • step S132 does not make any special limitation on how to perform step S132 to determine whether the iteration condition is satisfied.
  • the iteration termination condition satisfies at least one of the following conditions: the number of iterations at the current time step reaches a predetermined number of iterations, the overall gain at the current time step reaches a predetermined gain value, the parameters of the first neural network and the parameters of the second neural network both converge.
  • the method further includes:
  • step S401 the layout state data of the next time step is generated according to the layout state data of the current time step and the layout action of the current time step.
  • step S402 in the case that there is no logical core that is not laid out on the physical core at the current time step, it is judged whether the running performance of the layout represented by the layout state data of the next time step is better than the current time The performance of the optimal layout of the steps.
  • step S403 in the case that the running performance of the layout represented by the layout state data of the next time step is better than the running performance of the optimal layout of the current time step, the layout state of the next time step is The layout of the data representation is determined to be the optimal layout of the next time step; the running performance of the layout represented by the layout state data at the next time step is inferior to the running performance of the optimal layout of the current time step Next, the optimal layout of the current time step is used as the optimal layout of the next time step.
  • the method further includes:
  • step 404 the layout state data of the next time step is reset to the initial layout state data.
  • the stored optimal layout is used as the target layout.
  • the corresponding logical core layout is stored, and the target layout is determined according to the stored logical core layout.
  • the layout with the best running performance may be determined as the target layout.
  • the method further includes: in step S500, according to a predetermined algorithm, determining identification information of a plurality of logical cores having a determined topology.
  • the identification information of the multiple logic cores may be determined according to the flow directions of signals in the multiple logic cores.
  • the identification information of the logical core is the serial number of the logical core.
  • step S500 can be beneficial to obtain a better logical core layout.
  • the number of logical cores placed to physical cores in each placement action is a fixed value.
  • the number of physical cores and logical cores laid out in each layout action can be fixed instead of changing.
  • an embodiment of the present disclosure provides a method for training a layout model, where the layout model is used to layout a plurality of logical cores with a definite topology to a plurality of physical cores with a definite topology, and the training Methods include:
  • step S601 a plurality of samples are determined, and each sample includes information of a plurality of logical cores having a determined topology.
  • the information of multiple logical cores having a determined topology may include identification information of the multiple logical cores.
  • step S602 any one of the above-mentioned method for determining the layout of the logic core is performed on the sample.
  • step S603 the obtained first neural network is used as a layout model.
  • Any one of the above-mentioned method for determining the layout of the logic cores is performed on a plurality of logic cores, so as to determine the logic core layout of the plurality of logic cores, and determine the layout model.
  • an embodiment of the present disclosure provides a method for determining logical core layout, for placing a plurality of logical cores with a determined topology to a plurality of physical cores with a determined topology, the method comprising:
  • step S700 the identification information of the multiple logic cores with the determined topology is input into the layout model to obtain the target layout of the multiple logic cores with the determined topology; the layout model is according to the second aspect of the embodiment of the present disclosure. obtained by the training method of the layout model.
  • the parameters of the first neural network in the layout model remain unchanged.
  • an embodiment of the present disclosure provides an electronic device, which includes: one or more processors 101 ; and a memory 102 on which one or more programs are stored. or multiple processors to execute, so that one or more processors implement at least one of the following methods: the method for determining a logical core layout according to the first aspect of the embodiment of the present disclosure; The training method of the layout model described above; the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure.
  • One or more I/O interfaces 103 are connected between the processor and the memory, and are configured to realize the information exchange between the processor and the memory.
  • the processor 101 is a device with data processing capability, including but not limited to a central processing unit (CPU), etc.; the memory 102 is a device with data storage capability, including but not limited to random access memory (RAM, more specifically Such as SDRAM, DDR, etc.), read only memory (ROM), electrified erasable programmable read only memory (EEPROM), flash memory (FLASH); I/O interface (read and write interface) 103 is connected between the processor 101 and the memory 102 , can realize the information interaction between the processor 101 and the memory 102, which includes but is not limited to a data bus (Bus) and the like.
  • RAM random access memory
  • ROM read only memory
  • EEPROM electrified erasable programmable read only memory
  • FLASH flash memory
  • I/O interface (read and write interface) 103 is connected between the processor 101 and the memory 102 , can realize the information interaction between the processor 101 and the memory 102, which includes but is not limited to a data bus (Bus) and the like.
  • processor 101 memory 102, and I/O interface 103 are interconnected by bus 104, which in turn is connected to other components of the computing device.
  • an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements at least one of the following methods: According to the first aspect of the embodiment of the present disclosure The method for determining the layout of a logic core; the method for training a layout model according to the second aspect of the embodiment of the present disclosure; and the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure.
  • Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules or other data flexible, removable and non-removable media.
  • Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices, or may Any other medium used to store desired information and that can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and can include any information delivery media, as is well known to those of ordinary skill in the art .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Neurology (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

Provided in the present disclosure is a method for determining a logical core arrangement, the method being used to arrange multiple logical cores that have determined topologies to multiple physical cores that have determined topologies. The method comprises: updating parameters of a first neural network on the basis of reinforcement learning style so as to obtain a target arrangement according to the first neural network. The target arrangement comprises a mapping relationship between the logical cores and the physical cores; the first neural network is configured to generate an arrangement action according to arrangement state data of a current time step, and the arrangement state data represents the topological structures of multiple physical cores that have determined topologies, and the mapping relationship between the arranged logical cores and physical cores; and the arrangement action represents the mapping relationship between at least one logical core and physical core to be arranged. Also provided in the present disclosure are an arrangement model training method, a method for determining a logical core arrangement, an electronic device and a computer-readable medium.

Description

确定逻辑核布局的方法、模型训练方法、电子设备、介质Method for determining logical core layout, model training method, electronic device, medium 技术领域technical field
本公开实施例涉及计算机技术领域,特别涉及一种确定逻辑核布局的方法、一种布局模型的训练方法、一种确定逻辑核布局的方法、一种电子设备、一种计算机可读介质。The embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method for determining the layout of a logical core, a training method for a layout model, a method for determining the layout of a logical core, an electronic device, and a computer-readable medium.
背景技术Background technique
众核架构是一种广泛应用于执行神经网络模型的并行处理架构。如图1所示,在众核架构中,每个物理核(CORE)都能完成一定的计算功能,一定数量的物理核(CORE)通过一定拓扑结构连接构成一个芯片(CHIP),一定数量的芯片(CHIP)通过一定拓扑结构连接构成一个芯片阵列板,以此类推,可以扩展得到更大规模的系统。Many-core architecture is a parallel processing architecture widely used to execute neural network models. As shown in Figure 1, in the many-core architecture, each physical core (CORE) can complete a certain computing function, and a certain number of physical cores (CORE) are connected through a certain topology to form a chip (CHIP). Chips (CHIP) are connected through a certain topology to form a chip array board, and so on, which can be expanded to obtain a larger-scale system.
通过以下步骤将神经网络模型部署到众核架构:(1)将神经网络模型拆分映射为一个逻辑核计算图,逻辑核计算图由多个逻辑核通过一定拓扑结构连接构成;(2)将逻辑核布局到物理核。The neural network model is deployed to the many-core architecture through the following steps: (1) Split and map the neural network model into a logical core computation graph, which is composed of multiple logical cores connected by a certain topology; (2) The Logical cores are laid out to physical cores.
在一些相关技术中,将神经网络模型部署到众核架构的方案的效果不够理想。In some related technologies, the effect of deploying the neural network model to the many-core architecture is not satisfactory.
发明内容SUMMARY OF THE INVENTION
本公开实施例提供一种确定逻辑核布局的方法、一种布局模型的训练方法、一种确定逻辑核布局的方法、一种电子设备、一种计算机可读介质。Embodiments of the present disclosure provide a method for determining the layout of a logical core, a method for training a layout model, a method for determining the layout of a logical core, an electronic device, and a computer-readable medium.
第一方面,本公开实施例提供一种确定逻辑核布局的方法,用于将具有确定拓扑的多个逻辑核布局到具有确定拓扑的多个物理核,所述方法包括:基于强化学习方式更新第一神经网络的参数,以根据第一神经网络得到目标布局,所述目标布局包括所述逻辑核与所述物理核的映射关系;所述第一神经网络配置为根据当前时间步的布局状态数据生成布局动作,所述布局状态数据表征具有确定拓扑的多个所述物理核的拓扑结构、和已布局的所述逻辑核与所述物理核的映射关系;所述布局动作表征至少一个待布局的所述逻辑核与所述物理核的映射关系。In a first aspect, an embodiment of the present disclosure provides a method for determining logical core layout, for placing multiple logical cores with a determined topology to multiple physical cores with a determined topology, the method comprising: updating based on a reinforcement learning method parameters of the first neural network to obtain a target layout according to the first neural network, where the target layout includes the mapping relationship between the logical core and the physical core; the first neural network is configured to be based on the layout state of the current time step The data generates a layout action, and the layout state data represents the topology structure of a plurality of the physical cores with a determined topology, and the mapping relationship between the logical cores and the physical cores that have been laid out; the layout action represents at least one to-be-to-be. The mapping relationship between the logical core and the physical core in the layout.
在一些实施例中,在所述基于强化学习方式更新第一神经网络的参数的步骤之前,所述方法还包括:确定数据化表征结构,所述数据化表征结构表征逻辑核与物理核的映射关系以及多个物理核的拓扑结构,所述当前时间步的布局状态数据符合所述数据化表征结构。In some embodiments, before the step of updating the parameters of the first neural network based on the reinforcement learning method, the method further includes: determining a data-based representation structure, the data-based representation structure representing the mapping between logical cores and physical cores relationship and topology structure of multiple physical cores, and the layout state data of the current time step conforms to the data representation structure.
在一些实施例中,所述基于强化学习方式更新第一神经网络的参数的步骤包括:根据当前时间步的布局状态数据,通过所述第一神经网络生成所述当前时间步的布局动作;根 据所述当前时间步的收益参数更新所述第一神经网络的参数,以增大所述当前时间步的收益参数的预期值;所述收益参数至少包括所述当前时间步的布局状态的实际收益;判断学习终止条件是否满足,若是则学习结束,若否则返回所述通过所述第一神经网络生成所述当前时间步的布局动作的步骤。In some embodiments, the step of updating the parameters of the first neural network based on the reinforcement learning method includes: generating, by the first neural network, the layout action of the current time step according to the layout state data of the current time step; The gain parameter of the current time step updates the parameters of the first neural network to increase the expected value of the gain parameter of the current time step; the gain parameter includes at least the actual gain of the layout state of the current time step ; determine whether the learning termination condition is satisfied, if so, the learning ends, and if otherwise, return to the step of generating the layout action of the current time step through the first neural network.
在一些实施例中,所述根据所述当前时间步的收益参数更新所述第一神经网络的参数,以增大所述当前时间步的收益参数的预期值包括:根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,通过第二神经网络确定所述当前时间步的整体收益;根据所述当前时间步的整体收益更新所述第二神经网络的参数,以使所述当前时间步的整体收益逼近所述当前时间步的累计收益的预期值,所述当前时间步的累计收益由所述当前时间步的实际收益和所有后续时间步的实际收益确定;根据所述当前时间步的整体收益更新所述第一神经网络的参数,以增大所述当前时间步的整体收益的预期值。In some embodiments, the step of updating the parameters of the first neural network according to the gain parameter of the current time step to increase the expected value of the gain parameter of the current time step includes: according to the gain parameter of the current time step Layout state data and the layout action of the current time step, determine the overall gain of the current time step through the second neural network; update the parameters of the second neural network according to the overall gain of the current time step, so that all The overall revenue of the current time step is close to the expected value of the cumulative revenue of the current time step, and the cumulative revenue of the current time step is determined by the actual revenue of the current time step and the actual revenue of all subsequent time steps; according to the The overall gain at the current time step updates the parameters of the first neural network to increase the expected value of the overall gain at the current time step.
在一些实施例中,所述方法还包括:根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,确定所述当前时间步的实际收益。In some embodiments, the method further comprises: determining the actual benefit of the current time step according to the layout state data of the current time step and the layout actions of the current time step.
在一些实施例中,所述根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,确定所述当前时间步的实际收益的步骤包括:在所述当前时间步存在未布局到物理核上的逻辑核的情况下,将预定收益值确定为所述当前时间步的实际收益;在所述当前时间步不存在未布局到物理核上的逻辑核的情况下,根据所述当前时间步的布局状态的运行性能确定所述当前时间步的实际收益。In some embodiments, the step of determining the actual benefit of the current time step according to the layout state data of the current time step and the layout action of the current time step includes: there is no layout at the current time step In the case of a logical core on a physical core, the predetermined benefit value is determined as the actual benefit of the current time step; in the case that there is no logical core that is not laid out on the physical core in the current time step, according to the The operational performance of the layout state for the current time step determines the actual gain for that current time step.
在一些实施例中,所述运行性能包括延迟、吞吐量、功耗中的至少一者。In some embodiments, the operational performance includes at least one of latency, throughput, power consumption.
在一些实施例中,所述当前时间步的累计收益等于当前时间步的实际收益和通过后续每一时间步的折现系数对后续每一时间步的实际收益进行加权后的结果的总和,所述折现系数表征后续时间步的布局动作对所述当前时间步的整体收益的影响的大小。In some embodiments, the cumulative return of the current time step is equal to the sum of the actual return of the current time step and the result of weighting the actual return of each subsequent time step by the discount coefficient of each subsequent time step, so The discount coefficient represents the magnitude of the impact of the layout action of the subsequent time step on the overall revenue of the current time step.
在一些实施例中,后续每一个时间步的折现系数逐个递减。In some embodiments, the discount coefficient for each subsequent time step decreases one by one.
在一些实施例中,所述判断学习终止条件是否满足的步骤包括:判断所述当前时间步是否存在未布局到物理核上的逻辑核;在所述当前时间步不存在未布局到物理核上的逻辑核的情况下,判断迭代终止条件是否满足;在所述迭代终止条件满足的情况下,判定所述学习终止条件满足。In some embodiments, the step of judging whether the learning termination condition is satisfied includes: judging whether there are logical cores that are not allocated to the physical core at the current time step; and that there is no logical core that is not allocated to the physical core at the current time step In the case of the logic core of , it is determined whether the iteration termination condition is satisfied; in the case that the iteration termination condition is satisfied, it is determined that the learning termination condition is satisfied.
在一些实施例中,所述迭代终止条件满足包括以下条件中的至少一者:所述当前时间步的迭代次数达到预定迭代次数、所述当前时间步的整体收益达到预定收益值、所述第一神经网络的参数和所述第二神经网络的参数均收敛。In some embodiments, the iteration termination condition satisfies at least one of the following conditions: the number of iterations of the current time step reaches a predetermined number of iterations, the overall benefit of the current time step reaches a predetermined benefit value, the The parameters of one neural network and the parameters of the second neural network both converge.
在一些实施例中,所述方法还包括:根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,生成下一时间步的布局状态数据;在所述当前时间步不存在未布局到物理核上的逻辑核的情况下,判断所述下一时间步的布局状态数据表征的布局的运行性能 是否优于所述当前时间步的最优布局的运行性能;在所述下一时间步的布局状态数据表征的布局的运行性能优于所述当前时间步的最优布局的运行性能的情况下,将所述下一时间步的布局状态数据表征的布局确定为所述下一时间步的最优布局;在所述下一时间步的布局状态数据表征的布局的运行性能劣于所述当前时间步的最优布局的运行性能的情况下,将所述当前时间步的最优布局作为所述下一时间步的最优布局。In some embodiments, the method further includes: generating the layout state data of the next time step according to the layout state data of the current time step and the layout action of the current time step; there is no layout state data in the current time step In the case that the logical core is not laid out on the physical core, determine whether the running performance of the layout represented by the layout state data of the next time step is better than the running performance of the optimal layout of the current time step; When the running performance of the layout represented by the layout state data of one time step is better than the running performance of the optimal layout of the current time step, the layout represented by the layout state data of the next time step is determined to be the next time step. The optimal layout of a time step; in the case that the running performance of the layout represented by the layout state data of the next time step is inferior to the running performance of the optimal layout of the current time step, the The optimal layout is taken as the optimal layout for the next time step.
在一些实施例中,在所述迭代终止条件不满足且所述当前时间步不存在未布局到物理核上的逻辑核的情况下,所述方法还包括:将所述下一时间步的布局状态数据重置为初始布局状态数据。In some embodiments, when the iteration termination condition is not satisfied and there is no logical core that is not placed on a physical core at the current time step, the method further includes: placing the placement of the next time step The state data is reset to the initial layout state data.
在一些实施例中,在所述迭代终止条件满足的情况下,得到所述目标布局的步骤包括:将所述下一时间步的最优布局作为所述目标布局。In some embodiments, when the iteration termination condition is satisfied, the step of obtaining the target layout includes: taking the optimal layout in the next time step as the target layout.
在一些实施例中,在所述迭代终止条件满足的情况下,得到所述目标布局的步骤包括;确定所述当前时间步之前,不存在未布局到物理核上的逻辑核的至少一个时间步的布局状态数据所表征的布局中,运行性能最优的布局;将所述运行性能最优的布局确定为所述目标布局。In some embodiments, when the iteration termination condition is satisfied, the step of obtaining the target layout includes: determining that there is no at least one time step of a logical core that is not placed on a physical core before the current time step Among the layouts represented by the layout state data of , the layout with the best running performance is determined; the layout with the best running performance is determined as the target layout.
在一些实施例中,所述第一神经网络为卷积神经网络、循环神经网络、图神经网络中的任意一者。In some embodiments, the first neural network is any one of a convolutional neural network, a recurrent neural network, and a graph neural network.
在一些实施例中,所述方法还包括:根据预定算法,确定具有确定拓扑的多个逻辑核的标识信息。In some embodiments, the method further includes: determining, according to a predetermined algorithm, identification information of a plurality of logical cores having a determined topology.
在一些实施例中,根据预定算法,确定具有确定拓扑的多个逻辑核的标识信息的步骤包括:根据多个逻辑核中的信号流向确定多个逻辑核的所述标识信息。In some embodiments, according to a predetermined algorithm, the step of determining the identification information of the plurality of logical cores having the determined topology includes: determining the identification information of the plurality of logical cores according to signal flow directions in the plurality of logical cores.
在一些实施例中,每一个布局动作中布局到物理核的逻辑核的数量是固定值。In some embodiments, the number of logical cores placed to physical cores in each placement action is a fixed value.
第二方面,本公开实施例提供一种布局模型的训练方法,所述布局模型用于将具有确定拓扑的多个逻辑核布局到具有确定拓扑的多个物理核,所述训练方法包括:确定多个样本,每个样本包括具有确定拓扑的多个逻辑核的信息;对所述样本的执行上述任意一项的确定逻辑核布局的方法;以得到的所述第一神经网络为布局模型。In a second aspect, an embodiment of the present disclosure provides a training method for a layout model, where the layout model is used to layout a plurality of logical cores with a definite topology to a plurality of physical cores with a definite topology, and the training method includes: determining A plurality of samples, each of which includes information of a plurality of logic cores having a determined topology; performing any one of the above-mentioned methods for determining the layout of logic cores on the samples; and using the obtained first neural network as a layout model.
第三方面,本公开实施例提供一种确定逻辑核布局的方法,用于将具有确定拓扑的多个逻辑核布局到具有确定拓扑的多个物理核,所述方法包括:将具有确定拓扑的多个逻辑核的标识信息输入布局模型,以得到具有确定拓扑的多个逻辑核的目标布局;所述布局模型为根据本公开实施例第二方面所述的布局模型的训练方法得到的。In a third aspect, an embodiment of the present disclosure provides a method for determining logical core layout, for arranging a plurality of logical cores with a determined topology to a plurality of physical cores with a determined topology, the method comprising: The identification information of the multiple logic cores is input into the layout model to obtain the target layout of the multiple logic cores with the determined topology; the layout model is obtained according to the training method of the layout model described in the second aspect of the embodiment of the present disclosure.
第四方面,本公开实施例提供一种电子设备,其包括:一个或多个处理器;存储装置,其上存储有一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现以下方法中的至少一者:根据本公开实施例第一方面所述的确定逻辑核布局的方法;根据本公开实施例第二方面所述的布局模型的训练方法;根据本公 开实施例第三方面所述的确定逻辑核布局的方法。In a fourth aspect, embodiments of the present disclosure provide an electronic device, which includes: one or more processors; a storage device on which one or more programs are stored, when the one or more programs are stored by the one or more programs A plurality of processors execute, so that the one or more processors implement at least one of the following methods: the method for determining a logical core layout according to the first aspect of the embodiment of the present disclosure; according to the second aspect of the embodiment of the present disclosure The training method of the layout model; the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure.
第五方面,本公开实施例提供一种计算机可读介质,其上存储有计算机程序,所述程序被处理器执行时实现以下方法中的至少一者:根据本公开实施例第一方面所述的确定逻辑核布局的方法;根据本公开实施例第二方面所述的布局模型的训练方法;根据本公开实施例第三方面所述的确定逻辑核布局的方法。In a fifth aspect, an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements at least one of the following methods: according to the first aspect of the embodiment of the present disclosure The method for determining the layout of the logic core according to the method of the present disclosure; the training method for the layout model according to the second aspect of the embodiment of the present disclosure; and the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure.
在本公开实施例提供的确定逻辑核布局的方法中,基于强化学习,通过第一神经网络对能够表征物理核的拓扑结构、和逻辑核与物理核的映射关系的布局状态数据进行处理,生成布局动作,并通过根据运行性能确定的收益更新第一神经网络的参数,最终能够得到运行性能满足预设要求的逻辑核的布局。基于强化学习对逻辑核布局进行优化,有效缩减了逻辑核布局的搜索空间,有利于搜索到逻辑核的更优布局,能够有效解决众核架构系统在运行神经网络模型时芯片上或芯片间的数据传输延迟及数据传输延迟的非均匀问题,从而有效确保布局后神经网络模型的运行性能。In the method for determining the layout of a logical core provided by the embodiment of the present disclosure, based on reinforcement learning, the first neural network processes the layout state data that can represent the topology structure of the physical core and the mapping relationship between the logical core and the physical core, and generates The layout action is performed, and the parameters of the first neural network are updated according to the profit determined by the running performance, and finally the layout of the logic core whose running performance meets the preset requirements can be obtained. The optimization of the logic core layout based on reinforcement learning effectively reduces the search space of the logic core layout, which is conducive to the search for a better layout of the logic core, and can effectively solve the problems on the chip or between chips when the many-core architecture system runs the neural network model. Data transmission delay and non-uniformity of data transmission delay, so as to effectively ensure the running performance of the neural network model after layout.
附图说明Description of drawings
附图用来提供对本公开实施例的进一步理解,并且构成说明书的一部分,与本公开的实施例一起用于解释本公开,并不构成对本公开的限制。通过参考附图对详细示例实施例进行描述,以上和其它特征和优点对本领域技术人员将变得更加显而易见,在附图中:The accompanying drawings are used to provide a further understanding of the embodiments of the present disclosure, and constitute a part of the specification, and together with the embodiments of the present disclosure, they are used to explain the present disclosure, and do not constitute a limitation to the present disclosure. The above and other features and advantages will become more apparent to those skilled in the art by describing detailed example embodiments with reference to the accompanying drawings, in which:
图1为众核架构的示意图。Figure 1 is a schematic diagram of a many-core architecture.
图2为本公开实施例提供的一种确定逻辑核布局的方法中部分步骤的流程图。FIG. 2 is a flowchart of some steps in a method for determining the layout of a logic core provided by an embodiment of the present disclosure.
图3为本公开实施例中基于强化学习对逻辑核布局进行优化的流程示意图。FIG. 3 is a schematic flowchart of optimizing the layout of logic cores based on reinforcement learning in an embodiment of the present disclosure.
图4为本公开实施例提供的另一种确定逻辑核布局的方法中部分步骤的流程图。FIG. 4 is a flowchart of some steps in another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
图5为本公开实施例中基于强化学习对逻辑核布局进行优化的模型示意图。FIG. 5 is a schematic diagram of a model for optimizing logic core layout based on reinforcement learning in an embodiment of the present disclosure.
图6为本公开实施例提供的又一种确定逻辑核布局的方法中部分步骤的流程图。FIG. 6 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
图7为本公开实施例提供的再一种确定逻辑核布局的方法中部分步骤的流程图。FIG. 7 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
图8为本公开实施例提供的再一种确定逻辑核布局的方法中部分步骤的流程图。FIG. 8 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
图9为本公开实施例提供的再一种确定逻辑核布局的方法中部分步骤的流程图。FIG. 9 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
图10为本公开实施例提供的再一种确定逻辑核布局的方法中部分步骤的流程图。FIG. 10 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
图11为本公开实施例提供的再一种确定逻辑核布局的方法中部分步骤的流程图。FIG. 11 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
图12为本公开实施例提供的再一种确定逻辑核布局的方法中部分步骤的流程图。FIG. 12 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.
图13为本公开实施例提供的一种布局模型的训练方法的流程图。FIG. 13 is a flowchart of a training method for a layout model provided by an embodiment of the present disclosure.
图14为本公开实施例提供的一种确定逻辑核布局的方法的流程图。FIG. 14 is a flowchart of a method for determining the layout of a logic core provided by an embodiment of the present disclosure.
图15为本公开实施例提供的一种电子设备的组成框图。FIG. 15 is a block diagram of the composition of an electronic device according to an embodiment of the present disclosure.
图16为本公开实施例提供的一种计算机可读介质的组成框图。FIG. 16 is a block diagram of the composition of a computer-readable medium provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本领域的技术人员更好地理解本公开的技术方案,下面结合附图对本公开提供的确定逻辑核布局的方法、布局模型的训练方法、确定逻辑核布局的方法、电子设备、计算机可读介质进行详细描述。In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the following describes the method for determining the layout of the logic core, the training method for the layout model, the method for determining the layout of the logic core, the electronic equipment, and the computer software provided by the present disclosure with reference to the accompanying drawings. Read the medium for a detailed description.
在下文中将参考附图更充分地描述示例实施例,但是所述示例实施例可以以不同形式来体现且不应当被解释为限于本文阐述的实施例。反之,提供这些实施例的目的在于使本公开透彻和完整,并将使本领域技术人员充分理解本公开的范围。Example embodiments are described more fully hereinafter with reference to the accompanying drawings, but which may be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
在不冲突的情况下,本公开各实施例及实施例中的各特征可相互组合。Various embodiments of the present disclosure and various features of the embodiments may be combined with each other without conflict.
如本文所使用的,术语“和/或”包括一个或多个相关列举条目的任何和所有组合。As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
本文所使用的术语仅用于描述特定实施例,且不意欲限制本公开。如本文所使用的,单数形式“一个”和“该”也意欲包括复数形式,除非上下文另外清楚指出。还将理解的是,当本说明书中使用术语“包括”和/或“由……制成”时,指定存在所述特征、整体、步骤、操作、元件和/或组件,但不排除存在或添加一个或多个其它特征、整体、步骤、操作、元件、组件和/或其群组。The terminology used herein is used to describe particular embodiments only and is not intended to limit the present disclosure. As used herein, the singular forms "a" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that when the terms "comprising" and/or "made of" are used in this specification, the stated features, integers, steps, operations, elements and/or components are specified to be present, but not precluded or Add one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
除非另外限定,否则本文所用的所有术语(包括技术和科学术语)的含义与本领域普通技术人员通常理解的含义相同。还将理解,诸如那些在常用字典中限定的那些术语应当被解释为具有与其在相关技术以及本公开的背景下的含义一致的含义,且将不解释为具有理想化或过度形式上的含义,除非本文明确如此限定。Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will also be understood that terms such as those defined in common dictionaries should be construed as having meanings consistent with their meanings in the context of the related art and the present disclosure, and will not be construed as having idealized or over-formal meanings, unless expressly so limited herein.
经本公开的发明人研究发现,在将神经网络模型部署到众核架构时,即使将神经网络模型拆分后的逻辑核计算图一致,物理核的物理布局不同也会导致不同的核间延迟,而随着多个芯片或多个芯片阵列板构成的系统的规模的扩大,将会导致更大的数据传输延迟,还会因为片上或片间不均匀的通信能力产生数据传输延迟的非均匀问题。The inventors of the present disclosure have found that when a neural network model is deployed into a many-core architecture, even if the logical core calculation graphs after the neural network model is split are consistent, different physical layouts of physical cores will lead to different inter-core delays. , and with the expansion of the scale of the system composed of multiple chips or multiple chip array boards, it will lead to greater data transmission delay, and non-uniform data transmission delay due to uneven communication capabilities on-chip or between chips. question.
在一些相关技术中,大多采用顺序布局(按照逻辑核的序号将逻辑核顺序布局到物理核上)或者启发式搜索(按照一定规则进行随机搜索)来优化逻辑核到物理核的布局,但是,顺序布局的方案并没有针对上述数据传输延迟及数据传输延迟的非均匀问题进行优化,布局后神经网络模型的运行性能较差;而当众核架构的系统规模较大时,启发式搜索的方案由于搜索空间巨大而无法得到最优布局。因此,在一些相关技术中,将神经网络模型部署到众核架构的方案均无法解决上述数据传输延迟及数据传输延迟的非均匀问题,也就无法确保布局后神经网络模型的运行性能。In some related technologies, sequential layout (sequential layout of logical cores on physical cores according to the sequence number of logical cores) or heuristic search (random search according to certain rules) is used to optimize the layout of logical cores to physical cores. However, The sequential layout scheme is not optimized for the above-mentioned non-uniformity of data transmission delay and data transmission delay, and the running performance of the neural network model after layout is poor; when the system scale of many-core architecture is large, the heuristic search scheme is due to The search space is huge and the optimal layout cannot be obtained. Therefore, in some related technologies, the solution of deploying the neural network model to the many-core architecture cannot solve the above-mentioned non-uniform problems of data transmission delay and data transmission delay, and thus cannot ensure the running performance of the neural network model after the layout.
有鉴于此,第一方面,参照图2,本公开实施例提供一种确定逻辑核布局的方法,用于将具有确定拓扑的多个逻辑核布局到具有确定拓扑的多个物理核,所述方法包括:In view of this, in the first aspect, referring to FIG. 2 , an embodiment of the present disclosure provides a method for determining logical core layout, for laying out multiple logical cores having a determined topology to multiple physical cores having a determined topology, the Methods include:
在步骤S100中,基于强化学习方式更新第一神经网络的参数,以根据第一神经网络 得到目标布局;所述第一神经网络配置为根据当前时间步的布局状态数据生成布局动作。In step S100, the parameters of the first neural network are updated based on the reinforcement learning method to obtain the target layout according to the first neural network; the first neural network is configured to generate layout actions according to the layout state data of the current time step.
其中,目标布局可以包括多个物理核的拓扑结构,以及逻辑核与物理核之间的映射关系。The target layout may include topology structures of multiple physical cores, and a mapping relationship between logical cores and physical cores.
图3为本公开实施例中基于强化学习(RL,Reinforcement Learning)对逻辑核布局进行优化的流程示意图。如图3所示,在强化学习中,每一个时间步都由智慧体(Agent)产生布局动作,然后由环境(Environment)评估当前时间步的收益,更新智慧体的参数,进而更新智慧体的布局策略,以获得更大的收益的预期,通过多次迭代,最终能够得到一个满足要求的逻辑核布局,即目标布局。在图3中,“Available physical cores”表示空闲物理核,即未布局有逻辑核的物理核,“Placed logic cores”表示非空闲物理核,即对应布局有逻辑核的物理核。FIG. 3 is a schematic flowchart of optimizing the layout of logic cores based on Reinforcement Learning (RL, Reinforcement Learning) in an embodiment of the present disclosure. As shown in Figure 3, in reinforcement learning, the agent (Agent) generates layout actions at each time step, and then the environment (Environment) evaluates the benefits of the current time step, updates the parameters of the agent, and then updates the agent's parameters. Layout strategy, in order to obtain the expectation of greater profit, through multiple iterations, a logical core layout that meets the requirements can finally be obtained, that is, the target layout. In Figure 3, "Available physical cores" represents idle physical cores, that is, physical cores without logical cores, and "Placed logic cores" represents non-idle physical cores, that is, physical cores with logical cores corresponding to the layout.
在本公开实施例中,构造第一神经网络作为所述智慧体,第一神经网络能够处理布局状态数据并生成布局动作,其中,布局状态数据表征具有确定拓扑的多个物理核的拓扑结构、以及已布局的逻辑核与物理核的映射关系;布局动作表征至少一个待布局的逻辑核与物理核的映射关系。In the embodiment of the present disclosure, a first neural network is constructed as the agent, the first neural network can process layout state data and generate layout actions, wherein the layout state data represents the topology structure, and the mapping relationship between the logical cores and physical cores that have been laid out; the layout action represents the mapping relationship between at least one logical core to be laid out and the physical cores.
在图3所示的强化学习过程,当前时间步的收益是根据当前时间步的布局状态下需要部署到众核架构的神经网络模型的运行性能确定的。本公开实施例对运行性能的具体类型不做特殊限定。例如,运行性能可以是延迟、吞吐量、功耗中的至少一者。图3中所示的延迟(Latency)仅为示例性说明。根据运行性能确定收益,能够使通过执行步骤S100得到的目标布局的相应运行性能满足预设要求。In the reinforcement learning process shown in Figure 3, the benefit of the current time step is determined according to the running performance of the neural network model that needs to be deployed to the many-core architecture in the layout state of the current time step. The embodiment of the present disclosure does not specifically limit the specific type of the running performance. For example, the operational performance may be at least one of latency, throughput, power consumption. The latency (Latency) shown in FIG. 3 is only an exemplary illustration. The benefit is determined according to the operating performance, so that the corresponding operating performance of the target layout obtained by executing step S100 can meet the preset requirements.
在本公开实施例中,通过步骤S100获取具有确定拓扑的多个逻辑核的目标布局时,还完成了调整第一神经网络的参数的过程,从而可使第一神经网络在进行后续运行时在期望上能获得更好的效果,即得到具有更好实际运行性能的目标布局。In the embodiment of the present disclosure, when the target layout of the multiple logic cores with the determined topology is obtained through step S100, the process of adjusting the parameters of the first neural network is also completed, so that the first neural network can be run in the subsequent operation. It is expected that better results can be obtained, that is, a target layout with better actual running performance can be obtained.
作为本公开实施例的一种应用方式,可以是对需要进行实际布局的具有确定拓扑的多个逻辑核,通过步骤S100的方式,得到所需的最佳目标布局,并利用该最佳目标布局进行该多个逻辑核的实际布局。As an application mode of the embodiment of the present disclosure, it may be to obtain the required optimal target layout through the method of step S100 for a plurality of logic cores with a determined topology that need to be actually laid out, and use the optimal target layout The actual placement of the plurality of logical cores is performed.
作为本公开实施例的另一种应用方式,可以是对具有确定拓扑的多个逻辑核,通过步骤S100的方式进行处理,以改善第一神经网络中的参数,其相当于对第一神经网络进行“训练”。当然,在该过程中,实际也可得到对应的目标布局,但该目标布局可不进行实际的应用。As another application manner of the embodiment of the present disclosure, it may be to process multiple logic cores with a determined topology in the manner of step S100 to improve the parameters in the first neural network, which is equivalent to processing the first neural network Do "training". Of course, in this process, the corresponding target layout may actually be obtained, but the target layout may not be actually applied.
作为本公开实施例的另一种应用方式,可以是通过步骤S100的方式处理多种具有不同拓扑的多个逻辑核,以改善第一神经网络中的参数,完成第一神经网络的训练。从而在后续过程中,可直接用训练完成的第一神经网络对具有任意确定拓扑的多个逻辑核进行布局,而此过程中不用再改变第一神经网络的参数。As another application manner of the embodiment of the present disclosure, a plurality of logic cores with different topologies may be processed by means of step S100 to improve parameters in the first neural network and complete the training of the first neural network. Therefore, in the subsequent process, the trained first neural network can be directly used to lay out a plurality of logic cores with an arbitrarily determined topology, and the parameters of the first neural network do not need to be changed in this process.
在本公开实施例提供的确定逻辑核布局的方法中,基于强化学习,通过第一神经网络对能够表征物理核的拓扑结构、和逻辑核与物理核的映射关系的布局状态数据进行处理,生成布局动作,并通过根据运行性能确定的收益更新第一神经网络的参数,最终能够得到运行性能满足预设要求的逻辑核的布局。基于强化学习对逻辑核布局进行优化,有效缩减了逻辑核布局的搜索空间,有利于搜索到逻辑核的更优布局,能够有效解决众核架构系统在运行神经网络模型时芯片上或芯片间的数据传输延迟及数据传输延迟的非均匀问题,从而有效确保布局后神经网络模型的运行性能。In the method for determining the layout of a logical core provided by the embodiment of the present disclosure, based on reinforcement learning, the first neural network processes the layout state data that can represent the topology structure of the physical core and the mapping relationship between the logical core and the physical core, and generates The layout action is performed, and the parameters of the first neural network are updated according to the profit determined by the running performance, and finally the layout of the logic core whose running performance meets the preset requirements can be obtained. The optimization of the logic core layout based on reinforcement learning effectively reduces the search space of the logic core layout, which is conducive to the search for a better layout of the logic core, and can effectively solve the problems on the chip or between chips when the many-core architecture system runs the neural network model. Data transmission delay and non-uniformity of data transmission delay, so as to effectively ensure the running performance of the neural network model after layout.
本公开实施例定义了一种数据化表征结构,与物理核的连接拓扑相匹配,能够对物理核的拓扑结构、和已布局的逻辑核与物理核的映射关系数据化为第一神经网络能够识别和处理的数据,从而能够把强化学习应用到对逻辑核布局的优化。The embodiments of the present disclosure define a data representation structure, which matches the connection topology of the physical core, and can digitize the topology structure of the physical core and the mapping relationship between the logical core and the physical core that has been laid out into a first neural network. Identify and process data so that reinforcement learning can be applied to optimize the placement of logical cores.
相应地,参照图4,在一些实施例中,在步骤S100之前,所述方法还包括:Correspondingly, referring to FIG. 4, in some embodiments, before step S100, the method further includes:
在步骤S200中,确定数据化表征结构,所述数据化表征结构表征逻辑核与物理核的映射关系和多个物理核的拓扑结构,所述当前时间步的布局状态数据符合所述数据化表征结构。In step S200, a data-based representation structure is determined, the data-based representation structure represents the mapping relationship between logical cores and physical cores and the topology structures of multiple physical cores, and the layout state data of the current time step conforms to the data-based representation structure.
需要说明的是,在步骤S100中当前时间步的布局状态数据符合所述数据化表征结构;生成的布局动作也可以用符合所述数据化表征结构的数据进行表示。It should be noted that, in step S100, the layout state data of the current time step conforms to the data representation structure; the generated layout actions may also be represented by data conforming to the data representation structure.
本公开实施例对数据化表征结构的具体形式不做特殊限定。例如,可以用坐标表示多个物理核的拓扑结构,用逻辑核的标识与物理核的坐标的对应关系表征逻辑核与物理核的映射关系;也可以用二维矩阵表示多个物理核的拓扑结构、和逻辑核与物理核的映射关系;还可以用三维图表示多个物理核的拓扑结构、和逻辑核与物理核的映射关系。The embodiment of the present disclosure does not specifically limit the specific form of the data representation structure. For example, the topology of multiple physical cores can be represented by coordinates, and the mapping relationship between logical cores and physical cores can be represented by the correspondence between the identifiers of logical cores and the coordinates of physical cores; the topology of multiple physical cores can also be represented by a two-dimensional matrix. The structure, and the mapping relationship between logical cores and physical cores; the topology structure of multiple physical cores and the mapping relationship between logical cores and physical cores can also be represented by three-dimensional diagrams.
图5给出了本公开实施例中数据化表征结构的一种可选实施方式。如图5所示,用二维矩阵表示多个物理核的拓扑结构、和逻辑核与物理核的映射关系。其中,右侧的矩阵中的元素与左侧众核架构中的物理核一一对应,即,将多个物理核的拓扑结构数据化;右侧矩阵中值为0的元素对应空闲物理核(即未布局逻辑核的物理核),不为0的元素对应布局了逻辑核的物理核,对应的值表示部署在物理核上的逻辑核的标识,即,将逻辑核与物理核的映射关系数据化。FIG. 5 shows an optional implementation manner of the data representation structure in the embodiment of the present disclosure. As shown in FIG. 5 , the topology structure of a plurality of physical cores and the mapping relationship between logical cores and physical cores are represented by a two-dimensional matrix. Among them, the elements in the matrix on the right correspond to the physical cores in the many-core architecture on the left one-to-one, that is, the topological structures of multiple physical cores are digitized; the elements with a value of 0 in the matrix on the right correspond to idle physical cores ( That is, the physical cores without logical cores are laid out), the elements that are not 0 correspond to the physical cores on which the logical cores are laid out, and the corresponding value represents the identification of the logical cores deployed on the physical cores, that is, the mapping relationship between the logical cores and the physical cores Digitization.
需要说明的是,在本公开实施例中,步骤S100的输出为能够表征目标布局、且符合本公开实施例中数据化表征结构的布局数据。根据输出的布局数据能够得到目标布局中多个物理核的拓扑结构、并能够得到目标布局中逻辑核与物理核的映射关系,根据目标布局中逻辑核与物理核的映射关系能够确定在实际的众核架构中如何将逻辑核布局到实际的物理核。It should be noted that, in the embodiment of the present disclosure, the output of step S100 is layout data that can represent the target layout and conforms to the data representation structure in the embodiment of the present disclosure. According to the output layout data, the topology structure of multiple physical cores in the target layout can be obtained, and the mapping relationship between logical cores and physical cores in the target layout can be obtained. According to the mapping relationship between logical cores and physical cores in the target layout, the actual How to lay out logical cores to actual physical cores in many-core architecture.
作为一种可选的实施方式,根据多个物理核的拓扑结构确定所述数据化表征结构。As an optional implementation manner, the data representation structure is determined according to the topology structures of multiple physical cores.
需要说明的是,在本公开实施例中,可以用相同的神经网络处理符合不同数据化表征 结构的数据,也可以根据不同数据化表征结构构造相应的神经网络,本公开实施例对此不做特殊限定。It should be noted that, in the embodiments of the present disclosure, the same neural network may be used to process data conforming to different data representation structures, and corresponding neural networks may also be constructed according to different data representation structures, which is not done in the embodiments of the present disclosure Special restrictions.
在一些实施例中,参照图6,所述基于强化学习方式更新第一神经网络的参数的步骤包括:In some embodiments, referring to FIG. 6 , the step of updating the parameters of the first neural network based on the reinforcement learning method includes:
在步骤S110中,根据当前时间步的布局状态数据,通过所述第一神经网络生成所述当前时间步的布局动作。In step S110, the layout action of the current time step is generated by the first neural network according to the layout state data of the current time step.
在步骤S120中,根据所述当前时间步的收益参数更新所述第一神经网络的参数,以增大所述当前时间步的收益参数的预期值;所述收益参数至少包括所述当前时间步的布局状态的实际收益。In step S120, the parameters of the first neural network are updated according to the gain parameters of the current time step to increase the expected value of the gain parameters of the current time step; the gain parameters at least include the current time step The actual gain of the layout state.
在步骤S130中,判断学习终止条件是否满足,若是则学习结束,若否则返回所述通过所述第一神经网络生成所述当前时间步的布局动作的步骤。In step S130, it is judged whether the learning termination condition is satisfied, and if so, the learning ends, and if otherwise, the process returns to the step of generating the layout action of the current time step through the first neural network.
在本公开实施例中,步骤S110至步骤S130对应强化学习迭代中的一个时间步,每次迭代对应从没有逻辑核布局到物理核上的初始布局状态到所有逻辑核都部署到物理核上的布局状态的多个时间步。In this embodiment of the present disclosure, steps S110 to S130 correspond to a time step in the reinforcement learning iteration, and each iteration corresponds to a state from an initial layout state where no logical cores are laid out on a physical core to a state where all logical cores are deployed on the physical cores Multiple time steps for the layout state.
本公开实施例对于如何确定当前时间步的收益参数不做特殊限定。例如,可以构造奖赏函数,根据当前时间步的布局状态数据和当前时间步的布局动作计算当前时间步的收益。This embodiment of the present disclosure does not make special limitations on how to determine the gain parameter of the current time step. For example, a reward function can be constructed to calculate the benefit of the current time step according to the layout state data of the current time step and the layout actions of the current time step.
需要说明的是,当第一神经网络的参数变化时,第一神经网络生成布局动作的策略也会发生变化。即,不同的第一神经网络的参数对应不同的策略。步骤S120中所述的当前时间步的收益参数的预期值,是指保持选定第一神经网络的参数不变(即选定的策略不变)的情况下,根据选定的策略从当前时间步开始完成所有逻辑核到物理核的布局所能获得的收益。It should be noted that when the parameters of the first neural network change, the strategy of the first neural network for generating layout actions also changes. That is, different parameters of the first neural network correspond to different strategies. The expected value of the profit parameter at the current time step described in step S120 refers to keeping the parameters of the selected first neural network unchanged (that is, the selected strategy is unchanged), from the current time according to the selected strategy. The benefits that can be gained by starting to complete the placement of all logical cores to physical cores.
在本公开实施例中,第一神经网络可以为卷积神经网络,也可以为循环神经网络,还可以为图神经网络,本公开实施例对此不做特殊限定。In the embodiment of the present disclosure, the first neural network may be a convolutional neural network, a cyclic neural network, or a graph neural network, which is not particularly limited in this embodiment of the present disclosure.
在一些实施例中,参照图7,所述根据所述当前时间步的收益参数更新所述第一神经网络的参数,以增大所述当前时间步的收益参数的预期值包括:In some embodiments, referring to FIG. 7 , the updating of the parameters of the first neural network according to the gain parameter of the current time step to increase the expected value of the gain parameter of the current time step includes:
在步骤S121中,根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,通过第二神经网络确定所述当前时间步的整体收益。In step S121, according to the layout state data of the current time step and the layout action of the current time step, the overall benefit of the current time step is determined through the second neural network.
在步骤S122中,根据所述当前时间步的整体收益更新所述第二神经网络的参数,以使所述当前时间步的整体收益逼近所述当前时间步的累计收益的预期值,所述当前时间步的累计收益由所述当前时间步的实际收益和所有后续时间步对应的布局状态的实际收益确定。In step S122, the parameters of the second neural network are updated according to the overall income of the current time step, so that the overall income of the current time step is close to the expected value of the accumulated income of the current time step. The cumulative gain of a time step is determined by the actual gain of the current time step and the actual gain of the layout states corresponding to all subsequent time steps.
在步骤S123中,根据所述当前时间步的整体收益更新所述第一神经网络的参数,以 增大所述当前时间步的整体收益的预期值。In step S123, the parameters of the first neural network are updated according to the overall income of the current time step, so as to increase the expected value of the overall income of the current time step.
在本公开实施例中,当前时间步的整体收益的预期值表征在第一神经网络保持参数不变的情况下,选择当前时间步的布局动作所能获得的预期收益。In the embodiment of the present disclosure, the expected value of the overall benefit of the current time step represents the expected benefit that can be obtained by selecting the layout action of the current time step under the condition that the parameters of the first neural network remain unchanged.
当前时间步的累计收益表征在第一神经网络保持参数不变的情况下,在当前时间步之后的后续时间步的实际收益在当前时间步的折现值,表示当前时间步的布局动作的价值与后续时间步的布局动作的价值相关。The cumulative return of the current time step represents the discounted value of the actual return at the current time step of the subsequent time step after the current time step when the parameters of the first neural network remain unchanged, representing the value of the layout action at the current time step Relative to the value of layout actions at subsequent time steps.
在本公开实施例中,第二神经网络也经过不断学习更新第二神经网络的参数,从而使第二神经网络确定的当前时间步的整体收益逼近当前时间步的累计收益的预期值。当前时间步的整体收益逼近当前时间步的累计收益的预期值,表示第二神经网络生成的当前时间步的整体收益越准确。In the embodiment of the present disclosure, the second neural network also continuously learns and updates the parameters of the second neural network, so that the overall revenue of the current time step determined by the second neural network is close to the expected value of the accumulated revenue of the current time step. The overall return of the current time step is close to the expected value of the cumulative return of the current time step, indicating that the overall return of the current time step generated by the second neural network is more accurate.
需要说明的是,在本公开实施例中,由于当前时间步的累计收益能够反映后续时间步的布局动作的价值对当前时间步的布局动作的价值的影响,当第二神经网络通过不断学习使当前时间步的整体收益逼近当前时间步的累计收益的预期值时,第二神经网络确定的当前时间步的整体收益也就能够反映后续时间步的布局动作的价值对当前时间步的布局动作的价值的影响。当第一神经网络通过不断学习使当前时间步的整体收益增大时,即表示第一神经网络在生成当前时间步的布局动作时考虑了未来的收益,从而更加有利于搜索到多个逻辑核的最佳布局。It should be noted that, in the embodiment of the present disclosure, since the cumulative revenue of the current time step can reflect the influence of the value of the layout action of the subsequent time step on the value of the layout action of the current time step, when the second neural network continuously learns to make When the overall revenue of the current time step is close to the expected value of the cumulative revenue of the current time step, the overall revenue of the current time step determined by the second neural network can also reflect the value of the layout action of the subsequent time step to the layout action of the current time step. value impact. When the first neural network increases the overall revenue of the current time step through continuous learning, it means that the first neural network considers the future revenue when generating the layout action of the current time step, which is more conducive to searching for multiple logic cores the best layout.
图5示出了本公开实施例中基于强化学习对逻辑核布局进行优化的模型,如图5所示,第一神经网络和第二神经网络的输入均为符合本公开实施例定义的数据化表征结构的当前时间步的布局状态数据;第一神经网络输出当前时间步的布局动作(action),并将当前时间步的布局动作输入第二神经网络;第二神经网络输出当前时间步的整体收益Q(s,a)。FIG. 5 shows a model for optimizing the layout of logic cores based on reinforcement learning in an embodiment of the present disclosure. As shown in FIG. 5 , the inputs of the first neural network and the second neural network are all data-based data conforming to the definition of the embodiment of the present disclosure. The layout state data representing the current time step of the structure; the first neural network outputs the layout action (action) of the current time step, and the layout action of the current time step is input to the second neural network; the second neural network outputs the whole of the current time step The return Q(s,a).
在一些实施例中,所述当前时间步的累计收益等于当前时间步的实际收益和通过后续每一时间步的折现系数对后续每一时间步的实际收益进行加权后的结果的总和,所述折现系数表征后续时间步的布局动作对所述当前时间步的整体收益的影响的大小。In some embodiments, the cumulative return of the current time step is equal to the sum of the actual return of the current time step and the result of weighting the actual return of each subsequent time step by the discount coefficient of each subsequent time step, so The discount coefficient represents the magnitude of the impact of the layout action of the subsequent time step on the overall revenue of the current time step.
在一些实施例中,后续每一个时间步的折现系数逐个递减。In some embodiments, the discount coefficient for each subsequent time step decreases one by one.
作为一种可选的实施方式,用公式(1)计算累计收益:As an optional implementation, formula (1) is used to calculate the cumulative income:
Figure PCTCN2021124311-appb-000001
Figure PCTCN2021124311-appb-000001
其中,R t为时间步t的累计收益,n为一次迭代中时间步的数量,r i为时间步i的实际收益,γ i-t为时间步i的折现系数。 where R t is the cumulative return at time step t, n is the number of time steps in an iteration, ri is the actual return at time step i , and γ it is the discount coefficient at time step i.
作为一种可选的实施方式,γ的取值范围为(0,1]。可以理解的是,在公式(1)中,当前时间步的折现系数为1,后续每一个时间步的折现系数依次递减。As an optional implementation, the value range of γ is (0, 1]. It can be understood that in formula (1), the discount coefficient of the current time step is 1, and the discount coefficient of each subsequent time step is 1. The coefficients are decreasing in turn.
在本公开实施例中,根据当前时间步的布局状态数据和当前时间步的布局动作能够确定执行当前时间步的布局动作之后的布局状态数据,得到逻辑核的布局。在物理核的拓扑 确定的情况下,确定了逻辑核的布局即可对逻辑核布局的运行性能进行评估。作为一种可选的实施方式,当前时间步的实际收益表征逻辑核布局的运行性能。In the embodiment of the present disclosure, the layout state data after executing the layout action of the current time step can be determined according to the layout state data of the current time step and the layout action of the current time step, so as to obtain the layout of the logic core. When the topology of the physical core is determined, the running performance of the logical core layout can be evaluated after the logical core layout is determined. As an optional implementation, the actual gain at the current time step represents the operational performance of the logical core layout.
相应地,在一些实施例中,参照图8,所述方法还包括:在步骤S300中,根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,确定所述当前时间步的实际收益。Correspondingly, in some embodiments, referring to FIG. 8 , the method further includes: in step S300 , determining the current time step according to the layout state data of the current time step and the layout action of the current time step actual income.
在存在未布局到物理核上的逻辑核的情况下,由于无法得到完整的逻辑核布局,因此无法对逻辑核布局的运行性能进行评估。When there are logical cores that are not laid out on physical cores, since the complete logical core layout cannot be obtained, the running performance of the logical core layout cannot be evaluated.
相应地,在一些实施例中,参照图9,步骤S300包括:Correspondingly, in some embodiments, referring to FIG. 9 , step S300 includes:
在步骤S301中,在所述当前时间步存在未布局到物理核上的逻辑核的情况下,将预定收益值确定为所述当前时间步的布局状态的实际收益。In step S301, in the case that there are logical cores that are not laid out on the physical cores at the current time step, a predetermined benefit value is determined as the actual benefit of the layout state of the current time step.
在步骤S302中,在所述当前时间步不存在未布局到物理核上的逻辑核的情况下,根据所述当前时间步的布局状态的运行性能确定所述当前时间步的布局状态的实际收益。In step S302, in the case that there is no logical core that is not laid out on the physical core at the current time step, the actual benefit of the layout state of the current time step is determined according to the running performance of the layout state of the current time step .
作为一种可选的实施方式,步骤S301中的预定收益值为0。As an optional implementation manner, the predetermined benefit value in step S301 is 0.
在本公开实施例中,如何对逻辑核布局的运行性能进行评估不做特殊限定。例如,可以通过硬件模型,根据逻辑核布局进行模拟,从而评估运行性能。In the embodiment of the present disclosure, how to evaluate the running performance of the logical core layout is not particularly limited. For example, a hardware model can be used to simulate the layout of the logic cores to evaluate operational performance.
本公开实施例对于在步骤S130中如何判断学习终止条件是否满足不做特殊限定。作为一种可选的实施方式,参照图10,步骤S130包括:This embodiment of the present disclosure does not make any special limitation on how to determine whether the learning termination condition is satisfied in step S130. As an optional implementation manner, referring to FIG. 10 , step S130 includes:
在步骤S131中,判断所述当前时间步是否存在未布局到物理核上的逻辑核。In step S131, it is determined whether there is a logical core that is not allocated to a physical core at the current time step.
在步骤S132中,在所述当前时间步不存在未布局到物理核上的逻辑核的情况下,判断迭代终止条件是否满足。In step S132, in the case that there is no logical core that is not laid out on the physical core at the current time step, it is determined whether the iteration termination condition is satisfied.
在步骤S133中,在所述迭代终止条件满足的情况下,判定所述学习终止条件满足。In step S133, if the iteration termination condition is satisfied, it is determined that the learning termination condition is satisfied.
本公开实施例对于如何执行步骤S132判断迭代条件是否满足不做特殊限定。例如,迭代终止条件满足包括以下条件中的至少一者:当前时间步的迭代次数达到预定迭代次数、当前时间步的整体收益达到预定收益值、第一神经网络的参数和第二神经网络的参数均收敛。This embodiment of the present disclosure does not make any special limitation on how to perform step S132 to determine whether the iteration condition is satisfied. For example, the iteration termination condition satisfies at least one of the following conditions: the number of iterations at the current time step reaches a predetermined number of iterations, the overall gain at the current time step reaches a predetermined gain value, the parameters of the first neural network and the parameters of the second neural network both converge.
在本公开实施例中,每执行完一次迭代,即全部逻辑核均布局到物理核上,则判断是否需要对存储的最优布局进行更新。In the embodiment of the present disclosure, after each iteration is executed, that is, all logical cores are laid out on the physical core, it is determined whether the optimal layout of storage needs to be updated.
相应地,在一些实施例中,参照图11,所述方法还包括:Accordingly, in some embodiments, referring to FIG. 11 , the method further includes:
在步骤S401中,根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,生成下一时间步的布局状态数据。In step S401, the layout state data of the next time step is generated according to the layout state data of the current time step and the layout action of the current time step.
在步骤S402中,在所述当前时间步不存在未布局到物理核上的逻辑核的情况下,判断所述下一时间步的布局状态数据表征的布局的运行性能是否优于所述当前时间步的最优布局的运行性能。In step S402, in the case that there is no logical core that is not laid out on the physical core at the current time step, it is judged whether the running performance of the layout represented by the layout state data of the next time step is better than the current time The performance of the optimal layout of the steps.
在步骤S403中,在所述下一时间步的布局状态数据表征的布局的运行性能优于所述当前时间步的最优布局的运行性能的情况下,将所述下一时间步的布局状态数据表征的布局确定为所述下一时间步的最优布局;在所述下一时间步的布局状态数据表征的布局的运行性能劣于所述当前时间步的最优布局的运行性能的情况下,将所述当前时间步的最优布局作为所述下一时间步的最优布局。In step S403, in the case that the running performance of the layout represented by the layout state data of the next time step is better than the running performance of the optimal layout of the current time step, the layout state of the next time step is The layout of the data representation is determined to be the optimal layout of the next time step; the running performance of the layout represented by the layout state data at the next time step is inferior to the running performance of the optimal layout of the current time step Next, the optimal layout of the current time step is used as the optimal layout of the next time step.
在本公开实施例中,每执行完一次迭代,即全部逻辑核均布局到物理核上,则从初始布局状态开始,执行下一次迭代。In this embodiment of the present disclosure, after each iteration is executed, that is, all logical cores are laid out on the physical core, the next iteration starts from the initial layout state.
相应地,在一些实施例中,参照图11,在所述迭代终止条件不满足且所述当前时间步不存在未布局到物理核上的逻辑核的情况下,所述方法还包括:Correspondingly, in some embodiments, referring to FIG. 11 , in the case that the iteration termination condition is not satisfied and there is no logical core that is not placed on the physical core at the current time step, the method further includes:
在步骤404中,将所述下一时间步的布局状态数据重置为初始布局状态数据。In step 404, the layout state data of the next time step is reset to the initial layout state data.
在一些实施例中,在迭代终止条件满足的情况下,将存储的最优布局作为目标布局。In some embodiments, when the iteration termination condition is satisfied, the stored optimal layout is used as the target layout.
作为一种可选的实施方式,每执行完一次迭代,即全部逻辑核均布局到物理核上,则存储相应的逻辑核布局,根据存储的逻辑核布局确定目标布局。As an optional implementation manner, after each iteration is executed, that is, all logical cores are laid out on the physical core, the corresponding logical core layout is stored, and the target layout is determined according to the stored logical core layout.
在一些实施例中,可以将确定所述当前时间步之前,不存在未布局到物理核上的逻辑核的至少一个时间步的布局状态数据所表征的布局中,运行性能最优的布局确定为所述目标布局。In some embodiments, among the layouts represented by the layout state data of at least one time step in which there is no logical core that is not placed on the physical core before the current time step is determined, the layout with the best running performance may be determined as the target layout.
在一些实施例中,参照图12,所述方法还包括:在步骤S500中,根据预定算法,确定具有确定拓扑的多个逻辑核的标识信息。In some embodiments, referring to FIG. 12 , the method further includes: in step S500, according to a predetermined algorithm, determining identification information of a plurality of logical cores having a determined topology.
本公开实施例对步骤S500中所述的预定算法不做特殊限定。例如,可以根据多个逻辑核中信号的流向确定多个逻辑核的标识信息。This embodiment of the present disclosure does not specifically limit the predetermined algorithm described in step S500. For example, the identification information of the multiple logic cores may be determined according to the flow directions of signals in the multiple logic cores.
作为一种可选的实施方式,逻辑核的标识信息为逻辑核的序号。As an optional implementation manner, the identification information of the logical core is the serial number of the logical core.
在一些实施例中,通过步骤S500能够有利于获得更优的逻辑核布局。In some embodiments, step S500 can be beneficial to obtain a better logical core layout.
在一些实施例中,每一个布局动作中布局到物理核的逻辑核的数量是固定值。In some embodiments, the number of logical cores placed to physical cores in each placement action is a fixed value.
即每一个布局动作中布局到物理核逻辑核的数量可以是固定的,而不是变化的。That is, the number of physical cores and logical cores laid out in each layout action can be fixed instead of changing.
第二方面,参照图13,本公开实施例提供一种布局模型的训练方法,所述布局模型用于将具有确定拓扑的多个逻辑核布局到具有确定拓扑的多个物理核,所述训练方法包括:In a second aspect, referring to FIG. 13 , an embodiment of the present disclosure provides a method for training a layout model, where the layout model is used to layout a plurality of logical cores with a definite topology to a plurality of physical cores with a definite topology, and the training Methods include:
在步骤S601中,确定多个样本,每个样本包括具有确定拓扑的多个逻辑核的信息。In step S601, a plurality of samples are determined, and each sample includes information of a plurality of logical cores having a determined topology.
示例性的,具有确定拓扑的多个逻辑核的信息可以包括该多个逻辑核的标识信息。Exemplarily, the information of multiple logical cores having a determined topology may include identification information of the multiple logical cores.
在步骤S602中,对所述样本的执行上述任意一项的确定逻辑核布局的方法。In step S602, any one of the above-mentioned method for determining the layout of the logic core is performed on the sample.
在步骤S603中,以得到的所述第一神经网络为布局模型。In step S603, the obtained first neural network is used as a layout model.
对多个逻辑核执行上述任意一项的确定逻辑核布局的方法,以确定出该多个逻辑核的逻辑核布局,并确定布局模型。Any one of the above-mentioned method for determining the layout of the logic cores is performed on a plurality of logic cores, so as to determine the logic core layout of the plurality of logic cores, and determine the layout model.
第三方面,参照图14,本公开实施例提供一种确定逻辑核布局的方法,用于将具有确定拓扑的多个逻辑核布局到具有确定拓扑的多个物理核,所述方法包括:In a third aspect, referring to FIG. 14 , an embodiment of the present disclosure provides a method for determining logical core layout, for placing a plurality of logical cores with a determined topology to a plurality of physical cores with a determined topology, the method comprising:
在步骤S700中,将具有确定拓扑的多个逻辑核的标识信息输入布局模型,以得到具有确定拓扑的多个逻辑核的目标布局;所述布局模型为根据本公开实施例第二方面所述的布局模型的训练方法得到的。In step S700, the identification information of the multiple logic cores with the determined topology is input into the layout model to obtain the target layout of the multiple logic cores with the determined topology; the layout model is according to the second aspect of the embodiment of the present disclosure. obtained by the training method of the layout model.
在本公开实施例中,通过步骤S700确定逻辑核布局时,布局模型中第一神经网络的参数保持不变。In the embodiment of the present disclosure, when the logic core layout is determined through step S700, the parameters of the first neural network in the layout model remain unchanged.
第四方面,参照图15,本公开实施例提供一种电子设备,其包括:一个或多个处理器101;存储器102,其上存储有一个或多个程序,当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现以下方法中的至少一者:根据本公开实施例第一方面所述的确定逻辑核布局的方法;根据本公开实施例第二方面所述的布局模型的训练方法;根据本公开实施例第三方面所述的确定逻辑核布局的方法。一个或多个I/O接口103,连接在处理器与存储器之间,配置为实现处理器与存储器的信息交互。In a fourth aspect, referring to FIG. 15 , an embodiment of the present disclosure provides an electronic device, which includes: one or more processors 101 ; and a memory 102 on which one or more programs are stored. or multiple processors to execute, so that one or more processors implement at least one of the following methods: the method for determining a logical core layout according to the first aspect of the embodiment of the present disclosure; The training method of the layout model described above; the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure. One or more I/O interfaces 103 are connected between the processor and the memory, and are configured to realize the information exchange between the processor and the memory.
其中,处理器101为具有数据处理能力的器件,其包括但不限于中央处理器(CPU)等;存储器102为具有数据存储能力的器件,其包括但不限于随机存取存储器(RAM,更具体如SDRAM、DDR等)、只读存储器(ROM)、带电可擦可编程只读存储器(EEPROM)、闪存(FLASH);I/O接口(读写接口)103连接在处理器101与存储器102间,能实现处理器101与存储器102的信息交互,其包括但不限于数据总线(Bus)等。The processor 101 is a device with data processing capability, including but not limited to a central processing unit (CPU), etc.; the memory 102 is a device with data storage capability, including but not limited to random access memory (RAM, more specifically Such as SDRAM, DDR, etc.), read only memory (ROM), electrified erasable programmable read only memory (EEPROM), flash memory (FLASH); I/O interface (read and write interface) 103 is connected between the processor 101 and the memory 102 , can realize the information interaction between the processor 101 and the memory 102, which includes but is not limited to a data bus (Bus) and the like.
在一些实施例中,处理器101、存储器102和I/O接口103通过总线104相互连接,进而与计算设备的其它组件连接。In some embodiments, processor 101, memory 102, and I/O interface 103 are interconnected by bus 104, which in turn is connected to other components of the computing device.
第五方面,参照图16,本公开实施例提供一种计算机可读介质,其上存储有计算机程序,程序被处理器执行时实现以下方法中的至少一者:根据本公开实施例第一方面所述的确定逻辑核布局的方法;根据本公开实施例第二方面所述的布局模型的训练方法;根据本公开实施例第三方面所述的确定逻辑核布局的方法。In a fifth aspect, referring to FIG. 16 , an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements at least one of the following methods: According to the first aspect of the embodiment of the present disclosure The method for determining the layout of a logic core; the method for training a layout model according to the second aspect of the embodiment of the present disclosure; and the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其它数据)的任 何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其它存储器技术、CD-ROM、数字多功能盘(DVD)或其它光盘存储、磁盒、磁带、磁盘存储或其它磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其它的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其它传输机制之类的调制数据信号中的其它数据,并且可包括任何信息递送介质。Those of ordinary skill in the art can understand that all or some of the steps in the methods disclosed above, functional modules/units in the systems, and devices can be implemented as software, firmware, hardware, and appropriate combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components Components execute cooperatively. Some or all physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules or other data flexible, removable and non-removable media. Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices, or may Any other medium used to store desired information and that can be accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and can include any information delivery media, as is well known to those of ordinary skill in the art .
本文已经公开了示例实施例,并且虽然采用了具体术语,但它们仅用于并仅应当被解释为一般说明性含义,并且不用于限制的目的。在一些实例中,对本领域技术人员显而易见的是,除非另外明确指出,否则可单独使用与特定实施例相结合描述的特征、特性和/或元素,或可与其它实施例相结合描述的特征、特性和/或元件组合使用。因此,本领域技术人员将理解,在不脱离由所附的权利要求阐明的本公开的范围的情况下,可进行各种形式和细节上的改变。Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should only be construed in a general descriptive sense and not for purposes of limitation. In some instances, it will be apparent to those skilled in the art that features, characteristics and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments, unless expressly stated otherwise. Features and/or elements are used in combination. Accordingly, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the scope of the present disclosure as set forth in the appended claims.

Claims (23)

  1. 一种确定逻辑核布局的方法,用于将具有确定拓扑的多个逻辑核布局到具有确定拓扑的多个物理核,所述方法包括:A method for determining logical core layout, for laying out a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, the method comprising:
    基于强化学习方式更新第一神经网络的参数,以根据第一神经网络得到目标布局,所述目标布局包括所述逻辑核与所述物理核的映射关系;所述第一神经网络配置为根据当前时间步的布局状态数据生成布局动作,所述布局状态数据表征具有确定拓扑的多个所述物理核的拓扑结构、以及已布局的所述逻辑核与所述物理核的映射关系;所述布局动作表征至少一个待布局的所述逻辑核与所述物理核的映射关系。The parameters of the first neural network are updated based on the reinforcement learning method, so as to obtain a target layout according to the first neural network, and the target layout includes the mapping relationship between the logical core and the physical core; the first neural network is configured according to the current The layout state data of the time step generates a layout action, and the layout state data represents the topology structure of a plurality of the physical cores having a determined topology, and the mapped relationship between the logical cores and the physical cores that have been laid out; the layout The action represents at least one mapping relationship between the logical core to be placed and the physical core.
  2. 根据权利要求1所述的方法,其中,在所述基于强化学习方式更新第一神经网络的参数的步骤之前,所述方法还包括:The method according to claim 1, wherein, before the step of updating the parameters of the first neural network based on the reinforcement learning method, the method further comprises:
    确定数据化表征结构,所述数据化表征结构表征逻辑核与物理核的映射关系以及多个物理核的拓扑结构,所述当前时间步的布局状态数据符合所述数据化表征结构。A data-based representation structure is determined, the data-based representation structure represents a mapping relationship between logical cores and physical cores and a topology structure of a plurality of physical cores, and the layout state data of the current time step conforms to the data-based representation structure.
  3. 根据权利要求1所述的方法,其中,所述基于强化学习方式更新第一神经网络的参数的步骤包括:The method according to claim 1, wherein the step of updating the parameters of the first neural network based on the reinforcement learning method comprises:
    根据当前时间步的布局状态数据,通过所述第一神经网络生成所述当前时间步的布局动作;According to the layout state data of the current time step, the layout action of the current time step is generated by the first neural network;
    根据所述当前时间步的收益参数更新所述第一神经网络的参数,以增大所述当前时间步的收益参数的预期值;所述收益参数至少包括所述当前时间步的布局状态的实际收益;Update the parameters of the first neural network according to the gain parameter of the current time step to increase the expected value of the gain parameter of the current time step; the gain parameter at least includes the actual value of the layout state of the current time step income;
    判断学习终止条件是否满足,若是则学习结束,若否则返回所述通过所述第一神经网络生成所述当前时间步的布局动作的步骤。It is judged whether the learning termination condition is satisfied, and if so, the learning ends, and if otherwise, the process returns to the step of generating the layout action of the current time step through the first neural network.
  4. 根据权利要求3所述的方法,其中,所述根据所述当前时间步的收益参数更新所述第一神经网络的参数,以增大所述当前时间步的收益参数的预期值包括:The method according to claim 3, wherein the updating the parameters of the first neural network according to the gain parameter of the current time step to increase the expected value of the gain parameter of the current time step comprises:
    根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,通过第二神经网络确定所述当前时间步的整体收益;According to the layout state data of the current time step and the layout action of the current time step, determine the overall revenue of the current time step through the second neural network;
    根据所述当前时间步的整体收益更新所述第二神经网络的参数,以使所述当前时间步的整体收益逼近所述当前时间步的累计收益的预期值,所述当前时间步的累计收益由所述当前时间步的实际收益和所有后续时间步的实际收益确定;The parameters of the second neural network are updated according to the overall return of the current time step, so that the overall return of the current time step is close to the expected value of the accumulated return of the current time step, and the accumulated return of the current time step Determined by the actual payoff for the current time step and the actual payoff for all subsequent time steps;
    根据所述当前时间步的整体收益更新所述第一神经网络的参数,以增大所述当前时间步的整体收益的预期值。The parameters of the first neural network are updated according to the overall gain at the current time step to increase the expected value of the overall gain at the current time step.
  5. 根据权利要求4所述的方法,其中,所述方法还包括:The method of claim 4, wherein the method further comprises:
    根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,确定所述当前时间步的实际收益。The actual benefit of the current time step is determined according to the layout state data of the current time step and the layout action of the current time step.
  6. 根据权利要求5所述的方法,其中,所述根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,确定所述当前时间步的实际收益的步骤包括:The method according to claim 5, wherein the step of determining the actual benefit of the current time step according to the layout state data of the current time step and the layout actions of the current time step comprises:
    在所述当前时间步存在未布局到物理核上的逻辑核的情况下,将预定收益值确定为所述当前时间步的实际收益;In the case that the current time step has logical cores that are not arranged on the physical cores, determining a predetermined benefit value as the actual benefit of the current time step;
    在所述当前时间步不存在未布局到物理核上的逻辑核的情况下,根据所述当前时间步的布局状态的运行性能确定所述当前时间步的实际收益。In the case that there is no logical core that is not placed on the physical core at the current time step, the actual benefit of the current time step is determined according to the running performance of the placement state of the current time step.
  7. 根据权利要求6所述的方法,其中,所述运行性能包括延迟、吞吐量、功耗中的至少一者。The method of claim 6, wherein the operational performance includes at least one of latency, throughput, and power consumption.
  8. 根据权利要求4所述的方法,其中,所述当前时间步的累计收益等于当前时间步的实际收益和通过后续每一时间步的折现系数对后续每一时间步的实际收益进行加权后的结果的总和,所述折现系数表征后续时间步的布局动作对所述当前时间步的整体收益的影响的大小。The method according to claim 4, wherein the cumulative income of the current time step is equal to the actual income of the current time step and the actual income of each subsequent time step weighted by the discount coefficient of each subsequent time step. The sum of the results, the discount coefficient represents the size of the impact of the layout actions of the subsequent time step on the overall revenue of the current time step.
  9. 根据权利要求8所述的方法,其中,后续每一个时间步的折现系数逐个递减。The method according to claim 8, wherein the discount coefficient of each subsequent time step decreases one by one.
  10. 根据权利要求4至9中任意一项所述的方法,其中,所述判断学习终止条件是否满足的步骤包括:The method according to any one of claims 4 to 9, wherein the step of judging whether the learning termination condition is satisfied comprises:
    判断所述当前时间步是否存在未布局到物理核上的逻辑核;Determine whether there is a logical core that is not laid out on the physical core at the current time step;
    在所述当前时间步不存在未布局到物理核上的逻辑核的情况下,判断迭代终止条件是否满足;In the case that there is no logical core that is not laid out on the physical core at the current time step, determine whether the iteration termination condition is satisfied;
    在所述迭代终止条件满足的情况下,判定所述学习终止条件满足。When the iteration termination condition is satisfied, it is determined that the learning termination condition is satisfied.
  11. 根据权利要求10所述的方法,其中,所述迭代终止条件满足包括以下条件中的至少一者:所述当前时间步的迭代次数达到预定迭代次数、所述当前时间步的整体收益达到预定收益值、所述第一神经网络的参数和所述第二神经网络的参数均收敛。The method according to claim 10, wherein the iteration termination condition satisfies at least one of the following conditions: the number of iterations of the current time step reaches a predetermined number of iterations, and the overall benefit of the current time step reaches a predetermined benefit value, the parameters of the first neural network and the parameters of the second neural network all converge.
  12. 根据权利要求10所述的方法,其中,所述方法还包括:The method of claim 10, wherein the method further comprises:
    根据所述当前时间步的布局状态数据和所述当前时间步的布局动作,生成下一时间步的布局状态数据;According to the layout state data of the current time step and the layout action of the current time step, generate the layout state data of the next time step;
    在所述当前时间步不存在未布局到物理核上的逻辑核的情况下,判断所述下一时间步的布局状态数据表征的布局的运行性能是否优于所述当前时间步的最优布局的运行性能;In the case that there is no logical core that is not laid out on the physical core at the current time step, determine whether the running performance of the layout represented by the layout state data of the next time step is better than the optimal layout of the current time step performance;
    在所述下一时间步的布局状态数据表征的布局的运行性能优于所述当前时间步的最优布局的运行性能的情况下,将所述下一时间步的布局状态数据表征的布局确定为所述下一时间步的最优布局;在所述下一时间步的布局状态数据表征的布局的运行性能劣于所述当前时间步的最优布局的运行性能的情况下,将所述当前时间步的最优布局作为所述下一时间步的最优布局。In the case that the running performance of the layout represented by the layout state data of the next time step is better than the running performance of the optimal layout of the current time step, determine the layout represented by the layout state data of the next time step is the optimal layout of the next time step; in the case that the running performance of the layout represented by the layout state data of the next time step is inferior to the running performance of the optimal layout of the current time step, the The optimal layout for the current time step is used as the optimal layout for the next time step.
  13. 根据权利要求12所述的方法,其中,在所述迭代终止条件不满足且所述当前时间步不存在未布局到物理核上的逻辑核的情况下,所述方法还包括:The method according to claim 12, wherein, in the case that the iteration termination condition is not satisfied and there is no logical core that is not placed on a physical core at the current time step, the method further comprises:
    将所述下一时间步的布局状态数据重置为初始布局状态数据。The layout state data of the next time step is reset to the initial layout state data.
  14. 根据权利要求12所述的方法,其中,在所述迭代终止条件满足的情况下,得到所述目标布局的步骤包括:The method according to claim 12, wherein, when the iteration termination condition is satisfied, the step of obtaining the target layout comprises:
    将所述下一时间步的最优布局作为所述目标布局。The optimal layout of the next time step is used as the target layout.
  15. 根据权利要求10所述的方法,其中,在所述迭代终止条件满足的情况下,得到所述目标布局的步骤包括;The method according to claim 10, wherein, when the iteration termination condition is satisfied, the step of obtaining the target layout comprises;
    确定所述当前时间步之前,不存在未布局到物理核上的逻辑核的至少一个时间步的布局状态数据所表征的布局中,运行性能最优的布局;Before determining the current time step, there is no layout with the best running performance among the layouts represented by the layout state data of at least one time step of the logical core that is not laid out on the physical core;
    将所述运行性能最优的布局确定为所述目标布局。The layout with the optimal running performance is determined as the target layout.
  16. 根据权利要求1至9中任意一项所述的方法,其中,所述第一神经网络为卷积神经网络、循环神经网络、图神经网络中的任意一者。The method according to any one of claims 1 to 9, wherein the first neural network is any one of a convolutional neural network, a recurrent neural network, and a graph neural network.
  17. 根据权利要求1至9中任意一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1 to 9, wherein the method further comprises:
    根据预定算法,确定具有确定拓扑的多个逻辑核的标识信息。According to a predetermined algorithm, identification information of a plurality of logical cores having a determined topology is determined.
  18. 根据权利要求17所述的方法,其中,所述根据预定算法,确定具有确定拓扑的多个逻辑核的标识信息的步骤包括:The method according to claim 17, wherein the step of determining, according to a predetermined algorithm, the identification information of a plurality of logical cores having a determined topology comprises:
    根据多个逻辑核中的信号流向确定多个逻辑核的所述标识信息。The identification information of the plurality of logic cores is determined according to the signal flow directions in the plurality of logic cores.
  19. 根据权利要求1至9中任意一项所述的方法,其中,每一个布局动作中布局到物理核的逻辑核的数量是固定值。9. The method of any one of claims 1 to 9, wherein the number of logical cores that are placed to physical cores in each placement action is a fixed value.
  20. 一种布局模型的训练方法,所述训练方法包括:A training method for a layout model, the training method comprising:
    确定多个样本,每个样本包括具有确定拓扑的多个逻辑核的信息;determining a plurality of samples, each sample including information of a plurality of logical cores having a determined topology;
    对所述样本的执行根据权利要求1至19中任意一项所述确定逻辑核布局的方法;The execution of the sample is according to the method for determining the layout of a logic core according to any one of claims 1 to 19;
    以得到的所述第一神经网络为布局模型。The obtained first neural network is used as a layout model.
  21. 一种确定逻辑核布局的方法,用于将具有确定拓扑的多个逻辑核布局到具有确定拓扑的多个物理核,所述方法包括:A method for determining logical core layout, for laying out a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, the method comprising:
    将具有确定拓扑的多个逻辑核的标识信息输入布局模型,以得到具有确定拓扑的多个逻辑核的目标布局;所述布局模型为根据权利要求20所述的布局模型的训练方法得到的。The identification information of the multiple logic cores with the determined topology is input into the layout model to obtain the target layout of the multiple logic cores with the determined topology; the layout model is obtained according to the training method of the layout model according to claim 20 .
  22. 一种电子设备,其包括:An electronic device comprising:
    一个或多个处理器;one or more processors;
    存储装置,其上存储有一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现以下方法中的至少一者:A storage device having one or more programs stored thereon that, when executed by the one or more processors, cause the one or more processors to implement at least one of the following methods:
    根据权利要求1至19中任意一项所述的确定逻辑核布局的方法;The method for determining a logical core layout according to any one of claims 1 to 19;
    根据权利要求20所述的布局模型的训练方法;The training method of the layout model according to claim 20;
    根据权利要求21所述的确定逻辑核布局的方法。The method of claim 21 for determining logical core layout.
  23. 一种计算机可读介质,其上存储有计算机程序,所述程序被处理器执行时实现以下方法中的至少一者:A computer-readable medium having stored thereon a computer program that, when executed by a processor, implements at least one of the following methods:
    根据权利要求1至19中任意一项所述的确定逻辑核布局的方法;The method for determining a logical core layout according to any one of claims 1 to 19;
    根据权利要求20所述的布局模型的训练方法;The training method of the layout model according to claim 20;
    根据权利要求21所述的确定逻辑核布局的方法。The method of claim 21 for determining logical core layout.
PCT/CN2021/124311 2020-10-22 2021-10-18 Method for determining logical core arrangement, model training method, electronic device and medium WO2022083527A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011141034.4 2020-10-22
CN202011141034.4A CN112257848B (en) 2020-10-22 2020-10-22 Method for determining logic core layout, model training method, electronic device and medium

Publications (1)

Publication Number Publication Date
WO2022083527A1 true WO2022083527A1 (en) 2022-04-28

Family

ID=74263993

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/124311 WO2022083527A1 (en) 2020-10-22 2021-10-18 Method for determining logical core arrangement, model training method, electronic device and medium

Country Status (2)

Country Link
CN (1) CN112257848B (en)
WO (1) WO2022083527A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116962438A (en) * 2023-09-21 2023-10-27 浪潮电子信息产业股份有限公司 Gradient data synchronization method, system, electronic equipment and readable storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257848B (en) * 2020-10-22 2024-04-30 北京灵汐科技有限公司 Method for determining logic core layout, model training method, electronic device and medium
CN118210599A (en) * 2022-12-16 2024-06-18 华为技术有限公司 Chip resource scheduling method and related device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190095796A1 (en) * 2017-09-22 2019-03-28 Intel Corporation Methods and arrangements to determine physical resource assignments
CN110100255A (en) * 2017-01-06 2019-08-06 国际商业机器公司 Region is effective, reconfigurable, energy saving, the effective neural network substrate of speed
CN112257848A (en) * 2020-10-22 2021-01-22 北京灵汐科技有限公司 Method for determining logic core layout, model training method, electronic device, and medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9754221B1 (en) * 2017-03-09 2017-09-05 Alphaics Corporation Processor for implementing reinforcement learning operations
CN110737758B (en) * 2018-07-03 2022-07-05 百度在线网络技术(北京)有限公司 Method and apparatus for generating a model
US20200272905A1 (en) * 2019-02-26 2020-08-27 GE Precision Healthcare LLC Artificial neural network compression via iterative hybrid reinforcement learning approach
US20200320428A1 (en) * 2019-04-08 2020-10-08 International Business Machines Corporation Fairness improvement through reinforcement learning
CN111143148B (en) * 2019-12-30 2023-09-12 北京奇艺世纪科技有限公司 Model parameter determining method, device and storage medium
CN111798114B (en) * 2020-06-28 2024-07-02 纽扣互联(北京)科技有限公司 Model training and order processing method, device, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110100255A (en) * 2017-01-06 2019-08-06 国际商业机器公司 Region is effective, reconfigurable, energy saving, the effective neural network substrate of speed
US20190095796A1 (en) * 2017-09-22 2019-03-28 Intel Corporation Methods and arrangements to determine physical resource assignments
CN112257848A (en) * 2020-10-22 2021-01-22 北京灵汐科技有限公司 Method for determining logic core layout, model training method, electronic device, and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116962438A (en) * 2023-09-21 2023-10-27 浪潮电子信息产业股份有限公司 Gradient data synchronization method, system, electronic equipment and readable storage medium
CN116962438B (en) * 2023-09-21 2024-01-23 浪潮电子信息产业股份有限公司 Gradient data synchronization method, system, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN112257848A (en) 2021-01-22
CN112257848B (en) 2024-04-30

Similar Documents

Publication Publication Date Title
WO2022083527A1 (en) Method for determining logical core arrangement, model training method, electronic device and medium
US12086516B2 (en) Generating integrated circuit floorplans using neural networks
Hao et al. Adaptive infill sampling criterion for multi-fidelity gradient-enhanced kriging model
CN112513886B (en) Information processing method, information processing apparatus, and information processing program
CN111819578A (en) Asynchronous training for optimization of neural networks using distributed parameter servers with rush updates
CN115066694A (en) Computation graph optimization
US20210350230A1 (en) Data dividing method and processor for convolution operation
WO2024198502A1 (en) Method and apparatus for training optical neural network having robustness with respect to process error, and device
CN114218887A (en) Chip configuration design method, device and medium based on deep learning
CN116932174B (en) Dynamic resource scheduling method, device, terminal and medium for EDA simulation task
CN117371496A (en) Parameter optimization method, device, equipment and storage medium
TWI758223B (en) Computing method with dynamic minibatch sizes and computing system and computer-readable storage media for performing the same
KR20220032861A (en) Neural architecture search method and attaratus considering performance in hardware
CN109492759B (en) Neural network model prediction method, device and terminal
CN115269177A (en) Dynamic calibration optimization method for engine model parameters
US7346868B2 (en) Method and system for evaluating design costs of an integrated circuit
CN115345100A (en) Network-on-chip simulation model, dynamic path planning method and device, and multi-core chip
CN112116081A (en) Deep learning network optimization method and device
Wang et al. Automatically setting parameter-exchanging interval for deep learning
WO2023004593A1 (en) Method for simulating circuit, medium, program product, and electronic device
US20240028910A1 (en) Modeling method of neural network for simulation in semiconductor design process, simulation method in semiconductor design process using the same, manufacturing method of semiconductor device using the same, and semiconductor design system performing the same
CN115204086A (en) Network-on-chip simulation model, dynamic path planning method and device, and multi-core chip
CN118567816A (en) Automatic production line calculation task scheduling method, equipment, medium and product
KR20230087890A (en) System and method for optimizing integrated circuit layout based on hear source distribution image
KR20230087887A (en) Reinforcement learning device and method for automated integrated circuit layout design model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21881944

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21881944

Country of ref document: EP

Kind code of ref document: A1