WO2022083527A1

WO2022083527A1 - Method for determining logical core arrangement, model training method, electronic device and medium

Info

Publication number: WO2022083527A1
Application number: PCT/CN2021/124311
Authority: WO
Inventors: 邓磊; 李涵; 施路平; 祝夭龙
Original assignee: 北京灵汐科技有限公司
Priority date: 2020-10-22
Filing date: 2021-10-18
Publication date: 2022-04-28
Also published as: CN112257848A; CN112257848B

Abstract

Provided in the present disclosure is a method for determining a logical core arrangement, the method being used to arrange multiple logical cores that have determined topologies to multiple physical cores that have determined topologies. The method comprises: updating parameters of a first neural network on the basis of reinforcement learning style so as to obtain a target arrangement according to the first neural network. The target arrangement comprises a mapping relationship between the logical cores and the physical cores; the first neural network is configured to generate an arrangement action according to arrangement state data of a current time step, and the arrangement state data represents the topological structures of multiple physical cores that have determined topologies, and the mapping relationship between the arranged logical cores and physical cores; and the arrangement action represents the mapping relationship between at least one logical core and physical core to be arranged. Also provided in the present disclosure are an arrangement model training method, a method for determining a logical core arrangement, an electronic device and a computer-readable medium.

Description

Method for determining logical core layout, model training method, electronic device, medium

technical field

The embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method for determining the layout of a logical core, a training method for a layout model, a method for determining the layout of a logical core, an electronic device, and a computer-readable medium.

Background technique

Many-core architecture is a parallel processing architecture widely used to execute neural network models. As shown in Figure 1, in the many-core architecture, each physical core (CORE) can complete a certain computing function, and a certain number of physical cores (CORE) are connected through a certain topology to form a chip (CHIP). Chips (CHIP) are connected through a certain topology to form a chip array board, and so on, which can be expanded to obtain a larger-scale system.

The neural network model is deployed to the many-core architecture through the following steps: (1) Split and map the neural network model into a logical core computation graph, which is composed of multiple logical cores connected by a certain topology; (2) The Logical cores are laid out to physical cores.

In some related technologies, the effect of deploying the neural network model to the many-core architecture is not satisfactory.

SUMMARY OF THE INVENTION

Embodiments of the present disclosure provide a method for determining the layout of a logical core, a method for training a layout model, a method for determining the layout of a logical core, an electronic device, and a computer-readable medium.

In a first aspect, an embodiment of the present disclosure provides a method for determining logical core layout, for placing multiple logical cores with a determined topology to multiple physical cores with a determined topology, the method comprising: updating based on a reinforcement learning method parameters of the first neural network to obtain a target layout according to the first neural network, where the target layout includes the mapping relationship between the logical core and the physical core; the first neural network is configured to be based on the layout state of the current time step The data generates a layout action, and the layout state data represents the topology structure of a plurality of the physical cores with a determined topology, and the mapping relationship between the logical cores and the physical cores that have been laid out; the layout action represents at least one to-be-to-be. The mapping relationship between the logical core and the physical core in the layout.

In some embodiments, before the step of updating the parameters of the first neural network based on the reinforcement learning method, the method further includes: determining a data-based representation structure, the data-based representation structure representing the mapping between logical cores and physical cores relationship and topology structure of multiple physical cores, and the layout state data of the current time step conforms to the data representation structure.

In some embodiments, the step of updating the parameters of the first neural network based on the reinforcement learning method includes: generating, by the first neural network, the layout action of the current time step according to the layout state data of the current time step; The gain parameter of the current time step updates the parameters of the first neural network to increase the expected value of the gain parameter of the current time step; the gain parameter includes at least the actual gain of the layout state of the current time step ; determine whether the learning termination condition is satisfied, if so, the learning ends, and if otherwise, return to the step of generating the layout action of the current time step through the first neural network.

In some embodiments, the step of updating the parameters of the first neural network according to the gain parameter of the current time step to increase the expected value of the gain parameter of the current time step includes: according to the gain parameter of the current time step Layout state data and the layout action of the current time step, determine the overall gain of the current time step through the second neural network; update the parameters of the second neural network according to the overall gain of the current time step, so that all The overall revenue of the current time step is close to the expected value of the cumulative revenue of the current time step, and the cumulative revenue of the current time step is determined by the actual revenue of the current time step and the actual revenue of all subsequent time steps; according to the The overall gain at the current time step updates the parameters of the first neural network to increase the expected value of the overall gain at the current time step.

In some embodiments, the method further comprises: determining the actual benefit of the current time step according to the layout state data of the current time step and the layout actions of the current time step.

In some embodiments, the step of determining the actual benefit of the current time step according to the layout state data of the current time step and the layout action of the current time step includes: there is no layout at the current time step In the case of a logical core on a physical core, the predetermined benefit value is determined as the actual benefit of the current time step; in the case that there is no logical core that is not laid out on the physical core in the current time step, according to the The operational performance of the layout state for the current time step determines the actual gain for that current time step.

In some embodiments, the operational performance includes at least one of latency, throughput, power consumption.

In some embodiments, the cumulative return of the current time step is equal to the sum of the actual return of the current time step and the result of weighting the actual return of each subsequent time step by the discount coefficient of each subsequent time step, so The discount coefficient represents the magnitude of the impact of the layout action of the subsequent time step on the overall revenue of the current time step.

In some embodiments, the discount coefficient for each subsequent time step decreases one by one.

In some embodiments, the step of judging whether the learning termination condition is satisfied includes: judging whether there are logical cores that are not allocated to the physical core at the current time step; and that there is no logical core that is not allocated to the physical core at the current time step In the case of the logic core of , it is determined whether the iteration termination condition is satisfied; in the case that the iteration termination condition is satisfied, it is determined that the learning termination condition is satisfied.

In some embodiments, the iteration termination condition satisfies at least one of the following conditions: the number of iterations of the current time step reaches a predetermined number of iterations, the overall benefit of the current time step reaches a predetermined benefit value, the The parameters of one neural network and the parameters of the second neural network both converge.

In some embodiments, the method further includes: generating the layout state data of the next time step according to the layout state data of the current time step and the layout action of the current time step; there is no layout state data in the current time step In the case that the logical core is not laid out on the physical core, determine whether the running performance of the layout represented by the layout state data of the next time step is better than the running performance of the optimal layout of the current time step; When the running performance of the layout represented by the layout state data of one time step is better than the running performance of the optimal layout of the current time step, the layout represented by the layout state data of the next time step is determined to be the next time step. The optimal layout of a time step; in the case that the running performance of the layout represented by the layout state data of the next time step is inferior to the running performance of the optimal layout of the current time step, the The optimal layout is taken as the optimal layout for the next time step.

In some embodiments, when the iteration termination condition is not satisfied and there is no logical core that is not placed on a physical core at the current time step, the method further includes: placing the placement of the next time step The state data is reset to the initial layout state data.

In some embodiments, when the iteration termination condition is satisfied, the step of obtaining the target layout includes: taking the optimal layout in the next time step as the target layout.

In some embodiments, when the iteration termination condition is satisfied, the step of obtaining the target layout includes: determining that there is no at least one time step of a logical core that is not placed on a physical core before the current time step Among the layouts represented by the layout state data of , the layout with the best running performance is determined; the layout with the best running performance is determined as the target layout.

In some embodiments, the first neural network is any one of a convolutional neural network, a recurrent neural network, and a graph neural network.

In some embodiments, the method further includes: determining, according to a predetermined algorithm, identification information of a plurality of logical cores having a determined topology.

In some embodiments, according to a predetermined algorithm, the step of determining the identification information of the plurality of logical cores having the determined topology includes: determining the identification information of the plurality of logical cores according to signal flow directions in the plurality of logical cores.

In some embodiments, the number of logical cores placed to physical cores in each placement action is a fixed value.

In a second aspect, an embodiment of the present disclosure provides a training method for a layout model, where the layout model is used to layout a plurality of logical cores with a definite topology to a plurality of physical cores with a definite topology, and the training method includes: determining A plurality of samples, each of which includes information of a plurality of logic cores having a determined topology; performing any one of the above-mentioned methods for determining the layout of logic cores on the samples; and using the obtained first neural network as a layout model.

In a third aspect, an embodiment of the present disclosure provides a method for determining logical core layout, for arranging a plurality of logical cores with a determined topology to a plurality of physical cores with a determined topology, the method comprising: The identification information of the multiple logic cores is input into the layout model to obtain the target layout of the multiple logic cores with the determined topology; the layout model is obtained according to the training method of the layout model described in the second aspect of the embodiment of the present disclosure.

In a fourth aspect, embodiments of the present disclosure provide an electronic device, which includes: one or more processors; a storage device on which one or more programs are stored, when the one or more programs are stored by the one or more programs A plurality of processors execute, so that the one or more processors implement at least one of the following methods: the method for determining a logical core layout according to the first aspect of the embodiment of the present disclosure; according to the second aspect of the embodiment of the present disclosure The training method of the layout model; the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure.

In a fifth aspect, an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements at least one of the following methods: according to the first aspect of the embodiment of the present disclosure The method for determining the layout of the logic core according to the method of the present disclosure; the training method for the layout model according to the second aspect of the embodiment of the present disclosure; and the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure.

In the method for determining the layout of a logical core provided by the embodiment of the present disclosure, based on reinforcement learning, the first neural network processes the layout state data that can represent the topology structure of the physical core and the mapping relationship between the logical core and the physical core, and generates The layout action is performed, and the parameters of the first neural network are updated according to the profit determined by the running performance, and finally the layout of the logic core whose running performance meets the preset requirements can be obtained. The optimization of the logic core layout based on reinforcement learning effectively reduces the search space of the logic core layout, which is conducive to the search for a better layout of the logic core, and can effectively solve the problems on the chip or between chips when the many-core architecture system runs the neural network model. Data transmission delay and non-uniformity of data transmission delay, so as to effectively ensure the running performance of the neural network model after layout.

Description of drawings

The accompanying drawings are used to provide a further understanding of the embodiments of the present disclosure, and constitute a part of the specification, and together with the embodiments of the present disclosure, they are used to explain the present disclosure, and do not constitute a limitation to the present disclosure. The above and other features and advantages will become more apparent to those skilled in the art by describing detailed example embodiments with reference to the accompanying drawings, in which:

Figure 1 is a schematic diagram of a many-core architecture.

FIG. 2 is a flowchart of some steps in a method for determining the layout of a logic core provided by an embodiment of the present disclosure.

FIG. 3 is a schematic flowchart of optimizing the layout of logic cores based on reinforcement learning in an embodiment of the present disclosure.

FIG. 4 is a flowchart of some steps in another method for determining the layout of a logic core provided by an embodiment of the present disclosure.

FIG. 5 is a schematic diagram of a model for optimizing logic core layout based on reinforcement learning in an embodiment of the present disclosure.

FIG. 6 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.

FIG. 7 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.

FIG. 8 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.

FIG. 9 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.

FIG. 10 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.

FIG. 11 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.

FIG. 12 is a flowchart of some steps in still another method for determining the layout of a logic core provided by an embodiment of the present disclosure.

FIG. 13 is a flowchart of a training method for a layout model provided by an embodiment of the present disclosure.

FIG. 14 is a flowchart of a method for determining the layout of a logic core provided by an embodiment of the present disclosure.

FIG. 15 is a block diagram of the composition of an electronic device according to an embodiment of the present disclosure.

FIG. 16 is a block diagram of the composition of a computer-readable medium provided by an embodiment of the present disclosure.

Detailed ways

In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the following describes the method for determining the layout of the logic core, the training method for the layout model, the method for determining the layout of the logic core, the electronic equipment, and the computer software provided by the present disclosure with reference to the accompanying drawings. Read the medium for a detailed description.

Example embodiments are described more fully hereinafter with reference to the accompanying drawings, but which may be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Various embodiments of the present disclosure and various features of the embodiments may be combined with each other without conflict.

As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is used to describe particular embodiments only and is not intended to limit the present disclosure. As used herein, the singular forms "a" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that when the terms "comprising" and/or "made of" are used in this specification, the stated features, integers, steps, operations, elements and/or components are specified to be present, but not precluded or Add one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will also be understood that terms such as those defined in common dictionaries should be construed as having meanings consistent with their meanings in the context of the related art and the present disclosure, and will not be construed as having idealized or over-formal meanings, unless expressly so limited herein.

The inventors of the present disclosure have found that when a neural network model is deployed into a many-core architecture, even if the logical core calculation graphs after the neural network model is split are consistent, different physical layouts of physical cores will lead to different inter-core delays. , and with the expansion of the scale of the system composed of multiple chips or multiple chip array boards, it will lead to greater data transmission delay, and non-uniform data transmission delay due to uneven communication capabilities on-chip or between chips. question.

In some related technologies, sequential layout (sequential layout of logical cores on physical cores according to the sequence number of logical cores) or heuristic search (random search according to certain rules) is used to optimize the layout of logical cores to physical cores. However, The sequential layout scheme is not optimized for the above-mentioned non-uniformity of data transmission delay and data transmission delay, and the running performance of the neural network model after layout is poor; when the system scale of many-core architecture is large, the heuristic search scheme is due to The search space is huge and the optimal layout cannot be obtained. Therefore, in some related technologies, the solution of deploying the neural network model to the many-core architecture cannot solve the above-mentioned non-uniform problems of data transmission delay and data transmission delay, and thus cannot ensure the running performance of the neural network model after the layout.

In view of this, in the first aspect, referring to FIG. 2 , an embodiment of the present disclosure provides a method for determining logical core layout, for laying out multiple logical cores having a determined topology to multiple physical cores having a determined topology, the Methods include:

In step S100, the parameters of the first neural network are updated based on the reinforcement learning method to obtain the target layout according to the first neural network; the first neural network is configured to generate layout actions according to the layout state data of the current time step.

The target layout may include topology structures of multiple physical cores, and a mapping relationship between logical cores and physical cores.

FIG. 3 is a schematic flowchart of optimizing the layout of logic cores based on Reinforcement Learning (RL, Reinforcement Learning) in an embodiment of the present disclosure. As shown in Figure 3, in reinforcement learning, the agent (Agent) generates layout actions at each time step, and then the environment (Environment) evaluates the benefits of the current time step, updates the parameters of the agent, and then updates the agent's parameters. Layout strategy, in order to obtain the expectation of greater profit, through multiple iterations, a logical core layout that meets the requirements can finally be obtained, that is, the target layout. In Figure 3, "Available physical cores" represents idle physical cores, that is, physical cores without logical cores, and "Placed logic cores" represents non-idle physical cores, that is, physical cores with logical cores corresponding to the layout.

In the embodiment of the present disclosure, a first neural network is constructed as the agent, the first neural network can process layout state data and generate layout actions, wherein the layout state data represents the topology structure, and the mapping relationship between the logical cores and physical cores that have been laid out; the layout action represents the mapping relationship between at least one logical core to be laid out and the physical cores.

In the reinforcement learning process shown in Figure 3, the benefit of the current time step is determined according to the running performance of the neural network model that needs to be deployed to the many-core architecture in the layout state of the current time step. The embodiment of the present disclosure does not specifically limit the specific type of the running performance. For example, the operational performance may be at least one of latency, throughput, power consumption. The latency (Latency) shown in FIG. 3 is only an exemplary illustration. The benefit is determined according to the operating performance, so that the corresponding operating performance of the target layout obtained by executing step S100 can meet the preset requirements.

In the embodiment of the present disclosure, when the target layout of the multiple logic cores with the determined topology is obtained through step S100, the process of adjusting the parameters of the first neural network is also completed, so that the first neural network can be run in the subsequent operation. It is expected that better results can be obtained, that is, a target layout with better actual running performance can be obtained.

As an application mode of the embodiment of the present disclosure, it may be to obtain the required optimal target layout through the method of step S100 for a plurality of logic cores with a determined topology that need to be actually laid out, and use the optimal target layout The actual placement of the plurality of logical cores is performed.

As another application manner of the embodiment of the present disclosure, it may be to process multiple logic cores with a determined topology in the manner of step S100 to improve the parameters in the first neural network, which is equivalent to processing the first neural network Do "training". Of course, in this process, the corresponding target layout may actually be obtained, but the target layout may not be actually applied.

As another application manner of the embodiment of the present disclosure, a plurality of logic cores with different topologies may be processed by means of step S100 to improve parameters in the first neural network and complete the training of the first neural network. Therefore, in the subsequent process, the trained first neural network can be directly used to lay out a plurality of logic cores with an arbitrarily determined topology, and the parameters of the first neural network do not need to be changed in this process.

The embodiments of the present disclosure define a data representation structure, which matches the connection topology of the physical core, and can digitize the topology structure of the physical core and the mapping relationship between the logical core and the physical core that has been laid out into a first neural network. Identify and process data so that reinforcement learning can be applied to optimize the placement of logical cores.

Correspondingly, referring to FIG. 4, in some embodiments, before step S100, the method further includes:

In step S200, a data-based representation structure is determined, the data-based representation structure represents the mapping relationship between logical cores and physical cores and the topology structures of multiple physical cores, and the layout state data of the current time step conforms to the data-based representation structure.

It should be noted that, in step S100, the layout state data of the current time step conforms to the data representation structure; the generated layout actions may also be represented by data conforming to the data representation structure.

The embodiment of the present disclosure does not specifically limit the specific form of the data representation structure. For example, the topology of multiple physical cores can be represented by coordinates, and the mapping relationship between logical cores and physical cores can be represented by the correspondence between the identifiers of logical cores and the coordinates of physical cores; the topology of multiple physical cores can also be represented by a two-dimensional matrix. The structure, and the mapping relationship between logical cores and physical cores; the topology structure of multiple physical cores and the mapping relationship between logical cores and physical cores can also be represented by three-dimensional diagrams.

FIG. 5 shows an optional implementation manner of the data representation structure in the embodiment of the present disclosure. As shown in FIG. 5 , the topology structure of a plurality of physical cores and the mapping relationship between logical cores and physical cores are represented by a two-dimensional matrix. Among them, the elements in the matrix on the right correspond to the physical cores in the many-core architecture on the left one-to-one, that is, the topological structures of multiple physical cores are digitized; the elements with a value of 0 in the matrix on the right correspond to idle physical cores ( That is, the physical cores without logical cores are laid out), the elements that are not 0 correspond to the physical cores on which the logical cores are laid out, and the corresponding value represents the identification of the logical cores deployed on the physical cores, that is, the mapping relationship between the logical cores and the physical cores Digitization.

It should be noted that, in the embodiment of the present disclosure, the output of step S100 is layout data that can represent the target layout and conforms to the data representation structure in the embodiment of the present disclosure. According to the output layout data, the topology structure of multiple physical cores in the target layout can be obtained, and the mapping relationship between logical cores and physical cores in the target layout can be obtained. According to the mapping relationship between logical cores and physical cores in the target layout, the actual How to lay out logical cores to actual physical cores in many-core architecture.

As an optional implementation manner, the data representation structure is determined according to the topology structures of multiple physical cores.

It should be noted that, in the embodiments of the present disclosure, the same neural network may be used to process data conforming to different data representation structures, and corresponding neural networks may also be constructed according to different data representation structures, which is not done in the embodiments of the present disclosure Special restrictions.

In some embodiments, referring to FIG. 6 , the step of updating the parameters of the first neural network based on the reinforcement learning method includes:

In step S110, the layout action of the current time step is generated by the first neural network according to the layout state data of the current time step.

In step S120, the parameters of the first neural network are updated according to the gain parameters of the current time step to increase the expected value of the gain parameters of the current time step; the gain parameters at least include the current time step The actual gain of the layout state.

In step S130, it is judged whether the learning termination condition is satisfied, and if so, the learning ends, and if otherwise, the process returns to the step of generating the layout action of the current time step through the first neural network.

In this embodiment of the present disclosure, steps S110 to S130 correspond to a time step in the reinforcement learning iteration, and each iteration corresponds to a state from an initial layout state where no logical cores are laid out on a physical core to a state where all logical cores are deployed on the physical cores Multiple time steps for the layout state.

This embodiment of the present disclosure does not make special limitations on how to determine the gain parameter of the current time step. For example, a reward function can be constructed to calculate the benefit of the current time step according to the layout state data of the current time step and the layout actions of the current time step.

It should be noted that when the parameters of the first neural network change, the strategy of the first neural network for generating layout actions also changes. That is, different parameters of the first neural network correspond to different strategies. The expected value of the profit parameter at the current time step described in step S120 refers to keeping the parameters of the selected first neural network unchanged (that is, the selected strategy is unchanged), from the current time according to the selected strategy. The benefits that can be gained by starting to complete the placement of all logical cores to physical cores.

In the embodiment of the present disclosure, the first neural network may be a convolutional neural network, a cyclic neural network, or a graph neural network, which is not particularly limited in this embodiment of the present disclosure.

In some embodiments, referring to FIG. 7 , the updating of the parameters of the first neural network according to the gain parameter of the current time step to increase the expected value of the gain parameter of the current time step includes:

In step S121, according to the layout state data of the current time step and the layout action of the current time step, the overall benefit of the current time step is determined through the second neural network.

In step S122, the parameters of the second neural network are updated according to the overall income of the current time step, so that the overall income of the current time step is close to the expected value of the accumulated income of the current time step. The cumulative gain of a time step is determined by the actual gain of the current time step and the actual gain of the layout states corresponding to all subsequent time steps.

In step S123, the parameters of the first neural network are updated according to the overall income of the current time step, so as to increase the expected value of the overall income of the current time step.

In the embodiment of the present disclosure, the expected value of the overall benefit of the current time step represents the expected benefit that can be obtained by selecting the layout action of the current time step under the condition that the parameters of the first neural network remain unchanged.

The cumulative return of the current time step represents the discounted value of the actual return at the current time step of the subsequent time step after the current time step when the parameters of the first neural network remain unchanged, representing the value of the layout action at the current time step Relative to the value of layout actions at subsequent time steps.

In the embodiment of the present disclosure, the second neural network also continuously learns and updates the parameters of the second neural network, so that the overall revenue of the current time step determined by the second neural network is close to the expected value of the accumulated revenue of the current time step. The overall return of the current time step is close to the expected value of the cumulative return of the current time step, indicating that the overall return of the current time step generated by the second neural network is more accurate.

It should be noted that, in the embodiment of the present disclosure, since the cumulative revenue of the current time step can reflect the influence of the value of the layout action of the subsequent time step on the value of the layout action of the current time step, when the second neural network continuously learns to make When the overall revenue of the current time step is close to the expected value of the cumulative revenue of the current time step, the overall revenue of the current time step determined by the second neural network can also reflect the value of the layout action of the subsequent time step to the layout action of the current time step. value impact. When the first neural network increases the overall revenue of the current time step through continuous learning, it means that the first neural network considers the future revenue when generating the layout action of the current time step, which is more conducive to searching for multiple logic cores the best layout.

FIG. 5 shows a model for optimizing the layout of logic cores based on reinforcement learning in an embodiment of the present disclosure. As shown in FIG. 5 , the inputs of the first neural network and the second neural network are all data-based data conforming to the definition of the embodiment of the present disclosure. The layout state data representing the current time step of the structure; the first neural network outputs the layout action (action) of the current time step, and the layout action of the current time step is input to the second neural network; the second neural network outputs the whole of the current time step The return Q(s,a).

As an optional implementation, formula (1) is used to calculate the cumulative income:

where R _t is the cumulative return at time step t, n is the number of time steps in an iteration, ri is the actual return at time step _i , and γ ^it is the discount coefficient at time step i.

As an optional implementation, the value range of γ is (0, 1]. It can be understood that in formula (1), the discount coefficient of the current time step is 1, and the discount coefficient of each subsequent time step is 1. The coefficients are decreasing in turn.

In the embodiment of the present disclosure, the layout state data after executing the layout action of the current time step can be determined according to the layout state data of the current time step and the layout action of the current time step, so as to obtain the layout of the logic core. When the topology of the physical core is determined, the running performance of the logical core layout can be evaluated after the logical core layout is determined. As an optional implementation, the actual gain at the current time step represents the operational performance of the logical core layout.

Correspondingly, in some embodiments, referring to FIG. 8 , the method further includes: in step S300 , determining the current time step according to the layout state data of the current time step and the layout action of the current time step actual income.

When there are logical cores that are not laid out on physical cores, since the complete logical core layout cannot be obtained, the running performance of the logical core layout cannot be evaluated.

Correspondingly, in some embodiments, referring to FIG. 9 , step S300 includes:

In step S301, in the case that there are logical cores that are not laid out on the physical cores at the current time step, a predetermined benefit value is determined as the actual benefit of the layout state of the current time step.

In step S302, in the case that there is no logical core that is not laid out on the physical core at the current time step, the actual benefit of the layout state of the current time step is determined according to the running performance of the layout state of the current time step .

As an optional implementation manner, the predetermined benefit value in step S301 is 0.

In the embodiment of the present disclosure, how to evaluate the running performance of the logical core layout is not particularly limited. For example, a hardware model can be used to simulate the layout of the logic cores to evaluate operational performance.

This embodiment of the present disclosure does not make any special limitation on how to determine whether the learning termination condition is satisfied in step S130. As an optional implementation manner, referring to FIG. 10 , step S130 includes:

In step S131, it is determined whether there is a logical core that is not allocated to a physical core at the current time step.

In step S132, in the case that there is no logical core that is not laid out on the physical core at the current time step, it is determined whether the iteration termination condition is satisfied.

In step S133, if the iteration termination condition is satisfied, it is determined that the learning termination condition is satisfied.

This embodiment of the present disclosure does not make any special limitation on how to perform step S132 to determine whether the iteration condition is satisfied. For example, the iteration termination condition satisfies at least one of the following conditions: the number of iterations at the current time step reaches a predetermined number of iterations, the overall gain at the current time step reaches a predetermined gain value, the parameters of the first neural network and the parameters of the second neural network both converge.

In the embodiment of the present disclosure, after each iteration is executed, that is, all logical cores are laid out on the physical core, it is determined whether the optimal layout of storage needs to be updated.

Accordingly, in some embodiments, referring to FIG. 11 , the method further includes:

In step S401, the layout state data of the next time step is generated according to the layout state data of the current time step and the layout action of the current time step.

In step S402, in the case that there is no logical core that is not laid out on the physical core at the current time step, it is judged whether the running performance of the layout represented by the layout state data of the next time step is better than the current time The performance of the optimal layout of the steps.

In step S403, in the case that the running performance of the layout represented by the layout state data of the next time step is better than the running performance of the optimal layout of the current time step, the layout state of the next time step is The layout of the data representation is determined to be the optimal layout of the next time step; the running performance of the layout represented by the layout state data at the next time step is inferior to the running performance of the optimal layout of the current time step Next, the optimal layout of the current time step is used as the optimal layout of the next time step.

In this embodiment of the present disclosure, after each iteration is executed, that is, all logical cores are laid out on the physical core, the next iteration starts from the initial layout state.

Correspondingly, in some embodiments, referring to FIG. 11 , in the case that the iteration termination condition is not satisfied and there is no logical core that is not placed on the physical core at the current time step, the method further includes:

In step 404, the layout state data of the next time step is reset to the initial layout state data.

In some embodiments, when the iteration termination condition is satisfied, the stored optimal layout is used as the target layout.

As an optional implementation manner, after each iteration is executed, that is, all logical cores are laid out on the physical core, the corresponding logical core layout is stored, and the target layout is determined according to the stored logical core layout.

In some embodiments, among the layouts represented by the layout state data of at least one time step in which there is no logical core that is not placed on the physical core before the current time step is determined, the layout with the best running performance may be determined as the target layout.

In some embodiments, referring to FIG. 12 , the method further includes: in step S500, according to a predetermined algorithm, determining identification information of a plurality of logical cores having a determined topology.

This embodiment of the present disclosure does not specifically limit the predetermined algorithm described in step S500. For example, the identification information of the multiple logic cores may be determined according to the flow directions of signals in the multiple logic cores.

As an optional implementation manner, the identification information of the logical core is the serial number of the logical core.

In some embodiments, step S500 can be beneficial to obtain a better logical core layout.

That is, the number of physical cores and logical cores laid out in each layout action can be fixed instead of changing.

In a second aspect, referring to FIG. 13 , an embodiment of the present disclosure provides a method for training a layout model, where the layout model is used to layout a plurality of logical cores with a definite topology to a plurality of physical cores with a definite topology, and the training Methods include:

In step S601, a plurality of samples are determined, and each sample includes information of a plurality of logical cores having a determined topology.

Exemplarily, the information of multiple logical cores having a determined topology may include identification information of the multiple logical cores.

In step S602, any one of the above-mentioned method for determining the layout of the logic core is performed on the sample.

In step S603, the obtained first neural network is used as a layout model.

Any one of the above-mentioned method for determining the layout of the logic cores is performed on a plurality of logic cores, so as to determine the logic core layout of the plurality of logic cores, and determine the layout model.

In a third aspect, referring to FIG. 14 , an embodiment of the present disclosure provides a method for determining logical core layout, for placing a plurality of logical cores with a determined topology to a plurality of physical cores with a determined topology, the method comprising:

In step S700, the identification information of the multiple logic cores with the determined topology is input into the layout model to obtain the target layout of the multiple logic cores with the determined topology; the layout model is according to the second aspect of the embodiment of the present disclosure. obtained by the training method of the layout model.

In the embodiment of the present disclosure, when the logic core layout is determined through step S700, the parameters of the first neural network in the layout model remain unchanged.

In a fourth aspect, referring to FIG. 15 , an embodiment of the present disclosure provides an electronic device, which includes: one or more processors 101 ; and a memory 102 on which one or more programs are stored. or multiple processors to execute, so that one or more processors implement at least one of the following methods: the method for determining a logical core layout according to the first aspect of the embodiment of the present disclosure; The training method of the layout model described above; the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure. One or more I/O interfaces 103 are connected between the processor and the memory, and are configured to realize the information exchange between the processor and the memory.

The processor 101 is a device with data processing capability, including but not limited to a central processing unit (CPU), etc.; the memory 102 is a device with data storage capability, including but not limited to random access memory (RAM, more specifically Such as SDRAM, DDR, etc.), read only memory (ROM), electrified erasable programmable read only memory (EEPROM), flash memory (FLASH); I/O interface (read and write interface) 103 is connected between the processor 101 and the memory 102 , can realize the information interaction between the processor 101 and the memory 102, which includes but is not limited to a data bus (Bus) and the like.

In some embodiments, processor 101, memory 102, and I/O interface 103 are interconnected by bus 104, which in turn is connected to other components of the computing device.

In a fifth aspect, referring to FIG. 16 , an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements at least one of the following methods: According to the first aspect of the embodiment of the present disclosure The method for determining the layout of a logic core; the method for training a layout model according to the second aspect of the embodiment of the present disclosure; and the method for determining the layout of the logic core according to the third aspect of the embodiment of the present disclosure.

Those of ordinary skill in the art can understand that all or some of the steps in the methods disclosed above, functional modules/units in the systems, and devices can be implemented as software, firmware, hardware, and appropriate combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components Components execute cooperatively. Some or all physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules or other data flexible, removable and non-removable media. Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices, or may Any other medium used to store desired information and that can be accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and can include any information delivery media, as is well known to those of ordinary skill in the art .

Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should only be construed in a general descriptive sense and not for purposes of limitation. In some instances, it will be apparent to those skilled in the art that features, characteristics and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments, unless expressly stated otherwise. Features and/or elements are used in combination. Accordingly, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the scope of the present disclosure as set forth in the appended claims.

Claims

A method for determining logical core layout, for laying out a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, the method comprising:

The parameters of the first neural network are updated based on the reinforcement learning method, so as to obtain a target layout according to the first neural network, and the target layout includes the mapping relationship between the logical core and the physical core; the first neural network is configured according to the current The layout state data of the time step generates a layout action, and the layout state data represents the topology structure of a plurality of the physical cores having a determined topology, and the mapped relationship between the logical cores and the physical cores that have been laid out; the layout The action represents at least one mapping relationship between the logical core to be placed and the physical core.
The method according to claim 1, wherein, before the step of updating the parameters of the first neural network based on the reinforcement learning method, the method further comprises:

A data-based representation structure is determined, the data-based representation structure represents a mapping relationship between logical cores and physical cores and a topology structure of a plurality of physical cores, and the layout state data of the current time step conforms to the data-based representation structure.
The method according to claim 1, wherein the step of updating the parameters of the first neural network based on the reinforcement learning method comprises:

According to the layout state data of the current time step, the layout action of the current time step is generated by the first neural network;

Update the parameters of the first neural network according to the gain parameter of the current time step to increase the expected value of the gain parameter of the current time step; the gain parameter at least includes the actual value of the layout state of the current time step income;

It is judged whether the learning termination condition is satisfied, and if so, the learning ends, and if otherwise, the process returns to the step of generating the layout action of the current time step through the first neural network.
The method according to claim 3, wherein the updating the parameters of the first neural network according to the gain parameter of the current time step to increase the expected value of the gain parameter of the current time step comprises:

According to the layout state data of the current time step and the layout action of the current time step, determine the overall revenue of the current time step through the second neural network;

The parameters of the second neural network are updated according to the overall return of the current time step, so that the overall return of the current time step is close to the expected value of the accumulated return of the current time step, and the accumulated return of the current time step Determined by the actual payoff for the current time step and the actual payoff for all subsequent time steps;

The parameters of the first neural network are updated according to the overall gain at the current time step to increase the expected value of the overall gain at the current time step.
The method of claim 4, wherein the method further comprises:

The actual benefit of the current time step is determined according to the layout state data of the current time step and the layout action of the current time step.
The method according to claim 5, wherein the step of determining the actual benefit of the current time step according to the layout state data of the current time step and the layout actions of the current time step comprises:

In the case that the current time step has logical cores that are not arranged on the physical cores, determining a predetermined benefit value as the actual benefit of the current time step;

In the case that there is no logical core that is not placed on the physical core at the current time step, the actual benefit of the current time step is determined according to the running performance of the placement state of the current time step.
The method of claim 6, wherein the operational performance includes at least one of latency, throughput, and power consumption.
The method according to claim 4, wherein the cumulative income of the current time step is equal to the actual income of the current time step and the actual income of each subsequent time step weighted by the discount coefficient of each subsequent time step. The sum of the results, the discount coefficient represents the size of the impact of the layout actions of the subsequent time step on the overall revenue of the current time step.
The method according to claim 8, wherein the discount coefficient of each subsequent time step decreases one by one.
The method according to any one of claims 4 to 9, wherein the step of judging whether the learning termination condition is satisfied comprises:

Determine whether there is a logical core that is not laid out on the physical core at the current time step;

In the case that there is no logical core that is not laid out on the physical core at the current time step, determine whether the iteration termination condition is satisfied;

When the iteration termination condition is satisfied, it is determined that the learning termination condition is satisfied.
The method according to claim 10, wherein the iteration termination condition satisfies at least one of the following conditions: the number of iterations of the current time step reaches a predetermined number of iterations, and the overall benefit of the current time step reaches a predetermined benefit value, the parameters of the first neural network and the parameters of the second neural network all converge.
The method of claim 10, wherein the method further comprises:

According to the layout state data of the current time step and the layout action of the current time step, generate the layout state data of the next time step;

In the case that there is no logical core that is not laid out on the physical core at the current time step, determine whether the running performance of the layout represented by the layout state data of the next time step is better than the optimal layout of the current time step performance;

In the case that the running performance of the layout represented by the layout state data of the next time step is better than the running performance of the optimal layout of the current time step, determine the layout represented by the layout state data of the next time step is the optimal layout of the next time step; in the case that the running performance of the layout represented by the layout state data of the next time step is inferior to the running performance of the optimal layout of the current time step, the The optimal layout for the current time step is used as the optimal layout for the next time step.
The method according to claim 12, wherein, in the case that the iteration termination condition is not satisfied and there is no logical core that is not placed on a physical core at the current time step, the method further comprises:

The layout state data of the next time step is reset to the initial layout state data.
The method according to claim 12, wherein, when the iteration termination condition is satisfied, the step of obtaining the target layout comprises:

The optimal layout of the next time step is used as the target layout.
The method according to claim 10, wherein, when the iteration termination condition is satisfied, the step of obtaining the target layout comprises;

Before determining the current time step, there is no layout with the best running performance among the layouts represented by the layout state data of at least one time step of the logical core that is not laid out on the physical core;

The layout with the optimal running performance is determined as the target layout.
The method according to any one of claims 1 to 9, wherein the first neural network is any one of a convolutional neural network, a recurrent neural network, and a graph neural network.
The method according to any one of claims 1 to 9, wherein the method further comprises:

According to a predetermined algorithm, identification information of a plurality of logical cores having a determined topology is determined.
The method according to claim 17, wherein the step of determining, according to a predetermined algorithm, the identification information of a plurality of logical cores having a determined topology comprises:

The identification information of the plurality of logic cores is determined according to the signal flow directions in the plurality of logic cores.
9. The method of any one of claims 1 to 9, wherein the number of logical cores that are placed to physical cores in each placement action is a fixed value.
A training method for a layout model, the training method comprising:

determining a plurality of samples, each sample including information of a plurality of logical cores having a determined topology;

The execution of the sample is according to the method for determining the layout of a logic core according to any one of claims 1 to 19;

The obtained first neural network is used as a layout model.
A method for determining logical core layout, for laying out a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, the method comprising:

The identification information of the multiple logic cores with the determined topology is input into the layout model to obtain the target layout of the multiple logic cores with the determined topology; the layout model is obtained according to the training method of the layout model according to claim 20 .
An electronic device comprising:

one or more processors;

A storage device having one or more programs stored thereon that, when executed by the one or more processors, cause the one or more processors to implement at least one of the following methods:

The method for determining a logical core layout according to any one of claims 1 to 19;

The training method of the layout model according to claim 20;

The method of claim 21 for determining logical core layout.
A computer-readable medium having stored thereon a computer program that, when executed by a processor, implements at least one of the following methods:

The method for determining a logical core layout according to any one of claims 1 to 19;

The training method of the layout model according to claim 20;

The method of claim 21 for determining logical core layout.