CN112257848A

CN112257848A - Method for determining logic core layout, model training method, electronic device, and medium

Info

Publication number: CN112257848A
Application number: CN202011141034.4A
Authority: CN
Inventors: 邓磊; 李涵; 施路平
Original assignee: Beijing Lynxi Technology Co Ltd
Current assignee: Beijing Lynxi Technology Co Ltd
Priority date: 2020-10-22
Filing date: 2020-10-22
Publication date: 2021-01-22
Anticipated expiration: 2040-10-22
Also published as: CN112257848B; WO2022083527A1

Abstract

The present disclosure provides a method of determining a logical core placement for placing a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, the method comprising: updating parameters of the first neural network based on a reinforcement learning mode so as to obtain a target layout according to the first neural network; the first neural network is configured to generate a layout action based on the layout state data for the current time step. The disclosure also provides a training method of the layout model, a method for determining the layout of the logic core, an electronic device and a computer readable medium.

Description

Method for determining logic core layout, model training method, electronic device, and medium

Technical Field

The disclosed embodiments relate to the field of computer technologies, and in particular, to a method for determining a logic core layout, a method for training a layout model, a method for determining a logic core layout, an electronic device, and a computer-readable medium.

Background

The many-core architecture is a parallel processing architecture widely used to execute neural network models. As shown in fig. 1, in the many-core architecture, each physical core can complete a certain calculation function, a certain number of physical cores are connected through a certain topology structure to form a chip, a certain number of chips are connected through a certain topology structure to form a chip array board, and so on, a larger-scale system can be obtained by expansion.

Deploying a neural network model to a many-core architecture by: (1) the neural network model is split and mapped into a logic core computation graph, and the logic core computation graph is formed by connecting a plurality of logic cores through a certain topological structure; (2) the logical cores are laid out to the physical cores.

In some related technologies, the approach of deploying a neural network model to a many-core architecture is not ideal.

Disclosure of Invention

The embodiment of the disclosure provides a method for determining logic core layout, a method for training a layout model, a method for determining logic core layout, an electronic device and a computer readable medium.

In a first aspect, an embodiment of the present disclosure provides a method for determining a logical core layout, configured to lay out a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, where the method includes:

updating parameters of the first neural network based on a reinforcement learning mode so as to obtain a target layout according to the first neural network; the first neural network is configured to generate a layout action based on the layout state data for the current time step.

In some embodiments, before the step of updating the parameters of the first neural network based on the reinforcement learning manner, the method further comprises:

and determining a datamation representation structure, wherein the datamation representation structure represents the topology of a plurality of physical cores and the mapping relation between the logic cores and the physical cores, and the layout state data of the current time step conforms to the datamation representation structure.

In some embodiments, the step of updating the parameters of the first neural network based on the reinforcement learning manner includes:

generating a layout action of the current time step through the first neural network according to the layout state data of the current time step;

updating the first neural network parameter according to the profit parameter of the current time step to increase the expectation of the profit parameter of the current time step; the profit parameter at least comprises the actual profit of the layout state of the current time step;

and judging whether a learning termination condition is met, if so, finishing learning, and if not, returning to the step of generating the layout action of the current time step through the first neural network.

In some embodiments, said updating said first neural network parameter in accordance with the benefit parameter for the current time step to increase the expectation of the benefit parameter for the current time step comprises:

determining the overall yield of the current time step through a second neural network according to the layout state data of the current time step and the layout action of the current time step;

updating parameters of the second neural network according to the overall profit of the current time step so as to enable the overall profit of the current time step to approach the expectation of the accumulated profit of the current time step, wherein the accumulated profit of the current time step is determined by the actual profit of the current time step and the actual profits of all subsequent time steps;

updating the first neural network parameters according to the overall gain of the current time step to increase the expectation of the overall gain of the current time step.

In some embodiments, the method further comprises:

and determining the actual profit of the current time step according to the layout state data of the current time step and the layout action of the current time step.

In some embodiments, the step of determining the actual benefit of the current time step based on the layout state data for the current time step and the layout actions for the current time step comprises:

determining a predetermined profit value as an actual profit of the current time step when a logical core which is not laid out on a physical core exists in the current time step;

and under the condition that the current time step does not have a logic core which is not laid out on a physical core, determining the actual benefit of the current time step according to the running performance of the layout state of the current time step.

In some embodiments, the operational performance includes at least one of latency, throughput, power consumption.

In some embodiments, the cumulative revenue for the current time step is equal to a sum of the actual revenue for the current time step and the actual revenue for each subsequent time step weighted by a discount coefficient for each subsequent time step, the discount coefficient characterizing a magnitude of an impact of a layout action for the subsequent time step on the overall revenue for the current time step.

In some embodiments, the discount coefficients for each subsequent time step are decremented one by one.

In some embodiments, the step of determining whether the learning termination condition is satisfied includes:

judging whether a logic core which is not laid out on the physical core exists in the current time step;

judging whether an iteration termination condition is met or not under the condition that no logic core which is not laid out on the physical core exists in the current time step;

in a case where the iteration end condition is satisfied, it is determined that the learning end condition is satisfied.

In some embodiments, the iteration termination condition is satisfied and includes at least one of a number of iterations of the current time step reaching a predetermined number of iterations, an overall profit of the current time step reaching a predetermined profit value, a parameter of the first neural network, and a convergence of a parameter of the second neural network.

In some embodiments, the method further comprises:

generating layout state data of the next time step according to the layout state data of the current time step and the layout action of the current time step;

under the condition that no logic core which is not laid out on the physical core exists in the current time step, judging whether the running performance of the layout represented by the layout state data of the next time step is better than the running performance of the optimal layout of the current time step;

determining the layout represented by the layout state data of the next time step as the optimal layout of the next time step under the condition that the operation performance of the layout represented by the layout state data of the next time step is better than the operation performance of the optimal layout of the current time step; and under the condition that the running performance of the layout represented by the layout state data of the next time step is inferior to that of the optimal layout of the current time step, taking the optimal layout of the current time step as the optimal layout of the next time step.

In some embodiments, in the case that the iteration termination condition is not satisfied and the current time step does not exist without being laid out onto a physical core, the method further comprises:

and resetting the layout state data of the next time step to initial layout state data.

In some embodiments, in the case that the iteration termination condition is satisfied, the step of obtaining the target layout includes:

and taking the optimal layout of the next time step as the target layout.

In some embodiments, in the case that the termination condition is satisfied, the step of obtaining the target layout comprises;

determining the layout with the optimal operation performance in the layouts represented by the layout state data of at least one time step of the logic cores which are not laid out on the physical cores before the current time step;

and determining the layout with the optimal running performance as the target layout.

In some embodiments, the first neural network is any one of a convolutional neural network, a cyclic neural network, a graph neural network.

In some embodiments, the method further comprises:

identification information of a plurality of logical cores having a determined topology is determined according to a predetermined algorithm.

In some embodiments, the step of determining identification information of a plurality of logical cores having a determined topology according to a predetermined algorithm comprises: determining the identification information of the plurality of logic cores according to signal flow directions in the plurality of logic cores.

In some embodiments, the method further comprises:

the number of logical cores laid out to the physical core in each layout action is a fixed value.

In a second aspect, an embodiment of the present disclosure provides a training method for a layout model, where the layout model is used to lay out a plurality of logical cores with a certain topology to a plurality of physical cores with a certain topology, the training method includes:

determining a plurality of samples, each sample comprising a plurality of logical core layouts having a determined topology;

a method of determining a logical core layout for the sample that performs any of the above;

and taking the obtained first neural network as a layout model.

In a third aspect, an embodiment of the present disclosure provides a method for determining a logical core layout, configured to lay out a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, where the method includes:

inputting identification information of a plurality of logic cores with determined topology into a layout model to obtain a target layout of the plurality of logic cores with determined topology; the layout model is obtained according to the training method of the layout model in the second aspect of the embodiment of the disclosure. In a fourth aspect, an embodiment of the present disclosure provides an electronic device, including:

one or more processors;

a storage device having one or more programs stored thereon that, when executed by the one or more processors, cause the one or more processors to implement at least one of the following methods:

the method for determining the layout of the logic core according to the first aspect of the embodiment of the disclosure;

the method for training the layout model according to the second aspect of the embodiment of the disclosure;

the method for determining the layout of the logic core according to the third aspect of the embodiment of the disclosure.

In a fifth aspect, the disclosed embodiments provide a computer readable medium having stored thereon a computer program that, when executed by a processor, implements at least one of the following methods:

In the method for determining the layout of the logic core provided by the embodiment of the disclosure, based on reinforcement learning, the layout state data capable of representing the topological structure of the physical core and the mapping relationship between the logic core and the physical core is processed through the first neural network to generate a layout action, and the parameter of the first neural network is updated according to the benefit determined by the operation performance, so that the layout of the logic core with the operation performance meeting the preset requirement can be finally obtained. The logic core layout is optimized based on reinforcement learning, the search space is effectively reduced, better layout is searched, the problems of data transmission delay and non-uniformity of the data transmission delay can be solved, and therefore the operation performance of the neural network model after layout is ensured.

Drawings

The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. The above and other features and advantages will become more apparent to those skilled in the art by describing in detail exemplary embodiments thereof with reference to the attached drawings, in which:

FIG. 1 is a schematic diagram of a many-core architecture;

FIG. 2 is a flowchart illustrating some steps in a method for determining a logical core layout according to an embodiment of the present disclosure;

FIG. 3 is a schematic flow chart illustrating optimization of a logic core layout based on reinforcement learning according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a model for optimizing a logic core layout based on reinforcement learning according to an embodiment of the present disclosure;

FIG. 5 is a flowchart illustrating some of the steps in another method for determining a layout of a logic core according to an embodiment of the present disclosure;

FIG. 6 is a flowchart illustrating some steps in a method for determining a logical core layout according to an embodiment of the present disclosure;

FIG. 7 is a flowchart illustrating some steps in a method for determining a logical core layout according to an embodiment of the present disclosure;

FIG. 8 is a flowchart illustrating some steps in a method for determining a logical core layout according to an embodiment of the present disclosure;

FIG. 9 is a flowchart illustrating some steps in a method for determining a logical core layout according to an embodiment of the present disclosure;

FIG. 10 is a flowchart illustrating some steps in a method for determining a logical core layout according to an embodiment of the present disclosure;

FIG. 11 is a flowchart illustrating some steps in a method for determining a logical core layout according to an embodiment of the present disclosure;

FIG. 12 is a flowchart illustrating some steps in a method for determining a logical core layout according to an embodiment of the present disclosure;

FIG. 13 is a flowchart of a method for training a layout model according to an embodiment of the present disclosure;

FIG. 14 is a flowchart of a method of determining a logical core layout according to an embodiment of the present disclosure;

fig. 15 is a block diagram of an electronic device according to an embodiment of the disclosure;

fig. 16 is a block diagram of a computer-readable medium according to an embodiment of the disclosure.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present disclosure, the following describes in detail a method for determining a layout of a logic core, a method for training a layout model, a method for determining a layout of a logic core, an electronic device, and a computer-readable medium provided in the present disclosure with reference to the accompanying drawings.

Example embodiments will be described more fully hereinafter with reference to the accompanying drawings, but which may be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict.

As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

The inventor of the present disclosure finds that, when a neural network model is deployed to a many-core architecture, even if the logic core computation graphs after the neural network model is split are consistent, different inter-core delays can be caused due to different physical layouts of physical cores, and larger data transmission delays can be caused along with the enlargement of the scale of a system formed by a plurality of chips or a plurality of chip array boards, and a non-uniform problem of data transmission delays can be caused due to the non-uniform communication capacity between chips.

In some related technologies, the layout from the logic core to the physical core is mostly optimized by adopting sequential layout (the logic core is sequentially laid on the physical core according to the serial number of the logic core) or heuristic search (random search is performed according to a certain rule), however, the scheme of sequential layout is not optimized for the non-uniform problems of data transmission delay and data transmission delay, and the running performance of a neural network model after layout is poor; when the system scale of the many-core architecture is large, the heuristic search scheme cannot obtain the optimal layout due to the huge search space. Therefore, in some related technologies, none of the schemes for deploying the neural network model to the many-core architecture can solve the above-mentioned data transmission delay and non-uniformity of the data transmission delay, and thus the operation performance of the neural network model after layout cannot be ensured.

In view of the above, in a first aspect, referring to fig. 2, an embodiment of the present disclosure provides a method for determining a logical core layout, configured to lay out a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, where the method includes:

in step S100, updating parameters of the first neural network based on a reinforcement learning manner to obtain a target layout according to the first neural network; the first neural network is configured to generate a layout action based on the layout state data for the current time step.

Fig. 3 is a schematic flowchart illustrating a process of optimizing a logic core layout based on Reinforcement Learning (RL) according to an embodiment of the present disclosure. As shown in fig. 3, in reinforcement learning, at each time step, an Agent generates a layout action, and then an Environment (Environment) evaluates the profit of the current time step, updates parameters of the Agent, and further updates the layout strategy of the Agent, so as to obtain a greater profit expectation, and through multiple iterations, a logic core layout meeting the requirements can be finally obtained.

In the embodiment of the disclosure, a first neural network is constructed as the intelligent body, and the first neural network can process layout state data and generate layout actions, wherein the layout state data represents a topological structure of a plurality of physical cores with determined topologies and a mapping relation between a laid logic core and a physical core; the layout action characterizes a mapping relationship between at least one logical core to be laid out and a physical core.

In the reinforcement learning process shown in fig. 3, the benefit of the current time step is determined according to the operation performance of the neural network model that needs to be deployed to the many-core architecture in the layout state of the current time step. The specific type of the operation performance is not particularly limited in the embodiments of the present disclosure. For example, the operational performance may be at least one of latency, throughput, power consumption. The delay (Latency) shown in fig. 3 is merely an exemplary illustration. The profit is determined according to the operation performance so that the corresponding operation performance of the target layout obtained by performing step S100 can satisfy the preset requirement.

In the embodiment of the present disclosure, when the target layout of the plurality of logic cores with the determined topology is obtained in step S100, the process of adjusting the parameters of the first neural network is also completed, so that the first neural network can obtain a better effect in expectation of subsequent operation, that is, the target layout with better actual operation performance is obtained.

As an application of the embodiment of the present disclosure, a required optimal target layout may be obtained by the method of step S100 for a plurality of logic cores with determined topologies, which need to be actually laid out, and the actual layout of the plurality of logic cores may be performed by using the optimal target layout.

As another application of the embodiment of the present disclosure, a plurality of logic cores with a certain topology may be processed in the manner of step S100 to improve parameters in the first neural network, which is equivalent to "training" the first neural network. Of course, in this process, the corresponding target layout may also be actually obtained, but the target layout may not be actually applied.

As another application of the embodiment of the present disclosure, a plurality of logic cores with different topologies may be processed in the manner of step S100 to improve parameters in the first neural network, so as to complete training of the first neural network. Therefore, in the subsequent process, the trained first neural network can be directly used for laying out the plurality of logic cores with the arbitrarily determined topology, and the parameters of the first neural network are not required to be changed in the process.

The embodiment of the disclosure defines a datamation representation structure, which is matched with the connection topology of a physical core, and can datamation the topology structure of the physical core and the mapping relation between a laid logic core and the physical core into data that can be identified and processed by a first neural network, so that reinforcement learning can be applied to optimization of logic core layout.

Accordingly, referring to fig. 5, in some embodiments, prior to step S100, the method further comprises:

in step S200, a datamation representation structure is determined, where the datamation representation structure represents topologies of a plurality of physical cores and mapping relationships between logical cores and physical cores, and the layout state data at the current time step conforms to the datamation representation structure.

It should be noted that, in step S100, the layout state data of the current time step conforms to the datamation representation structure; the generated layout actions may also be represented by data that conforms to the datalized representation structure.

The embodiment of the present disclosure does not specially limit the specific form of the datamation representation structure. For example, the topological structures of a plurality of physical cores can be represented by coordinates, and the mapping relationship between the logical core and the physical core is expressed by the correspondence between the identification of the logical core and the coordinates of the physical core; the topological structures of a plurality of physical cores and the mapping relation between the logic core and the physical core can also be represented by a two-dimensional matrix; the topology of the plurality of physical cores and the mapping relationship between the logical cores and the physical cores can also be represented by a three-dimensional graph.

FIG. 4 shows an alternative implementation of the datamation representation structure in the embodiment of the present disclosure. As shown in fig. 4, the topology of the plurality of physical cores and the mapping relationship of the logical cores and the physical cores are represented by a two-dimensional matrix. The elements in the matrix on the right side correspond to the physical cores in the many-core architecture on the left side one by one, namely, the topological structures of the physical cores are subjected to datamation; the element with the median value of 0 in the matrix on the right side corresponds to an idle physical core (i.e., a physical core without a logical core), the element without the median value of 0 corresponds to a physical core with a logical core, and the corresponding value represents the identifier of the logical core deployed on the physical core, that is, the mapping relationship between the logical core and the physical core is digitized.

It should be noted that, in the embodiment of the present disclosure, the output of step S100 is layout data capable of characterizing the target layout and conforming to the datamation characterization structure in the embodiment of the present disclosure. According to the output layout data, a topological structure of the target layout can be obtained, the mapping relation between the logic cores and the physical cores in the target layout can be obtained, and how to lay out the logic cores to the actual physical cores in the actual many-core architecture can be determined according to the mapping relation between the logic cores and the physical cores in the target layout.

As an alternative embodiment, the datamation representation structure is determined according to the topological structure of a plurality of physical cores.

It should be noted that, in the embodiment of the present disclosure, the same neural network may be used to process data conforming to different datamation representation structures, and a corresponding neural network may also be constructed according to different datamation representation structures. The embodiment of the present disclosure is not particularly limited in this regard.

In some embodiments, referring to fig. 6, the step of updating the parameters of the first neural network based on the reinforcement learning manner includes:

in step S110, generating a layout action at a current time step through the first neural network according to layout state data at the current time step;

in step S120, updating the first neural network parameter according to the profit parameter of the current time step to increase the expectation of the profit parameter of the current time step; the profit parameter at least comprises the actual profit of the layout state of the current time step;

in step S130, it is determined whether a learning termination condition is satisfied, if so, learning is terminated, otherwise, the step of generating the layout action of the current time step through the first neural network is returned.

In the disclosed embodiment, steps S110 to S130 correspond to one time step in the reinforcement learning iteration, and each iteration corresponds to a plurality of time steps from an initial layout state in which no logical core is laid out on the physical core to a layout state in which all logical cores are deployed on the physical core.

The embodiments of the present disclosure do not make any special restrictions on how to determine the revenue parameter of the current time step. For example, a reward function may be constructed that calculates a benefit at the current time step based on the layout state data for the current time step and the layout actions for the current time step.

It should be noted that when the parameters of the first neural network change, the strategy of the first neural network to generate the layout action also changes. I.e. different parameters of the first neural network correspond to different strategies. The expectation of the benefit parameter at the current time step in step S120 refers to the benefit that can be obtained by completing the layout from all the logic cores to the physical cores from the current time step according to the selected policy while maintaining the parameters of the selected first neural network (i.e., the selected policy is unchanged).

In the embodiment of the present disclosure, the first neural network may be a convolutional neural network, a cyclic neural network, or a graph neural network. The embodiment of the present disclosure is not particularly limited in this regard.

In some embodiments, referring to fig. 7, said updating the first neural network parameter according to the benefit parameter at the current time step to increase the expectation of the benefit parameter at the current time step comprises:

in step S121, determining the overall profit of the current time step through a second neural network according to the layout state data of the current time step and the layout action of the current time step;

in step S122, updating parameters of the second neural network according to the overall profit of the current time step, so that the overall profit of the current time step approaches to the expectation of the cumulative profit of the current time step, where the cumulative profit of the current time step is determined by the actual profit of the current time step and the actual profits of the layout states corresponding to all the subsequent time steps;

in step S123, the first neural network parameter is updated according to the overall profit at the current time step, so as to increase the expectation of the overall profit at the current time step.

In the disclosed embodiment, the overall profit at the current time step represents the expected profit that can be obtained by selecting the layout action at the current time step while the first neural network remains unchanged in parameters.

And under the condition that the accumulated yield of the current time step represents that the first neural network keeps the parameters unchanged, the actual yield of the subsequent time step after the current time step is the converted value of the current time step, and the value of the layout action of the current time step is related to the value of the layout action of the subsequent time step.

In the embodiment of the disclosure, the second neural network also continuously learns to update the parameters of the second neural network, so that the overall profit of the current time step determined by the second neural network approaches the expectation of the accumulated profit of the current time step. The overall gain of the current time step approaches the accumulated gain of the current time step. The overall gain of the current time step approaches the accumulated gain of the current time step, which means that the overall gain of the current time step generated by the second neural network is more accurate.

It should be noted that, in the embodiment of the present disclosure, since the cumulative benefit of the current time step can reflect the influence of the value of the layout action of the subsequent time step on the value of the layout action of the current time step, when the second neural network makes the overall benefit of the current time step approach the cumulative benefit of the current time step through continuous learning, the overall benefit of the current time step determined by the second neural network can also reflect the influence of the value of the layout action of the subsequent time step on the value of the layout action of the current time step. When the first neural network increases the overall profit at the current time step through continuous learning, it means that the first neural network takes the future profit into consideration when generating the layout action at the current time step. Thereby facilitating the search for the optimum layout.

Fig. 4 is a schematic diagram of a model for optimizing a logic core layout based on reinforcement learning in an embodiment of the present disclosure. As shown in FIG. 4, the inputs of the first neural network and the second neural network are both current layout state data of the datamation representation structure defined according to the embodiment of the disclosure; the first neural network outputs the layout action (action) of the current time step and inputs the layout action of the current time step into the second neural network; the second neural network outputs the overall yield Q (s, a) for the current time step.

As an alternative embodiment, the cumulative benefit is calculated using equation (1):

wherein R is_tFor the cumulative yield of time step t, n is the number of time steps in an iteration, r_iFor the actual benefit of time step i, gamma^i-tIs the discount coefficient of time step i.

As an alternative embodiment, γ has a value range of (0, 1).

In the embodiment of the present disclosure, the layout state data after the layout action at the current time step is executed can be determined according to the layout state data at the current time step and the layout action at the current time step, so as to obtain the layout of the logic core. In the case of topology determination of the physical cores, determining the layout of the logical cores may evaluate the operational performance of the layout of the logical cores. As an alternative embodiment, the actual benefit at the current time step characterizes the operational performance of the logical core layout.

Accordingly, in some embodiments, with reference to fig. 8, the method further comprises:

in step S300, the actual profit at the current time step is determined according to the layout state data at the current time step and the layout operation at the current time step.

In the case where there is a logical core that is not laid out on a physical core, since a complete logical core layout cannot be obtained, the operation performance of the logical core layout cannot be evaluated.

Accordingly, in some embodiments, referring to fig. 9, step S300 includes:

in step S301, in a case where there is a logical core that is not laid out on a physical core at the current time step, determining a predetermined profit value as an actual profit of a layout state at the current time step;

in step S302, when there is no logical core that is not laid out on a physical core at the current time step, an actual benefit of the layout state at the current time step is determined according to the operation performance of the layout state at the current time step.

As an alternative embodiment, the predetermined profit value in step S301 is 0.

In the embodiment of the present disclosure, how to evaluate the operation performance of the logic core layout is not particularly limited. For example, the operational performance may be evaluated by performing simulations based on the logical core layout via a hardware model.

The embodiment of the present disclosure does not make any special limitation on how to determine whether the learning termination condition is satisfied in step S130. As an alternative embodiment, referring to fig. 10, step S130 includes:

in step S131, it is determined whether there is a logical core that is not laid out on the physical core at the current time step;

in step S132, in a case that there is no logical core not laid out on the physical core at the current time step, determining whether an iteration termination condition is satisfied;

in step S133, when the iteration end condition is satisfied, it is determined that the learning end condition is satisfied.

The embodiment of the present disclosure does not make any special limitation on how to execute step S132 to determine whether the iteration condition is satisfied. For example, the iteration termination condition is satisfied and includes at least one of the number of iterations of the current time step reaching the predetermined number of iterations, the overall profit of the current time step reaching the predetermined profit value, the parameter of the first neural network, and the parameter convergence of the second neural network.

In the embodiment of the present disclosure, each time an iteration is executed, that is, all the logic cores are laid out on the physical core, it is determined whether the stored optimal layout needs to be updated.

Accordingly, in some embodiments, with reference to fig. 11, the method further comprises:

in step S401, generating layout state data of a next time step based on the layout state data of the current time step and the layout operation of the current time step;

in step S402, when there is no logical core not laid out on a physical core at the current time step, determining whether the operation performance of the layout represented by the layout state data at the next time step is better than the operation performance of the optimal layout at the current time step;

in step S403, in a case that the operation performance of the layout represented by the layout state data of the next time step is better than the operation performance of the optimal layout of the current time step, determining the layout represented by the layout state data of the next time step as the optimal layout of the next time step; and under the condition that the running performance of the layout represented by the layout state data of the next time step is inferior to that of the optimal layout of the current time step, taking the optimal layout of the current time step as the optimal layout of the next time step.

In the embodiment of the present disclosure, each time an iteration is executed, that is, all the logical cores are laid out on the physical core, the next iteration is executed from the initial layout state.

Accordingly, in some embodiments, referring to fig. 11, in a case that the iteration termination condition is not satisfied and the current time step does not exist without being laid out onto a physical core, the method further comprises:

in step 404, the layout state data of the next time step is reset to the initial layout state data.

In some embodiments, the stored optimal layout is taken as the target layout in case the iteration end condition is met.

As an optional implementation manner, after each iteration is executed, that is, all the logic cores are laid out on the physical core, the corresponding logic core layout is stored, and the target layout is determined according to the stored logic core layout.

In some embodiments, the layout with the best operation performance in the layouts characterized by the layout state data of at least one time step of the logic core which is not laid out on the physical core before the current time step is determined may be determined as the target layout.

In some embodiments, referring to fig. 12, the method further comprises:

in step S500, identification information of a plurality of logical cores having a determined topology is determined according to a predetermined algorithm.

The predetermined algorithm described in step S500 is not particularly limited in the embodiment of the present disclosure. For example, the identification information of the plurality of logical cores may be determined according to the flow direction of signals in the plurality of logical cores.

As an alternative embodiment, the identification information of the logical core is a serial number of the logical core.

The step S500 can be used to obtain a better logic core layout.

In some embodiments, the number of logical cores laid out to the physical core in each layout action is a fixed value.

That is, the number of logical cores laid out to the physical core per layout action may be fixed, rather than varying.

In a second aspect, referring to fig. 13, an embodiment of the present disclosure provides a training method of a layout model, where the layout model is used to lay out a plurality of logical cores with a certain topology to a plurality of physical cores with a certain topology, the training method includes:

in step S601, a plurality of samples are determined, each sample including a plurality of logical core layouts having a determined topology;

in step S602, a method for determining a logical core layout for the sample is performed;

in step S603, the obtained first neural network is used as a layout model.

In a third aspect, referring to fig. 14, an embodiment of the present disclosure provides a method for determining a logical core layout, configured to lay out a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, where the method includes:

in step S700, inputting identification information of a plurality of logic cores having a determined topology into a layout model to obtain a target layout of the plurality of logic cores having the determined topology; the layout model is obtained according to the training method of the layout model in the second aspect of the embodiment of the disclosure.

In the embodiment of the present disclosure, when determining the logic core layout through step S700, the parameters of the first neural network in the layout model remain unchanged.

In a fourth aspect, referring to fig. 15, an embodiment of the present disclosure provides an electronic device, including:

one or more processors 101;

a memory 102 having one or more programs stored thereon that, when executed by the one or more processors, cause the one or more processors to implement at least one of the following methods:

One or more I/O interfaces 103 coupled between the processor and the memory and configured to enable information interaction between the processor and the memory.

The processor 101 is a device with data processing capability, and includes but is not limited to a Central Processing Unit (CPU) and the like; memory 102 is a device having data storage capabilities including, but not limited to, random access memory (RAM, more specifically SDRAM, DDR, etc.), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), FLASH memory (FLASH); an I/O interface (read/write interface) 103 is connected between the processor 101 and the memory 102, and can realize information interaction between the processor 101 and the memory 102, which includes but is not limited to a data Bus (Bus) and the like.

In some embodiments, the processor 101, memory 102, and I/O interface 103 are interconnected via a bus 104, which in turn connects with other components of the computing device.

In a fifth aspect, referring to fig. 16, an embodiment of the present disclosure provides a computer readable medium, on which a computer program is stored, which when executed by a processor, implements at least one of the following methods:

It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purposes of limitation. In some instances, features, characteristics and/or elements described in connection with a particular embodiment may be used alone or in combination with features, characteristics and/or elements described in connection with other embodiments, unless expressly stated otherwise, as would be apparent to one skilled in the art. Accordingly, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims

1. A method of determining a logical core placement for placing a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, the method comprising:

2. The method of claim 1, wherein the step of updating parameters of the first neural network based on reinforcement learning is preceded by the method further comprising:

3. The method of claim 1, wherein the step of updating parameters of the first neural network based on a reinforcement learning approach comprises:

4. The method of claim 3, wherein the updating the first neural network parameter as a function of the benefit parameter for the current time step to increase the expectation of the benefit parameter for the current time step comprises:

5. The method of claim 3, wherein the method further comprises:

6. The method of claim 5, wherein determining the actual revenue for the current time step based on the layout state data for the current time step and the layout actions for the current time step comprises:

7. The method of claim 6, wherein the operational performance comprises at least one of latency, throughput, power consumption.

8. The method of claim 4, wherein the cumulative revenue for the current time step is equal to the sum of the actual revenue for the current time step and the actual revenue for each subsequent time step weighted by a discount coefficient for each subsequent time step, the discount coefficient characterizing the magnitude of the impact of layout actions for the subsequent time step on the overall revenue for the current time step.

9. The method of claim 8, wherein the discount coefficients for each subsequent time step are decremented one by one.

10. The method according to any one of claims 4 to 8, wherein the step of determining whether the learning termination condition is satisfied includes:

11. The method of claim 10, wherein the iteration termination condition is satisfied and comprises at least one of a number of iterations for the current time step reaching a predetermined number of iterations, an overall gain for the current time step reaching a predetermined gain value, a parameter of the first neural network, and a convergence of a parameter of the second neural network.

12. The method of claim 10, wherein the method further comprises:

13. The method of claim 12, wherein, in the event that the iteration termination condition is not met and the current time step does not exist without being laid out onto a physical core, the method further comprises:

14. The method of claim 12, wherein, in the case that the iteration termination condition is satisfied, the step of deriving the target layout comprises:

and taking the optimal layout of the next time step as the target layout.

15. The method of claim 10, wherein, in the event that the termination condition is satisfied, the step of deriving the target layout comprises;

16. The method of any one of claims 1 to 9, wherein the first neural network is any one of a convolutional neural network, a cyclic neural network, a graph neural network.

17. The method of any of claims 1 to 9, wherein the method further comprises:

18. The method of claim 17, wherein determining identification information of a plurality of logical cores having a determined topology according to a predetermined algorithm comprises:

determining the identification information of the plurality of logic cores according to signal flow directions in the plurality of logic cores.

19. The method of any one of claims 1 to 9,

20. A method of training a layout model, the method comprising:

execution of the sample determines a logical core layout according to any of claims 1 to 19;

and taking the obtained first neural network as a layout model.

21. A method of determining a logical core placement for placing a plurality of logical cores having a determined topology to a plurality of physical cores having a determined topology, the method comprising:

inputting identification information of a plurality of logic cores with determined topology into a layout model to obtain a target layout of the plurality of logic cores with determined topology; the layout model is obtained by the method for training a layout model according to claim 20.

22. An electronic device, comprising:

one or more processors;

a method of determining a logical core layout according to any one of claims 1 to 19;

a training method of a layout model according to claim 20;

the method of determining a layout of a logical core of claim 21.

23. A computer-readable medium, on which a computer program is stored which, when executed by a processor, implements at least one of the following methods:

a training method of a layout model according to claim 20;

the method of determining a layout of a logical core of claim 21.