CN116861972A

CN116861972A - Neuron cluster distribution method and device for network simulation

Info

Publication number: CN116861972A
Application number: CN202310816481.2A
Authority: CN
Inventors: 杨岚雁; 李涵
Original assignee: Beijing Lynxi Technology Co Ltd
Current assignee: Beijing Lynxi Technology Co Ltd
Priority date: 2023-07-04
Filing date: 2023-07-04
Publication date: 2023-10-10

Abstract

The present disclosure provides a method and an apparatus for assigning a neuron cluster for network simulation, where the method includes: acquiring a plurality of target neuron clusters corresponding to the neural network; pre-distributing a plurality of target neuron clusters to a plurality of processing cores, wherein at least one target neuron cluster is correspondingly pre-distributed in each processing core in at least part of the processing cores, and the current cluster memory size occupied by all target neuron clusters pre-distributed to the processing cores is smaller than or equal to the current available cluster memory size corresponding to the processing cores, and the current available cluster memory size is equal to the difference value between the available memory size and the current preset routing memory size; obtaining the size of the pre-estimated routing memory currently occupied by all target neuron clusters pre-allocated in a processing core; when the size of the pre-estimated routing memory corresponding to the at least one processing core is larger than the current preset routing memory size, the current preset routing memory size is adjusted, and the step of pre-distributing the target neuron clusters to the processing cores of the many-core system is performed in a returning mode.

Description

Neuron cluster distribution method and device for network simulation

Technical Field

The disclosure relates to the technical field of network simulation, and in particular relates to a neuron cluster allocation method and device for network simulation, electronic equipment and a computer readable storage medium.

Background

The neuron clusters are referred to from the concept of neuron clusters in the human brain, and represent a collection of neurons. The Core is the smallest computing unit in a brain-like chip (such as many-Core chip), which can also be called as computing Core or processing Core, and the brain-like chip is composed of a plurality of cores; logical neuron clusters refer to logical neuron clusters represented in a brain-like computational network model (e.g., a impulse neural network model, a convolutional neural network model, etc.). In the related art, before performing network simulation, each logic neuron cluster of the network model needs to be allocated and deployed to a corresponding core in the brain-like chip to implement operation of the brain-like calculation network model.

Disclosure of Invention

The disclosure provides a neuron cluster allocation method and device for network simulation, electronic equipment and a computer readable storage medium.

In a first aspect, the present disclosure provides a neuron cluster allocation method for network simulation, the neuron cluster allocation method comprising:

acquiring a plurality of target neuron clusters to be distributed corresponding to a neural network to be simulated;

pre-distributing the target neuron clusters to a plurality of processing cores of a many-core system, wherein at least one target neuron cluster is pre-distributed to each processing core in at least part of the processing cores, the current cluster memory size occupied by all target neuron clusters pre-distributed to the processing cores is smaller than or equal to the current available cluster memory size corresponding to the processing cores, and the current available cluster memory size corresponding to the processing cores is equal to the difference value between the available memory size of the processing cores and the current preset routing memory size;

Aiming at each processing core in at least part of the processing cores, obtaining the pre-estimated routing memory size currently occupied by all target neuron clusters pre-allocated in the processing cores;

when the size of the pre-estimated routing memory currently corresponding to at least one processing core in the at least part of processing cores is larger than the size of the current preset routing memory, the size of the current preset routing memory is adjusted, and the step of pre-distributing the target neuron clusters to the processing cores of the many-core system is performed in a returning mode until the size of the pre-estimated routing memory currently corresponding to each processing core in the at least part of processing cores is smaller than or equal to the size of the current preset routing memory.

In a second aspect, the present disclosure provides a neuron cluster allocation device for network simulation, the neuron cluster allocation device comprising:

an acquisition unit configured to acquire a plurality of target neuron clusters to be allocated corresponding to a neural network to be simulated;

a pre-allocation unit configured to pre-allocate a plurality of the target neuron clusters to a plurality of processing cores of a many-core system, wherein at least one target neuron cluster is pre-allocated for each processing core in at least part of the processing cores, and a current cluster memory size occupied by all target neuron clusters pre-allocated to the processing cores is smaller than or equal to a current available cluster memory size corresponding to the processing cores, and the current available cluster memory size corresponding to the processing cores is equal to a difference value between the available memory size of the processing cores and a current preset routing memory size;

The computing unit is configured to acquire the pre-estimated routing memory size currently occupied by all target neuron clusters pre-allocated in the processing cores for each processing core in the at least partial processing cores;

and the iteration unit is configured to adjust the current preset routing memory size when the estimated routing memory size currently corresponding to at least one processing core in the at least part of processing cores is larger than the current preset routing memory size, and return to execute the step of pre-distributing the target neuron clusters to the processing cores of the many-core system until the estimated routing memory size currently corresponding to each processing core in the at least part of processing cores is smaller than or equal to the current preset routing memory size.

In a third aspect, the present disclosure provides an electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,

the memory stores one or more computer programs executable by the at least one processor, the one or more computer programs being executable by the at least one processor to enable the at least one processor to perform the neuron cluster allocation method according to the first aspect described above.

In a fourth aspect, the present disclosure provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method for assigning a neuron cluster according to the first aspect described above.

According to the technical scheme of the neuron cluster allocation method for network simulation in the embodiment of the disclosure, the size of the currently available cluster memory corresponding to each processing core is determined by setting the size of the currently preset routing memory and according to the difference between the size of the available memory of the processing core and the size of the currently preset routing memory, when pre-allocation exists, the size of the currently available cluster memory corresponding to each processing core is also adjusted, a plurality of target neuron clusters to be allocated are pre-allocated to at least part of the processing cores based on the size of the currently available cluster memory corresponding to each processing core, after pre-allocation, the size of the currently available cluster memory is pre-allocated to at least one target neuron cluster by evaluating the size of the pre-estimated routing memory corresponding to each processing core, if the size of the currently available routing memory corresponding to at least one processing core exceeds the size of the currently preset routing memory, the size of the currently preset routing memory is calculated based on the size of the currently available cluster memory corresponding to each pre-allocated, and the size of the currently available cluster memory is calculated as per the size of the currently available memory is adjusted, and the size of the currently available cluster memory is calculated to be equal to the currently available memory size of each pre-allocated memory, therefore, the method can be beneficial to improving the memory overflow condition when the actual distribution neuron clusters run on the processing core.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art by describing in detail exemplary embodiments with reference to the attached drawings, in which:

fig. 1 is a schematic flow chart of a neuron cluster allocation method for network simulation according to an embodiment of the present disclosure;

FIG. 2 is a schematic structural diagram of a neuron cluster distribution device for network simulation according to an embodiment of the present disclosure;

fig. 3 is a block diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

For a better understanding of the technical solutions of the present disclosure, exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding, and they should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.

As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

In the embodiment of the disclosure, the network simulation system can be used for running a brain-like computing network (such as a convolutional neural network, a pulse neural network and the like), and the intelligent behavior function of the brain neural network of the brain is simulated by simulating the brain neural network structure and the information processing mechanism of the human brain. The brain-like computing network may be used to perform image processing tasks, voice processing tasks, text processing tasks, etc., and the present disclosure does not limit the specific task types performed by the brain-like computing network. The network simulation system can comprise a plurality of computing nodes, and the brain-like computing network is deployed on the plurality of computing nodes (such as a chip or a server) to realize parallel simulation.

In some embodiments, the Network simulation system may operate on a many-core system, where the many-core system may include one or more many-core chips, where the many-core chips are composed of multiple processing cores and nocs (Network-on-Chip networks) between the processing cores, the many-core chips being chips based on a unified many-core architecture, the processing cores being minimum units on the many-core chips that can be independently scheduled and that possess complete computing power, each processing core having independent on-Chip memory, where the processing cores are responsible for performing the primary computation, and the nocs are responsible for transferring data between the processing cores.

In some embodiments, where the network simulation system is operating in a many-core system, the compute nodes in the network simulation system may include one or more processing cores in the many-core chip for operating a number of neurons in a brain-like computing network.

In some embodiments, in a network simulation system, the brain-like computing network includes a plurality of neuron clusters, wherein each neuron cluster is a logical neuron cluster.

Before network simulation, a plurality of neuron clusters of the brain-like computing network are required to be distributed and deployed into corresponding processing cores in the many-core system so as to run the brain-like computing network on the many-core system, thereby realizing corresponding computing task processing.

In impulse neural network simulation, for each computing node, the neuron running on the computing node performs calculation and update of self membrane potential in response to receiving impulse information of other neurons (other neurons of the computing node or neurons of other computing nodes), and when the self membrane potential reaches a release threshold, the impulse information is released to a target neuron (other neurons of the computing node or neurons of other computing nodes) to drive the target neuron to perform calculation.

In the related art, when a plurality of neuron clusters of the brain-like computing network are allocated and deployed to corresponding processing cores in the many-core system, the allocation and deployment of the neuron clusters are generally performed based on the memory size required to be occupied by the neuron clusters, namely, only the memory size required to be occupied by the neuron clusters is considered, however, in actual operation, data communication interaction is required to be performed between different neuron clusters, the data communication interaction between different neuron clusters also needs to consume a certain memory size (hereinafter referred to as a routing memory size), and because the routing memory size is not considered when the allocation and deployment of the neuron clusters are performed, when the high-level compilation allocates too many neuron clusters to the bottom single processing core, the situation that memory overflow is generated due to too much consumption of the routing memory size is easily caused, and then the operation effect of the brain-like computing network is affected.

Therefore, the embodiment of the disclosure provides a neuron cluster allocation method for network simulation, which aims to effectively solve the technical problems in the related art.

Fig. 1 is a flowchart of a method for assigning a neuron cluster for network simulation according to an embodiment of the present disclosure, where the method for assigning a neuron cluster for network simulation, as shown in fig. 1, includes:

Step S11, a plurality of target neuron clusters to be distributed corresponding to the neural network to be simulated are obtained.

The neural network to be simulated is the brain-like computing network, and exemplary, the neural network to be simulated can be a pulse neural network or a convolutional neural network and other neural networks, and the neural network to be simulated can be used for executing image processing tasks, voice processing tasks, text processing tasks and the like.

In some embodiments, before deploying the neural network on the many-core system to perform network simulation on the neural network, obtaining a network configuration of the neural network to be simulated, where the network configuration of the neural network to be simulated includes a plurality of original neuron clusters and a number of original neuron clusters included in the neural network, a number of neurons included in each neuron cluster, a connection relationship between the neuron clusters, and information of a destination neuron cluster corresponding to each neuron cluster, and information of a source neuron cluster corresponding to each neuron cluster. The source neuron cluster corresponding to each neuron cluster refers to a neuron cluster which has a connection relation with neurons in the neuron cluster and needs to transmit information to the neurons in the neuron cluster during simulation, the destination neuron cluster corresponding to each neuron cluster refers to a neuron cluster which has a connection relation with the neurons in the neuron cluster and needs to transmit information to the neurons in the destination neuron cluster during simulation, namely, the source neuron cluster is a neuron cluster which transmits information, the destination neuron cluster is a neuron cluster which receives information, and when one neuron cluster needs to transmit information to another neuron cluster during simulation, the one neuron cluster is called a source neuron cluster corresponding to the other neuron cluster, and the other neuron cluster is called a destination neuron cluster corresponding to the one neuron cluster.

In addition, for any one of the neuron clusters, since different neurons in the neuron cluster may have a connection relationship, and the neurons in the neuron cluster may have a connection relationship with other neuron clusters, the source neuron cluster corresponding to the neuron cluster may be the neuron cluster itself or another neuron cluster.

In step S11, a plurality of target neuron clusters to be allocated are obtained according to a plurality of original neuron clusters corresponding to the neural network to be simulated.

Step S12, pre-distributing a plurality of target neuron clusters to a plurality of processing cores of the many-core system, wherein at least one target neuron cluster is pre-distributed to each processing core in at least part of the processing cores, the current cluster memory size occupied by all target neuron clusters pre-distributed to the processing cores is smaller than or equal to the current available cluster memory size corresponding to the processing cores, and the current available cluster memory size corresponding to the processing cores is equal to the difference value between the available memory size of the processing cores and the current preset routing memory size.

In the disclosed embodiment, preassigned refers to a predicted assignment but actually a temporary assignment, preassigned a plurality of target neuron clusters to a plurality of processing cores of a many-core system, and a determination is made that each of at least some of the processing cores corresponds to a preassigned target neuron cluster. At least part of the processing cores can be located on the same many-core chip in the many-core system or can be located on different many-core chips in the many-core system.

In the embodiment of the present disclosure, the current cluster memory size occupied by all the target neuron clusters pre-allocated to the processing core refers to the sum of the memory sizes occupied by all the target neuron clusters pre-allocated to the processing core, for example, 3 target neuron clusters pre-allocated to the processing core 1 are respectively target neuron cluster a, target neuron cluster B and target neuron cluster C, the memory size occupied by the target neuron cluster a is a, the memory size occupied by the target neuron cluster B is B, and the memory size occupied by the target neuron cluster C is C, and then the current cluster memory size occupied by all the target neuron clusters pre-allocated to the processing core 1 is d=a+b+c.

In the embodiment of the present disclosure, the available memory size of the processing core refers to the maximum memory size available to the processing core; the route refers to a path for carrying out data communication interaction among the neuron clusters of different processing cores and is used for carrying out communication among different neuron clusters; the current preset route memory size refers to the sum of the currently set memory sizes occupied by all routes for data communication interaction, and is set as an initial value before the pre-allocation of the neuron clusters is performed for the first time.

Since the size of the routing memory which is estimated to be actually consumed cannot be determined before the pre-allocation of the neuron clusters is performed for the first time, an initial value needs to be set as the size of the current preset routing memory, for example, the initial value of the size of the current preset routing memory is set to 480B (bytes), 480B is an empirical value, after the pre-allocation of the first time is completed, if the calculated size of the estimated routing memory exceeds the size of the current preset routing memory, the size of the current preset routing memory is adjusted, and then the pre-allocation of the neuron clusters is performed again according to the adjusted size of the current preset routing memory.

Step S13, for each processing core in at least part of the processing cores, obtaining the pre-estimated routing memory size currently occupied by all target neuron clusters pre-allocated in the processing cores.

After all the target neuron clusters are pre-allocated to at least part of the processing cores, the routing memory size required to be occupied by each target neuron cluster pre-allocated in at least part of the processing cores is obtained for each processing core in the processing cores, and the estimated routing memory size required to be occupied currently by all the target neuron clusters pre-allocated in the processing cores is determined according to the sum of the routing memory sizes required to be occupied by all the target neuron clusters pre-allocated in the processing cores.

The size of the routing memory required to be occupied by each target neuron cluster is the product of the number of routes corresponding to each target neuron cluster and the size of the basic routing memory, and the number of routes corresponding to each target neuron cluster is the number of source neuron clusters corresponding to the target neuron cluster and in the same many-core chip with the target neuron cluster.

In some embodiments, after pre-allocating all target neuron clusters to at least a portion of the processing cores, determining a core chip in which each processing core of the at least portion of the processing cores is located, traversing each target neuron cluster, determining all source neuron clusters corresponding to each target neuron cluster, thereby determining a number of routes corresponding to each target neuron cluster, the number of routes corresponding to each target neuron cluster being a number of source neuron clusters corresponding to the target neuron cluster within the same core chip, where the number of routes is an on-chip number of routes, and then determining a size of routing memory that each target neuron cluster needs to occupy according to a product of the number of routes corresponding to each target neuron cluster and a basic routing memory size; the basic routing memory size is the unit routing memory size, which refers to the memory size that needs to be occupied by each route, and is a predetermined value, for example, the basic routing memory size is set to 48B.

Step S14, when at least one of the processing cores has a predicted routing memory size currently corresponding to at least one of the processing cores greater than a current preset routing memory size, the current preset routing memory size is adjusted, and the step of pre-allocating the target neuron clusters to the processing cores of the many-core system is returned to be executed until the predicted routing memory size currently corresponding to each of the processing cores is less than or equal to the current preset routing memory size.

After calculating the pre-estimated routing memory size currently required to be occupied by all target neuron clusters pre-allocated in each processing core in at least part of the processing cores, comparing the pre-estimated routing memory size corresponding to each processing core with the current pre-set routing memory size, when at least one processing core currently corresponding pre-estimated routing memory size is larger than the current pre-set routing memory size, adjusting the current pre-set routing memory size, and returning to execute the step of pre-allocating the target neuron clusters to the processing cores of the many-core system so as to re-allocate the neuron clusters according to the adjusted current pre-set routing memory size until the pre-estimated routing memory size currently corresponding to each processing core in at least part of the processing cores is smaller than or equal to the current pre-set routing memory size.

According to the neuron cluster allocation method for network simulation provided by the embodiment of the present disclosure, by setting the current preset routing memory size, determining the current available cluster memory size corresponding to each processing core according to the difference between the available memory size of the processing core and the current preset routing memory size, when pre-allocation exists, pre-allocating a plurality of target neuron clusters to be allocated to at least part of the processing cores based on the current available cluster memory size corresponding to each processing core, after pre-allocation, comparing the pre-estimated routing memory size with the current preset routing memory size by evaluating the pre-estimated routing memory size corresponding to each processing core pre-allocated with at least one target neuron cluster, if the pre-estimated routing memory size corresponding to at least one processing core exceeds the current preset routing memory size, adjusting the current preset routing memory size, and when pre-allocation is performed, the pre-estimated routing memory size is equal to the current routing memory size, the pre-allocated size is calculated to the current routing memory size, and the pre-allocated size is equal to the current routing memory size, and the current routing memory size is calculated based on the current pre-allocated size, and the current routing memory size is calculated to be equal to the current pre-available memory size, therefore, the method can be beneficial to improving the memory overflow condition when the actual distribution neuron clusters run on the processing core.

In some embodiments, obtaining a plurality of target neuron clusters to be allocated corresponding to a neural network to be simulated includes: acquiring a plurality of original neuron clusters corresponding to a neural network to be simulated; judging whether the original neuron clusters meet preset splitting conditions or not according to each original neuron cluster; splitting the original neuron clusters into a plurality of sub-clusters when the original neuron clusters meet the splitting conditions; the original neuron clusters and the sub-clusters which do not meet the splitting condition are respectively determined as target neuron clusters.

Wherein, the resolution conditions include: the total memory size required to be occupied by the original neuron clusters is larger than the available memory size of a single processing core; the total memory size occupied by the original neuron clusters is the sum of the cluster memory size occupied by the original neuron clusters and the total routing memory size occupied by the original neuron clusters, the total routing memory size occupied by the original neuron clusters is the product of the number of source neuron clusters corresponding to the original neuron clusters and the basic routing memory size, and the source neuron clusters corresponding to the original neuron clusters are the original neuron clusters which have connection relation with neurons in the original neuron clusters and need to transmit information to the neurons in the original neuron clusters during simulation.

When the original neuron clusters meet the splitting condition, the total memory size required to occupy by the single original neuron cluster exceeds the available memory size of the single processing core, so that the original neuron clusters need to be split to split the original neuron clusters into a plurality of sub-clusters, and each sub-cluster is taken as a target neuron cluster, wherein the total memory size required to occupy by each sub-cluster is smaller than or equal to the available memory size of the single processing core. When the original neuron clusters do not meet the splitting condition, the total memory size required to occupy the single original neuron cluster does not exceed the available memory size of the single processing core, so that the original neuron clusters do not need to be split, and the original neuron clusters which do not meet the splitting condition are determined to be target neuron clusters.

In some embodiments, splitting the original neuron clusters into a plurality of sub-clusters comprises: constructing a current sub-cluster, wherein the current sub-cluster is an empty list; adding the current rest neurons in the original neuron clusters to the current sub-cluster one by one; when each neuron is added in the current sub-cluster, calculating the total memory size required to be occupied by the current sub-cluster; stopping adding neurons to the current sub-cluster when the total memory size required to be occupied by the current sub-cluster is larger than the available memory size of a single processing core, and removing the currently added neurons from the current sub-cluster; and constructing a new sub-cluster and taking the new sub-cluster as a current sub-cluster, and returning to execute the step of adding the current rest neurons in the original neuron cluster to the current sub-cluster one by one until the neurons in the original neuron cluster are distributed to all the sub-clusters.

In some embodiments, splitting the original neuron clusters into a plurality of sub-clusters comprises: sorting neurons in the original neuron cluster according to the sequence from the large number to the small number of other neurons correspondingly connected with each neuron in the original neuron cluster; constructing a current sub-cluster; adding the current rest neurons in the original neuron clusters to the current sub-clusters one by one according to the sequence in the sequencing; when each neuron is added in the current sub-cluster, calculating the total memory size required to be occupied by the current sub-cluster; stopping adding neurons to the current sub-cluster when the total memory size required to be occupied by the current sub-cluster is larger than the available memory size of a single processing core, and removing the currently added neurons from the current sub-cluster; and constructing a new sub-cluster, taking the new sub-cluster as a current sub-cluster, and returning to the step of adding the current rest neurons in the original neuron cluster to the current sub-cluster one by one according to the sequence in the sequence until the neurons in the original neuron cluster are distributed to all the sub-clusters.

Each original neuron cluster comprises one or more neurons, and the neurons among different original neuron clusters can have a connection relationship, and the neurons in the original neuron clusters can also have a connection relationship, so that for each original neuron cluster, each neuron in the original neuron cluster can be connected with other neurons in the cluster, can be connected with other neurons in other original neuron clusters, and can be connected with the neurons. The number of other neurons that are connected to each neuron in the original neuron cluster refers to how many other neurons are connected to each neuron in the original neuron cluster.

Therefore, when the original neuron clusters are split, the neurons with more corresponding connection with other neurons can be distributed in the same sub-cluster as much as possible, so that the increase of routes among the split sub-clusters is reduced.

In some embodiments, splitting the original neuron clusters into a plurality of sub-clusters comprises: for each neuron in the original neuron cluster, marking the neuron as an important neuron when the number of other neurons correspondingly connected with the neuron is larger than a preset number, and marking the neuron as a non-important neuron when the number of other neurons correspondingly connected with the neuron is smaller than or equal to the preset number; constructing a corresponding number of first sub-clusters according to the number of important neurons in the original neuron clusters, and correspondingly adding one important neuron in each first sub-cluster; traversing each first sub-cluster, and adding non-important neurons in the original neuron cluster to the first sub-clusters one by one when the total memory size required to be occupied by the first sub-clusters is smaller than the available memory size of a single processing core; when each non-important neuron is added in the first sub-cluster, calculating the total memory size required to be occupied by the first sub-cluster; and stopping adding non-important neurons to the first sub-cluster when the total memory size required to be occupied by the first sub-cluster is larger than the available memory size of the single processing core, and removing the currently added non-important neurons from the first sub-cluster.

Thus, after the original neuron clusters are split, each sub-cluster only comprises one important neuron and at least one non-important neuron, which is beneficial to improving the uniformity of the total memory size required to be occupied by the split sub-clusters.

In some embodiments, when there are unassigned non-significant neurons in the original neuron cluster, the method further comprises: constructing a second sub-cluster; adding the non-important neurons currently remained in the original neuron cluster to a second sub-cluster one by one; when each non-important neuron is added in the second sub-cluster, calculating the total memory size required to be occupied by the second sub-cluster; stopping adding non-important neurons to the second sub-cluster when the total memory size required to be occupied by the second sub-cluster is larger than the available memory size of the single processing core, and removing the currently added non-important neurons from the second sub-cluster; a new second sub-cluster is constructed and the step of adding the currently remaining non-significant neurons in the original neuron cluster to the second sub-cluster one by one is performed back until the neurons in the original neuron cluster are assigned to all sub-clusters.

In some embodiments, pre-assigning the plurality of target neuron clusters to a plurality of processing cores of a many-core system comprises: for the current processing core in the multiple processing cores of the many-core system, the multiple target neuron clusters remained at present are pre-allocated to the current processing core one by one; when each target neuron cluster is allocated in the current processing core, judging whether the current cluster memory size occupied by all the target neuron clusters currently allocated in the current processing core is larger than the current available cluster memory size corresponding to the current processing core; if the current cluster memory size occupied by all the currently allocated target neuron clusters in the current processing core is larger than the current available cluster memory size corresponding to the current processing core, stopping allocating the target neuron clusters to the current processing core, and removing the currently added target neuron clusters from the current processing core; taking the next processing core in the multiple processing cores of the many-core system as the current processing core, and returning to execute the step of pre-distributing the multiple target neuron clusters which are remained at present to the current processing core one by one until all the target neuron clusters are pre-distributed to at least part of the processing cores.

In some embodiments, for each of at least some of the processing cores, the pre-estimated routing memory size currently required to be occupied by all of the target neuron clusters pre-allocated in that processing core is the sum of the routing memory sizes required to be occupied by each of the target neuron clusters pre-allocated in that processing core; the size of the routing memory which is needed to be occupied by the target neuron cluster is the product of the number of source neuron clusters which correspond to the target neuron cluster and are preassigned on the same many-core chip with the target neuron cluster and the size of the basic routing memory, and the source neuron cluster which corresponds to the target neuron cluster is the target neuron cluster which has a connection relation with the neurons in the target neuron cluster and needs to transmit information to the neurons in the target neuron cluster during simulation.

In some embodiments, adjusting the current preset routing memory size includes: and adjusting the size of the current preset routing memory to be the sum of the size of the current preset routing memory and a preset value.

In some embodiments, if the difference between the predicted routing memory size corresponding to the processing core and the current preset routing memory size is smaller than 0, which indicates that the predicted routing memory size corresponding to the processing core is smaller than the current preset routing memory size, then no further adjustment is required to the current preset routing memory size.

In some embodiments, if the difference between the predicted routing memory size corresponding to the processing core and the current preset routing memory size is greater than 0 and less than or equal to the initial preset routing memory size, it is indicated that the predicted routing memory size corresponding to the processing core is greater than the current preset routing memory size, and the difference between the predicted routing memory size and the current preset routing memory size is less than the initial preset routing memory size, that is, less than the initial value, the predetermined value is the initial preset routing memory size, that is, the initial value.

In some embodiments, if the difference between the predicted route memory size corresponding to the processing core and the current preset route memory size is greater than the initial preset route memory size, it is indicated that the predicted route memory size corresponding to the processing core is greater than the current preset route memory size, and the difference is relatively greater, and in order to reduce the number of iterative adjustment, the predetermined value is half of the difference between the predicted route memory size and the current preset route memory size.

It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure. It will be appreciated by those skilled in the art that in the above-described methods of the embodiments, the particular order of execution of the steps should be determined by their function and possible inherent logic.

In addition, the disclosure further provides a neuron-cluster allocation device, an electronic device and a computer-readable storage medium for network simulation, where the neuron-cluster allocation device, the electronic device and the computer-readable storage medium for network simulation can be used to implement the neuron-cluster allocation method for network simulation provided by the disclosure, and corresponding technical schemes and descriptions and corresponding descriptions of method parts are omitted herein.

Fig. 2 is a schematic structural diagram of a neuron cluster allocation device for network simulation according to an embodiment of the present disclosure.

As shown in fig. 2, the embodiment of the present disclosure further provides a neuron cluster allocation device 200 for network simulation, the neuron cluster allocation device 200 comprising:

an obtaining unit 201, configured to obtain a plurality of target neuron clusters to be allocated corresponding to the neural network to be simulated.

A pre-allocation unit 202 configured to pre-allocate a plurality of target neuron clusters to a plurality of processing cores of the many-core system, wherein at least a portion of the processing cores are each pre-allocated with at least one target neuron cluster, and a current cluster memory size occupied by all target neuron clusters pre-allocated to the processing cores is less than or equal to a current available cluster memory size corresponding to the processing cores, and the current available cluster memory size corresponding to the processing cores is equal to a difference value between the available memory size of the processing cores and a current preset routing memory size.

A computing unit 203 configured to obtain, for each of at least part of the processing cores, an estimated routing memory size currently required to be occupied by all target neuron clusters pre-allocated in the processing core.

And an iteration unit 204 configured to, when at least one of the at least some processing cores has a currently corresponding estimated routing memory size greater than the current preset routing memory size, adjust the current preset routing memory size, and return to perform the step of pre-allocating the plurality of target neuron clusters to the plurality of processing cores of the many-core system until each of the at least some processing cores has a currently corresponding estimated routing memory size less than or equal to the current preset routing memory size.

The neuron cluster allocation device 200 provided in the embodiments of the present disclosure is configured to implement the neuron cluster allocation method for network simulation provided in any of the embodiments described above, and specific description thereof can be referred to the description thereof in the brain simulation-based data processing method in the embodiment described above, which is not repeated herein.

Fig. 3 is a block diagram of an electronic device according to an embodiment of the present disclosure, and referring to fig. 3, an embodiment of the present disclosure provides an electronic device, including: at least one processor 31; at least one memory 32, and one or more I/O interfaces 33 connected between the processor 31 and the memory 32; the memory 32 stores one or more computer programs executable by the at least one processor 31, and the one or more computer programs are executed by the at least one processor 31 to enable the at least one processor 31 to perform the above-described neuron cluster allocation method for network simulation.

The disclosed embodiments also provide a computer readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the above-described neuron cluster allocation method for network simulation. Wherein the computer readable storage medium may be a volatile or non-volatile computer readable storage medium.

Embodiments of the present disclosure also provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when executed in a processor of an electronic device, performs the above-described neuron cluster allocation method for network simulation.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).

The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, random Access Memory (RAM), read Only Memory (ROM), erasable Programmable Read Only Memory (EPROM), static Random Access Memory (SRAM), flash memory or other memory technology, portable compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.

The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.

The computer program product described herein may be embodied in hardware, software, or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims

1. A method for assigning a cluster of neurons for network simulation, comprising:

2. The method of claim 1, wherein the obtaining a plurality of target neuron clusters to be assigned corresponding to the neural network to be simulated comprises:

acquiring a plurality of original neuron clusters corresponding to a neural network to be simulated;

judging whether the original neuron clusters meet a preset splitting condition or not according to each original neuron cluster;

splitting the original neuron clusters into a plurality of sub-clusters when the original neuron clusters meet a splitting condition;

And respectively determining the original neuron clusters and the sub-clusters which do not meet the splitting condition as the target neuron clusters.

3. The method of claim 2, wherein the splitting conditions comprise: the total memory size required to be occupied by the original neuron clusters is larger than the available memory size of a single processing core;

the total memory size occupied by the original neuron clusters is the sum of the cluster memory size occupied by the original neuron clusters and the total routing memory size occupied by the original neuron clusters, the total routing memory size occupied by the original neuron clusters is the product of the number of source neuron clusters corresponding to the original neuron clusters and the basic routing memory size, and the source neuron clusters corresponding to the original neuron clusters are the original neuron clusters which have connection relation with neurons in the original neuron clusters and need to transmit information to the neurons in the original neuron clusters during simulation.

4. The method of claim 2, wherein the splitting the original neuron cluster into a plurality of sub-clusters comprises:

constructing a current sub-cluster;

adding the neurons currently remained in the original neuron clusters to the current sub-cluster one by one;

When each neuron is added in the current sub-cluster, calculating the total memory size required to be occupied by the current sub-cluster;

stopping adding neurons to the current sub-cluster when the total memory size required to be occupied by the current sub-cluster is larger than the available memory size of a single processing core, and removing the currently added neurons from the current sub-cluster;

and constructing a new sub-cluster and taking the new sub-cluster as a current sub-cluster, and returning to execute the step of adding the current rest neurons in the original neuron cluster to the current sub-cluster one by one until the neurons in the original neuron cluster are distributed to all the sub-clusters.

5. The method of claim 2, wherein the splitting the original neuron cluster into a plurality of sub-clusters comprises:

sorting neurons in the original neuron clusters according to the sequence from the big to the small of the number of other neurons correspondingly connected with the neurons;

constructing a current sub-cluster;

adding the neurons currently remaining in the original neuron clusters to the current sub-cluster one by one according to the sequence in the sorting;

and constructing a new sub-cluster and taking the new sub-cluster as a current sub-cluster, and returning to execute the step of adding the current rest neurons in the original neuron cluster to the current sub-cluster one by one according to the sequence in the sequence until the neurons in the original neuron cluster are distributed to all the sub-clusters.

6. The method of claim 2, wherein the splitting the original neuron cluster into a plurality of sub-clusters comprises:

for each neuron in the original neuron cluster, marking the neuron as an important neuron when the number of other neurons correspondingly connected with the neuron is larger than a preset number, and marking the neuron as a non-important neuron when the number of other neurons correspondingly connected with the neuron is smaller than or equal to the preset number;

constructing first sub-clusters with corresponding quantity according to the quantity of important neurons in the original neuron clusters, and correspondingly adding one important neuron in each first sub-cluster;

Traversing each first sub-cluster, and adding non-important neurons in the original neuron cluster to the first sub-clusters one by one when the total memory size required to be occupied by the first sub-clusters is smaller than the available memory size of a single processing core;

when each non-important neuron is added in the first sub-cluster, calculating the total memory size required to be occupied by the first sub-cluster;

and stopping adding non-important neurons to the first sub-cluster when the total memory size required to be occupied by the first sub-cluster is larger than the available memory size of a single processing core, and removing the currently added non-important neurons from the first sub-cluster.

7. The method of claim 6, wherein when unassigned non-important neurons are present in the original neuron cluster, the method further comprises:

constructing a second sub-cluster;

adding the non-important neurons currently remaining in the original neuron cluster to the second sub-cluster one by one;

when each non-important neuron is added in the second sub-cluster, calculating the total memory size required to be occupied by the second sub-cluster;

stopping adding non-important neurons to the second sub-cluster when the total memory size required to be occupied by the second sub-cluster is larger than the available memory size of a single processing core, and removing the currently added non-important neurons from the second sub-cluster;

A new second sub-cluster is constructed and the step of adding the currently remaining non-significant neurons in the original neuron cluster to the second sub-cluster one by one is performed back until the neurons in the original neuron cluster are assigned to all sub-clusters.

8. The method of claim 1, wherein the pre-assigning the plurality of target neuron clusters to a plurality of processing cores of a many-core system comprises:

for the current processing core in the multiple processing cores of the many-core system, the multiple target neuron clusters remained at present are pre-allocated to the current processing core one by one;

when each target neuron cluster is allocated in the current processing core, judging whether the current cluster memory size occupied by all the target neuron clusters currently allocated in the current processing core is larger than the current available cluster memory size corresponding to the current processing core;

if the current cluster memory size occupied by all the target neuron clusters currently allocated in the current processing core is larger than the current available cluster memory size corresponding to the current processing core, stopping allocating the target neuron clusters to the current processing core, and removing the target neuron clusters currently added from the current processing core;

Taking the next processing core in the multiple processing cores of the many-core system as the current processing core, and returning to execute the step of pre-distributing the multiple target neuron clusters which are remained currently to the current processing core one by one until all the target neuron clusters are pre-distributed to at least part of the processing cores.

9. The method of claim 1, wherein the pre-estimated routing memory size currently required to be occupied by all target neuron clusters pre-allocated in the processing core is the sum of the routing memory sizes required to be occupied by each target neuron cluster pre-allocated in the processing core;

the size of the routing memory which is needed to be occupied by the target neuron cluster is the product of the number of source neuron clusters which correspond to the target neuron cluster and are preassigned on the same many-core chip with the target neuron cluster and the size of the basic routing memory, and the source neuron cluster which corresponds to the target neuron cluster is a target neuron cluster which has a connection relation with neurons in the target neuron cluster and needs to transmit information to the neurons in the target neuron cluster during simulation.

10. The method of claim 1, wherein the adjusting the current preset routing memory size comprises:

And adjusting the size of the current preset routing memory to be the sum of the size of the current preset routing memory and a preset value.

11. The method of claim 10, wherein the predetermined value is an initial preset routing memory size if a difference between the pre-estimated routing memory size corresponding to the processing core and the current preset routing memory size is greater than 0 and less than or equal to the initial preset routing memory size;

if the difference between the size of the pre-estimated routing memory corresponding to the processing core and the size of the current pre-set routing memory is larger than the initial pre-set routing memory size, the predetermined value is half of the difference between the size of the pre-estimated routing memory and the size of the current pre-set routing memory.

12. A neuron cluster allocation device for network simulation, comprising:

13. An electronic device, comprising:

at least one processor; and

the memory stores one or more computer programs executable by the at least one processor to enable the at least one processor to perform the neuron cluster allocation method according to any one of claims 1-11.

14. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the neuron cluster allocation method according to any one of claims 1-11.