CN112242963B

CN112242963B - Rapid high-concurrency neural pulse data packet distribution and transmission method and system

Info

Publication number: CN112242963B
Application number: CN202011096640.9A
Authority: CN
Inventors: 杨培超; 刘怡俊; 林文杰; 叶武剑; 陈靖宇
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2020-10-14
Filing date: 2020-10-14
Publication date: 2022-06-24
Anticipated expiration: 2040-10-14
Also published as: CN112242963A

Abstract

The invention discloses a rapid high-concurrency neural pulse data packet distribution and transmission method which is applied to a computing node and comprises the steps that when the computing node receives a neural pulse data packet, the neural pulse data packet is sent to a target CPU module through an FPGA module based on a CPU number in the neural pulse data packet; in the target CPU module, the routing core determines a target computing core and sends a nerve pulse data packet based on the computing core number and the buffer data distribution table of the nerve pulse data packet; when the target computing core receives the nerve pulse data packet, the first target neuron is determined and the nerve pulse data packet is sent according to the neuron data distribution table and the neuron number, so that the first target neuron is activated, rapid routing transmission is carried out according to nerve pulses more flexibly, the resource utilization efficiency is improved, meanwhile, the hardware change design is not needed, and the implementation cost is reduced.

Description

Rapid high-concurrency neural pulse data packet distribution and transmission method and system

Technical Field

The invention relates to the technical field of data transmission, in particular to a rapid high-concurrency nerve pulse data packet distribution and transmission method and system.

Background

The pulse neural network (SNN) is a principle that information transmission between neurons is realized in a pulse form, and is closest to a biological neural network, and is often known as a third-generation artificial neural network. The method simulates the working principle of human brain neurons, when the membrane potential of the neurons exceeds a response threshold value, a pulse event occurs, generated pulse information is transmitted to the neurons connected with the pulse event, and after each neuron receives the pulse, the state of the neuron is updated according to a kinetic equation to generate the next round of neuron behaviors. The degree of simulation of SNNs is typically measured by the simulation of the behavior of neurons (the computational part) and the transmission of neural impulses (the communication part). The main feature of the communication section is that information is represented between neurons in the timing and frequency of the pulses. And the computing part is limited by the bottleneck of computer structure such as the operating frequency and the small high-speed storage capacity of a single processor, and if the brain-like computer needs to support the SNN real-time analog simulation of hundreds of millions of neurons, a distributed cluster environment of the brain-like computer needs to be built, and a matched communication scheme is used for accelerating parallel operation. It can be known that, in the multi-level structure system of node-CPU-core-neuron, it is a very important link to design a routing mechanism that supports efficient low-delay transceiving of massive pulse data packets.

In the prior art, dedicated links are generally designed between nodes for transmission, or one-to-many transmission is performed in a multicast or cluster manner, each function in a computing node is split into each co-processing module, and a generated pulse data packet is sent to a routing communication unit in a node in a data bus manner. However, the above method is usually that a designer designs a routing mechanism in a hardware angle, which is not strong in expandability and low in resource utilization rate, and is difficult to flexibly optimize according to the characteristics of pulses, and the implementation cost is high.

Disclosure of Invention

The invention provides a rapid high-concurrency neural pulse data packet distribution and transmission method and a system, and solves the technical problems that in the prior art, a data routing mechanism is usually designed based on hardware, so that the expandability is not strong, the resource utilization rate is not high, flexible optimization is difficult to perform according to the characteristics of pulses, and the implementation cost is high.

The invention provides a rapid high-concurrency neural pulse data packet distribution and transmission method which is applied to a computing node, wherein the computing node comprises an FPGA module and a plurality of CPU modules, each CPU module comprises a routing core and a plurality of computing cores, each computing core comprises at least one neuron, and the method comprises the following steps:

when the computing node receives a nerve pulse data packet, the FPGA module sends the nerve pulse data packet to a target CPU module corresponding to the CPU number according to the CPU number in the nerve pulse data packet;

when the target CPU module receives the nerve pulse data packet, sending the nerve pulse data packet to a target computing core corresponding to the computing core number through the routing core according to a preset buffer area data distribution table and the computing core number in the nerve pulse data packet;

when the target computing core receives the nerve pulse data packet, sending the nerve pulse data packet to a first target neuron corresponding to the neuron number through the target computing core according to a preset neuron data distribution table and the neuron number in the nerve pulse data packet;

activating, by the target computing core, the first target neuron with the neural pulse data packet.

Optionally, the method further comprises:

generating a new neural impulse data packet by the first target neuron and returning to the target computational core;

reselecting a second target neuron by the target computing core according to the new neural pulse data packet and a preset neuron data sending table;

sending, by the target computing core, the new neural impulse data packet to the second target neuron.

Optionally, the CPU module further includes a communication core, the routing core is provided with a receiving buffer, and when the target CPU module receives the neuropulse data packet, the step of sending the neuropulse data packet to a target computing core corresponding to the computing core number through the routing core according to a preset buffer data distribution table and a computing core number in the neuropulse data packet includes:

when the target CPU module receives the nerve pulse data packet, storing the nerve pulse data packet to the receiving buffer area through the communication core;

searching a calculation core number in the nerve pulse data packet in a preset buffer data distribution table through the routing core;

and sending the nerve pulse data packet in the receiving buffer area to a target computing core corresponding to the computing core number through the routing core.

Optionally, the computing node has hardware configuration information and neuron scale information, and before the step of receiving the neural impulse data packet by the computing node, the method further comprises:

according to the hardware configuration information, the neuron scale information and a preset adjacency matrix, carrying out hierarchical division on a plurality of neurons, and determining the neuron number of each neuron;

performing column traversal operation on the adjacent matrix by taking a calculation core as a unit, determining the calculation core to which each neuron belongs, and constructing a buffer area data distribution table;

and performing column traversal operation on the adjacent matrix by taking the neuron as a unit, determining the neuron receiving the nerve pulse data packet, and constructing a neuron data distribution table.

And performing row traversal operation on the adjacent matrix by taking the neuron as a unit, determining the neuron receiving the new nerve pulse data packet, and constructing a neuron data sending table.

Optionally, the routing core is further provided with a sending buffer, and the step of sending the new neural impulse packet to the second target neuron through the target computing core includes:

sending, by the target computing core, the new neural impulse data packet to the second target neuron when the second target neuron is in the target computing core;

when the second target neuron is not in the target computing core, saving the new neural impulse data packet to the sending buffer through the target computing core;

sending, by the communication core, the new neural impulse data packet to the FPGA module;

sending, by the FPGA module, the new neural impulse data packet to the second target neuron.

The invention also provides a rapid high-concurrency neural pulse data packet distribution and transmission system, which comprises a computing node, wherein the computing node comprises an FPGA module and a plurality of CPU modules, each CPU module comprises a routing core and a plurality of computing cores, and each computing core comprises at least one neuron;

the FPGA module comprises:

the target CPU data sending submodule is used for sending the nerve pulse data packet to a target CPU module corresponding to a CPU number according to the CPU number in the nerve pulse data packet when the computing node receives the nerve pulse data packet;

the routing core includes:

the target calculation core data sending submodule is used for sending the nerve pulse data packet to a target calculation core corresponding to the calculation core number according to a preset buffer area data distribution table and the calculation core number in the nerve pulse data packet when the target CPU module receives the nerve pulse data packet;

the computing core includes:

the target neuron data sending submodule is used for sending the nerve pulse data packet to a first target neuron corresponding to the neuron number according to a preset neuron data distribution table and the neuron number in the nerve pulse data packet when the target computing core receives the nerve pulse data packet;

an activation sub-module for activating the first target neuron using the neural pulse data packet.

Optionally, the first target neuron comprises:

the data packet returning submodule is used for generating a new nerve pulse data packet and returning the new nerve pulse data packet to the target computing core;

the computing core further comprises:

the neuron reselection submodule is used for reselecting a second target neuron according to the new nerve pulse data packet and a preset neuron data sending table;

and the data packet sending submodule sends the new nerve pulse data packet to the second target neuron through the target computing core.

Optionally, the CPU module further includes a communication core, and the routing core is provided with a receiving buffer;

the communication core includes:

the data packet storage submodule is used for storing the neural pulse data packet to the receiving buffer area when the target CPU module receives the neural pulse data packet;

the target computing kernel data sending submodule comprises:

the target calculation core determining unit is used for retrieving the calculation core number in the nerve pulse data packet in a preset buffer area data distribution table;

and the calculation core data sending unit is used for sending the nerve pulse data packet in the receiving buffer area to a target calculation core corresponding to the calculation core number.

Optionally, the compute node has hardware configuration information and neuron scale information, and the system further includes:

the neuron number determining module is used for performing hierarchical division on a plurality of neurons according to the hardware configuration information, the neuron scale information and a preset adjacency matrix to determine the neuron number of each neuron;

the buffer area data distribution table construction module is used for executing column traversal operation on the adjacent matrix by taking a calculation core as a unit, determining the calculation core to which each neuron belongs and constructing a buffer area data distribution table;

a neuron data distribution table construction module, configured to perform a column traversal operation on the adjacent matrix by taking a neuron as a unit, determine a neuron receiving the nerve pulse data packet, and construct a neuron data distribution table;

and the neuron data sending table building module is used for executing row traversal operation on the adjacent matrix by taking neurons as units, determining the neurons for receiving the new nerve pulse data packets and building a neuron data sending table.

Optionally, the routing core is further provided with a sending buffer, and the data packet sending sub-module includes:

a data packet sending unit, configured to send the new neural impulse data packet to the second target neuron when the second target neuron is in the target computational core;

a saving unit, configured to save the new neural burst packet to the transmission buffer when the second target neuron is not in the target computational core;

the communication core is used for sending the new nerve pulse data packet to the FPGA module;

the FPGA module is used for sending the new nerve pulse data packet to the second target neuron.

According to the technical scheme, the invention has the following advantages:

when the calculation node receives the neural pulse data packet, the neural pulse data packet is sent to a target CPU module through an FPGA module based on the CPU number in the neural pulse data packet; in the target CPU module, the routing core determines a target computing core and sends a nerve pulse data packet based on the computing core number and the buffer data distribution table of the nerve pulse data packet; when the target computing core receives the nerve pulse data packet, the first target neuron is determined and the nerve pulse data packet is sent according to the neuron data distribution table and the neuron number, so that the first target neuron is activated, rapid routing transmission is carried out according to nerve pulses more flexibly, the resource utilization efficiency is improved, meanwhile, the hardware change design is not needed, and the implementation cost is reduced.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive labor.

Fig. 1 is a flowchart illustrating steps of a method for fast and highly concurrent nerve impulse packet distribution and transmission according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a hierarchical structure of a compute node according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating steps of a method for fast packet distribution and delivery of high-concurrency spikes according to an alternative embodiment of the present invention;

fig. 4 is a flowchart of steps of a routing table construction process according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an adjacency matrix according to an embodiment of the present invention;

fig. 6 is a schematic diagram of a buffer data distribution table C according to an embodiment of the present invention;

FIG. 7 is a diagram of a neuron data distribution table B according to an embodiment of the present invention;

FIG. 8 is a diagram of a neuron data transmission table A according to an embodiment of the present invention;

FIG. 9 is a diagram illustrating an encoding format of a neural impulse data packet according to an embodiment of the present invention;

fig. 10 is a schematic routing diagram of a fast high-concurrency neural burst packet distribution and transmission method according to an embodiment of the present invention;

fig. 11 is a block diagram of a fast high-concurrency neural pulse data packet distribution and transmission system according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a rapid high-concurrency neural pulse data packet distribution and transmission method and system, which are used for solving the technical problems of low expandability, low resource utilization rate, difficulty in flexible optimization according to the characteristics of pulses and high implementation cost caused by a hardware-based data routing mechanism in the prior art.

In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, fig. 1 is a flowchart illustrating a method for fast and highly concurrent nerve impulse packet distribution and transmission according to an embodiment of the present invention.

step 101, when the computing node receives a nerve pulse data packet, the FPGA module sends the nerve pulse data packet to a target CPU module corresponding to a CPU number according to the CPU number in the nerve pulse data packet;

referring to fig. 2, it can be seen that each compute node is composed of an FPGA and a plurality of CPUs, and the CPUs have cores divided into different functions, including a routing core for distributing pulses in the CPUs, a plurality of compute cores responsible for neuron computation, a system core responsible for resource scheduling, and a communication core responsible for CPU pulse transceiving. And the number of CPUs which can exist in one FPGA is often less, so that when the computing node receives the nerve pulse data packet, the CPU number carried in the nerve pulse data packet can be detected through the FPGA module so as to determine a target CPU module which needs to be sent and distribute the target CPU module to the corresponding target CPU module.

102, when the target CPU module receives the neural pulse data packet, sending the neural pulse data packet to a target computation core corresponding to the computation core number through the routing core according to a preset buffer data distribution table and the computation core number in the neural pulse data packet;

103, when the target computing core receives the nerve pulse data packet, sending the nerve pulse data packet to a first target neuron corresponding to the neuron number through the target computing core according to a preset neuron data distribution table and the neuron number in the nerve pulse data packet;

step 104, activating the first target neuron by the target computational core using the neural pulse data packet.

In the embodiment of the invention, when a computing node receives a nerve pulse data packet, the nerve pulse data packet is sent to a target CPU module through an FPGA module based on the CPU number in the nerve pulse data packet; in the target CPU module, the routing core determines a target computing core and sends a nerve pulse data packet based on the computing core number and the buffer data distribution table of the nerve pulse data packet; when the target computing core receives the nerve pulse data packet, the first target neuron is determined and the nerve pulse data packet is sent according to the neuron data distribution table and the neuron number, so that the first target neuron is activated, rapid routing transmission is carried out according to nerve pulses more flexibly, the resource utilization efficiency is improved, meanwhile, the hardware change design is not needed, and the implementation cost is reduced.

Referring to fig. 3, fig. 3 is a flowchart illustrating a method for fast and highly concurrent nerve impulse packet distribution according to an alternative embodiment of the present invention.

step 301, when the computing node receives a neural pulse data packet, sending the neural pulse data packet to a target CPU module corresponding to a CPU number through the FPGA module according to the CPU number in the neural pulse data packet;

referring to fig. 4, fig. 4 is a flow chart showing steps of a routing table construction process in the embodiment of the present invention, where the computing node has hardware configuration information and neuron scale information, and the method further includes the following steps S11-S14:

step S11, according to the hardware configuration information, the neuron scale information and a preset adjacency matrix, carrying out hierarchical division on a plurality of neurons, and determining the neuron number of each neuron;

step S12, executing column traversal operation on the adjacent matrix by taking a calculation core as a unit, determining the calculation core to which each neuron belongs, and constructing a buffer area data distribution table;

step S13, performing a column traversal operation on the adjacent matrix with neurons as a unit, determining neurons that receive the nerve pulse data packet, and constructing a neuron data distribution table.

Step S14, performing a row traversal operation on the adjacency matrix by taking a neuron as a unit, determining a neuron that receives the new nerve pulse data packet, and constructing a neuron data transmission table.

Referring to fig. 5, fig. 5 is a schematic diagram of a structure of a neighbor matrix according to an embodiment of the present invention, in which neurons numbered on the left send out nerve impulses, neurons numbered on the upper side receive nerve impulses, and the positions of intersection of the neurons are weights of the sent impulses, and the neighbor matrix comprises a CPU0 and a CPU1, which respectively comprise a computation Core0 and a Core 1.

According to the hardware configuration information, namely the number of CPUs and computing cores needed by the neural pulse data packet; the neuron scale information, that is, the total number of neurons, is to hierarchically divide the neurons by "CPU-computational core-neuron", that is, how many computational cores are set in each CPU, how many neurons are set in each computational core, so as to classify all neurons into different computational cores, and renumber each neuron. The adjacency matrix comprises information in two directions of receiving pulse and sending pulse, row-column traversal operation can be performed on the adjacency matrix by taking a calculation core as a unit so as to determine the calculation core to which each neuron belongs, namely the calculation core acted by the neuron pulse data packet, and information is graded, sorted, counted, numbered by length, entry and target calculation core, so that a buffer data distribution table is constructed; performing column traversal operation on the adjacent matrix by taking the neurons as a unit, integrating the influence of each neuron on which neurons in the calculation core, and performing hierarchical arrangement on information to count the length, the entry, the number of the target neuron and the weight so as to construct a neuron data distribution table; and performing traversal operation on the adjacent matrix by taking the neuron as a unit, integrating pulses generated by the calculation core and needing to be left locally, counting which neurons of the core are influenced by each neuron of the core, and grading information to count the length, the entry, the number of the target neuron and the weight, so as to construct and construct a neuron data sending table.

In the embodiment of the invention, in order to determine the relation among the neurons, the computing core and the CPU module before receiving the nerve pulse data packet, a plurality of routing tables, namely buffer area data distribution tables, are designed, and each CPU is provided with one part; neuron data is distributed, one for each computational core; neuronal data transmission table, one for each computational core. And because the number of neurons is usually large, the list is too long due to the large number of neurons, and the traversal time of each query is too long, so that the information of each routing table is recombined and split into a multi-stage cascade list, the traversal time is reduced, and the efficiency is improved. The specific design details of each meter are as follows:

referring to fig. 6, fig. 6 is a schematic diagram of a buffer data distribution table C in an embodiment of the present invention, where recorded information needs to be distributed to specific computing cores after each pulse is sent to the CPU, where the table C is composed of two-level concatenation lists C1 and C2. The row index of the table C1 is the number nid of the neuron in the core in the pulse packet, and two pieces of information, namely the row index entry st of the table C2 of the next stage and the length dr of the search row, are obtained, the length means the number of neurons affected by a certain pulse to the local, the row index of the table C2 is determined by taking the entry of the table C1 and the number Coreid of the core in the pulse packet as an offset, and the information mask CoreMask of the core which needs to send the pulse to the CPU is obtained.

Referring to fig. 7, fig. 7 is a diagram of a neuron data distribution table B in an embodiment of the present invention, in which specific neurons to which weights need to be applied after a pulse is received by a computational core and corresponding weight values are recorded, and the table B is composed of three-level cascade lists B1, B2, and B3. The row index of table B1 is the number Snid of the intra-core neuron in the pulse packet, and the row index entry st and length dr of the next-level table B2 are obtained. The row index of table B2 is formed by combining the entry of B1 and the core number ccid in the pulse packet as an offset, and the information stored in each row is the row index entry st and the length dr of the next-level table B3. Table B3 the row index is determined by B2, the destination neuron Dnid to which this pulse of information recorded in each row is to be applied, and the weight w, which represents the pulse strength of the neural pulse packet.

Referring to fig. 8, fig. 8 shows a schematic diagram of a neuron data transmission table a in an embodiment of the present invention, in which the influence of pulses generated after neuron activation on some computational kernels is recorded, i.e., similarly to table B, target neuron numbers and weights are recorded. Since the pulse generated by a certain computation core only comes from itself, and the core number of the pulse is not required to be queried, table a is composed of two-stage cascade lists a1 and a 2. The row index of table a1 is the number nid of the intra-core neuron in the pulse packet, and the row index entry st and length dr of the next-level table a2 are obtained. Table a2 row indices the destination neuron on which this pulse of information recorded for each row is to act, and the weight, are determined by a 1. The method also comprises the number nid of the neuron in the core and the SendMask of the sending mask, wherein if the first bit of the mask is 1, the mask needs to be sent to the local, and if the first bit of the mask is zero, the mask needs to be sent to other destinations. If the pulse with the length of 0 is recorded in the table a1, that is, the pulse is not sent to the local, the pulse is forwarded to the sending buffer of the routing core and sent to other destinations.

Step 302, when the target CPU module receives the neuropulse data packet, sending the neuropulse data packet to a target computation core corresponding to the computation core number through the routing core according to a preset buffer data distribution table and the computation core number in the neuropulse data packet;

in an example of the present invention, the CPU module further includes a communication core, the routing core is provided with a receiving buffer, and the step 302 may include the following sub-steps:

when the target CPU module receives the nerve pulse data packet, the nerve pulse data packet is stored to the receiving buffer area through the communication core;

searching a calculation core number in the nerve pulse data packet in a preset buffer area data distribution table through the routing core;

Step 303, when the target computing core receives the neural pulse data packet, sending the neural pulse data packet to a first target neuron corresponding to the neuron number through the target computing core according to a preset neuron data distribution table and the neuron number in the neural pulse data packet;

step 304, activating, by the target computing core, the first target neuron with the neural pulse data packet.

In the embodiment of the present invention, since the computation core is responsible for the behavior update of specific neurons, it is the final producer and consumer of the pulse data. Therefore, before the target computation core updates the first target neuron, the neural impulse data packet needs to be applied to a specific neuron by the target computation core, so as to activate the first target neuron.

After the first target neuron is activated, that is, after the update operation, the target computation core is further required to send a newly generated new neural pulse data packet to other neurons of the current computation core or neurons of other computation cores.

Further, the method further comprises the following steps 305-307:

step 305, generating a new neural impulse data packet by the first target neuron and returning to the target computing core;

step 306, reselecting a second target neuron by the target computing core according to the new neural pulse data packet and a preset neuron data sending table;

step 307, sending the new neural impulse data packet to the second target neuron by the target computational core.

Further, the routing core is further provided with a sending buffer, and the step 307 may include the following sub-steps:

when the second target neuron is not in the target computational core, saving the new neural impulse data packet to the sending buffer through the target computational core;

sending the new neural impulse data packet to the FPGA module through the communication core;

Optionally, referring to fig. 9, fig. 9 is a schematic diagram illustrating an encoding format of a neural impulse data packet in the embodiment of the present invention, where the encoding format includes a CPU number D + a computation core number E + a neuron number F. The pulse in the nerve pulse data packet essentially represents an event message, i.e. a certain neuron is activated, so that the number of the neuron needs to be expressed in the data packet content. Each neuro-impulse packet is assigned to a neuron on a specific computational core according to a three-layer structure in this architecture, i.e., CPU-core-neuron.

In the embodiment of the invention, when a computing node receives a nerve pulse data packet, the nerve pulse data packet is sent to a target CPU module through an FPGA module based on the CPU number in the nerve pulse data packet; in the target CPU module, the routing core determines a target computing core and sends a nerve pulse data packet based on the computing core number and the buffer data distribution table of the nerve pulse data packet; when the target computing core receives the neural pulse data packet, the first target neuron is determined and the neural pulse data packet is sent according to the neuron data distribution table and the neuron number, so that the first target neuron is activated, rapid routing transmission is carried out according to the neural pulse more flexibly, the resource utilization efficiency is improved, meanwhile, the design change on hardware is not needed, and the implementation cost is reduced.

Referring to fig. 10, fig. 10 is a schematic routing diagram illustrating a fast high-concurrency neural burst packet distribution and transmission method according to an embodiment of the present invention, and the method may include the following steps:

1. the FPGA sends a pulse to the CPU, and the CPU stores the pulse to a routing core receiving buffer area;

2. the routing core table look-up distributes the buffer data to a receiving buffer of the computing core;

3. a calculation check table applies the pulse to the neuron and updates the neuron;

4. the computational core lookup table retains the updated activated neuron pulses in the core or to other destinations.

Referring to fig. 11, fig. 11 is a block diagram illustrating a fast high-concurrency neural pulse data packet distribution and transmission system according to an embodiment of the present invention.

A rapid high-concurrency neural pulse data packet distribution and transmission system comprises a computing node, wherein the computing node comprises an FPGA module 111 and a plurality of CPU modules 114, each CPU module 114 comprises a routing core 112 and a plurality of computing cores 113, and each computing core comprises at least one neuron;

the FPGA module 111 includes:

the target CPU data sending submodule 1111 is configured to send, when the compute node receives a neural impulse data packet, the neural impulse data packet to a target CPU module corresponding to a CPU number according to the CPU number in the neural impulse data packet;

the routing core 112 includes:

a target computation core data sending submodule 1121, configured to send the neuropulse data packet to a target computation core corresponding to the computation core number according to a preset buffer data distribution table and the computation core number in the neuropulse data packet when the target CPU module receives the neuropulse data packet;

the computing core 113 includes:

a target neuron data sending sub-module 1131, configured to send, when the target computational core receives the neural pulse data packet, the neural pulse data packet to a first target neuron corresponding to the neuron number according to a preset neuron data distribution table and the neuron number in the neural pulse data packet;

an activation sub-module 1132 for activating the first target neuron with the neural pulse data packet.

Optionally, the first target neuron comprises:

the computing core 113 further includes:

the communication core includes:

the target computing kernel data sending sub-module 1131 includes:

the buffer area data distribution table building module is used for executing column traversal operation on the adjacent matrix by taking a calculation core as a unit, determining the calculation core to which each neuron belongs and building a buffer area data distribution table;

the neuron data distribution table construction module is used for executing column traversal operation on the adjacent matrix by taking neurons as units, determining the neurons for receiving the nerve pulse data packet and constructing a neuron data distribution table;

the FPGA module 111 is configured to send the new neural impulse data packet to the second target neuron.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present invention, it should be understood that the disclosed system and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A fast high-concurrency neural pulse data packet distribution and transmission method is applied to a computing node, the computing node comprises an FPGA module and a plurality of CPU modules, each CPU module comprises a routing core and a plurality of computing cores, each computing core comprises at least one neuron, and the method comprises the following steps:

activating, by the target computing core, the first target neuron with the neural pulse data packet;

the method further comprises the following steps:

reselecting a second target neuron by the target computing core according to the new nerve pulse data packet and a preset neuron data sending table;

sending, by the target computing core, the new neural impulse data packet to the second target neuron;

the CPU module further comprises a communication core, the routing core is provided with a receiving buffer area, and when the target CPU module receives the nerve pulse data packet, the step of sending the nerve pulse data packet to a target computing core corresponding to the computing core number through the routing core according to a preset buffer area data distribution table and the computing core number in the nerve pulse data packet comprises the following steps:

2. The method of claim 1, wherein the compute node has hardware configuration information and neuron size information, and wherein prior to the step of the compute node receiving a neural impulse packet, the method further comprises:

according to the hardware configuration information, the neuron scale information and a preset adjacency matrix, conducting hierarchical division on the neurons, and determining the neuron number of each neuron;

performing column traversal operation on the adjacent matrix by taking the neuron as a unit, determining the neuron receiving the nerve pulse data packet, and constructing a neuron data distribution table;

3. The method of claim 1, wherein the routing core is further configured with a transmit buffer, and wherein the step of transmitting the new neural burst packet to the second target neuron by the target computing core comprises:

4. A rapid high-concurrency neural pulse data packet distribution and transmission system is characterized by comprising a computing node, wherein the computing node comprises an FPGA module and a plurality of CPU modules, each CPU module comprises a routing core and a plurality of computing cores, and each computing core comprises at least one neuron;

the FPGA module includes:

the routing core comprises:

the target calculation core data sending submodule is used for sending the neural pulse data packet to a target calculation core corresponding to the calculation core number according to a preset buffer area data distribution table and the calculation core number in the neural pulse data packet when the target CPU module receives the neural pulse data packet;

the computing core includes:

an activation sub-module for activating the first target neuron using the neural pulse data packet;

the first target neuron comprises:

the computing core further comprises:

a data packet sending submodule for sending the new neural impulse data packet to the second target neuron through the target computing core;

the CPU module also comprises a communication core, and the routing core is provided with a receiving buffer area;

the communication core includes:

the target computing kernel data sending submodule comprises:

5. The system of claim 4, wherein the compute node has hardware configuration information and neuron scale information, the system further comprising:

6. The system of claim 4, wherein the routing core is further provided with a transmission buffer, and the packet transmission sub-module comprises:

a packet sending unit, configured to send the new neural impulse packet to the second target neuron when the second target neuron is in the target computational core;