CN112242963B - Rapid high-concurrency neural pulse data packet distribution and transmission method and system - Google Patents

Rapid high-concurrency neural pulse data packet distribution and transmission method and system Download PDF

Info

Publication number
CN112242963B
CN112242963B CN202011096640.9A CN202011096640A CN112242963B CN 112242963 B CN112242963 B CN 112242963B CN 202011096640 A CN202011096640 A CN 202011096640A CN 112242963 B CN112242963 B CN 112242963B
Authority
CN
China
Prior art keywords
neuron
data packet
target
core
pulse data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011096640.9A
Other languages
Chinese (zh)
Other versions
CN112242963A (en
Inventor
杨培超
刘怡俊
林文杰
叶武剑
陈靖宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202011096640.9A priority Critical patent/CN112242963B/en
Publication of CN112242963A publication Critical patent/CN112242963A/en
Application granted granted Critical
Publication of CN112242963B publication Critical patent/CN112242963B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • H04L49/9005Buffering arrangements using dynamic buffer space allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Abstract

The invention discloses a rapid high-concurrency neural pulse data packet distribution and transmission method which is applied to a computing node and comprises the steps that when the computing node receives a neural pulse data packet, the neural pulse data packet is sent to a target CPU module through an FPGA module based on a CPU number in the neural pulse data packet; in the target CPU module, the routing core determines a target computing core and sends a nerve pulse data packet based on the computing core number and the buffer data distribution table of the nerve pulse data packet; when the target computing core receives the nerve pulse data packet, the first target neuron is determined and the nerve pulse data packet is sent according to the neuron data distribution table and the neuron number, so that the first target neuron is activated, rapid routing transmission is carried out according to nerve pulses more flexibly, the resource utilization efficiency is improved, meanwhile, the hardware change design is not needed, and the implementation cost is reduced.

Description

Rapid high-concurrency neural pulse data packet distribution and transmission method and system
Technical Field
The invention relates to the technical field of data transmission, in particular to a rapid high-concurrency nerve pulse data packet distribution and transmission method and system.
Background
The pulse neural network (SNN) is a principle that information transmission between neurons is realized in a pulse form, and is closest to a biological neural network, and is often known as a third-generation artificial neural network. The method simulates the working principle of human brain neurons, when the membrane potential of the neurons exceeds a response threshold value, a pulse event occurs, generated pulse information is transmitted to the neurons connected with the pulse event, and after each neuron receives the pulse, the state of the neuron is updated according to a kinetic equation to generate the next round of neuron behaviors. The degree of simulation of SNNs is typically measured by the simulation of the behavior of neurons (the computational part) and the transmission of neural impulses (the communication part). The main feature of the communication section is that information is represented between neurons in the timing and frequency of the pulses. And the computing part is limited by the bottleneck of computer structure such as the operating frequency and the small high-speed storage capacity of a single processor, and if the brain-like computer needs to support the SNN real-time analog simulation of hundreds of millions of neurons, a distributed cluster environment of the brain-like computer needs to be built, and a matched communication scheme is used for accelerating parallel operation. It can be known that, in the multi-level structure system of node-CPU-core-neuron, it is a very important link to design a routing mechanism that supports efficient low-delay transceiving of massive pulse data packets.
In the prior art, dedicated links are generally designed between nodes for transmission, or one-to-many transmission is performed in a multicast or cluster manner, each function in a computing node is split into each co-processing module, and a generated pulse data packet is sent to a routing communication unit in a node in a data bus manner. However, the above method is usually that a designer designs a routing mechanism in a hardware angle, which is not strong in expandability and low in resource utilization rate, and is difficult to flexibly optimize according to the characteristics of pulses, and the implementation cost is high.
Disclosure of Invention
The invention provides a rapid high-concurrency neural pulse data packet distribution and transmission method and a system, and solves the technical problems that in the prior art, a data routing mechanism is usually designed based on hardware, so that the expandability is not strong, the resource utilization rate is not high, flexible optimization is difficult to perform according to the characteristics of pulses, and the implementation cost is high.
The invention provides a rapid high-concurrency neural pulse data packet distribution and transmission method which is applied to a computing node, wherein the computing node comprises an FPGA module and a plurality of CPU modules, each CPU module comprises a routing core and a plurality of computing cores, each computing core comprises at least one neuron, and the method comprises the following steps:
when the computing node receives a nerve pulse data packet, the FPGA module sends the nerve pulse data packet to a target CPU module corresponding to the CPU number according to the CPU number in the nerve pulse data packet;
when the target CPU module receives the nerve pulse data packet, sending the nerve pulse data packet to a target computing core corresponding to the computing core number through the routing core according to a preset buffer area data distribution table and the computing core number in the nerve pulse data packet;
when the target computing core receives the nerve pulse data packet, sending the nerve pulse data packet to a first target neuron corresponding to the neuron number through the target computing core according to a preset neuron data distribution table and the neuron number in the nerve pulse data packet;
activating, by the target computing core, the first target neuron with the neural pulse data packet.
Optionally, the method further comprises:
generating a new neural impulse data packet by the first target neuron and returning to the target computational core;
reselecting a second target neuron by the target computing core according to the new neural pulse data packet and a preset neuron data sending table;
sending, by the target computing core, the new neural impulse data packet to the second target neuron.
Optionally, the CPU module further includes a communication core, the routing core is provided with a receiving buffer, and when the target CPU module receives the neuropulse data packet, the step of sending the neuropulse data packet to a target computing core corresponding to the computing core number through the routing core according to a preset buffer data distribution table and a computing core number in the neuropulse data packet includes:
when the target CPU module receives the nerve pulse data packet, storing the nerve pulse data packet to the receiving buffer area through the communication core;
searching a calculation core number in the nerve pulse data packet in a preset buffer data distribution table through the routing core;
and sending the nerve pulse data packet in the receiving buffer area to a target computing core corresponding to the computing core number through the routing core.
Optionally, the computing node has hardware configuration information and neuron scale information, and before the step of receiving the neural impulse data packet by the computing node, the method further comprises:
according to the hardware configuration information, the neuron scale information and a preset adjacency matrix, carrying out hierarchical division on a plurality of neurons, and determining the neuron number of each neuron;
performing column traversal operation on the adjacent matrix by taking a calculation core as a unit, determining the calculation core to which each neuron belongs, and constructing a buffer area data distribution table;
and performing column traversal operation on the adjacent matrix by taking the neuron as a unit, determining the neuron receiving the nerve pulse data packet, and constructing a neuron data distribution table.
And performing row traversal operation on the adjacent matrix by taking the neuron as a unit, determining the neuron receiving the new nerve pulse data packet, and constructing a neuron data sending table.
Optionally, the routing core is further provided with a sending buffer, and the step of sending the new neural impulse packet to the second target neuron through the target computing core includes:
sending, by the target computing core, the new neural impulse data packet to the second target neuron when the second target neuron is in the target computing core;
when the second target neuron is not in the target computing core, saving the new neural impulse data packet to the sending buffer through the target computing core;
sending, by the communication core, the new neural impulse data packet to the FPGA module;
sending, by the FPGA module, the new neural impulse data packet to the second target neuron.
The invention also provides a rapid high-concurrency neural pulse data packet distribution and transmission system, which comprises a computing node, wherein the computing node comprises an FPGA module and a plurality of CPU modules, each CPU module comprises a routing core and a plurality of computing cores, and each computing core comprises at least one neuron;
the FPGA module comprises:
the target CPU data sending submodule is used for sending the nerve pulse data packet to a target CPU module corresponding to a CPU number according to the CPU number in the nerve pulse data packet when the computing node receives the nerve pulse data packet;
the routing core includes:
the target calculation core data sending submodule is used for sending the nerve pulse data packet to a target calculation core corresponding to the calculation core number according to a preset buffer area data distribution table and the calculation core number in the nerve pulse data packet when the target CPU module receives the nerve pulse data packet;
the computing core includes:
the target neuron data sending submodule is used for sending the nerve pulse data packet to a first target neuron corresponding to the neuron number according to a preset neuron data distribution table and the neuron number in the nerve pulse data packet when the target computing core receives the nerve pulse data packet;
an activation sub-module for activating the first target neuron using the neural pulse data packet.
Optionally, the first target neuron comprises:
the data packet returning submodule is used for generating a new nerve pulse data packet and returning the new nerve pulse data packet to the target computing core;
the computing core further comprises:
the neuron reselection submodule is used for reselecting a second target neuron according to the new nerve pulse data packet and a preset neuron data sending table;
and the data packet sending submodule sends the new nerve pulse data packet to the second target neuron through the target computing core.
Optionally, the CPU module further includes a communication core, and the routing core is provided with a receiving buffer;
the communication core includes:
the data packet storage submodule is used for storing the neural pulse data packet to the receiving buffer area when the target CPU module receives the neural pulse data packet;
the target computing kernel data sending submodule comprises:
the target calculation core determining unit is used for retrieving the calculation core number in the nerve pulse data packet in a preset buffer area data distribution table;
and the calculation core data sending unit is used for sending the nerve pulse data packet in the receiving buffer area to a target calculation core corresponding to the calculation core number.
Optionally, the compute node has hardware configuration information and neuron scale information, and the system further includes:
the neuron number determining module is used for performing hierarchical division on a plurality of neurons according to the hardware configuration information, the neuron scale information and a preset adjacency matrix to determine the neuron number of each neuron;
the buffer area data distribution table construction module is used for executing column traversal operation on the adjacent matrix by taking a calculation core as a unit, determining the calculation core to which each neuron belongs and constructing a buffer area data distribution table;
a neuron data distribution table construction module, configured to perform a column traversal operation on the adjacent matrix by taking a neuron as a unit, determine a neuron receiving the nerve pulse data packet, and construct a neuron data distribution table;
and the neuron data sending table building module is used for executing row traversal operation on the adjacent matrix by taking neurons as units, determining the neurons for receiving the new nerve pulse data packets and building a neuron data sending table.
Optionally, the routing core is further provided with a sending buffer, and the data packet sending sub-module includes:
a data packet sending unit, configured to send the new neural impulse data packet to the second target neuron when the second target neuron is in the target computational core;
a saving unit, configured to save the new neural burst packet to the transmission buffer when the second target neuron is not in the target computational core;
the communication core is used for sending the new nerve pulse data packet to the FPGA module;
the FPGA module is used for sending the new nerve pulse data packet to the second target neuron.
According to the technical scheme, the invention has the following advantages:
when the calculation node receives the neural pulse data packet, the neural pulse data packet is sent to a target CPU module through an FPGA module based on the CPU number in the neural pulse data packet; in the target CPU module, the routing core determines a target computing core and sends a nerve pulse data packet based on the computing core number and the buffer data distribution table of the nerve pulse data packet; when the target computing core receives the nerve pulse data packet, the first target neuron is determined and the nerve pulse data packet is sent according to the neuron data distribution table and the neuron number, so that the first target neuron is activated, rapid routing transmission is carried out according to nerve pulses more flexibly, the resource utilization efficiency is improved, meanwhile, the hardware change design is not needed, and the implementation cost is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive labor.
Fig. 1 is a flowchart illustrating steps of a method for fast and highly concurrent nerve impulse packet distribution and transmission according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a hierarchical structure of a compute node according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating steps of a method for fast packet distribution and delivery of high-concurrency spikes according to an alternative embodiment of the present invention;
fig. 4 is a flowchart of steps of a routing table construction process according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of an adjacency matrix according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a buffer data distribution table C according to an embodiment of the present invention;
FIG. 7 is a diagram of a neuron data distribution table B according to an embodiment of the present invention;
FIG. 8 is a diagram of a neuron data transmission table A according to an embodiment of the present invention;
FIG. 9 is a diagram illustrating an encoding format of a neural impulse data packet according to an embodiment of the present invention;
fig. 10 is a schematic routing diagram of a fast high-concurrency neural burst packet distribution and transmission method according to an embodiment of the present invention;
fig. 11 is a block diagram of a fast high-concurrency neural pulse data packet distribution and transmission system according to an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a rapid high-concurrency neural pulse data packet distribution and transmission method and system, which are used for solving the technical problems of low expandability, low resource utilization rate, difficulty in flexible optimization according to the characteristics of pulses and high implementation cost caused by a hardware-based data routing mechanism in the prior art.
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for fast and highly concurrent nerve impulse packet distribution and transmission according to an embodiment of the present invention.
The invention provides a rapid high-concurrency neural pulse data packet distribution and transmission method which is applied to a computing node, wherein the computing node comprises an FPGA module and a plurality of CPU modules, each CPU module comprises a routing core and a plurality of computing cores, each computing core comprises at least one neuron, and the method comprises the following steps:
step 101, when the computing node receives a nerve pulse data packet, the FPGA module sends the nerve pulse data packet to a target CPU module corresponding to a CPU number according to the CPU number in the nerve pulse data packet;
referring to fig. 2, it can be seen that each compute node is composed of an FPGA and a plurality of CPUs, and the CPUs have cores divided into different functions, including a routing core for distributing pulses in the CPUs, a plurality of compute cores responsible for neuron computation, a system core responsible for resource scheduling, and a communication core responsible for CPU pulse transceiving. And the number of CPUs which can exist in one FPGA is often less, so that when the computing node receives the nerve pulse data packet, the CPU number carried in the nerve pulse data packet can be detected through the FPGA module so as to determine a target CPU module which needs to be sent and distribute the target CPU module to the corresponding target CPU module.
102, when the target CPU module receives the neural pulse data packet, sending the neural pulse data packet to a target computation core corresponding to the computation core number through the routing core according to a preset buffer data distribution table and the computation core number in the neural pulse data packet;
103, when the target computing core receives the nerve pulse data packet, sending the nerve pulse data packet to a first target neuron corresponding to the neuron number through the target computing core according to a preset neuron data distribution table and the neuron number in the nerve pulse data packet;
step 104, activating the first target neuron by the target computational core using the neural pulse data packet.
In the embodiment of the invention, when a computing node receives a nerve pulse data packet, the nerve pulse data packet is sent to a target CPU module through an FPGA module based on the CPU number in the nerve pulse data packet; in the target CPU module, the routing core determines a target computing core and sends a nerve pulse data packet based on the computing core number and the buffer data distribution table of the nerve pulse data packet; when the target computing core receives the nerve pulse data packet, the first target neuron is determined and the nerve pulse data packet is sent according to the neuron data distribution table and the neuron number, so that the first target neuron is activated, rapid routing transmission is carried out according to nerve pulses more flexibly, the resource utilization efficiency is improved, meanwhile, the hardware change design is not needed, and the implementation cost is reduced.
Referring to fig. 3, fig. 3 is a flowchart illustrating a method for fast and highly concurrent nerve impulse packet distribution according to an alternative embodiment of the present invention.
The invention provides a rapid high-concurrency neural pulse data packet distribution and transmission method which is applied to a computing node, wherein the computing node comprises an FPGA module and a plurality of CPU modules, each CPU module comprises a routing core and a plurality of computing cores, each computing core comprises at least one neuron, and the method comprises the following steps:
step 301, when the computing node receives a neural pulse data packet, sending the neural pulse data packet to a target CPU module corresponding to a CPU number through the FPGA module according to the CPU number in the neural pulse data packet;
referring to fig. 4, fig. 4 is a flow chart showing steps of a routing table construction process in the embodiment of the present invention, where the computing node has hardware configuration information and neuron scale information, and the method further includes the following steps S11-S14:
step S11, according to the hardware configuration information, the neuron scale information and a preset adjacency matrix, carrying out hierarchical division on a plurality of neurons, and determining the neuron number of each neuron;
step S12, executing column traversal operation on the adjacent matrix by taking a calculation core as a unit, determining the calculation core to which each neuron belongs, and constructing a buffer area data distribution table;
step S13, performing a column traversal operation on the adjacent matrix with neurons as a unit, determining neurons that receive the nerve pulse data packet, and constructing a neuron data distribution table.
Step S14, performing a row traversal operation on the adjacency matrix by taking a neuron as a unit, determining a neuron that receives the new nerve pulse data packet, and constructing a neuron data transmission table.
Referring to fig. 5, fig. 5 is a schematic diagram of a structure of a neighbor matrix according to an embodiment of the present invention, in which neurons numbered on the left send out nerve impulses, neurons numbered on the upper side receive nerve impulses, and the positions of intersection of the neurons are weights of the sent impulses, and the neighbor matrix comprises a CPU0 and a CPU1, which respectively comprise a computation Core0 and a Core 1.
According to the hardware configuration information, namely the number of CPUs and computing cores needed by the neural pulse data packet; the neuron scale information, that is, the total number of neurons, is to hierarchically divide the neurons by "CPU-computational core-neuron", that is, how many computational cores are set in each CPU, how many neurons are set in each computational core, so as to classify all neurons into different computational cores, and renumber each neuron. The adjacency matrix comprises information in two directions of receiving pulse and sending pulse, row-column traversal operation can be performed on the adjacency matrix by taking a calculation core as a unit so as to determine the calculation core to which each neuron belongs, namely the calculation core acted by the neuron pulse data packet, and information is graded, sorted, counted, numbered by length, entry and target calculation core, so that a buffer data distribution table is constructed; performing column traversal operation on the adjacent matrix by taking the neurons as a unit, integrating the influence of each neuron on which neurons in the calculation core, and performing hierarchical arrangement on information to count the length, the entry, the number of the target neuron and the weight so as to construct a neuron data distribution table; and performing traversal operation on the adjacent matrix by taking the neuron as a unit, integrating pulses generated by the calculation core and needing to be left locally, counting which neurons of the core are influenced by each neuron of the core, and grading information to count the length, the entry, the number of the target neuron and the weight, so as to construct and construct a neuron data sending table.
In the embodiment of the invention, in order to determine the relation among the neurons, the computing core and the CPU module before receiving the nerve pulse data packet, a plurality of routing tables, namely buffer area data distribution tables, are designed, and each CPU is provided with one part; neuron data is distributed, one for each computational core; neuronal data transmission table, one for each computational core. And because the number of neurons is usually large, the list is too long due to the large number of neurons, and the traversal time of each query is too long, so that the information of each routing table is recombined and split into a multi-stage cascade list, the traversal time is reduced, and the efficiency is improved. The specific design details of each meter are as follows:
referring to fig. 6, fig. 6 is a schematic diagram of a buffer data distribution table C in an embodiment of the present invention, where recorded information needs to be distributed to specific computing cores after each pulse is sent to the CPU, where the table C is composed of two-level concatenation lists C1 and C2. The row index of the table C1 is the number nid of the neuron in the core in the pulse packet, and two pieces of information, namely the row index entry st of the table C2 of the next stage and the length dr of the search row, are obtained, the length means the number of neurons affected by a certain pulse to the local, the row index of the table C2 is determined by taking the entry of the table C1 and the number Coreid of the core in the pulse packet as an offset, and the information mask CoreMask of the core which needs to send the pulse to the CPU is obtained.
Referring to fig. 7, fig. 7 is a diagram of a neuron data distribution table B in an embodiment of the present invention, in which specific neurons to which weights need to be applied after a pulse is received by a computational core and corresponding weight values are recorded, and the table B is composed of three-level cascade lists B1, B2, and B3. The row index of table B1 is the number Snid of the intra-core neuron in the pulse packet, and the row index entry st and length dr of the next-level table B2 are obtained. The row index of table B2 is formed by combining the entry of B1 and the core number ccid in the pulse packet as an offset, and the information stored in each row is the row index entry st and the length dr of the next-level table B3. Table B3 the row index is determined by B2, the destination neuron Dnid to which this pulse of information recorded in each row is to be applied, and the weight w, which represents the pulse strength of the neural pulse packet.
Referring to fig. 8, fig. 8 shows a schematic diagram of a neuron data transmission table a in an embodiment of the present invention, in which the influence of pulses generated after neuron activation on some computational kernels is recorded, i.e., similarly to table B, target neuron numbers and weights are recorded. Since the pulse generated by a certain computation core only comes from itself, and the core number of the pulse is not required to be queried, table a is composed of two-stage cascade lists a1 and a 2. The row index of table a1 is the number nid of the intra-core neuron in the pulse packet, and the row index entry st and length dr of the next-level table a2 are obtained. Table a2 row indices the destination neuron on which this pulse of information recorded for each row is to act, and the weight, are determined by a 1. The method also comprises the number nid of the neuron in the core and the SendMask of the sending mask, wherein if the first bit of the mask is 1, the mask needs to be sent to the local, and if the first bit of the mask is zero, the mask needs to be sent to other destinations. If the pulse with the length of 0 is recorded in the table a1, that is, the pulse is not sent to the local, the pulse is forwarded to the sending buffer of the routing core and sent to other destinations.
Step 302, when the target CPU module receives the neuropulse data packet, sending the neuropulse data packet to a target computation core corresponding to the computation core number through the routing core according to a preset buffer data distribution table and the computation core number in the neuropulse data packet;
in an example of the present invention, the CPU module further includes a communication core, the routing core is provided with a receiving buffer, and the step 302 may include the following sub-steps:
when the target CPU module receives the nerve pulse data packet, the nerve pulse data packet is stored to the receiving buffer area through the communication core;
searching a calculation core number in the nerve pulse data packet in a preset buffer area data distribution table through the routing core;
and sending the nerve pulse data packet in the receiving buffer area to a target computing core corresponding to the computing core number through the routing core.
Step 303, when the target computing core receives the neural pulse data packet, sending the neural pulse data packet to a first target neuron corresponding to the neuron number through the target computing core according to a preset neuron data distribution table and the neuron number in the neural pulse data packet;
step 304, activating, by the target computing core, the first target neuron with the neural pulse data packet.
In the embodiment of the present invention, since the computation core is responsible for the behavior update of specific neurons, it is the final producer and consumer of the pulse data. Therefore, before the target computation core updates the first target neuron, the neural impulse data packet needs to be applied to a specific neuron by the target computation core, so as to activate the first target neuron.
After the first target neuron is activated, that is, after the update operation, the target computation core is further required to send a newly generated new neural pulse data packet to other neurons of the current computation core or neurons of other computation cores.
Further, the method further comprises the following steps 305-307:
step 305, generating a new neural impulse data packet by the first target neuron and returning to the target computing core;
step 306, reselecting a second target neuron by the target computing core according to the new neural pulse data packet and a preset neuron data sending table;
step 307, sending the new neural impulse data packet to the second target neuron by the target computational core.
Further, the routing core is further provided with a sending buffer, and the step 307 may include the following sub-steps:
sending, by the target computing core, the new neural impulse data packet to the second target neuron when the second target neuron is in the target computing core;
when the second target neuron is not in the target computational core, saving the new neural impulse data packet to the sending buffer through the target computational core;
sending the new neural impulse data packet to the FPGA module through the communication core;
sending, by the FPGA module, the new neural impulse data packet to the second target neuron.
Optionally, referring to fig. 9, fig. 9 is a schematic diagram illustrating an encoding format of a neural impulse data packet in the embodiment of the present invention, where the encoding format includes a CPU number D + a computation core number E + a neuron number F. The pulse in the nerve pulse data packet essentially represents an event message, i.e. a certain neuron is activated, so that the number of the neuron needs to be expressed in the data packet content. Each neuro-impulse packet is assigned to a neuron on a specific computational core according to a three-layer structure in this architecture, i.e., CPU-core-neuron.
In the embodiment of the invention, when a computing node receives a nerve pulse data packet, the nerve pulse data packet is sent to a target CPU module through an FPGA module based on the CPU number in the nerve pulse data packet; in the target CPU module, the routing core determines a target computing core and sends a nerve pulse data packet based on the computing core number and the buffer data distribution table of the nerve pulse data packet; when the target computing core receives the neural pulse data packet, the first target neuron is determined and the neural pulse data packet is sent according to the neuron data distribution table and the neuron number, so that the first target neuron is activated, rapid routing transmission is carried out according to the neural pulse more flexibly, the resource utilization efficiency is improved, meanwhile, the design change on hardware is not needed, and the implementation cost is reduced.
Referring to fig. 10, fig. 10 is a schematic routing diagram illustrating a fast high-concurrency neural burst packet distribution and transmission method according to an embodiment of the present invention, and the method may include the following steps:
1. the FPGA sends a pulse to the CPU, and the CPU stores the pulse to a routing core receiving buffer area;
2. the routing core table look-up distributes the buffer data to a receiving buffer of the computing core;
3. a calculation check table applies the pulse to the neuron and updates the neuron;
4. the computational core lookup table retains the updated activated neuron pulses in the core or to other destinations.
Referring to fig. 11, fig. 11 is a block diagram illustrating a fast high-concurrency neural pulse data packet distribution and transmission system according to an embodiment of the present invention.
A rapid high-concurrency neural pulse data packet distribution and transmission system comprises a computing node, wherein the computing node comprises an FPGA module 111 and a plurality of CPU modules 114, each CPU module 114 comprises a routing core 112 and a plurality of computing cores 113, and each computing core comprises at least one neuron;
the FPGA module 111 includes:
the target CPU data sending submodule 1111 is configured to send, when the compute node receives a neural impulse data packet, the neural impulse data packet to a target CPU module corresponding to a CPU number according to the CPU number in the neural impulse data packet;
the routing core 112 includes:
a target computation core data sending submodule 1121, configured to send the neuropulse data packet to a target computation core corresponding to the computation core number according to a preset buffer data distribution table and the computation core number in the neuropulse data packet when the target CPU module receives the neuropulse data packet;
the computing core 113 includes:
a target neuron data sending sub-module 1131, configured to send, when the target computational core receives the neural pulse data packet, the neural pulse data packet to a first target neuron corresponding to the neuron number according to a preset neuron data distribution table and the neuron number in the neural pulse data packet;
an activation sub-module 1132 for activating the first target neuron with the neural pulse data packet.
Optionally, the first target neuron comprises:
the data packet returning submodule is used for generating a new nerve pulse data packet and returning the new nerve pulse data packet to the target computing core;
the computing core 113 further includes:
the neuron reselection submodule is used for reselecting a second target neuron according to the new nerve pulse data packet and a preset neuron data sending table;
and the data packet sending submodule sends the new nerve pulse data packet to the second target neuron through the target computing core.
Optionally, the CPU module further includes a communication core, and the routing core is provided with a receiving buffer;
the communication core includes:
the data packet storage submodule is used for storing the neural pulse data packet to the receiving buffer area when the target CPU module receives the neural pulse data packet;
the target computing kernel data sending sub-module 1131 includes:
the target calculation core determining unit is used for retrieving the calculation core number in the nerve pulse data packet in a preset buffer area data distribution table;
and the calculation core data sending unit is used for sending the nerve pulse data packet in the receiving buffer area to a target calculation core corresponding to the calculation core number.
Optionally, the compute node has hardware configuration information and neuron scale information, and the system further includes:
the neuron number determining module is used for performing hierarchical division on a plurality of neurons according to the hardware configuration information, the neuron scale information and a preset adjacency matrix to determine the neuron number of each neuron;
the buffer area data distribution table building module is used for executing column traversal operation on the adjacent matrix by taking a calculation core as a unit, determining the calculation core to which each neuron belongs and building a buffer area data distribution table;
the neuron data distribution table construction module is used for executing column traversal operation on the adjacent matrix by taking neurons as units, determining the neurons for receiving the nerve pulse data packet and constructing a neuron data distribution table;
and the neuron data sending table building module is used for executing row traversal operation on the adjacent matrix by taking neurons as units, determining the neurons for receiving the new nerve pulse data packets and building a neuron data sending table.
Optionally, the routing core is further provided with a sending buffer, and the data packet sending sub-module includes:
a data packet sending unit, configured to send the new neural impulse data packet to the second target neuron when the second target neuron is in the target computational core;
a saving unit, configured to save the new neural burst packet to the transmission buffer when the second target neuron is not in the target computational core;
the communication core is used for sending the new nerve pulse data packet to the FPGA module;
the FPGA module 111 is configured to send the new neural impulse data packet to the second target neuron.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (6)

1. A fast high-concurrency neural pulse data packet distribution and transmission method is applied to a computing node, the computing node comprises an FPGA module and a plurality of CPU modules, each CPU module comprises a routing core and a plurality of computing cores, each computing core comprises at least one neuron, and the method comprises the following steps:
when the computing node receives a nerve pulse data packet, the FPGA module sends the nerve pulse data packet to a target CPU module corresponding to the CPU number according to the CPU number in the nerve pulse data packet;
when the target CPU module receives the nerve pulse data packet, sending the nerve pulse data packet to a target computing core corresponding to the computing core number through the routing core according to a preset buffer area data distribution table and the computing core number in the nerve pulse data packet;
when the target computing core receives the nerve pulse data packet, sending the nerve pulse data packet to a first target neuron corresponding to the neuron number through the target computing core according to a preset neuron data distribution table and the neuron number in the nerve pulse data packet;
activating, by the target computing core, the first target neuron with the neural pulse data packet;
the method further comprises the following steps:
generating a new neural impulse data packet by the first target neuron and returning to the target computational core;
reselecting a second target neuron by the target computing core according to the new nerve pulse data packet and a preset neuron data sending table;
sending, by the target computing core, the new neural impulse data packet to the second target neuron;
the CPU module further comprises a communication core, the routing core is provided with a receiving buffer area, and when the target CPU module receives the nerve pulse data packet, the step of sending the nerve pulse data packet to a target computing core corresponding to the computing core number through the routing core according to a preset buffer area data distribution table and the computing core number in the nerve pulse data packet comprises the following steps:
when the target CPU module receives the nerve pulse data packet, storing the nerve pulse data packet to the receiving buffer area through the communication core;
searching a calculation core number in the nerve pulse data packet in a preset buffer data distribution table through the routing core;
and sending the nerve pulse data packet in the receiving buffer area to a target computing core corresponding to the computing core number through the routing core.
2. The method of claim 1, wherein the compute node has hardware configuration information and neuron size information, and wherein prior to the step of the compute node receiving a neural impulse packet, the method further comprises:
according to the hardware configuration information, the neuron scale information and a preset adjacency matrix, conducting hierarchical division on the neurons, and determining the neuron number of each neuron;
performing column traversal operation on the adjacent matrix by taking a calculation core as a unit, determining the calculation core to which each neuron belongs, and constructing a buffer area data distribution table;
performing column traversal operation on the adjacent matrix by taking the neuron as a unit, determining the neuron receiving the nerve pulse data packet, and constructing a neuron data distribution table;
and performing row traversal operation on the adjacent matrix by taking the neuron as a unit, determining the neuron receiving the new nerve pulse data packet, and constructing a neuron data sending table.
3. The method of claim 1, wherein the routing core is further configured with a transmit buffer, and wherein the step of transmitting the new neural burst packet to the second target neuron by the target computing core comprises:
sending, by the target computing core, the new neural impulse data packet to the second target neuron when the second target neuron is in the target computing core;
when the second target neuron is not in the target computational core, saving the new neural impulse data packet to the sending buffer through the target computational core;
sending the new neural impulse data packet to the FPGA module through the communication core;
sending, by the FPGA module, the new neural impulse data packet to the second target neuron.
4. A rapid high-concurrency neural pulse data packet distribution and transmission system is characterized by comprising a computing node, wherein the computing node comprises an FPGA module and a plurality of CPU modules, each CPU module comprises a routing core and a plurality of computing cores, and each computing core comprises at least one neuron;
the FPGA module includes:
the target CPU data sending submodule is used for sending the nerve pulse data packet to a target CPU module corresponding to a CPU number according to the CPU number in the nerve pulse data packet when the computing node receives the nerve pulse data packet;
the routing core comprises:
the target calculation core data sending submodule is used for sending the neural pulse data packet to a target calculation core corresponding to the calculation core number according to a preset buffer area data distribution table and the calculation core number in the neural pulse data packet when the target CPU module receives the neural pulse data packet;
the computing core includes:
the target neuron data sending submodule is used for sending the nerve pulse data packet to a first target neuron corresponding to the neuron number according to a preset neuron data distribution table and the neuron number in the nerve pulse data packet when the target computing core receives the nerve pulse data packet;
an activation sub-module for activating the first target neuron using the neural pulse data packet;
the first target neuron comprises:
the data packet returning submodule is used for generating a new nerve pulse data packet and returning the new nerve pulse data packet to the target computing core;
the computing core further comprises:
the neuron reselection submodule is used for reselecting a second target neuron according to the new nerve pulse data packet and a preset neuron data sending table;
a data packet sending submodule for sending the new neural impulse data packet to the second target neuron through the target computing core;
the CPU module also comprises a communication core, and the routing core is provided with a receiving buffer area;
the communication core includes:
the data packet storage submodule is used for storing the neural pulse data packet to the receiving buffer area when the target CPU module receives the neural pulse data packet;
the target computing kernel data sending submodule comprises:
the target calculation core determining unit is used for retrieving the calculation core number in the nerve pulse data packet in a preset buffer area data distribution table;
and the calculation core data sending unit is used for sending the nerve pulse data packet in the receiving buffer area to a target calculation core corresponding to the calculation core number.
5. The system of claim 4, wherein the compute node has hardware configuration information and neuron scale information, the system further comprising:
the neuron number determining module is used for performing hierarchical division on a plurality of neurons according to the hardware configuration information, the neuron scale information and a preset adjacency matrix to determine the neuron number of each neuron;
the buffer area data distribution table building module is used for executing column traversal operation on the adjacent matrix by taking a calculation core as a unit, determining the calculation core to which each neuron belongs and building a buffer area data distribution table;
a neuron data distribution table construction module, configured to perform a column traversal operation on the adjacent matrix by taking a neuron as a unit, determine a neuron receiving the nerve pulse data packet, and construct a neuron data distribution table;
and the neuron data sending table building module is used for executing row traversal operation on the adjacent matrix by taking neurons as units, determining the neurons for receiving the new nerve pulse data packets and building a neuron data sending table.
6. The system of claim 4, wherein the routing core is further provided with a transmission buffer, and the packet transmission sub-module comprises:
a packet sending unit, configured to send the new neural impulse packet to the second target neuron when the second target neuron is in the target computational core;
a saving unit, configured to save the new neural burst packet to the transmission buffer when the second target neuron is not in the target computational core;
the communication core is used for sending the new nerve pulse data packet to the FPGA module;
the FPGA module is used for sending the new nerve pulse data packet to the second target neuron.
CN202011096640.9A 2020-10-14 2020-10-14 Rapid high-concurrency neural pulse data packet distribution and transmission method and system Active CN112242963B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011096640.9A CN112242963B (en) 2020-10-14 2020-10-14 Rapid high-concurrency neural pulse data packet distribution and transmission method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011096640.9A CN112242963B (en) 2020-10-14 2020-10-14 Rapid high-concurrency neural pulse data packet distribution and transmission method and system

Publications (2)

Publication Number Publication Date
CN112242963A CN112242963A (en) 2021-01-19
CN112242963B true CN112242963B (en) 2022-06-24

Family

ID=74169157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011096640.9A Active CN112242963B (en) 2020-10-14 2020-10-14 Rapid high-concurrency neural pulse data packet distribution and transmission method and system

Country Status (1)

Country Link
CN (1) CN112242963B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116468086A (en) * 2022-01-11 2023-07-21 北京灵汐科技有限公司 Data processing method and device, electronic equipment and computer readable medium
CN116011563B (en) * 2023-03-28 2023-07-21 之江实验室 High-performance pulse transmission simulation method and device for pulse relay

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056212A (en) * 2016-05-25 2016-10-26 清华大学 Artificial neural network calculating core
CN110991626A (en) * 2019-06-28 2020-04-10 广东工业大学 Multi-CPU brain simulation system
CN111082949A (en) * 2019-10-29 2020-04-28 广东工业大学 Method for efficiently transmitting pulse data packets in brain-like computer
CN111210014A (en) * 2020-01-06 2020-05-29 清华大学 Control method and device of neural network accelerator and neural network accelerator

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095961B (en) * 2015-07-16 2017-09-29 清华大学 A kind of hybrid system of artificial neural network and impulsive neural networks
WO2018137412A1 (en) * 2017-01-25 2018-08-02 清华大学 Neural network information reception method, sending method, system, apparatus and readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056212A (en) * 2016-05-25 2016-10-26 清华大学 Artificial neural network calculating core
CN110991626A (en) * 2019-06-28 2020-04-10 广东工业大学 Multi-CPU brain simulation system
CN111082949A (en) * 2019-10-29 2020-04-28 广东工业大学 Method for efficiently transmitting pulse data packets in brain-like computer
CN111210014A (en) * 2020-01-06 2020-05-29 清华大学 Control method and device of neural network accelerator and neural network accelerator

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"A multicast routing scheme for a universal spiking neural network architecture.";Wu, Jian, and Steve Furber.;《The Computer Journal》;20100330;第53卷(第3期);第280-288页 *
"多核并行脉冲神经网络模拟器的设计";刘家华,陈靖宇.;《计算机工程与应用》;20191231;第56卷(第22期);第244-250页 *
"类脑机的思想与体系结构综述";刘怡俊等;《计算机研究与发展》;20190615;第56卷(第06期);第1135-1148页 *

Also Published As

Publication number Publication date
CN112242963A (en) 2021-01-19

Similar Documents

Publication Publication Date Title
US9818058B2 (en) Time-division multiplexed neurosynaptic module with implicit memory addressing for implementing a universal substrate of adaptation
US8429107B2 (en) System for address-event-representation network simulation
CN112242963B (en) Rapid high-concurrency neural pulse data packet distribution and transmission method and system
US10713561B2 (en) Multiplexing physical neurons to optimize power and area
US8843425B2 (en) Hierarchical routing for two-way information flow and structural plasticity in neural networks
CN111082949B (en) Method for efficiently transmitting pulse data packets in brain-like computer
CN107430704A (en) Neural network algorithm is realized in nerve synapse substrate based on the metadata associated with neural network algorithm
US10198692B2 (en) Scalable neural hardware for the noisy-OR model of Bayesian networks
US20140180987A1 (en) Time-division multiplexed neurosynaptic module with implicit memory addressing for implementing a neural network
CN111131304B (en) Cloud platform-oriented large-scale virtual machine fine-grained abnormal behavior detection method and system
CN106056212A (en) Artificial neural network calculating core
CN106056211A (en) Neuron computing unit, neuron computing module and artificial neural network computing core
CN110991626B (en) Multi-CPU brain simulation system
CN109684290A (en) Log storing method, device, equipment and computer readable storage medium
CN110086731A (en) A kind of cloud framework lower network data stabilization acquisition method
CN103530304A (en) On-line recommendation method, system and mobile terminal based on self-adaption distributed computation
CN112149815A (en) Population clustering and population routing method for large-scale brain-like computing network
Joshi et al. Scalable event routing in hierarchical neural array architecture with global synaptic connectivity
CN112491572B (en) Method and device for predicting connection state between terminals and analysis equipment
EP0559100B1 (en) Method and apparatus for data distribution
US20200242457A1 (en) High dynamic range, high class count, high input rate winner-take-all on neuromorphic hardware
KR20210091688A (en) Data processing module, data processing system and data processing method
JPH0769893B2 (en) Neural network simulator
Singh et al. Advanced pre-fetching based dynamic data replication under small world network model
KR20210004342A (en) Neuromorphic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant