CN110991626A - Multi-CPU brain simulation system - Google Patents

Multi-CPU brain simulation system Download PDF

Info

Publication number
CN110991626A
CN110991626A CN201910582931.XA CN201910582931A CN110991626A CN 110991626 A CN110991626 A CN 110991626A CN 201910582931 A CN201910582931 A CN 201910582931A CN 110991626 A CN110991626 A CN 110991626A
Authority
CN
China
Prior art keywords
computing node
cpu
brain
core
computing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910582931.XA
Other languages
Chinese (zh)
Other versions
CN110991626B (en
Inventor
刘怡俊
梁君泽
叶武剑
翁韶伟
张子文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910582931.XA priority Critical patent/CN110991626B/en
Publication of CN110991626A publication Critical patent/CN110991626A/en
Application granted granted Critical
Publication of CN110991626B publication Critical patent/CN110991626B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • G06N3/065Analogue means
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a multi-CPU brain analog system, which comprises a plurality of brain analog system mainboards, wherein the brain analog system mainboards are sequentially connected; the brain-like simulation system mainboard consists of six computing nodes, and the computing nodes and the brain-like simulation system mainboard are communicated by adopting SATA interfaces; the computing node comprises a plurality of CPUs and a routing system; the routing system consists of an FPGA and a CAM, and the CPU is connected with the FGPA by adopting an RGMII communication interface; the computing nodes in the same brain simulation system mainboard are connected logically, a regular hexagon interconnection structure is adopted, and each edge of the regular hexagon represents a logical connection. The invention effectively connects a large number of CPUs based on the routing system composed of the FPGA and the CAM and the regular hexagon interconnection structure, so that the number of physical connections is reduced while normal communication between the computing nodes is calculated, the difficulty of system realization is reduced, and the system has better expansion capability.

Description

Multi-CPU brain simulation system
Technical Field
The invention relates to the technical field of computers, in particular to a multi-CPU brain simulation system.
Background
The human brain undergoes long-term natural development and biological evolution, and the ultra-strong logical thinking capability and the excellent intelligent perception capability are gradually formed. The human brain can reason about everything and living things that humans are exposed to, and is known for the third and the fourth. The human brain can easily deal with various problems by means of tactile, visual, auditory, logical thinking reasoning and decision strategies. The above-mentioned capabilities of the human brain are places that modern computers cannot compare with, but are also the purpose of continuous improvement and struggle of modern computer technology. For further development and research of intelligent brain computing, biological brain mechanisms, especially human brain mechanisms, should be used as reference, and a general intelligent system should be constructed as the first approach.
Brain-like computing refers to the simulation of the nervous system of the brain and the information processing process, and realizes a high-performance and low-power-consumption computing system. Although the existing supercomputer has the capability of brain-like calculation, the defects of high power consumption, high cost, large volume, low efficiency and the like make the development of brain-like calculation research and the realization of application difficult.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a multi-CPU brain simulation system which has low power consumption, low implementation difficulty and reasonable cost and can be expanded or cut according to the calculated amount of tasks.
In order to achieve the purpose, the technical scheme provided by the invention is as follows:
a multi-CPU brain analog system comprises a plurality of brain analog system main boards which are sequentially connected;
the brain-like simulation system mainboard consists of six computing nodes, and the computing nodes and the brain-like simulation system mainboard are communicated by adopting SATA interfaces;
the computing node comprises a plurality of CPUs and a routing system; the routing system consists of an FPGA and a CAM, and the CPU is connected with the FGPA by adopting an RGMII communication interface;
the computing nodes in the same brain simulation system mainboard are connected logically, a regular hexagon interconnection structure is adopted, and each edge of the regular hexagon represents a logical connection.
Further, the six computing nodes are respectively a first computing node, a second computing node, a third computing node, a fourth computing node, a fifth computing node and a sixth computing node;
in the logic connection, the regular hexagonal interconnection structure is specifically as follows:
the first, second, third, fourth, fifth and sixth calculation nodes are all regular hexagons;
wherein, the first, second and third computation nodes are connected in sequence by edges; the fifth computing node and the sixth computing node are respectively connected between the first computing node and the second computing node and between the second computing node and the third computing node in a seamless mode; connecting the fifth and sixth computing nodes after seamless connection; two adjacent edges of the fourth computing node are respectively superposed with one edges of the first computing node and the fifth computing node.
Further, on the physical connection, the first computing node is respectively connected with a second computing node, a fourth computing node and a fifth computing node; the second computing node is respectively connected with the first computing node, the third computing node, the fifth computing node and the sixth computing node; the third computing node is respectively connected with the second computing node and the sixth computing node; the fourth computing node is respectively connected with the first computing node and the fifth computing node; the fifth computing node is respectively connected with the first, second, fourth and sixth computing nodes; the physical connection among the computing nodes corresponds to the logical connection among six regular hexagons in a regular hexagon interconnection structure one by one.
Furthermore, the first, third, fourth and sixth computing nodes are provided with an external link connected with a mainboard of an external brain-like simulation system;
the external link set by the first computing node is respectively connected with the link connected between the first computing node and the second computing node, the link connected between the first computing node and the fourth computing node, and the link connected between the first computing node and the fifth computing node;
an external link arranged on the third computing node is respectively connected with a link connected between the third computing node and the second computing node and a link connected between the third computing node and the sixth computing node;
an external link arranged on the fourth computing node is respectively connected with a link connected between the fourth computing node and the first computing node and a link connected between the fourth computing node and the fifth computing node;
and an external link arranged on the sixth computing node is respectively connected with a link connected between the sixth computing node and the second computing node, a link connected between the sixth computing node and the third computing node, and a link connected between the sixth computing node and the fifth computing node.
Furthermore, the CPU is a single-core CPU, a thread for simulating neurons and a clock synchronization thread are bound to a core of the CPU, and an internal cache, an intra-core routing table, and a weight table are provided for storing pulse data packets generated by the neurons in the thread.
Further, the routing table in the core includes the number items of neuron ID, weight address and weight in the core; the weight table includes an offset address, a destination neuron ID, and a weight entry.
Furthermore, the CPU is a multi-core CPU, a core of the CPU is bound with a routing thread for processing the routing problem of the pulse data packet between the neuron threads, one core is bound with a clock synchronization thread, and other cores are bound with threads for simulating neurons, an internal cache for storing the pulse data packet generated by the neurons in the threads and a receiving cache for storing the pulse data packet sent by other threads; and the multi-core CPU is provided with a first out-of-core routing table and a second out-of-core routing table which are used for addressing the pulse data packet sent to the thread outside the thread.
Further, the first out-of-core routing table includes entries for an in-core neuron ID, an offset address, and a number of entries; the second out-of-core routing table comprises a CPU number, a core number, a weight address and a number item of a weight.
Furthermore, the CPU is provided with an external receiving buffer for receiving the pulse data packet sent by the routing system.
Compared with the prior art, the principle and the advantages of the scheme are as follows:
1. based on a routing system consisting of an FPGA and a CAM and an interconnection structure of a regular hexagon, a large number of CPUs are effectively connected, so that the number of physical connections is reduced while normal communication between computing nodes is realized, the difficulty of system implementation is reduced, the expansion capability is good, and a hardware basis is provided for multi-CPU analog neurons.
2. And establishing a first out-of-core routing table by taking the in-core neuron ID as an index, so that the number of table entries of the out-of-core routing table is greatly reduced, and the traversal time of the weight value table is shortened.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the services required for the embodiments or the technical solutions in the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic diagram of a multi-CPU brain simulation system according to the present invention;
fig. 2 is a schematic structural diagram of a brain-like simulation system motherboard in a multi-CPU brain-like simulation system according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a compute node in a multi-CPU-like brain simulation system according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the interconnection in the motherboard of a brain-like simulation system in a multi-CPU brain-like simulation system according to an embodiment of the present invention;
fig. 5 is a schematic diagram illustrating interconnection between boards of a brain-like simulation system in a multi-CPU brain-like simulation system according to an embodiment of the present invention;
FIG. 6 is a process of addressing a pulse packet when a single CPU core simulates a neuron;
FIG. 7 is a block diagram of a single CPU multi-core simulated neuron system;
FIG. 8 is a process of addressing an external pulse packet when a single CPU multi-core simulates a neuron;
FIG. 9 is a schematic diagram of a multi-CPU multi-core simulated neuron system;
fig. 10 is a schematic diagram of an interconnection structure of a regular quadrangle, a regular hexagon, and a regular octagon.
Detailed Description
The invention is further illustrated below with reference to three specific examples:
example 1:
as shown in fig. 1, the multi-CPU brain-like simulation system according to this embodiment includes a plurality of brain-like simulation system motherboards 1, and the brain-like simulation system motherboards 1 are sequentially connected; the brain-like simulation system mainboard 1 is composed of six computing nodes 2, namely a first computing node, a second computing node, a third computing node, a fourth computing node, a fifth computing node and a sixth computing node, and the computing nodes 2 and the brain-like simulation system mainboard 1 are communicated by SATA interfaces, as shown in FIG. 2.
As shown in fig. 3, the computing node 2 includes eight CPUs and a routing system; the routing system consists of FPGA and CAM, and the CPU and FGPA are connected by RGMII communication interface.
Due to the large number of connections among the nodes of the multi-CPU brain analog system, the connection between the circuit boards needs to be symmetrical and have infinite expansion capability. Therefore, in this embodiment, in the logical connection between the computing nodes 2 in the same brain simulation system-like motherboard 1, a regular hexagon interconnection structure is adopted, and each edge of the regular hexagon represents one logical connection.
As shown in fig. 4(a), the regular hexagonal interconnection structure is specifically as follows:
the first, second, third, fourth, fifth and sixth calculation nodes are all regular hexagons; wherein, the first, second and third computation nodes are connected in sequence by edges; the fifth computing node and the sixth computing node are respectively connected between the first computing node and the second computing node and between the second computing node and the third computing node in a seamless mode; connecting the fifth and sixth computing nodes after seamless connection; two adjacent edges of the fourth computing node are respectively superposed with one edges of the first computing node and the fifth computing node.
As shown in fig. 4(b), the first computing node is connected to the second, fourth, and fifth computing nodes respectively on the physical connection; the second computing node is respectively connected with the first computing node, the third computing node, the fifth computing node and the sixth computing node; the third computing node is respectively connected with the second computing node and the sixth computing node; the fourth computing node is respectively connected with the first computing node and the fifth computing node; the fifth computing node is respectively connected with the first, second, fourth and sixth computing nodes; the physical connection among the computing nodes corresponds to the logical connection among six regular hexagons in a regular hexagon interconnection structure one by one. The first to fourth computing nodes and the third to fifth computing nodes have no adjacent edges logically, so that the two pairs of computing nodes on the brain simulation system-like mainboard 1 have no physical connection.
Based on rich wiring resources and multi-path switching functions of the FPGA and a regular hexagon communication network structure, the number of connections between the mainboards is reduced by adopting a line sharing method. The logic is unchanged, and physically, the first, third, fourth and sixth computing nodes are provided with an external link connected with an external brain-like simulation system mainboard (1); the external link set by the first computing node is respectively connected with the link connected between the first computing node and the second computing node, the link connected between the first computing node and the fourth computing node, and the link connected between the first computing node and the fifth computing node; an external link arranged on the third computing node is respectively connected with a link connected between the third computing node and the second computing node and a link connected between the third computing node and the sixth computing node; an external link arranged on the fourth computing node is respectively connected with a link connected between the fourth computing node and the first computing node and a link connected between the fourth computing node and the fifth computing node; the external link set by the sixth computing node is respectively connected with the link connected between the sixth computing node and the second computing node, the link connected between the sixth computing node and the third computing node, and the link connected between the sixth computing node and the fifth computing node, as shown in fig. 5.
In this embodiment 1, the CPU is a single-core CPU, and a single-CPU single-core neuron is simulated, specifically as follows:
in the biological neural network, the neurons of the same type are gathered together and connected at a short distance, and the neurons of different types are connected relatively far, so when the neurons are divided into CPUs, the neurons of the same type are generally divided into one CPU or one CPU core. The brain-like simulation system needs to simulate a large number of neurons, the operation time of a single neuron is short, and if the process simulation neuron is used, the switching of processes is very frequent, so that the utilization rate of system resources is low. In this embodiment 1, a core of a CPU is bound with a thread for simulating a certain number of neurons, a clock synchronization thread, and an internal cache for storing a pulse packet generated by a neuron in the thread, an intra-core routing table, and a weight table. The routing table in the core comprises neuron ID in the core, weight address and number item of the weight; the weight table includes an offset address, a destination neuron ID, and a weight entry.
When the time synchronization signal arrives, the thread updates the state of the neuron firstly, and then reads the pulse data packet buffered inside for addressing operation. As shown in fig. 6, there is a pulse packet in the internal cache, and the ID of the intra-core neuron of the pulse packet is D1, and the corresponding entry is found in the intra-core routing table by using the ID of the intra-core neuron. The offset address of the weight value table and the number of the weight value table entries can be obtained from the table entries. And searching the ID and the weight of the target neuron in the weight value table according to the two items of information, and finally superposing the weight to the input variable of the target neuron.
Example 2:
compared with embodiment 1, in this embodiment 2, the CPU is a multi-core CPU, and a single-CPU multi-core neuron is simulated, specifically as follows:
the single CPU multi-core simulation neuron simulates the neuron by using a plurality of threads to work cooperatively. Two key problems to be solved for realizing single-CPU multi-core simulation of neurons are respectively: the issue of sending packets to other threads and the problem of addressing received burst packets.
In order to solve the two problems, a core of a multi-core CPU is bound with a routing thread for processing the routing problem of pulse data packets between neuron threads, one core is bound with a clock synchronization thread, other cores are bound with threads for simulating neurons, and an internal cache for storing the pulse data packets generated by the neurons in the threads and a receiving cache for storing the pulse data packets sent by other threads are arranged; and the multi-core CPU is provided with a first out-of-core routing table and a second out-of-core routing table which are used for addressing the pulse data packet sent to the thread outside the thread. The first out-of-core routing table comprises an in-core neuron ID, an offset address and a number item of table entries; the second out-of-core routing table comprises a CPU number, a core number, a weight address and a number item of a weight. A block diagram of a single CPU multi-core neuron simulation system is shown in FIG. 7.
The neurons of the same type are distributed in the same thread as much as possible, so that external connection is less; the connection of the neurons is known at the time of routing table construction. A forwarding flag variable is designed in the neuron structure to indicate by the lower 7 bits that the pulse packet should be forwarded to that thread. When a bit is set to 1, the thread is forwarded; when set to 0, it indicates that the thread is not to be forwarded to the corresponding thread.
When the time synchronization signal arrives, the thread updates the state of the neuron firstly, and then reads the pulse data packet buffered inside for addressing operation. In the single-CPU multi-core simulation neuron, the forwarding flag variable of the neuron is judged at the moment, and whether forwarding to a routing thread is needed or not is checked. And if the forwarding mark variable has a 1 position 1, forming a data packet by the CPU number, the CPU core number, the neuron ID and the forwarding mark variable, and sending the data packet to the routing thread through the SOCKET interface. And after the routing thread receives the data packet, checking the forwarding mark variable, if a plurality of positions 1 exist, removing the forwarding mark variable of the data packet, copying a plurality of forwarding mark variables, finally sending the data to a receiving cache of a corresponding thread through an SOCKET interface, and waiting for processing when the next time arrives.
And finishing processing the pulse data packet buffered inside, and then processing the pulse data packet in the receiving buffer. As shown in fig. 8, a burst packet with an ID of D1 for the intra-core neuron is stored in the receiving buffer, and the entry of D1 is looked up in the first extra-core routing table to obtain the offset address and the number of entries in the second extra-core routing table. And comparing the CPU number and the CPU core number in the pulse data packet with the CPU number and the CPU core number at the 0 address of the second core external routing table, and successfully matching the result. And obtaining a weight address and the number of table entries from the table entry at the address 0, finding out the corresponding ID and weight of the target neuron in the weight table, and superposing the weight to the input variable of the target neuron.
Example 3:
and combining a hardware platform of a multi-CPU brain simulation system, and further optimizing the single-CPU multi-core simulated neuron system to realize the multi-CPU multi-core simulated neuron.
Compared with embodiment 2, in this embodiment 3, the multi-core CPU is provided with an external receiving cache for receiving the burst data packet sent by the routing system, and at the same time, a network service function based on an original socket is added for receiving/sending data to the routing system, as shown in fig. 9. And an outgoing flag bit is added in a forwarding flag variable of the neuron, and the position 1 indicates that a pulse data packet is sent out of the CPU.
The forwarding process of the pulse data packet from the thread to the routing thread is the same as that of the single-CPU multi-core simulation neuron. When the flag variable is judged, whether the outgoing flag bit is 1 needs to be judged, and if the outgoing flag bit is 1, the pulse data is forwarded to the routing system. And after receiving the pulse data packet, the routing system performs routing table matching, and the matching result indicates that the pulse data packet is forwarded to a port of the CPU 2. After receiving the pulse data packet, the routing thread of the CPU2 performs retrieval in conjunction with the first out-of-core routing table and the second out-of-core routing table, and a result indicates forwarding to thread 2. The routing thread forwards the pulse data packet to the thread 2 through the SOCKET interface, stores the pulse data packet in a receiving buffer, and waits for processing when the next synchronous clock arrives.
In the three specific embodiments, the regular hexagonal interconnection structure is adopted, so that the number of physical connections is reduced while normal communication between the nodes is calculated, the difficulty of system implementation is reduced, and the expansion capability is better. In general design, a network structure of a regular quadrangle, a regular hexagon, or a regular octagon may be considered. The network diameters of the regular hexagon network structure and the regular octagon network structure are the same, the network diameter of the regular quadrilateral network structure is the largest of the three structures, and the transmission time of the system is longer; the number of physical links of the regular quadrilateral network structure is the least among the 3 network structures, and the number of physical links of the regular octagonal network structure is the most. As shown in fig. 10, the regular octagonal network structure has regular octagons and regular quadrilateral portions, and is not an expandable fully symmetrical structure, so that the regular octagons are not adopted. Because the number of connections between the nodes of the brain-like simulation system is large, more connections between the nodes are designed on one computing node as much as possible, and the number of connections of the regular quadrangle is not superior to that of the regular hexagon. Therefore, the regular hexagon structure is selected as the interconnection structure of the brain simulation system.
In addition, the number of the table entries of the routing table is small, so that the query time is reduced, and the performance of the system is improved. When the neurons are distributed to the CPUs, the neurons of the same type are divided to the same CPU or the same core as much as possible. Assuming all neurons in core 1 of CPU1, with in-core IDs 0-999, a pulse packet is sent to core 2 of CPU 1. Still other CPU cores may have a small number of burst packets sent to core 2 of CPU 1. If the first out-of-core routing table is built with the CPU and the core ID as indices and the second out-of-core routing table is built with the core ID as indices, in this case, a large number of empty rows appear in the first out-of-core routing table, and it takes a long time for the second out-of-core routing table to traverse each time. If the first out-of-core routing table is built with the in-core neuron ID as an index, the number of entries of the first out-of-core routing table is reduced by a large amount. The time for traversing the weight table is not too long.
The above-mentioned embodiments are merely preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, so that variations based on the shape and principle of the present invention should be covered within the scope of the present invention.

Claims (9)

1. A multi-CPU brain analog system is characterized by comprising a plurality of brain analog system main boards (1), wherein the brain analog system main boards (1) are sequentially connected;
the brain-like simulation system mainboard (1) consists of six computing nodes (2), and the computing nodes (2) and the brain-like simulation system mainboard (1) are communicated by adopting SATA interfaces;
the compute node (2) comprises a plurality of CPUs and a routing system; the routing system consists of an FPGA and a CAM, and the CPU is connected with the FGPA by adopting an RGMII communication interface;
the computational nodes (2) in the same brain simulation system mainboard (1) are connected logically, a regular hexagon interconnection structure is adopted, and each edge of the regular hexagon represents a logical connection.
2. The multi-CPU-like brain simulation system according to claim 1, wherein the six computation nodes (2) are first, second, third, fourth, fifth, and sixth computation nodes, respectively;
in the logic connection, the regular hexagonal interconnection structure is specifically as follows:
the first, second, third, fourth, fifth and sixth calculation nodes are all regular hexagons;
wherein, the first, second and third computation nodes are connected in sequence by edges; the fifth computing node and the sixth computing node are respectively connected between the first computing node and the second computing node and between the second computing node and the third computing node in a seamless mode; connecting the fifth and sixth computing nodes after seamless connection; two adjacent edges of the fourth computing node are respectively superposed with one edges of the first computing node and the fifth computing node.
3. The multi-CPU brain-like simulation system according to claim 2, wherein said first computing node is connected to a second, a fourth and a fifth computing node respectively; the second computing node is respectively connected with the first computing node, the third computing node, the fifth computing node and the sixth computing node; the third computing node is respectively connected with the second computing node and the sixth computing node; the fourth computing node is respectively connected with the first computing node and the fifth computing node; the fifth computing node is respectively connected with the first, second, fourth and sixth computing nodes; the physical connection among the computing nodes corresponds to the logical connection among six regular hexagons in a regular hexagon interconnection structure one by one.
4. The multi-CPU brain-like simulation system according to claim 3, wherein, on the physical connection, the first, third, fourth and sixth computing nodes are provided with an external connection link connected with an external brain-like simulation system mainboard (1);
the external link set by the first computing node is respectively connected with the link connected between the first computing node and the second computing node, the link connected between the first computing node and the fourth computing node, and the link connected between the first computing node and the fifth computing node;
an external link arranged on the third computing node is respectively connected with a link connected between the third computing node and the second computing node and a link connected between the third computing node and the sixth computing node;
an external link arranged on the fourth computing node is respectively connected with a link connected between the fourth computing node and the first computing node and a link connected between the fourth computing node and the fifth computing node;
and an external link arranged on the sixth computing node is respectively connected with a link connected between the sixth computing node and the second computing node, a link connected between the sixth computing node and the third computing node, and a link connected between the sixth computing node and the fifth computing node.
5. The multi-CPU brain-like simulation system according to claim 1, wherein the CPU is a single-core CPU, a thread for simulating neurons, a clock synchronization thread are bound to a core of the CPU, and an internal cache for storing pulse packets generated by neurons in the thread, an intra-core routing table, and a weight table are provided.
6. The multi-CPU brain-like simulation system according to claim 5, wherein the intra-core routing table comprises intra-core neuron IDs, weight addresses and number items of weights; the weight table includes an offset address, a destination neuron ID, and a weight entry.
7. The multi-CPU brain-like simulation system according to claim 1, wherein the CPU is a multi-core CPU, one core of the CPU is bound with a routing thread for processing the routing problem of the pulse data packets between the neuron threads, one core is bound with a clock synchronization thread, the other cores are bound with threads for simulating neurons, and an internal cache for storing the pulse data packets generated by the neurons in the threads and a receiving cache for storing the pulse data packets sent by the other threads are provided; and the multi-core CPU is provided with a first out-of-core routing table and a second out-of-core routing table which are used for addressing the pulse data packet sent to the thread outside the thread.
8. The multi-CPU brain simulation system according to claim 7, wherein said first extra-core routing table comprises an intra-core neuron ID, an offset address, and a number of entries; the second out-of-core routing table comprises a CPU number, a core number, a weight address and a number item of a weight.
9. The multi-CPU brain-like simulation system according to claim 7, wherein said CPU is provided with an external receiving buffer for receiving the pulse data packet sent from the routing system.
CN201910582931.XA 2019-06-28 2019-06-28 Multi-CPU brain simulation system Active CN110991626B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910582931.XA CN110991626B (en) 2019-06-28 2019-06-28 Multi-CPU brain simulation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910582931.XA CN110991626B (en) 2019-06-28 2019-06-28 Multi-CPU brain simulation system

Publications (2)

Publication Number Publication Date
CN110991626A true CN110991626A (en) 2020-04-10
CN110991626B CN110991626B (en) 2023-04-28

Family

ID=70081580

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910582931.XA Active CN110991626B (en) 2019-06-28 2019-06-28 Multi-CPU brain simulation system

Country Status (1)

Country Link
CN (1) CN110991626B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552563A (en) * 2020-04-20 2020-08-18 南昌嘉研科技有限公司 Multithreading data architecture, multithreading message transmission method and system
CN112242963A (en) * 2020-10-14 2021-01-19 广东工业大学 Rapid high-concurrency neural pulse data packet distribution and transmission method
CN112270407A (en) * 2020-11-11 2021-01-26 浙江大学 Brain-like computer supporting hundred-million neurons
CN113837354A (en) * 2021-08-19 2021-12-24 北京他山科技有限公司 R-SpiNNaker chip

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108073982A (en) * 2016-11-18 2018-05-25 上海磁宇信息科技有限公司 Class brain computing system
CN108182473A (en) * 2017-12-12 2018-06-19 中国科学院自动化研究所 Full-dimension distributed full brain modeling system based on class brain impulsive neural networks
CN109858620A (en) * 2018-12-29 2019-06-07 北京灵汐科技有限公司 One type brain computing system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108073982A (en) * 2016-11-18 2018-05-25 上海磁宇信息科技有限公司 Class brain computing system
CN108182473A (en) * 2017-12-12 2018-06-19 中国科学院自动化研究所 Full-dimension distributed full brain modeling system based on class brain impulsive neural networks
CN109858620A (en) * 2018-12-29 2019-06-07 北京灵汐科技有限公司 One type brain computing system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552563A (en) * 2020-04-20 2020-08-18 南昌嘉研科技有限公司 Multithreading data architecture, multithreading message transmission method and system
CN112242963A (en) * 2020-10-14 2021-01-19 广东工业大学 Rapid high-concurrency neural pulse data packet distribution and transmission method
CN112242963B (en) * 2020-10-14 2022-06-24 广东工业大学 Rapid high-concurrency neural pulse data packet distribution and transmission method and system
CN112270407A (en) * 2020-11-11 2021-01-26 浙江大学 Brain-like computer supporting hundred-million neurons
CN112270407B (en) * 2020-11-11 2022-09-13 浙江大学 Brain-like computer supporting hundred-million neurons
CN113837354A (en) * 2021-08-19 2021-12-24 北京他山科技有限公司 R-SpiNNaker chip

Also Published As

Publication number Publication date
CN110991626B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN110991626B (en) Multi-CPU brain simulation system
US11132127B2 (en) Interconnect systems and methods using memory links to send packetized data between different data handling devices of different memory domains
US9009648B2 (en) Automatic deadlock detection and avoidance in a system interconnect by capturing internal dependencies of IP cores using high level specification
US4247892A (en) Arrays of machines such as computers
US8050256B1 (en) Configuring routing in mesh networks
EP2159694B1 (en) Method and device for barrier synchronization, and multicore processor
CN102270180B (en) Multicore processor cache and management method thereof
US6138166A (en) Interconnection subsystem for interconnecting a predetermined number of nodes to form a Moebius strip topology
Ma et al. Process distance-aware adaptive MPI collective communications
US20040073702A1 (en) Shortest path search method "Midway"
US20140177473A1 (en) Hierarchical asymmetric mesh with virtual routers
WO2020078470A1 (en) Network-on-chip data processing method and device
CN104794100A (en) Heterogeneous multi-core processing system based on on-chip network
CN106569896B (en) A kind of data distribution and method for parallel processing and system
CN103744644A (en) Quad-core processor system built in quad-core structure and data switching method thereof
Paul et al. MG-Join: A scalable join for massively parallel multi-GPU architectures
US11573898B2 (en) System and method for facilitating hybrid hardware-managed and software-managed cache coherency for distributed computing
CN113312283A (en) Heterogeneous image learning system based on FPGA acceleration
Li et al. Scalable Graph500 design with MPI-3 RMA
CN111901257B (en) Switch, message forwarding method and electronic equipment
Sun et al. Multi-node acceleration for large-scale GCNs
CN114764374A (en) Method and equipment for executing communication task in accelerator card system
US20230305991A1 (en) Network Computer with Two Embedded Rings
DE102022129890A1 (en) ERROR CORRECTION CODE WITH LOW OVERHEAD
CN110297802A (en) Interconnection architecture between a kind of new types of processors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant