CN112231269A

CN112231269A - Data processing method of multiprocessor system and multiprocessor system

Info

Publication number: CN112231269A
Application number: CN202011056600.1A
Authority: CN
Inventors: 赖振楠
Original assignee: Hosin Global Electronics Co Ltd
Current assignee: Hosin Global Electronics Co Ltd
Priority date: 2020-09-29
Filing date: 2020-09-29
Publication date: 2021-01-15

Abstract

The invention provides a data processing method of a multiprocessor system and the multiprocessor system, wherein the multiprocessor system comprises a central processing unit, a memory, at least one processor with different bit widths from the memory and at least one memory mapper, each memory mapper is respectively connected with a corresponding processor, and the method comprises the following steps: each memory mapper converts the data of the memory into preset bit width data, and the preset bit width data corresponds to the bit width of a processor connected with the memory mapper; and the processor reads and processes the preset bit width data generated by the corresponding memory mapper. The invention converts the data of the memory into the preset bit width data corresponding to the bit width of the processor through the memory mapper, thereby ensuring that the processor does not need to have the same bit width as the memory, greatly improving the flexibility of the multiprocessor system, avoiding the data transfer among the processors and saving the cost of the multiprocessor system.

Description

Data processing method of multiprocessor system and multiprocessor system

Technical Field

The present invention relates to the field of computer systems, and more particularly, to a data processing method for a multiprocessor system and the multiprocessor system.

Background

Multiprocessor Systems (Multiprocessor Systems) include two or more processors with similar functions, the processors can exchange data with each other, all the processors share a memory, I/O devices, a controller and external devices, the whole hardware system is controlled by a unified operating system, and operations, tasks, programs, arrays and all levels of elements are comprehensively parallel between the processors and the programs, so as to improve the data processing speed. At present, multiple processors are widely applied to products such as artificial intelligence, multimedia, voice communication and the like.

However, since the plurality of processors need to share the memory, the plurality of processors need to have the same bit width as the memory, which greatly limits the use of the multiprocessor system, increases the cost of the multiprocessor system, and makes the application inflexible.

Disclosure of Invention

The technical problem to be solved by the embodiments of the present invention is to provide a data processing method and system for a multiprocessor system, aiming at the problem that the multiprocessor system needs to adopt a processor with the same bit width as a memory, which results in higher cost and inflexible application of the multiprocessor system.

In an embodiment of the present invention, a data processing method for a multiprocessor system is provided, where the multiprocessor system includes a central processing unit, a memory, at least one processor with a bit width different from that of the memory, and at least one memory mapper, and each memory mapper is connected to a corresponding processor, and the method includes:

each memory mapper converts the data of the memory into preset bit width data, and the preset bit width data corresponds to the bit width of a processor connected with the memory mapper;

and the processor reads and processes the preset bit width data generated by the corresponding memory mapper.

Preferably, the memory is PCM, NRAM, MRAM, ReRAM or FeRAM, each memory mapper includes a register buffer, and the register buffer and a processor to which the memory mapper is connected have the same bit width; each memory mapper converts the data of the memory into data with a preset bit width, and the method comprises the following steps:

each memory mapper sequentially writes the narrow bit width data in the memory into different bits of the register buffer according to a clock signal;

and the register buffer outputs data of all bits at the same time to form the preset bit width data.

Preferably, the memory is a persistent storage-class memory, the persistent storage-class memory includes a memory interface, a control chip, a DRAM chipset, and a flash memory integrated on the same base, and the memory interface, the DRAM chipset, and the flash memory are respectively connected to the control chip, and the memory interface is connected to the central processing unit via a memory bus; the method further comprises the following steps:

when receiving a first read-write request of the central processing unit, the control chip acquires an instruction corresponding to the first read-write request from the DRAM chipset and returns the instruction corresponding to the first read-write request to the central processing unit through a memory interface;

and when the instructions in the DRAM chipset meet preset conditions, the control chip acquires a subsequent instruction set of the instructions in the DRAM chipset from the flash memory and moves the subsequent instruction set to the DRAM chipset.

Preferably, the processor comprises a graphics processor, the memory mapper comprises a first memory mapper, the persistent storage class memory comprises GDDRs respectively connected to the first memory mapper, and the memory interface, the DRAM chipset, the control chip, the flash memory, the GDDR, the first memory mapper are integrated into the same substrate, the first memory mapper is connected to the graphics processor via a GDDR bus;

the reading and processing of the preset bit width data generated by the corresponding memory mapper by the processor includes: when receiving a second read-write request of the graphics processor, the first memory mapper acquires a graphics processing instruction corresponding to the second read-write request from the GDDR and returns the graphics processing instruction corresponding to the second read-write request to the graphics processor;

each memory mapper converts the data of the memory into data with a preset bit width, and the method comprises the following steps: and when the graphic processing instruction in the GDDR meets a preset condition, the first memory mapper acquires a subsequent graphic processing instruction set of the graphic processing instruction in the GDDR from the flash memory, converts the subsequent graphic processing instruction set into preset bit width data corresponding to the bit width of the graphic processor and then moves the preset bit width data to the GDDR.

Preferably, the processor includes an AI processor, the memory mapper includes a second memory mapper, the persistent storage class memory includes HBMs respectively connected to the second memory mapper, and the memory interface, the DRAM chipset, the control chip, the flash memory, the HBM, and the second memory mapper are integrated into the same substrate, and the second memory mapper is connected to the AI processor via an HBM bus;

the reading and processing of the preset bit width data generated by the corresponding memory mapper by the processor includes: when receiving a third read/write request of the AI processor, the second memory mapper acquires an AI instruction corresponding to the third read/write request from the HBM and returns the AI instruction corresponding to the third read/write request to the AI processor;

each memory mapper converts the data of the memory into data with a preset bit width, and the method comprises the following steps: and when the AI instruction in the HBM meets a preset condition, the second memory mapper acquires a subsequent AI instruction set of the AI instruction in the HBM from the flash memory, converts the subsequent AI instruction set into preset bit width data corresponding to the bit width of the AI processor and then moves the preset bit width data to the HBM.

The embodiment of the invention also provides a multiprocessor system, which comprises a central processing unit, a memory, at least one processor with different bit widths from the memory and at least one memory mapper, wherein each memory mapper is respectively connected with a corresponding processor; wherein:

the memory mapper converts data from the memory into preset bit width data, and the preset bit width data corresponds to the bit width of a processor connected with the memory mapper;

and the processor is used for reading and processing the preset bit width data generated by the corresponding memory mapper.

Preferably, the memory is PCM, NRAM, MRAM, ReRAM or FeRAM, each memory mapper includes a register buffer, and the register buffer and a processor to which the memory mapper is connected have the same bit width;

and the memory mapper sequentially writes the narrow bit width data in the memory into different bits of the register buffer according to a clock signal, and simultaneously outputs data of all bits through the register buffer to form the preset bit width data.

Preferably, the memory is a persistent storage-class memory, the persistent storage-class memory includes a memory interface, a control chip, a DRAM chipset, and a flash memory integrated on the same base, and the memory interface, the DRAM chipset, and the flash memory are respectively connected to the control chip, and the memory interface is connected to the central processing unit via a memory bus;

when receiving a second read-write request of the graphics processor, the first memory mapper acquires a graphics processing instruction corresponding to the second read-write request from the GDDR and returns the graphics processing instruction corresponding to the second read-write request to the graphics processor;

and when the graphic processing instruction in the GDDR meets a preset condition, the first memory mapper acquires a subsequent graphic processing instruction set of the graphic processing instruction in the GDDR from the flash memory, converts the subsequent graphic processing instruction set into preset bit width data corresponding to the bit width of the graphic processor and then moves the preset bit width data to the GDDR.

when receiving a third read/write request of the AI processor, the second memory mapper acquires an AI instruction corresponding to the third read/write request from the HBM and returns the AI instruction corresponding to the third read/write request to the AI processor;

and when the AI instruction in the HBM meets a preset condition, the second memory mapper acquires a subsequent AI instruction set of the AI instruction in the HBM from the flash memory, converts the subsequent AI instruction set into preset bit width data corresponding to the bit width of the AI processor and then moves the preset bit width data to the HBM.

According to the data processing method and the multiprocessor system of the embodiment of the invention, the data of the memory is converted into the preset bit width data corresponding to the bit width of the processor through the memory mapper, so that the processor does not need to have the same bit width as the memory, the flexibility of the multiprocessor system is greatly improved, the data transfer among the processors is avoided, and the cost of the multiprocessor system is saved.

Drawings

FIG. 1 is a schematic diagram of a multiprocessor system provided in a first embodiment of the present invention;

FIG. 2 is a schematic diagram of a register buffer in the multiprocessor system of FIG. 1 for data conversion;

FIG. 3 is a diagram of a multiprocessor system provided in a second embodiment of the present invention;

FIG. 4 is a schematic diagram of a memory in the multiprocessor system of FIG. 3;

fig. 5 is a flowchart illustrating a data processing method of a multiprocessor system according to a first embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Fig. 1 is a schematic diagram of a multiprocessor system according to a first embodiment of the present invention, where the multiprocessor system may be a control system in a product such as artificial intelligence, multimedia, and voice communication, or may be a server in a cloud storage system. The multiprocessor system of the embodiment includes a Central Processing Unit (CPU) 11, a memory 12, a memory mapper 16, and a processor 17, and the CPU 11 and the memory 12 are respectively connected to a memory bus. Furthermore, similar to the existing computer system, the multiprocessor system may further include a DMA controller (Direct Memory Access) 13, a PCIe bridge 14, and a Hard Disk 15 connected to the PCIe bridge 14 through a PCIe bus, where the Hard Disk 15 may be a Hard Disk Drive (Hard Disk Drive, HDD) or a Solid State Drive (SSD), and the like. The multiprocessor system may operate based on an embedded operating system, and the DMA controller 13 may write data in the hard disk 15 to the memory 12 or write data in the memory 12 to the hard disk 15 according to an instruction of the central processing unit 11.

The cpu 11 and the memory have the same bit width, so that the cpu 11 can directly access the data in the memory 12 through the memory bus. In this embodiment, the processor 17 is connected to the memory bus via the memory mapper 16. The processor 17 may be a graphic processor, an AI processor, etc. which has different bandwidth from the memory 12 (i.e. the processor 17 has different bit width from the central processing unit 11), so that the processor 17 cannot directly access the data in the memory 12.

In this embodiment, the memory mapper 16 may convert the data from the memory 12 into the preset bit width data, where the preset bit width data corresponds to a bit width of the processor 17 connected to the memory mapper 16, and the processor 17 reads and processes the preset bit width data generated by the memory mapper 16. Meanwhile, the memory mapper 16 may also convert the data processed by the processor 17 into data corresponding to the bit width of the memory 12, so that the operation result of the processor 17 may be written back to the memory 12.

In practical applications, the multiprocessor system may include a plurality of memory mappers 16 and a plurality of processors 17 having different bit widths from the memory, and each processor 17 is connected to the memory bus via one memory mapper 16, and the memory mapper 16 performs bit width conversion on the data, so that each processor 17 can process the data in the memory 12 separately.

According to the multiprocessor system, the data of the memory is converted into the preset bit width data corresponding to the bit width of the processor through the memory mapper, so that the processor does not need to have the same bit width as the memory, the flexibility of the multiprocessor system is greatly improved, and the cost of the multiprocessor system is saved.

In this embodiment, the memory 12 may be PCM, NRAM, MRAM, ReRAM, FeRAM, or the like. As shown in fig. 2, each memory mapper 16 includes a register buffer 161, and the register buffer 161 and the processor 17 connected to the memory mapper 16 have the same bit width.

The memory mapper 16 may sequentially write the narrow-bit-width data in the memory 12 to different bits of the register buffer 161 according to the system clock signal CLK, and output data of all bits through the register buffer 161 to form preset-bit-width data, so that the processor 17 may process the data in the memory 12.

Of course, in practical applications, the memory mapper 16 may include a buffer having a bit width identical to that of the processor 17, the register buffer 161 may write the data stored therein, and the processor 17 may read the data in the buffer, thereby improving the data processing efficiency.

Fig. 3 to 4 are schematic diagrams of a multiprocessor system according to a second embodiment of the present invention, and similarly, the multiprocessor system may be a control system in a product such as artificial intelligence, multimedia, and voice communication, or may be a server in a cloud storage system.

Similarly, the multiprocessor system of the embodiment includes a central processing unit 31, a memory 32, a DMA controller 33, a PCIe bridge 34, a hard disk 35 connected to the PCIe bridge 34 through a PCIe bus, a memory mapper, and a processor 37, and the central processing unit 31, the memory 32, the DMA controller 33, and the PCIe bridge 34 are respectively connected to the memory bus.

In this embodiment, the memory 32 is a persistent storage class memory, which includes a memory interface, a control chip 321, a DRAM chipset 322, and a flash memory 328 integrated on the same substrate, and the memory interface, the DRAM chipset 322, and the flash memory 328 are respectively connected to the control chip 321, and the memory interface is connected to the cpu 31 via a memory bus. The memory capacity of the flash memory 328 is much greater than the memory capacity of the DRAM chipset 322.

In the embodiment, the control chip 321 may, in response to a read/write request of the central processing unit 31 connected to the memory interface, obtain an instruction set from the DRAM chipset 322, transmit the instruction set to the central processing unit 31 through the memory interface, and write execution result data of the central processing unit 31 into the DRAM chipset 322, so as to implement data interaction between the central processing unit 31 and the DRAM chipset 322, and specifically, the central processing unit 31 may obtain the instruction set from the DRAM chipset 322 according to a program pointer and execute the instruction set.

The control chip 321 can also realize the interaction between the DRAM chipset 322 and the data in the flash memory 328. Specifically, when the instruction set (including the instruction code and the data) read by the central processing unit 31 in the DRAM chipset 322 meets a preset condition (for example, the instruction set read by the central processing unit 31 in the DRAM chipset 322 is smaller than a preset number), the control chip 321 obtains a subsequent instruction set (including the instruction code and the data) of the instruction set in the DRAM chipset 322 from the flash memory 328, and stores the subsequent instruction set in the DRAM chipset 322 for subsequent access by the central processing unit 31.

By the above manner, the instruction set in the DRAM chipset 322 can be automatically updated according to the operating state of each central processing unit 31, so that the storage capacity of the DRAM chipset 322 is close to that of the flash memory 328, the central processing unit 31 can be always in a high-efficiency operating state, and the method is suitable for the fields of cloud computing and the like with high requirements on computing resources, and can greatly improve the operating efficiency of the system.

In another embodiment of the present invention, the processor 37 includes a graphics processor 371, and the memory mapper includes a first memory mapper 323, and the first memory mapper 323 may be formed by a control chip. Accordingly, the persistent storage class memory includes GDDR (Graphics Double Data Rate) 324 respectively connected to the first memory mapper 323, and the memory interface, the DRAM chipset 322, the control chip 321, the flash memory 328, the GDDR 324, and the first memory mapper 323 are integrated into the same body (i.e., the memory mapper is integrated into the memory 32), and the first memory mapper 323 is connected to the Graphics processor 371 through the GDDR bus.

In this embodiment, when receiving the second read/write request from the gpu 371, the first memory mapper 323 acquires the gpu instruction corresponding to the second read/write request from the GDDR 324 and returns the gpu instruction corresponding to the second read/write request to the gpu 371 through the GDDR interface. The second read/write request may specifically be an image or data display instruction, and the graphic processor 371 may output the image or data to a display device such as a display by executing the graphic processing instruction returned by the first memory mapper 323.

The first memory mapper 323 further obtains a subsequent gpu instruction set of the gpu instructions in the GDDR 324 from the flash memory 328 when the gpu instructions in the GDDR 324 meet a predetermined condition (for example, the gpu instructions in the GDDR 324 waiting for the gpu 371 to read are smaller than a predetermined number), converts the subsequent gpu instruction set into a predetermined bit width data corresponding to the bit width of the gpu 371, and then moves the predetermined bit width data to the GDDR 324 for the gpu 371 to access subsequently.

In another embodiment of the present invention, the processor 37 includes an AI processor 372, and the Memory mapper includes a second Memory mapper 325, the second Memory mapper 325 may be formed by a controller chip, accordingly, the persistent storage class Memory includes a High Bandwidth Memory (HBM) 326 respectively connected to the second Memory mapper 325, and the Memory interface, the DRAM chipset 322, the controller chip 321, the flash Memory 328, the HBM326 and the second Memory mapper 325 are integrated into the same substrate (i.e., the Memory mapper is integrated into the Memory 32), and the second Memory mapper 325 is connected to the AI processor 372 via the HBM bus.

In this embodiment, when receiving the third read/write request from the AI processor 372, the second memory mapper 325 obtains the AI instruction corresponding to the third read/write request from the HBM326 and returns the AI instruction corresponding to the third read/write request to the AI processor 372.

When the AI instruction in the HBM326 meets a preset condition (for example, the AI instruction waiting for the AI processor 372 to read in the HBM326 is smaller than a preset number), the second memory mapper 325 obtains a subsequent AI instruction set of the AI instruction in the HBM326 from the flash memory 328, converts the subsequent AI instruction set into preset bit-width data corresponding to the bit-width of the AI processor 372, and then moves the preset bit-width data to the HBM326 for the AI processor 372 to access subsequently.

As shown in fig. 5, the present invention further provides a data processing method for a multiprocessor system, where the method may be applied to a control system for products such as artificial intelligence, multimedia, and voice communication, and may also be applied to a server in a cloud storage system. Referring to fig. 1, the multiprocessor system includes a central processing unit, a memory, at least one processor having a bit width different from that of the memory, and at least one memory mapper, where each memory mapper is connected to a corresponding processor. The central processing unit and the memory have the same bit width, so that the central processing unit can directly access data in the memory through the memory bus. The processor may be a graphic processor, an AI processor, etc., which has different bandwidths with the memory (i.e., the processor and the central processing unit have different bit widths), so that the processor cannot directly access the data in the memory.

The method of the embodiment comprises the following steps:

step S51: each memory mapper converts data stored in the memory into preset bit width data, and the preset bit width data corresponds to the bit width of a processor connected with the memory mapper.

Specifically, when the memory adopts PCM, NRAM, MRAM, ReRAM, FeRAM, or the like, each memory mapper includes a register buffer, and the register buffer and the processor to which the memory mapper is connected have the same bit width. The memory mapper can sequentially write the narrow-bit wide data in the memory into different bits of the register buffer according to the system clock signal CLK, and simultaneously output the data of all the bits through the register buffer to form preset bit wide data.

Of course, in practical applications, the memory mapper may include a buffer, the bit width of the buffer is the same as the bit width of the processor, the register buffer may write the data stored in the register buffer into the buffer, and the processor may read the data in the buffer, thereby improving the data processing efficiency.

Step S52: and the processor reads and processes the preset bit width data generated by the corresponding memory mapper. The bit width of the data generated by the memory mapper is the same as the bit width of the processor, so the processor can directly process the data.

The method can also be applied to a multiprocessor system with a persistent memory, namely the memory is a persistent storage memory which comprises a memory interface, a control chip, a DRAM chipset and a flash memory which are integrated on the same substrate, wherein the memory interface, the DRAM chipset and the flash memory are respectively connected with the control chip, and the memory interface is connected with the central processing unit through a memory bus. The storage capacity of the flash memory is far larger than that of the DRAM chip set.

At this time, the multiprocessor system data processing method includes, in addition to the above-described steps S51 to S52:

when receiving a first read-write request of the central processing unit, the control chip acquires an instruction corresponding to the first read-write request from the DRAM chip set and returns the instruction corresponding to the first read-write request to the central processing unit through the memory interface;

when the instructions in the DRAM chip set meet preset conditions, the control chip acquires a subsequent instruction set of the instructions in the DRAM chip set from the flash memory, converts the subsequent instruction set into preset bit width data corresponding to the bit width of the central processing unit and then moves the preset bit width data to the DRAM chip set.

The multiprocessor system can use an embedded operating system, namely the central processing unit realizes integral operation control based on the embedded operating system, and the instruction set in the DRAM chipset is automatically updated according to the operation state of each central processing unit through the control chip, so that the storage capacity of the DRAM chipset is close to that of a flash memory, the central processing unit can be always in a high-efficiency operation state, the multiprocessor system is suitable for the fields with higher requirements on operation resources, such as cloud computing, and the like, and the operation efficiency of the system can be greatly improved.

In another embodiment of the data processing method of the multiprocessor system of the present invention, the processor includes a graphics processor, and accordingly, the memory mapper includes a first memory mapper, the persistent storage class memory includes GDDRs respectively connected to the first memory mapper, and the memory interface, the DRAM chipset, the controller chip, the flash memory, the GDDR, and the first memory mapper are integrated into the same substrate, and the first memory mapper is connected to the graphics processor via a GDDR bus.

The converting, by the memory mapper in the step S51, the data of the memory into the data with the preset bit width specifically includes: when the graphic processing instruction in the GDDR meets a preset condition, the first memory mapper acquires a subsequent graphic processing instruction set of the graphic processing instruction in the GDDR from the flash memory, converts the subsequent graphic processing instruction set into preset bit width data corresponding to the bit width of the graphic processor and then moves the preset bit width data to the GDDR.

The reading and processing of the preset bit width data generated by the corresponding memory mapper by the processor in the step S52 specifically includes: when receiving a second read/write request of the graphics processor, the first memory mapper acquires a graphics processing instruction corresponding to the second read/write request from the GDDR and returns the graphics processing instruction corresponding to the second read/write request to the graphics processor.

In another embodiment of the data processing method of the multiprocessor system according to the present invention, the processor includes an AI processor, the memory mapper includes a second memory mapper, the persistent storage class memory includes HBMs respectively connected to the second memory mapper, and the memory interface, the DRAM chipset, the control chip, the flash memory, the HBM, and the second memory mapper are integrated into a same substrate, and the second memory mapper is connected to the AI processor via an HBM bus.

At this time, the converting the data of the memory into the data with the preset bit width by the memory mapper in the step S51 specifically includes: when receiving a third read-write request of the AI processor, a second memory mapper acquires an AI instruction corresponding to the third read-write request from the HBM and returns the AI instruction corresponding to the third read-write request to the AI processor;

the reading and processing of the preset bit width data generated by the corresponding memory mapper by the processor in the step S52 specifically includes: and when the AI instruction in the HBM meets a preset condition, the second memory mapper acquires a subsequent AI instruction set of the AI instruction in the HBM from the flash memory, converts the subsequent AI instruction set into preset bit width data corresponding to the bit width of the AI processor and then moves the preset bit width data to the HBM.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A data processing method for a multiprocessor system, wherein the multiprocessor system includes a central processing unit, a memory, at least one processor having a bit width different from that of the memory, and at least one memory mapper, and each memory mapper is connected to a corresponding processor, the method comprising:

2. The multiprocessor system data processing method of claim 1, wherein the memory is PCM, NRAM, MRAM, ReRAM or FeRAM, each of the memory mappers includes a register buffer, and the register buffer and a processor to which the memory mapper is connected have a same bit width; each memory mapper converts the data of the memory into data with a preset bit width, and the method comprises the following steps:

3. The multiprocessor system data processing method of claim 1, wherein the memory is a persistent storage class memory, the persistent storage class memory includes a memory interface, a control chip, a DRAM chipset, and a flash memory integrated into the same base, and the memory interface, the DRAM chipset, and the flash memory are respectively connected to the control chip, and the memory interface is connected to the central processing unit via a memory bus; the method further comprises the following steps:

4. The multiprocessor system data processing method of claim 3, wherein the processor comprises a graphics processor, the memory mapper comprises a first memory mapper, the persistent storage class memory comprises GDDRs respectively connected to the first memory mapper, and the memory interface, DRAM chipset, control chip, flash memory, GDDR, first memory mapper are integrated into a same substrate, the first memory mapper is connected to the graphics processor via a GDDR bus;

5. The multiprocessor system data processing method of claim 3, wherein the processor comprises an AI processor, the memory mapper comprises a second memory mapper, the persistent storage class memory comprises HBMs respectively connected to the second memory mapper, and the memory interface, the DRAM chipset, the control chip, the flash memory, the HBM, the second memory mapper are integrated into a same substrate, the second memory mapper is connected to the AI processor via an HBM bus;

6. A multiprocessor system is characterized in that the multiprocessor system comprises a central processing unit, a memory, at least one processor with different bit widths from the memory and at least one memory mapper, wherein each memory mapper is respectively connected with a corresponding processor; wherein:

7. The multiprocessor system of claim 6, wherein the memory is PCM, NRAM, MRAM, ReRAM, or FeRAM, each of the memory mappers includes a register buffer, and the register buffer has a same bit width as a processor to which the memory mapper is connected;

8. The multiprocessor system according to claim 6, wherein the memory is a persistent storage class memory, the persistent storage class memory includes a memory interface, a control chip, a DRAM chipset, and a flash memory integrated into the same substrate, and the memory interface, the DRAM chipset, and the flash memory are respectively connected to the control chip, and the memory interface is connected to the central processing unit via a memory bus;

9. The multiprocessor system of claim 8, wherein the processor comprises a graphics processor, the memory mapper comprises a first memory mapper, the persistent storage class memory comprises GDDRs respectively connected to the first memory mapper, and the memory interface, DRAM chipset, controller chip, flash memory, GDDR, first memory mapper are integrated into the same base, the first memory mapper is connected to the graphics processor via a GDDR bus;

10. The multiprocessor system of claim 8, wherein the processor comprises an AI processor, the memory mapper comprises a second memory mapper, the persistent storage class memory comprises HBMs respectively coupled to the second memory mapper, and the memory interface, DRAM chipset, controller chip, flash memory, HBM, second memory mapper are integrated into the same substrate, the second memory mapper is coupled to the AI processor via an HBM bus;