CN115454907A

CN115454907A - RISC-V instruction set-based multi-matrix node bus topological structure and working method thereof

Info

Publication number: CN115454907A
Application number: CN202211080020.5A
Authority: CN
Inventors: 周莉; 牟进正; 薛立晓; 贾思敏; 王肖丛; 孙田弋
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2022-09-05
Filing date: 2022-09-05
Publication date: 2022-12-09

Abstract

The invention relates to a multi-matrix node bus topological structure based on RISC-V instruction set and a working method thereof, comprising a high-speed bus matrix node, a plurality of slow bus matrix nodes and a plurality of low-speed bus matrix nodes; the power consumption of each matrix node is controlled to form poor efficiency through different widths of bus channels, different working frequencies or different working voltages, and the requirement for the working modes of the same matrix node at different voltages or different working frequencies is met by designing a multi-stage working voltage mode, so that the matrix nodes at the same level can be connected with each other, and normal communication and working states are kept. Meanwhile, the invention designs the internal nodes of each matrix node, and determines the access authority according to different paths in the matrix nodes.

Description

RISC-V instruction set-based multi-matrix node bus topological structure and working method thereof

Technical Field

The invention relates to a multi-matrix node bus topological structure based on a RISC-V instruction set and a working method thereof, belonging to the technical field of the design of a hierarchical structure of an integrated circuit processor.

Background

Over the years, with the improvement of chip design technology and the wide application range, RISC-V shows the advantages of complete open source, simple architecture and the like which are not possessed by more and more traditional ARM and x86 architectures. RISC-V is now widely used, but the current market compatibility is not good because RISC-V officially only provides the TILELINK bus protocol.

Meanwhile, the system bus of the SoC is more and more diversified, the traditional system bus does not meet the requirement of the SoC design with the difference day by day any more, and because the kernel bus interfaces of various CPUs are inconsistent, a processor with lower performance is used for driving the complex SoC, so that the performance of the SoC is not matched with the core performance, various problems are caused, the number of kernels is increased for solving the problems, and more kernels are used for driving the whole SoC, but the method relates to extremely complex consistency detection in a complex network on chip formed by the kernels and various kernel clusters; besides, the number of parallel bus channels can be increased by modifying the bus interface in the CORE, but this method cannot modify the bus interface in the case that the provider of the CORE IP provides a hard IP or a fixed IP.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a multi-matrix node bus topological structure based on a RISC-V instruction set, which solves the bus protocol compatibility problem of the RISC-V on a middle-high end processor;

the invention also provides a working method of the hardware architecture;

the invention utilizes an instruction set kernel based on RISC-V; the invention is composed of three matrix nodes, namely a High-speed bus matrix node (High speed bus matrix node), a slow-speed bus matrix node (Medium speed bus matrix node) and a Low-speed bus matrix node (Low speed bus matrix node), wherein each matrix node can control power consumption to form poor efficiency through different widths, different working frequencies or different working voltages of a bus channel;

interpretation of terms:

1. DMAP: direct Memory Access for Peripherals, a DMA operating at a low speed node, is capable of performing data conversion from peripheral to peripheral in addition to the normal DMA functions (i.e. 3 paths, memory to peripheral, peripheral to Memory, memory to Memory).

2. The main end: and the two devices perform data interaction and send out an instruction.

3. The slave end: and one end of the two devices for data interaction and receiving the instruction.

4. DMA: direct memory access is an off-core unit operable at each node to carry data.

5. And (3) CORE: an arithmetic unit capable of working in MSMN and HSMN, a core unit for processing data in the node.

6. MATRIX: the communication matrix formed by the interaction of the master end and the slave end, and the intersection point of each access master end and each access slave end is called an internal node.

7. FLASH: the flash memory can be configured into a nor flash or a nand flash, a nonvolatile memory unit.

8. SRAM: static random access memory, a volatile memory cell.

9. LOW SPEED BUS: low speed buses are commonly used to connect peripherals.

10. BRIDGE: buses of different speeds must be connected through a bridge for allocation and chip select addresses.

The technical scheme of the invention is as follows:

a multi-matrix node bus topological structure based on RISC-V instruction set comprises a high-speed bus matrix node, a plurality of slow bus matrix nodes and a plurality of low-speed bus matrix nodes;

each matrix node controls power consumption through different widths, different working frequencies or different working voltages of bus channels, so that the forming efficiency is poor, and meanwhile, the correct node communication direction is kept.

According to the optimization of the invention, in the low-speed bus matrix nodes, DMAP and other slow-speed bus matrix nodes or high-speed bus matrix nodes are used as the main ends of the low-speed bus matrix nodes, and in addition, a control signal from the high-speed bus matrix node reaches the low-speed bus matrix node through BRIDGE; the low-speed bus matrix node and an SRAM smaller than 128KB serve as the slave ends of the low-speed bus matrix node; the SRAM with smaller capacity is configured with an independent working voltage domain, and can still keep normal work after the voltage domain of the low-speed bus matrix node is closed, and data externally arranged in the SRAM can be stored.

According to the optimization of the invention, in the slow bus matrix node, the DMA, the configurable core and the control signal sent by the high-speed bus matrix node are used as the main end of the slow bus matrix node; the control signal from the high-speed bus matrix node reaches the slow-speed bus matrix node through BRIDGE; the peripheral serves as a slave end of the node of the slow bus matrix.

Preferably, according to the invention, in the high-speed bus matrix node, a high-performance core, a DMA and other configurable accelerators are used as main equipment of the high-speed bus matrix node; under the condition that the cache bus matrix node has an independent core and does not have a FLASH for storing a Bootloader, a starting program for storing the cache bus matrix node is configured at the slave end of the cache bus matrix node, after the SoC works, the DMA is controlled through a program of a high-performance core, and the starting program of the cache bus matrix node is carried to an SRAM or other storage units of the cache bus matrix node.

According to the invention, the whole SoC is preferably divided into three voltage domains, namely a main domain, a backup domain and an analog voltage domain, wherein each matrix node works in one main domain and independently works through different power gating, or all the matrix nodes work in different main domains, and a frequency difference is generated by voltage, so that the matrix nodes of the same type have different working frequencies to meet the diversity of low power consumption.

According to the invention, preferably, a unidirectional instruction filter is added in BRIDGE between the high-speed bus matrix node and the slow-speed bus matrix node.

According to the invention, the high-speed bus matrix nodes carry out data communication with each slow-speed bus matrix node and each low-speed bus matrix node, and the command filtering principle is followed when the command is carried;

the high-speed bus matrix node is used as a main end of a slow bus matrix node and a low-speed bus matrix node;

in the retardance bus matrix nodes, the same level of retardance bus matrix nodes are connected to ensure that different retardance bus matrix nodes have a master frequency difference, the retardance bus matrix node with higher working frequency is used as the master end of two retardance bus matrix nodes, and the retardance bus matrix node with lower working frequency is used as the slave end of the two retardance bus matrix nodes;

when the high-speed bus matrix node is connected with the slow bus matrix node, the slow bus matrix node is used as a slave end; when the slow bus matrix node is connected with the low-speed bus matrix node, and when the slow bus matrix node interacts with the low-speed bus matrix node with the same frequency and lower frequency, the slow bus matrix node is used as a main end;

in the low-speed bus matrix node, when the low-speed bus matrix node is connected with a same-frequency or higher-frequency retarding bus matrix node, the low-speed bus matrix node is used as a slave end of the same-frequency or higher-frequency retarding bus matrix node; when the low-speed bus matrix nodes are connected with the high-speed bus matrix nodes, the higher-frequency matrix nodes serve as the master end, and the lower-frequency matrix nodes serve as the slave end.

According to the invention, in the slow BUS MATRIX node, the control signal sent by the high SPEED BUS MATRIX node, three groups of buses of two DMAs and one CORE are used as the master of MATRIX, and FLASH, SRAM, LOW SPEED BUS and BRIDGE are used as the slaves of MATRIX, wherein each master is configured with the access authority of the corresponding slave, and two or more groups of channels are synchronously performed on the channel without node access.

The working method of the multi-matrix node bus topological structure based on the RISC-V instruction set comprises the following steps:

each matrix node works in different voltage domains independently, and each matrix node refers to any one of a high-speed bus matrix node, a slow-speed bus matrix node and a low-speed bus matrix node;

at first, for each matrix node with FLASH, reading Bootloader in FLASH, and for the matrix nodes without FLASH, carrying the data in the matrix nodes with FLASH to the matrix nodes without FLASH through DMA or CORE; after each matrix node works, the interactive data is also carried to the matrix node with the FLASH through DMA or CORE;

before the voltage of each matrix node is turned off, after delaying for several cycles through an algorithm, and after finishing data processing in the matrix nodes, turning off the voltage domain through a voltage switch; and for the execution of transferring the data to other matrix nodes, after the data is transferred to other nodes, the voltage domain is turned off by the voltage switch.

The invention has the beneficial effects that:

1. the invention designs three matrix nodes to meet the power consumption performance ratio under different conditions, and because each matrix node can work independently, the current power consumption requirement is determined according to the current performance requirement only to the maximum extent, and the method is a typical design mode of exchanging area for power consumption.

2. The invention designs a multi-stage working voltage mode to meet the working modes of the same matrix node at different voltages or different working frequencies, so that the same-stage matrix nodes can be connected with each other to keep normal communication and working states.

3. The invention designs the internal nodes of each matrix node, determines the access authority according to different paths in the matrix nodes, simultaneously provides a DMAP working mode in an extremely low power consumption mode, and can carry out peripheral data interaction in nodes without CORE.

4. The method designs two design methods which ensure data consistency and do not need secondary cache, wherein one method is to ensure that an input instruction of each matrix node can be executed and identified by a core of the matrix node through instruction filtering, and the other method is to set an independent storage space for each matrix node and provide cross-node memory protection.

5. The invention designs a brand new bus topology structure by combining the points, the topology structure is composed of the three nodes, the nodes (except HSMN) can work in three different voltage domains at most, thereby forming seven data nodes at most, the relationship of the data nodes is divided into absolute link and passive link, the absolute link is a node relationship which can be accessed to a certain extent, and the passive link is a node relationship which is not suggested as the initial end of the passive link as the main end.

Drawings

FIG. 1 is a schematic diagram of a RISC-V instruction set-based multi-matrix node bus topology structure, its working method, the connection relationship between nodes, and the class division of nodes;

fig. 2 is a schematic diagram of a five-node three-core SoC designed using a bus topology according to the present invention.

Detailed Description

The invention is further defined in the following, but not limited to, the figures and examples in the description.

Example 1

each matrix node is realized through a voltage controller or a frequency divider by different widths, different working frequencies or different working voltages of a bus channel, so that the power consumption is controlled to be low in efficiency, and meanwhile, the correct node communication direction is kept.

Example 2

A RISC-V instruction set based multi-matrix node bus topology as described in embodiment 1, the difference being that:

in the low-speed bus matrix nodes, DMAP and other slow-speed bus matrix nodes or high-speed bus matrix nodes are used as main ends of the low-speed bus matrix nodes, and in addition, a control signal from the high-speed bus matrix node reaches the low-speed bus matrix nodes through BRIDGE; the low-speed bus matrix node and an SRAM smaller than 128KB serve as the slave ends of the low-speed bus matrix node; the SRAM with smaller capacity is configured with an independent working voltage domain, and can still keep normal work after the voltage domain of the low-speed bus matrix node is closed, and data externally arranged in the SRAM can be stored.

In the slow bus matrix node, a DMA, a configurable core and a control signal sent by the high-speed bus matrix node are used as a main end of the slow bus matrix node; control signals from the high-speed bus matrix node need to reach the slow-speed bus matrix node through a BRIDGE; peripheral devices such as common FLASH, SRAM, etc. are used as slaves to the cache bus matrix nodes.

In the high-speed bus matrix node, a high-performance core, DMA and other configurable accelerators are used as main equipment of the high-speed bus matrix node; under the condition that the cache bus matrix node has an independent core and does not have a FLASH for storing a Bootloader, a starting program for storing the cache bus matrix node is configured at the slave end of the cache bus matrix node, after the SoC works, the DMA is controlled through a program of a high-performance core, and the starting program of the cache bus matrix node is carried to an SRAM or other storage units of the cache bus matrix node. Generally, a system has only one high-speed bus matrix node, and the high-speed bus matrix node uses more data channels than the slow-speed matrix node and the low-speed matrix node.

Example 3

A RISC-V instruction set based multi-matrix node bus topology according to embodiments 1 or 2, the difference being:

the whole SoC is integrally divided into three voltage domains, namely a main domain, a backup domain and an analog voltage domain, wherein each matrix node works in one main domain and independently works through different power gating, or all the matrix nodes work in different main domains, and frequency difference is generated by voltage, so that the matrix nodes of the same type have different working frequencies to meet the diversity of low power consumption. Analog circuits such as an ADC (analog to digital converter), a DAC (digital to analog converter) and the like work in an analog voltage domain, some parts of peripheral equipment work in a backup domain, and the backup domain can not be controlled by no power gating.

Data consistency correlation problem of different matrix nodes: the whole system is likely to have the condition of multi-core coexistence, so the possibility of data conflict is likely to be caused, and the conflict of data consistency is not solved by adopting a multi-stage cache in the design of the system, mainly because the cache involves extremely complicated cross-clock domain signal processing and cache consistency detection under the condition that various working voltages are likely to exist; however, in order to solve the potential data conflict in the system, the invention adopts the following method:

through unidirectional RISC-V instruction filtering, because a single CORE can be configured in the MSMN without configuring a FLASH for storing a corresponding Bootloader, a starting program in the HSMN needs to be carried into the MSMN through DMA, but because instruction sets of two nodes are not necessarily compatible, the starting program which is originally executed by the HSMN and cannot be identified in the MSMN is easily carried into a memory of the MSMN. If the program operation is wrong, the unexecutable instruction is sent to MSMN through BRIDGE to keep the original data still, and a short interrupt instruction is sent out, and the next address of the instruction is used as the interrupt vector of the interrupt signal.

The high-speed bus matrix nodes are in data communication with each slow-speed bus matrix node and each low-speed bus matrix node, and the command filtering principle is followed when the command is carried;

in the slow bus matrix nodes, the slow bus matrix nodes at the same level are connected to ensure that different slow bus matrix nodes have certain main frequency difference (distributed by a voltage and frequency regulating unit), the slow bus matrix node with higher working frequency is used as a main end of two slow bus matrix nodes, and the slow bus matrix node with lower working frequency is used as a slave end of the two slow bus matrix nodes;

when the high-speed bus matrix node is connected with the slow bus matrix node, the slow bus matrix node is used as a slave end; when the slow bus matrix node is connected with the low-speed bus matrix node, and when the slow bus matrix node interacts with the low-speed bus matrix node with the same frequency and lower frequency, the slow bus matrix node is used as a main end; when the LSMN is connected with a high-frequency LSMN, the LSMN needs to be subjected to frequency reduction or voltage reduction, otherwise, data collision is easily caused;

in the low-speed bus matrix node, when the low-speed bus matrix node is connected with a same-frequency or higher-frequency retarding bus matrix node, the low-speed bus matrix node is used as a slave end of the same-frequency or higher-frequency retarding bus matrix node; when the low-speed bus matrix node is connected with the high-speed bus matrix node, the higher-frequency matrix node is used as a master end, and the lower-frequency matrix node is used as a slave end. The design method is not easy to generate data conflict, and not only can keep the control under various conditions of low power consumption, but also can continue the rationality of design.

In each MATRIX node, there are several groups of internal nodes, each internal node represents the slave end of the master end capable of being oriented to the correspondent node, in the slow SPEED BUS MATRIX node, the control signal sent out by the high SPEED BUS MATRIX node, three groups of buses of two DMA (DMA 1, DMA 2) and one CORE are used as master end of MATRIX, FLASH, SRAM, LOW SPEED BUS and BRIDGE are used as slave end of MATRIX, in which every master end is equipped with access authority of correspondent slave end, on the channel without node access, two groups or several groups of channels are synchronously implemented. For example, while DMA1 accesses FLASH, DMA2 without internal node conflict can also access SRAM but not FLASH.

Each matrix node of the three matrix nodes has an independent storage space, the high-speed bus matrix node and the slow bus matrix node independently execute an independent starting program, however, the SoC allows cross-node data interaction, DMA of the high-speed bus matrix node and the slow bus matrix node only carries the transmission type from a memory to the memory, data can be carried only when the nodes in the matrix are not occupied, and when the matrix nodes are occupied, the end of the occupation time of the nodes needs to be waited, or the nodes in the matrix are forcibly interrupted through an interruption program.

FIG. 1 is a schematic diagram of a proposed RISC-V instruction set-based multi-matrix node bus topology, its working method, the connection relationship between nodes, and the class division of nodes; fig. 1 lists 7 types of matrix nodes, where the types of the matrix nodes are represented from top to bottom, and the voltage ranges of the work represented from left to right are different, the voltage or frequency ranges are divided by solid lines and dotted lines, the types of the matrix nodes are divided by names in fig. 1, arrows between the matrix nodes represent the connection directions of the master end and the slave end, the solid lines of the connecting lines represent matrix nodes that can be accessed, the dotted lines represent matrix nodes that can be accessed only by performing down-conversion or voltage-reduction processing before access, and the circuit in fig. 1 is a real circuit architecture after the matrix nodes are amplified.

Example 4

The working method of the multi-matrix node bus topology based on RISC-V instruction set described in any of embodiments 1-3 includes the following steps:

at first, for each matrix node with FLASH, reading Bootloader in FLASH out, and for the matrix nodes without FLASH, carrying the data in the matrix nodes with FLASH to the matrix nodes without FLASH through DMA or CORE; after each matrix node works, the interactive data is also carried to the matrix node with the FLASH through DMA or CORE;

because each matrix node can be independently turned off, before the voltage of each matrix node is turned off, after the voltage needs to be delayed for several cycles through an algorithm, after the data processing in the matrix nodes is finished, the voltage domain is turned off through a voltage switch; and for the execution of transferring the data to other matrix nodes, after the data is transferred to other nodes, the voltage domain is turned off by the voltage switch. For example, data in the cache bus matrix node is transferred to the high-speed bus matrix node, at this time, the internal voltage domain is closed in batches, the instruction and the data in the cache are transferred to the SRAM or the FLASH, then, the voltage of the CORE in the cache bus matrix node is closed, and after the executable instruction is transmitted to the high-speed bus matrix node, the stored voltage is closed.

Example 5

The working method of the multi-matrix node bus topology based on RISC-V instruction set according to embodiment 4 is characterized in that:

for convenience of illustration, fig. 2 is a schematic diagram of a five-node three-core SoC designed using a bus topology; the system comprises a high-speed bus matrix node, two slow bus matrix nodes and two low-speed bus matrix nodes, wherein one high-speed bus matrix node works in a high-frequency domain, the two slow bus matrix nodes respectively work in a middle-frequency domain and a high-frequency domain, and the two low-speed bus matrix nodes respectively work in a middle-frequency domain and a high-frequency domain; renaming all the matrix nodes in FIG. 2, namely, a high-speed bus matrix node is named as node 1, two slow-speed bus matrix nodes respectively working in high and medium frequency domains are named as node 2 and node 3, namely, the three matrixes above the FIG. 2 are sequentially named as node 1, node 2 and node 3 from left to right, and the last two low-speed bus matrix nodes working in the high and medium frequency domains are named as node 4 and node 5, namely, the two matrixes below the FIG. 2 are sequentially named as node 4 and node 5 from left to right;

the components of each node are briefly introduced, wherein the node 1 comprises a CORE with higher performance and a DMA with more than 11 channels as a main end of the node 1, two flashes (and controllers of the flashes), an SRAM and a BRIDGE leading to one of the slow bus matrix nodes and one of the low-speed bus matrix nodes (the other slow bus matrix node has its own FLASH, and the path is not designed in the design of the whole SoC, otherwise, the timing violation is caused by too long connecting lines), the two slow bus matrix nodes are basically the same, and are respectively formed by an access signal of a high node, two DMAs with 7 or 5 channels and a CORE as the main end of the node, the slave end is designed to a storage unit and a low-speed bus connected with the peripheral, the two low-speed bus matrix nodes are basically the same, but the slave end of the low-speed bus matrix node is not provided with a storage unit, and only two groups of low-speed central lines connected with the peripheral are arranged.

The working mode of each node under a group of complete all-node common working modes;

the node 1 and the node 3 simultaneously carry the instructions of the FLASH stored in the respective matrixes to an instruction coupling storage unit or a cache in the core, and after the core of the node 1 receives an instruction for carrying the data of the other FLASH to the node 2, the node sends a command to the DMA of the same node;

because there is no internal node conflict at this moment, the DMA transfers the data of another FLASH to the storage unit in the node 2 (in this node cross-frequency transmission mode, since the write rate is higher than the read rate, the write operation does not occupy the access node in the dual port memory of the node 2), and meanwhile, the node 1 and the node 3 still normally execute their respective instructions, wherein the node 3 receives an instruction to access the data of the node 5, and the node 2 receives an instruction to sequentially access the same peripheral of the node 4 and the node 5, in this case, arbitration is caused when there is a possibility of multi-master access at the node 5;

the node 2 accesses the node 4 normally because of no arbitration condition, the node 5 causes arbitration because the node 2 and the node 3 access simultaneously, the arbitration algorithm adopts normal polling arbitration, the polling is started from a high-frequency domain, and the node 2 accesses the node 5 preferentially;

because the DMA of the node 1 still carries the instruction to the node 2, during the carrying process, a piece of instruction that the node 2 cannot execute is found through RISC-V instruction filtering, at this time, the carrying program of the node 1 is interrupted (the instruction error is sent by the DMA, the interrupt vector is fixed at the next piece of the instruction, and the program can be terminated through interruption because the instruction error happens accidentally), meanwhile, the node 2 executes the instruction, accesses the data in the node 3, and the program is terminated after the access is finished;

the above description mainly describes that in the topology structure of 5 nodes, all the node connection modes, the cross-frequency domain DMA working mode and node occupation, the multi-master arbitration of internal nodes, and the program interruption caused by instruction filtering.

Describing the working mode of independent or partial node group combined work of each node under the condition of different power consumption;

starting with simultaneous operation of all nodes, because performance requirements are reduced and a high-power consumption mode of a high-speed bus matrix node is not needed for working, the high-speed bus matrix node is closed through power gating, a read channel of a FLASH is turned off before turning off, then data in a current register is stored into an always on cell, two slow-speed bus matrix nodes and two low-speed bus matrix nodes still work normally, and at the moment, CORE and DMA of the slow-speed bus matrix nodes can normally access other nodes and PDMA of the high-speed bus matrix node;

with the further reduction of the performance requirement, the power gating of the two slow speed bus matrix nodes is closed in sequence, so that the whole system enters a single-core and coreless working mode in sequence, the turn-off sequence in the single-core mode is consistent with the turn-off sequence of the high speed bus matrix nodes, and the program is still stored in each unit;

when entering a CORE-less mode, because the DMAP can not identify the command, the carrying command of the peripheral storage space is required to be sent to two low-speed bus matrix nodes before the slow-speed bus matrix nodes are turned off, and after the two low-speed bus matrix nodes receive the carrying command, the data of different peripherals can be carried through the DMAP operation in a CORE dormant state, because the nodes 4 and 5 in the figure 2 have no storage units, after the slow-speed bus matrix nodes and the high-speed bus matrix nodes are turned off, the DMAP of the nodes 4 and 5 can only carry out peripheral and peripheral transmission modes;

finally, in the mode of lowest power consumption, only the node 5 is still working, it can execute the unfinished instruction of the access node 5 data sent to the node 4 by the node 2, or can execute the unfinished instruction of the access node 5 by the node 2 or the node 3, at this time, the DMAP can be driven by the cores of the node 2 and the node 3 before turning off, or can be initiated by the peripheral interrupt in the coreless mode.

The above examples describe the single or combined working mode of each node in the low power consumption mode, the working mode is also one of the performance advantages of the multi-matrix node bus topological structure, the matrix node topological structure can be seen by combining the two examples, the multi-core cooperative work and more controllability selection of the power consumption modes can be completed under the condition of not considering a consistency protocol, and simultaneously, the RISC-V instruction filter and the independent node storage unit are inserted, so that various interface selections of various RISC-V kernels are maximally adapted.

Claims

1. A multi-matrix node bus topological structure based on RISC-V instruction set is characterized by comprising a high-speed bus matrix node, a plurality of slow bus matrix nodes and a plurality of low-speed bus matrix nodes;

2. A RISC-V instruction set based multi-matrix node bus topology as claimed in claim 1, wherein, in the low speed bus matrix nodes, DMAP and other slow speed bus matrix nodes or high speed bus matrix nodes are used as the master of the low speed bus matrix nodes, and in addition, the control signal from the high speed bus matrix node needs to reach the low speed bus matrix node through a BRIDGE; the low-speed bus matrix node and an SRAM smaller than 128KB serve as the slave ends of the low-speed bus matrix node; the SRAM with smaller capacity is configured with an independent working voltage domain, and can still keep normal work after the voltage domain of the low-speed bus matrix node is closed, and stores data externally arranged in the voltage domain.

3. The RISC-V instruction set based multi-matrix node bus topology of claim 1, wherein in a slowdown bus matrix node, the DMA, the configurable core and the control signal from the high speed bus matrix node are used as the master of the slowdown bus matrix node; the control signal from the high-speed bus matrix node reaches the slow-speed bus matrix node through BRIDGE; the peripheral serves as a slave end of the node of the slow bus matrix.

4. A RISC-V instruction set based multi-matrix node bus topology as recited in claim 1, wherein the cores, DMA, and other configurable accelerators in the high speed bus matrix node act as masters for the high speed bus matrix node; under the condition that the cache bus matrix node has an independent core without a FLASH storage Bootloader, a starting program for storing the cache bus matrix node is configured at the slave end of the cache bus matrix node, after the SoC works, the DMA is controlled through the program of the high-performance core, and the starting program of the cache bus matrix node is carried to the SRAM or other storage units of the cache bus matrix node.

5. The RISC-V instruction set-based multi-matrix node bus topology of claim 1, wherein the whole SoC is divided into three voltage domains, namely a master domain, a backup domain and an analog voltage domain.

6. A RISC-V instruction set based multi-matrix node bus topology according to claim 5, wherein each matrix node operates in one master domain independently with different power gating or all matrix nodes operate in different master domains with frequency difference created by voltages, resulting in different operating frequencies for the same type of matrix node.

7. A RISC-V instruction set based multi-matrix node bus topology as recited in claim 1, wherein a one-way instruction filter is added to BRIDGE between the high speed bus matrix nodes and the slow speed bus matrix nodes.

8. A RISC-V instruction set based multi-matrix node bus topology as recited in claim 1, wherein the high speed bus matrix nodes communicate data with each of the slow bus matrix nodes and each of the low speed bus matrix nodes, following instruction filtering when carrying instructions;

in the slow bus matrix nodes, the slow bus matrix nodes at the same level are connected to ensure that different slow bus matrix nodes have main frequency difference, the slow bus matrix node with higher working frequency is used as the main end of two slow bus matrix nodes, and the slow bus matrix node with lower working frequency is used as the slave end of the two slow bus matrix nodes;

in the low-speed bus matrix node, when the low-speed bus matrix node is connected with a same-frequency or higher-frequency retarding bus matrix node, the low-speed bus matrix node is used as a slave end of the same-frequency or higher-frequency retarding bus matrix node; when the low-speed bus matrix node is connected with the high-speed bus matrix node, the higher-frequency matrix node is used as a master end, and the lower-frequency matrix node is used as a slave end.

9. A RISC-V instruction set based multi-MATRIX node BUS topology as recited in claim 1, wherein, in the cache BUS MATRIX node, the control signal issued by the cache BUS MATRIX node, three sets of buses of two DMAs and one CORE are used as master of MATRIX, and FLASH, SRAM, LOW SPEED BUS and BRIDGE are used as slave of MATRIX, wherein each master is configured with access right corresponding to the slave, and two or more sets of lanes are synchronized on the lanes without node access.

10. A method of operating a RISC-V instruction set based multi-matrix node bus topology according to any of claims 1 to 9, comprising the steps of: