CN113297097A - Mixed address programming method for packaging-level multiprocessor - Google Patents

Mixed address programming method for packaging-level multiprocessor Download PDF

Info

Publication number
CN113297097A
CN113297097A CN202110586042.8A CN202110586042A CN113297097A CN 113297097 A CN113297097 A CN 113297097A CN 202110586042 A CN202110586042 A CN 202110586042A CN 113297097 A CN113297097 A CN 113297097A
Authority
CN
China
Prior art keywords
address space
shared
cluster
address
mpus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110586042.8A
Other languages
Chinese (zh)
Other versions
CN113297097B (en
Inventor
魏敬和
黄乐天
于宗光
冯敏刚
王淑芬
鞠虎
高营
顾林
郑利华
刘国柱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 58 Research Institute
Original Assignee
CETC 58 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 58 Research Institute filed Critical CETC 58 Research Institute
Priority to CN202110586042.8A priority Critical patent/CN113297097B/en
Publication of CN113297097A publication Critical patent/CN113297097A/en
Application granted granted Critical
Publication of CN113297097B publication Critical patent/CN113297097B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/0284Multiple user address space allocation, e.g. using different base addresses
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the field of multi-die integration, in particular to a mixed address programming method facing a packaging level multiprocessor. The flexibility of data access is increased by sharing the allocation of the memory address space, the memory access delay is reduced, and the communication efficiency and the system performance are improved.

Description

Mixed address programming method for packaging-level multiprocessor
Technical Field
The invention relates to the field of multi-die integration, in particular to a mixed address programming method for a packaging-level multiprocessor.
Background
In a monolithic asic, all components are designed and fabricated in the same process on a single silicon wafer. As process dimensions shrink, the cost and development cycle for developing such integrated circuits becomes extremely high. Under the circumstances, through continuous research and exploration, developers find that multi-die integration is a necessary choice for the development of integrated circuits, that is, a plurality of functionally different and verified and unpackaged chip components are interconnected and assembled together and packed into a chip whole in the same package, so as to form a package-level network nop (network on package). These dies can be made by different processes and from different manufacturers, thus greatly shortening and reducing the development cycle and difficulty. The difficulty of multi-die integration is how to efficiently interconnect the dies and ensure that higher performance of the micro-system is realized under the constraint of power consumption. The existing communication protocol facing multi-die integration is a special protocol or has poor universality; or the technical system is too bulky and difficult to use. Under the condition that a multi-die interconnection bus protocol is immature, how to define the multi-die interconnection bus protocol meeting the development requirement of the current integrated circuit in China and complete the construction of a new generation of integrated microsystem is a main task in the current stage based on the practical situation and the current technical level in China.
In order to design a complete microsystem architecture, a data sharing mode, a data exchange flow and an interface design need to be determined. The micro system architecture relates to interconnection, data transmission and data sharing among a plurality of processors, and is a core technology for realizing the micro system. Therefore, it is necessary to design a data sharing manner and an addressing manner to ensure fast interaction between a plurality of processors, thereby constructing a high-performance microsystem. There are two basic schemes for common multiprocessor data sharing, one is a scheme based on shared storage, and the other is a scheme based on distributed storage. The scheme based on shared storage adopts a global sharing mode and has a uniform addressing space, and each processor can access all the storage spaces. Although Data movement between processors can be simplified into simple access and memory operations by adopting a shared memory mode, global sharing also brings about the maintenance problem of memory consistency, which has higher design requirements on an interconnected bare chip (a bare chip level Network (NoD) is taken as a core, and various standard protocol interface conversion, configuration units, clock management and other circuits form the bare chip which can be actually used), requires the realization of the support of a consistency protocol of Cache (Cache) and a DDR (double Data rate) main memory in the interconnected bare chip, needs to design a set of special consistency hardware system to ensure the consistency of all memories in the global, and greatly increases hardware overhead and design difficulty. The scheme based on distributed storage adopts mutually independent addressing spaces, so that each processor cannot access the storage spaces of other processors, and program distribution and scheduling cannot be flexibly carried out. This requires that each processing unit be properly assigned and programmed during programming, requiring programmers to be familiar with the specific hardware architecture, which greatly increases programming difficulty.
Obviously, the design requirements of flexible customization and rapid integration of the microsystem cannot be met by adopting any scheme alone, and a new data sharing and addressing mode needs to be designed to adapt to the architecture of the microsystem.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a mixed address programming method facing a package level multiprocessor, and the technical problem to be solved is how to realize the efficient sharing of data among different processors.
In order to solve the technical problems, the technical scheme provided by the invention is as follows: a mixed address programming method facing a package level multiprocessor is characterized in that: the method comprises the following steps:
s1: a multi-level hybrid searching mode, namely a 'hybrid storage' mode is adopted, MPUs are connected with an interconnection bare chip in a DDR mode, a shared DDR memory, an external PCIe interface and an external Rapid IO interface are mapped to an address space of the MPUs, and when different MPUs need to interact, a mailbox mode is adopted to realize communication among the multiprocessors through reading and writing of an area to be determined;
s2: system address space division of micro system, dividing system address space into private address space, shared storage address space, shared peripheral address space and system address space, where the private address space is the independent address space of each main device, the shared storage space atmosphere shares storage address space and shared peripheral address space, the shared storage address space is the shared address that the whole multi-die integrated micro system can access, used for data interaction and temporary storage between devices, improving data locality, reducing access delay, the shared peripheral address space maps the peripheral interface into address for read-write access control of the main device, the system address space is the address space for managing interconnected die and its own auxiliary resources, including address mapping change, batch read-write, DMA operation, protocol conversion mode, Interrupt management and software debugging.
Further, in S2, for the multi-core system, the shared address space is further divided in clusters, one system is composed of a plurality of clusters, one cluster is composed of a plurality of MPUs, each MPU inside the cluster has its private address space, and the other partial address space can be shared by the MPUs inside the cluster, but other devices outside the cluster cannot realize the sharing of the address space. The sharing of the interior of the cluster is realized by a mode that a software program accesses a mailbox address space, when the address space is divided by taking the cluster as a unit, the address space of the whole system is divided into three categories of a private address space, a shared address space and a system address space, the private address space is divided by taking the cluster as a unit, the interior of each cluster is divided into the respective private address spaces of a plurality of MPUs and a cluster shared address space, and the shared address space is further divided into a shared peripheral address space and a shared storage address space.
The beneficial effect that this technical scheme brought is: the mixed address programming facing the packaging level multiprocessor adopts a multi-level mixed addressing mode of global sharing and private address to carry out multi-level division on the system address space, thereby realizing the sharing of multi-level different ranges, simultaneously enriching the structural level of the system, facilitating the thread synchronization of different ranges and increasing the flexibility of programming. The flexibility of data access is increased by sharing the allocation of the memory address space, the memory access delay is reduced, and the communication efficiency and the system performance are improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a block diagram of a multiprocessor data sharing scheme using "hybrid storage";
FIG. 2 is an address space division diagram of MPU0 angles;
FIG. 3 is an address space division diagram for DSP angles;
FIG. 4 is a multi-level address space partition diagram of the system;
FIG. 5 is a multi-level address space division diagram at the MPU0 angle;
FIG. 6 is a multi-level address space partition diagram from a DSP perspective.
Detailed Description
The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings, and it should be understood that the preferred embodiments described herein are merely for purposes of illustration and explanation, and are not intended to limit the present invention. A mixed address programming method facing a package level multiprocessor comprises the following steps:
s1: a multi-level hybrid searching mode, namely a 'hybrid storage' mode is adopted, MPUs are connected with an interconnection bare chip in a DDR mode, a shared DDR memory, an external PCIe interface and an external Rapid IO interface are mapped to an address space of the MPUs, and when different MPUs need to interact, a mailbox mode is adopted to realize communication among the multiprocessors through reading and writing of an area to be determined;
s2: system address space division of micro system-dividing system address space into private address space, shared storage address space, shared peripheral address space and system address space, in which the private address space is independent address space of every main equipment, shared storage space atmosphere shared storage address space and shared peripheral address space, the shared storage address space is shared address which can be accessed by whole multi-bare chip integrated micro system, and is used for data interaction and data temporary storage between every two equipments, raising data locality and reducing access delay, the shared peripheral address space can map peripheral interface into address, and is used for read-write access control of main equipment, and the system address space is address space for managing interconnected bare chip and its self auxiliary resource, including address mapping change, batch read-write, DMA operation, protocol conversion mode, Interrupt management and software debugging.
The method adopts a multi-level mixed addressing mode of global sharing and private address to divide the whole address space, completes the programming of different processors on the basis, observes the division of the address space from the angle of different processors on the basis of the division of the address space of the system, and elaborates the division of the address space of a more complex multi-core system in detail to further explain the data sharing mode of the micro-system.
Fig. 2 and 3 show address space divisions viewed from the MPU and DSP, respectively, and the MPU0 can see the address space including the private address space of the MPU0, the shared address space, and a part of the system address space visible to the MPU0, as shown in fig. 2, the DSP can see the private address space of the MPU0, the shared address space, and a part of the system address space visible to the DSP, as shown in fig. 3, the system address space visible to the MPU0 and the DSP is different, and as an example of the MPU0, the visible system address space includes its own interrupt address space, an address space for managing the interconnect die, and the like, and the MPU can also operate an operating system as a master core, and thus, the system address space can also include the part of the address space.
In this embodiment, in S2, for the multi-core system, the shared address space is further divided in clusters, one system is composed of a plurality of clusters, one cluster is composed of a plurality of MPUs, each MPU inside the cluster has its own private address space, and the other partial address space can be shared by the MPUs inside the cluster, but other devices outside this cluster cannot realize the sharing of the address space.
The sharing of the interior of the cluster is realized by a mode that a software program accesses a mailbox address space, when the address space is divided by taking the cluster as a unit, the address space of the whole system is divided into three categories of a private address space, a shared address space and a system address space, the private address space is divided by taking the cluster as a unit, the interior of each cluster is divided into the respective private address spaces of a plurality of MPUs and a cluster shared address space, and the shared address space is further divided into a shared peripheral address space and a shared storage address space.
By the multi-level data sharing mode, flexibility of address space division is enhanced, as shown in fig. 4, the address space of the whole system is divided into three categories of a private address space, a shared address space and a system-modified address space, the private address space is divided by cluster units, each cluster is divided into a private address space and a cluster-shared address space of a plurality of MPUs, the shared address space can be further divided into a shared peripheral address space and a shared storage address space, under the multi-level address space division, the difference of the address space range which is usually visible by the DSP and the MPUs is still reflected in some shared address spaces in the accessible clusters of the MPUs, but the DSP cannot access the shared address spaces, as shown in fig. 5, the multi-level address space division at an angle of 0 of the MPUs, fig. 6, the multi-level address space division at an angle of the DSP, the space division needs to be determined at the beginning of constructing a micro system, initialization in normal use is dependent on the configuration file, as determined by the configuration information and address translation table.
Based on the traditional architecture, the communication between the multiple dies needs a lot of time for data movement, data cannot be shared, and the communication process needs a lot of participation of MPUs and DSPs, which affects the performance of the system. In the micro system, both the MPU and the DSP are connected through a CIB (chip Interconnect bus) bus, and the DSP can realize shared access to storage space data, so that the flexibility of data access is increased, a large amount of data movement and the participation of the MPU and the DSP are reduced, and the communication efficiency and the system performance are improved. Meanwhile, a multi-level mixed addressing mode of global sharing and private address is adopted, so that the design of the interconnected bare chip is simplified, and the programming flexibility is increased.
The method has the advantages that:
the mixed address programming facing the packaging level multiprocessor adopts a multi-level mixed addressing mode of global sharing and private address to carry out multi-level division on the system address space, thereby realizing the sharing of multi-level different ranges, simultaneously enriching the structural level of the system, facilitating the thread synchronization of different ranges and increasing the flexibility of programming. The flexibility of data access is increased by sharing the allocation of the memory address space, the memory access delay is reduced, and the communication efficiency and the system performance are improved.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (2)

1. A mixed address programming method facing a package level multiprocessor is characterized in that: the method comprises the following steps:
s1: a multi-level hybrid searching mode, namely a 'hybrid storage' mode is adopted, MPUs are connected with an interconnection bare chip in a DDR mode, a shared DDR memory, an external PCIe interface and an external Rapid IO interface are mapped to an address space of the MPUs, and when different MPUs need to interact, a mailbox mode is adopted to realize communication among the multiprocessors through reading and writing of an area to be determined;
s2: the method comprises the steps of dividing the address space of a micro system into a private address space, a shared storage address space, a shared peripheral address space and a system address space, wherein the private address space is an independent address space of each main device, the shared storage space is divided into the shared storage address space and the shared peripheral address space, the shared storage address space is a shared address which can be accessed by the whole multi-die integrated micro system and is used for data interaction and data temporary storage among devices, the locality of data is improved, the access delay is reduced, the shared peripheral address space maps a peripheral interface into an address and is used for read-write access control of the main devices, the system address space is an address space for managing interconnected dies and self auxiliary resources and comprises address mapping change, batch read-write, DMA operation, protocol conversion mode, Interrupt management and software debugging.
2. The hybrid address translation method for a package-level multiprocessor according to claim 1, wherein: in S2, for the multi-core system, the shared address space is further divided in clusters, one system is composed of multiple clusters, one cluster is composed of multiple MPUs, each MPU in the cluster has its private address space, and in addition, part of the address space can be shared by the MPUs in the cluster, but other devices outside the cluster cannot realize the sharing of the address space;
the sharing of the interior of the cluster is realized by a mode that a software program accesses a mailbox address space, when the address space is divided by taking the cluster as a unit, the address space of the whole system is divided into three categories of a private address space, a shared address space and a system address space, the private address space is divided by taking the cluster as a unit, the interior of each cluster is divided into the respective private address spaces of a plurality of MPUs and a cluster shared address space, and the shared address space is further divided into a shared peripheral address space and a shared storage address space.
CN202110586042.8A 2021-05-27 2021-05-27 Mixed address programming method for package level multiprocessor Active CN113297097B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110586042.8A CN113297097B (en) 2021-05-27 2021-05-27 Mixed address programming method for package level multiprocessor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110586042.8A CN113297097B (en) 2021-05-27 2021-05-27 Mixed address programming method for package level multiprocessor

Publications (2)

Publication Number Publication Date
CN113297097A true CN113297097A (en) 2021-08-24
CN113297097B CN113297097B (en) 2022-09-02

Family

ID=77325626

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110586042.8A Active CN113297097B (en) 2021-05-27 2021-05-27 Mixed address programming method for package level multiprocessor

Country Status (1)

Country Link
CN (1) CN113297097B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114679424A (en) * 2022-03-31 2022-06-28 中科芯集成电路有限公司 DMA implementation method for multi-die integrated microsystem

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102075578A (en) * 2011-01-19 2011-05-25 南京大学 Distributed storage unit-based hierarchical network on chip architecture
CN102497411A (en) * 2011-12-08 2012-06-13 南京大学 Intensive operation-oriented hierarchical heterogeneous multi-core on-chip network architecture
CN104202391A (en) * 2014-08-28 2014-12-10 浪潮(北京)电子信息产业有限公司 RDMA (Remote Direct Memory Access) communication method between non-tightly-coupled systems of sharing system address space
CN104699631A (en) * 2015-03-26 2015-06-10 中国人民解放军国防科学技术大学 Storage device and fetching method for multilayered cooperation and sharing in GPDSP (General-Purpose Digital Signal Processor)

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102075578A (en) * 2011-01-19 2011-05-25 南京大学 Distributed storage unit-based hierarchical network on chip architecture
CN102497411A (en) * 2011-12-08 2012-06-13 南京大学 Intensive operation-oriented hierarchical heterogeneous multi-core on-chip network architecture
CN104202391A (en) * 2014-08-28 2014-12-10 浪潮(北京)电子信息产业有限公司 RDMA (Remote Direct Memory Access) communication method between non-tightly-coupled systems of sharing system address space
CN104699631A (en) * 2015-03-26 2015-06-10 中国人民解放军国防科学技术大学 Storage device and fetching method for multilayered cooperation and sharing in GPDSP (General-Purpose Digital Signal Processor)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114679424A (en) * 2022-03-31 2022-06-28 中科芯集成电路有限公司 DMA implementation method for multi-die integrated microsystem
CN114679424B (en) * 2022-03-31 2023-07-07 中科芯集成电路有限公司 DMA (direct memory access) implementation method of multi-die integrated microsystem

Also Published As

Publication number Publication date
CN113297097B (en) 2022-09-02

Similar Documents

Publication Publication Date Title
CN107341053B (en) Heterogeneous multi-core programmable system and memory configuration and programming method of computing unit thereof
Swan et al. Cm* a modular, multi-microprocessor
CN104699631A (en) Storage device and fetching method for multilayered cooperation and sharing in GPDSP (General-Purpose Digital Signal Processor)
CN106681949B (en) Direct memory operation implementation method based on consistency acceleration interface
Gschwandtner et al. Performance analysis and benchmarking of the intel scc
KR101830685B1 (en) On-chip mesh interconnect
CN102541804A (en) Multi-GPU (graphic processing unit) interconnection system structure in heterogeneous system
KR101003102B1 (en) Memory assignmen method for multi-processing unit, and memory controller using the same
CN112817902B (en) Interconnected bare chip interface management system and initialization method thereof
CA2506522A1 (en) Methods and apparatus for distributing system management signals
CN113297097B (en) Mixed address programming method for package level multiprocessor
US9703516B2 (en) Configurable interface controller
CN106844263B (en) Configurable multiprocessor-based computer system and implementation method
US20040064748A1 (en) Methods and apparatus for clock domain conversion in digital processing systems
CN104714906A (en) Dynamic processor-memory revectoring architecture
US11301295B1 (en) Implementing an application specified as a data flow graph in an array of data processing engines
US20030229721A1 (en) Address virtualization of a multi-partitionable machine
CN114240731B (en) Distributed storage interconnection structure, video card and memory access method of graphics processor
CN116757132A (en) Heterogeneous multi-core FPGA circuit architecture, construction method and data transmission method
CN106502923B (en) Storage accesses ranks two-stage switched circuit in cluster in array processor
WO2021139733A1 (en) Memory allocation method and device, and computer readable storage medium
CN111666104A (en) DSP processor design method supporting starting from RapidO
US20040064662A1 (en) Methods and apparatus for bus control in digital signal processors
JPWO2012127534A1 (en) Barrier synchronization method, barrier synchronization apparatus, and arithmetic processing apparatus
US20220066923A1 (en) Dynamically configurable multi-mode memory allocation in an accelerator multi-core system on chip

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant