CN111506540A - Hardware programmable heterogeneous multi-core system on chip - Google Patents

Hardware programmable heterogeneous multi-core system on chip Download PDF

Info

Publication number
CN111506540A
CN111506540A CN202010333344.XA CN202010333344A CN111506540A CN 111506540 A CN111506540 A CN 111506540A CN 202010333344 A CN202010333344 A CN 202010333344A CN 111506540 A CN111506540 A CN 111506540A
Authority
CN
China
Prior art keywords
core
dsp
bus
chip
mpu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010333344.XA
Other languages
Chinese (zh)
Other versions
CN111506540B (en
Inventor
谢长生
黄旭东
张猛华
陈振娇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 58 Research Institute
Original Assignee
CETC 58 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 58 Research Institute filed Critical CETC 58 Research Institute
Priority to CN202010333344.XA priority Critical patent/CN111506540B/en
Publication of CN111506540A publication Critical patent/CN111506540A/en
Application granted granted Critical
Publication of CN111506540B publication Critical patent/CN111506540B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package

Abstract

The invention discloses a hardware programmable heterogeneous multi-core system on a chip, and belongs to the technical field of integrated circuits. The system on chip organically combines a multi-core DSP, a multi-core MPU, a GPU, an FPGA and a plurality of IP components through a high-speed bus interconnection network on chip to form a system on chip with programmable hardware and multi-core and heterogeneous processors, and fully exerts the speciality of each heterogeneous core. The invention is used for completing the realization of high throughput rate data preprocessing, intensive data operation, bottom layer algorithm and other services through an area consisting of a multi-core DSP and an FPGA; and the area formed by the multi-core MPU, the GPU and the FPGA accelerator is used for completing the realization of services such as a user interface, a high-level algorithm, application program operation, network transmission and the like. And the FPGA can also realize the hardware acceleration function required in the multi-core DSP and the multi-core MPU. Based on hardware reconfiguration, the chip architecture improves the flexibility of system-on-chip integration, and the expandability and upgradability of later-stage products.

Description

Hardware programmable heterogeneous multi-core system on chip
Technical Field
The invention relates to the technical field of integrated circuits, in particular to a hardware programmable heterogeneous multi-core system on a chip.
Background
Currently, the application fields of communication, machine vision, assisted driving, medical/biological imaging, avionics, big data analysis, internet of things and the like all require high-performance digital signal processing, intensive data computing capability and strong graphic image processing and displaying functions, and further require systems and algorithms to have flexibility and self-adaption capability. The updating and upgrading speed of the whole machine system is increased at present, the whole machine system needs to have certain upgrading capability, and after the product is manufactured, a chip in the system can provide certain modifying, optimizing and reconstructing capabilities.
In order to meet the above requirements, many attempts have been made in the industry to introduce corresponding products. Firstly, the heterogeneous multi-core processor, the heterogeneous processing core architecture combining the RISC general processor and the DSP has better performance and more efficient energy consumption. Under this architecture, the DSP is used to process intensive data processing, complex algorithms such as data filtering, FFT operations, and the like. The DSP is more efficiently implemented thanks to its harvard architecture, SIMD architecture, and hardware support in terms of special addressing, zero overhead loops, etc. The general purpose processor such as ARM is used for processing user interface, controlling affairs, network connection, operating operation system and application program, and the overall performance of the hybrid system is greatly improved.
The traditional DSP processor core is a high-performance arithmetic unit, and the operation is carried out in series when the algorithm is implemented, so that the more complicated algorithm calculation can be completed by circulating for hundreds of times, and the total operation speed of the DSP processor processing algorithm is not very high. In large-scale data calculation, parallel processing and high throughput of data operation are often needed, so that the disadvantage of a single DSP is more obvious. When the performance of a single processor cannot meet the requirement, the common processing method is to increase the operating frequency of the processor or increase the number of processors, and the operating frequency of the processor has a limited margin due to limitations in processes, circuit structures, power consumption, and the like.
The Field Programmable Gate Array (FPGA) has great performance advantage in the aspect of parallel computing, the FPGA can adopt a parallel processing structure for signal and data processing, and the FPGA can realize simultaneous operation of dozens or even hundreds of operation units. However, the traditional FPGA has single function and mainly focuses on data operation and circuit control; when the circuit scale is too large, the chip configuration time is too long. Although the FPGA has a dynamic reconfiguration function, the realized circuit has certain flexibility, but the requirements of dynamic quick switching of an algorithm and increasing intelligence and system flexibility cannot be met. In addition, in order to ensure that the programmable logic blocks and the interconnections have flexible general programming characteristics, the FPGA has large delay of an actual circuit after being comprehensively implemented, has low working frequency, has low utilization efficiency of a circuit area relative to the ASIC, and is difficult to obtain the aims of high performance, high integration and low power consumption like the ASIC.
In the field of general processors and DSP chips, in order to support some complex protocols and algorithms in the specific application field and improve the data processing capacity, a special accelerator module is added into a processor, but the special accelerator function is difficult to modify and expand after the device is manufactured, and the flexibility of the system is limited. The bandwidth required by communication depends on the number and type of accelerators, and the accelerators are required to have adaptability and flexibility, so that the expandability and adaptability of the heterogeneous multi-core SOC are also problems. The hardware programmable heterogeneous multi-core system on chip can overcome the problem, well combines the flexibility of a general processor and the parallel computing and hardware acceleration capabilities of an FPGA (field programmable gate array), and is an ideal solution for large-scale computing. However, the current research on hardware programmable systems on chip mainly focuses on combining a general RISC processor such as ARM and FPGA. The general processor, the DSP and the FPGA are rarely integrated in one chip to exert respective characteristics, and a single-chip fully programmable heterogeneous multi-core system on a chip is provided.
The search of the prior art documents shows that the patent with application number 201410273439.1 and the name of 'heterogeneous multi-core processor based on ARM, DSP and FPGA and task scheduling method' discloses a heterogeneous multi-core processor technology, the processors of the same type in the scheme do not adopt a multi-core structure, the interconnection among the heterogeneous processors adopts a PCI peripheral interconnection bus or a simple bus network, the bottleneck problem of the current SOC storage interface and bus cannot be solved, the programmable accelerator function is not provided, and the intensive data processing capability and the actual application capability of the programmable accelerator function are still to be enhanced. The invention discloses a multi-core DSP reconfigurable special integrated circuit system used in the technical field of digital signal processing, which has the application number of 201110008399.4 and the name of 'multi-core DSP reconfigurable special integrated circuit system', wherein the multi-core DSP is used as an operation core, an FPGA realizes the interconnection topological structure between the DSPs according to different execution tasks, the operation data of the multi-core DSP is acquired by a control processor through a memory interface and is transmitted to the DSP, and the task scheduling and control of the multi-core DSP are implemented by a central processing unit; the reconfigurable special integrated circuit system has the defects that hardware interconnection is carried out on an interconnection topological structure among the multi-core DSPs according to the requirement of an arithmetic operation process, the interconnection topological structure participates in operation, the application adaptability is slightly poor although the distribution of DSP task blocks is clear, the expansion of DSP multi-core is inconvenient, the flexible scheduling among the DSP multi-core is inconvenient, in addition, the task scheduling of the multi-core DSP is also required to be controlled by a central processing unit, and the application of the reconfigurable special integrated circuit system is also focused on signal processing and bottom layer data processing.
Disclosure of Invention
The invention aims to provide a hardware programmable heterogeneous multi-core system on a chip, which enhances the capacity of a processor on the basis of keeping the characteristics of flexible software programming and low power consumption of a DSP (digital signal processor) and an MPU (micro processing unit), and has the characteristics of balanced and comprehensive hardware programmable and parallel computing capacity and service function and wide application range.
To solve the above technical problem, the present invention provides a hardware programmable heterogeneous multi-core system on a chip, including:
the system comprises a multi-core DSP, a multi-core MPU, a GPU, an FPGA configuration peripheral module, an IO peripheral A and an IO peripheral B; wherein the content of the first and second substances,
the IO peripheral A is connected with the multi-core DSP and the FPGA through an FE bus and transmits data to be processed to the multi-core DSP and the FPGA;
the multi-core DSP and the FPGA are used for data preprocessing with high throughput rate, intensive data processing and bottom layer algorithm operation;
the multi-core DSP and the multi-core MPU are both connected with a PS bus; the FPGA configuration peripheral module is connected with the PS bus; the FPGA is connected with the IO peripheral B through a BE bus;
the FPGA respectively realizes programmable hardware acceleration on the multi-core DSP and the multi-core MPU through an AIP _ DSP bus and an AIP _ MPU bus; the GPU is connected with the multi-core MPU, the multi-core MPU realizes system control, user interface and high-level algorithm operation, and the GPU is used for graphic acceleration.
Optionally, the FPGA includes a programmable DSP accelerator and a programmable MPU accelerator, and provides programmable hardware acceleration for the multi-core DSP and the multi-core MPU; the programmable DSP accelerator is connected with the multi-core DSP through an AIP _ DSP bus, and the programmable MPU accelerator is connected with the multi-core MPU through an AIP _ MPU bus.
Optionally, the AIP _ DSP bus and the AIP _ MPU bus are high-speed buses supporting Cache coherence, and are respectively connected to the L2 Cache memory of the multi-core DSP and the L1 Cache memory of the multi-core MPU, and the PS bus is a high-speed bus between the multi-core DSP and the multi-core MPU.
Optionally, the multi-core DSP and the multi-core MPU share the FPGA configuration peripheral module, and the FPGA configuration peripheral module is configured to perform static configuration and dynamic reconfiguration on the FPGA.
Optionally, the multi-core DSP configures an on-chip SRAM and a multi-level Cache to implement access of an acceleration instruction and data, and the L1, L2 Cache/SRAM of the multi-core DSP can be configured as a Cache or an on-chip SRAM.
Optionally, the FE bus, the DSP-P L bus, the PS bus, and the BE bus are connected by topology interconnection, which provides hierarchy and flexibility of the whole system interconnection.
Optionally, the FPGA further includes an SRAM, a hardware IP, a software definable IP, a macro module, an on-chip memory, and an interface controller, and is connected to the processor system through a programmable cross bus or an on-chip network to implement parallel data operation, hardware acceleration, and SOC resource configuration.
Optionally, the hardware programmable heterogeneous multi-core system on a chip further includes DSP accelerator firmware and MPU accelerator firmware, which are respectively connected to the multi-core DSP and the multi-core MPU.
The invention has the following beneficial effects:
(1) the system integrates heterogeneous components such as multi-core DSP, multi-core MPU, GPU, FPGA and the like, and each component is long, so that the formed SOC device has a balanced and comprehensive structure and a wide application range;
(2) the FPGA and the multi-core DSP are tightly combined and share a high-speed IO bus, so that the current digital interface is gradually transferred to the front end, and high-speed data is received and processed. The data is sent into the FPGA or the multi-core DSP for high-bandwidth data processing, and after the data rate is reduced, the data is sent to the multi-core DSP and the multi-core MPU for further processing and analysis;
(3) a plurality of high-speed on-chip buses are arranged between the FPGA and the multi-core DSP, so that the data processing task can be switched between the FPGA and the multi-core DSP for multiple times, the heterogeneous characteristics of processing cores are fully utilized, and the flexibility of data processing is improved;
(4) the FPGA provides programmable hardware acceleration for the multi-core DSP and the multi-core MPU, meanwhile, the FPGA integrates a large number of hardware IP resources, and the connection of the hardware IP resources and the multi-core DSP and the multi-core MPU is programmable, so that the resource sharing and the system integration of the multi-core DSP and the multi-core MPU are facilitated, the BOM cost of the system is reduced, the size of the PCB is reduced, and the whole system can be further modified, optimized and upgraded after being shaped;
(5) the hardware programmable range of the SOC chip is not limited to an FPGA area, and partial programmable characteristics of a processor system consisting of a multi-core DSP, a multi-core MPU, a GPU, interconnections and other components are provided, so that the processor system architecture fixed by conventional hardware has certain flexibility and can be adjusted and optimized according to application.
Drawings
FIG. 1 is a schematic diagram of a hardware programmable heterogeneous multi-core system-on-a-chip structure provided by the present invention.
Detailed Description
The hardware programmable heterogeneous multi-core system on a chip according to the present invention is further described in detail with reference to the accompanying drawings and specific embodiments. Advantages and features of the present invention will become apparent from the following description and from the claims. It is to be noted that the drawings are in a very simplified form and are not to precise scale, which is merely for the purpose of facilitating and distinctly claiming the embodiments of the present invention.
Example one
The invention provides a hardware programmable heterogeneous multi-core system on a chip, which comprises a multi-core DSP1, a multi-core MPU2, a GPU 3, DSP accelerator firmware 4, MPU accelerator firmware 5, an FPGA6, an FPGA configuration peripheral module 7, an IO peripheral A8, an IO peripheral B9 and a high-speed bus interconnection network on the chip, wherein a programmable DSP accelerator 16 and a programmable MPU accelerator 17 can BE realized in the FPGA6, the high-speed bus interconnection network on the chip comprises an FE bus 10, a DSP-P L bus 11, a PS bus 12, a BE bus 13, an AIP _ DSP bus 14 and an AIP _ MPU bus 15, and all functional modules are connected through the high-speed bus interconnection network on the chip.
With reference to fig. 1, the IO peripheral A8 is connected to the multi-core DSP1 and the FPGA6 through the FE bus 10 to transmit data to BE processed to the multi-core DSP1 and the FPGA6, the multi-core DSP1 and the FPGA6 realize switching of signal and data processing tasks through the DSP-P L bus 11, the multi-core DSP1 and the FPGA6 are used for high throughput data preprocessing, intensive data processing, and bottom layer algorithm operation, the multi-core DSP1 and the multi-core MPU2 are both connected to the PS bus 12, the DSP accelerator firmware 4 and the MPU accelerator firmware 5 are respectively connected to the multi-core DSP1 and the multi-core MPU2, the FPGA configuration peripheral module 7 is connected to the PS bus 12, the FPGA6 is connected to the peripheral B9 through the BE bus 13, the FPGA6 respectively realizes programmable acceleration for the multi-core DSP1 and the multi-core MPU2 through the AIP _ DSP bus 14 and the AIP _ MPU bus 15, the GPU 1 and the multi-core MPU2 are connected to each other, the GPU 1 and the multi-core DSP accelerator bus 14 and the MPU 6 are connected to each other through the BE connected to the high speed bus 13, the SOC bus 14 and the api-DSP bus 14 and the SOC bus 13, the SOC bus 14 and the SOC bus 14 are connected to provide a high speed accelerator bus 14, the programmable accelerator bus 14, the high speed accelerator bus 16, the programmable accelerator bus 14 and the SOC 16, and the programmable accelerator bus 14 are connected to provide the high speed accelerator bus 14, and the programmable accelerator bus 14, and the high speed accelerator bus interconnection network bus interconnection network bus 14, and the high speed accelerator bus interconnection network bus 14, and the high interconnection bus interconnection system bus 14, and the high interconnection bus interconnection of the high speed bus interconnection bus 14, and the high speed bus interconnection system bus 14, and.
The multi-core DSP1 and the multi-core MPU2 share the FPGA configuration peripheral module 7, so that the multi-core DSP1 and the multi-core MPU2 can carry out static configuration and dynamic reconstruction on the FPGA6 through the FPGA configuration peripheral module 7 according to requirements, the static configuration during power-on can be more conveniently managed by the multi-core MPU2, and more dynamic reconfiguration is carried out by the multi-core DSP1, in order to support the high-speed operation of the multi-core DSP1, the multi-core DSP1 configures the on-chip SRAM and the multi-level Cache to realize the access of acceleration instructions and data, the L1, the L2 Cache/SRAM of the multi-core DSP1 can be configured into Cache, or the multi-core DSP1 and the multi-core MPU2 are connected with the FPGA6 through a shared memory, and an on-chip microprocessor system interface, an on-chip microprocessor System (SOC) controls the on-chip FPGA, the external memory 6 and an on-chip microprocessor system through IP (Internet protocol) interface, and an on-chip microprocessor system hardware control software.
Viewed from the top layer, the system area can be divided into a processor system area 18, an FPGA area and an IO peripheral area, and the processor system area 18 includes the multi-core DSP1, the multi-core MPU2, the GPU 3, the DSP accelerator firmware 4, the MPU accelerator firmware 5, the FPGA configuration peripheral module 7, a part of the on-chip high-speed bus interconnection network, and the like. The system has two large service areas, one is an area formed by the multi-core DSP1 and the FPGA6, and services such as high-throughput data preprocessing, intensive data operation, bottom layer algorithm and the like are realized; one is an area composed of a multi-core MPU2, a GPU 3 and a programmable MPU accelerator 17, and the multi-core MPU realizes services such as a user interface, a high-level algorithm, network transmission, application program operation and the like. The FPGA6 also realizes a hardware acceleration function required in the multicore DSP1 and the multicore MPU 2. The hardware acceleration functions implemented by the FPGA6 are field programmable, and have a compromise between flexibility and computing performance, although performance is somewhat different from the accelerator firmware implemented by an ASIC.
The multicore MPU2 may not only be accelerated by the MCU accelerator firmware 5 and the programmable MPU accelerator 17, but also the multicore DSP1, the FPGA6, and the programmable DSP accelerator may be integrated, or a part of the multicore MPU2 may be used as an acceleration module, and at this time, the acceleration module is programmed by hardware and software, and is more flexible in practical application, and at this time, the source of processing data is more from interfaces such as a high-speed network interface, SATA, and PCI of the multicore MPU 2. And a shared memory module is arranged between the multi-core DSP1 and the multi-core MPU2 and is connected with an external memory controller positioned in the FPGA area through the shared memory module. The architecture also realizes the functions and modules of inter-core communication, EDMA and the like, and the on-chip high-speed bus realizes the functions of Cache consistency, power management function, bandwidth management function, FIFO buffering and the like at the connection part of a memory interface and an accelerator interface according to the requirements so as to meet the application of high performance and high quality. In addition, the power management and clock management functions are realized, the power management function is realized by the bus, and the measures ensure the low power consumption of the SOC chip.
The bus connection and shared resources among the multi-core DSP, the multi-core MPU, the GPU, the IO peripheral (including the IO peripheral A and the IO peripheral B) and the memory can be properly adjusted and optimized through programmable resources according to the actual application requirements, namely the programmable range of the system on chip is not limited in an FPGA area, so that the conventional processor system architecture with fixed hardware has certain flexibility. Besides providing a programmable gate array, the FPGA also becomes a hardware resource pool of the whole chip of the SOC. In the FPGA part, besides a plurality of coarse-grained and fine-grained reconfigurable logic units and programmable interconnections, various IP modules such as a macro module, an on-chip memory, an interface controller and the like are integrated, and the hardware resources are connected with a processor system through a programmable cross bus or an on-chip network, so that the parallel data operation and hardware acceleration of large data volume are realized, the SOC resource configuration, the interface control realization and the like are also realized, the interconnection length is reduced, the integration flexibility of the SOC system is improved, the BOM cost and the PCB size of the whole machine are reduced, and compared with the situation that the resources are arranged on a chip, the optimization modification and the upgrade of the whole machine system after the shaping are facilitated, and the expandability and the upgradability of the whole machine system are improved.
A multi-path high-speed on-chip bus is configured between the multi-core DSP and the FPGA, so that data operation tasks can be switched between the multi-core DSP and the FPGA for multiple times, software implementation and hardware implementation of the algorithm can be performed in a flowing manner, or the software implementation and the hardware implementation are mutually nested, and the algorithm implementation is more flexible, adaptive and intelligent.
The hardware programmable heterogeneous multi-core system on chip is described in the following, data to be processed, such as front-end original data, are sent to an FPGA6 and a multi-core DSP1 through a high-speed IO peripheral A8 and an FE bus 10, high-throughput data preprocessing and intensive data operation are carried out in the FPGA6, the data rate is reduced, and then the data are sent to the multi-core DSP1 for further data processing, the multi-core DSP1 sometimes needs intensive data parallel operation in the data processing process, tasks can be switched to the FPGA6 to be completed, the tasks are sent to the multi-core DSP1 after processing, sometimes, partial algorithms need to be used for processing functions and algorithms realized in the multi-core DSP1 when the FPGA6 carries out data processing, the tasks can be switched from the FPGA6 to the multi-core DSP1 to be executed, then the tasks are switched to the FPGA6, and a multi-channel DSP-P L bus 11 is configured for completing the flexible data processing.
The above description is only for the purpose of describing the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention, and any variations and modifications made by those skilled in the art based on the above disclosure are within the scope of the appended claims.

Claims (8)

1. A hardware programmable heterogeneous multi-core system-on-a-chip, comprising:
the system comprises a multi-core DSP (1), a multi-core MPU (2), a GPU (3), an FPGA (6), an FPGA configuration peripheral module (7), an IO peripheral A (8) and an IO peripheral B (9); wherein the content of the first and second substances,
the IO peripheral A (8) is connected with the multi-core DSP (1) and the FPGA (6) through an FE bus (10) and transmits data to be processed into the multi-core DSP (1) and the FPGA (6);
the multi-core DSP (1) and the FPGA (6) realize the switching of signal and data processing tasks through a DSP-P L bus (11), and the multi-core DSP (1) and the FPGA (6) are used for data preprocessing with high throughput rate, intensive data processing and bottom layer algorithm operation;
the multi-core DSP (1) and the multi-core MPU (2) are both connected with a PS bus (12); the FPGA configuration peripheral module (7) is connected with the PS bus (12); the FPGA (6) is connected with the IO peripheral B (9) through a BE bus (13);
the FPGA (6) respectively realizes programmable hardware acceleration on the multi-core DSP (1) and the multi-core MPU (2) through an AIP _ DSP bus (14) and an AIP _ MPU bus (15); the GPU (3) is connected with the multi-core MPU (2), the multi-core MPU (2) realizes system control, user interface and high-level algorithm operation, and the GPU (3) is used for graphic acceleration.
2. The hardware programmable heterogeneous multi-core system on a chip of claim 1, wherein the FPGA (6) includes a programmable DSP accelerator (16) and a programmable MPU accelerator (17) therein, providing programmable hardware acceleration to the multi-core DSP (1) and the multi-core MPU (2); the programmable DSP accelerator (16) is connected with the multi-core DSP (1) through an AIP _ DSP bus (14), and the programmable MPU accelerator (17) is connected with the multi-core MPU (2) through an AIP _ MPU bus (15).
3. The hardware programmable heterogeneous multi-core system on a chip of claim 2, wherein the AIP _ DSP bus (14) and the AIP _ MPU bus (15) are high-speed buses supporting Cache coherence and are respectively connected to L2 Cache memories of a multi-core DSP (1) and L1 Cache memories of the multi-core MPU (2), and the PS bus (12) is a high-speed bus between the multi-core DSP (1) and the multi-core MPU (2).
4. The hardware programmable heterogeneous multi-core system on a chip of claim 1, wherein the multi-core DSP (1) and the multi-core MPU (2) share the FPGA configuration peripheral module (7), the FPGA configuration peripheral module (7) being configured for static configuration and dynamic reconfiguration of the FPGA (6).
5. The hardware programmable heterogeneous multi-core system on a chip of claim 1, wherein the multi-core DSP (1) configures an on-chip SRAM, a multi-level Cache to enable accelerated instruction and data access, L1, L2 Cache/SRAM of the multi-core DSP (1) being configurable as a Cache, or an on-chip SRAM.
6. The hardware programmable heterogeneous multi-core system on a chip of claim 1, wherein the FE bus (10), the DSP-P L bus (11), the PS bus (12) and the BE bus (13) are connected by a topological interconnect, providing hierarchy and flexibility of the whole system interconnect.
7. The hardware programmable heterogeneous multi-core system on a chip of claim 1, wherein the FPGA (6) further comprises SRAM, hardware IP, software definable IP, macro module, on-chip memory, and interface controller, connected to the processor system via a programmable crossbar bus or on-chip network to implement parallel data operations, hardware acceleration, and SOC resource configuration.
8. The hardware programmable heterogeneous multi-core system on a chip of claim 1, further comprising DSP accelerator firmware (4) and MPU accelerator firmware (5) respectively connected to the multi-core DSP (1) and the multi-core MPU (2).
CN202010333344.XA 2020-04-24 2020-04-24 Hardware programmable heterogeneous multi-core system on chip Active CN111506540B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010333344.XA CN111506540B (en) 2020-04-24 2020-04-24 Hardware programmable heterogeneous multi-core system on chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010333344.XA CN111506540B (en) 2020-04-24 2020-04-24 Hardware programmable heterogeneous multi-core system on chip

Publications (2)

Publication Number Publication Date
CN111506540A true CN111506540A (en) 2020-08-07
CN111506540B CN111506540B (en) 2021-11-30

Family

ID=71873030

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010333344.XA Active CN111506540B (en) 2020-04-24 2020-04-24 Hardware programmable heterogeneous multi-core system on chip

Country Status (1)

Country Link
CN (1) CN111506540B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463723A (en) * 2020-12-17 2021-03-09 王志平 Method for realizing microkernel array
CN112463718A (en) * 2020-11-17 2021-03-09 中国计量大学 Signal recognition processing device
CN112506851A (en) * 2020-12-02 2021-03-16 广东电网有限责任公司佛山供电局 SOC chip architecture construction method for solving multi-core access conflict
CN114647610A (en) * 2022-02-17 2022-06-21 北京百度网讯科技有限公司 Voice chip implementation method, voice chip and related equipment
WO2024002172A1 (en) * 2022-06-29 2024-01-04 上海寒武纪信息科技有限公司 System on chip, instruction system, compilation system, and related product

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073481A (en) * 2011-01-14 2011-05-25 上海交通大学 Multi-kernel DSP reconfigurable special integrated circuit system
CN103516630A (en) * 2012-06-28 2014-01-15 成都鼎桥通信技术有限公司 Normalization data processing board and integrated equipment inside BBU machine frame
CN104021042A (en) * 2014-06-18 2014-09-03 哈尔滨工业大学 Heterogeneous multi-core processor based on ARM, DSP and FPGA and task scheduling method
CN104685516A (en) * 2012-08-17 2015-06-03 高通技术公司 Apparatus and methods for spiking neuron network learning
CN105279133A (en) * 2015-10-20 2016-01-27 电子科技大学 VPX parallel DSP signal processing board card based on SoC online reconstruction
WO2016077393A1 (en) * 2014-11-12 2016-05-19 Xilinx, Inc. Heterogeneous multiprocessor program compilation targeting programmable integrated circuits
CN106886690A (en) * 2017-01-25 2017-06-23 人和未来生物科技(长沙)有限公司 It is a kind of that the heterogeneous platform understood is calculated towards gene data
CN106897581A (en) * 2017-01-25 2017-06-27 人和未来生物科技(长沙)有限公司 A kind of restructural heterogeneous platform understood towards gene data
CN107562530A (en) * 2016-06-30 2018-01-09 无锡十月中宸科技有限公司 A kind of variable computing system of mixing based on server

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073481A (en) * 2011-01-14 2011-05-25 上海交通大学 Multi-kernel DSP reconfigurable special integrated circuit system
CN103516630A (en) * 2012-06-28 2014-01-15 成都鼎桥通信技术有限公司 Normalization data processing board and integrated equipment inside BBU machine frame
CN104685516A (en) * 2012-08-17 2015-06-03 高通技术公司 Apparatus and methods for spiking neuron network learning
CN104021042A (en) * 2014-06-18 2014-09-03 哈尔滨工业大学 Heterogeneous multi-core processor based on ARM, DSP and FPGA and task scheduling method
WO2016077393A1 (en) * 2014-11-12 2016-05-19 Xilinx, Inc. Heterogeneous multiprocessor program compilation targeting programmable integrated circuits
CN105279133A (en) * 2015-10-20 2016-01-27 电子科技大学 VPX parallel DSP signal processing board card based on SoC online reconstruction
CN107562530A (en) * 2016-06-30 2018-01-09 无锡十月中宸科技有限公司 A kind of variable computing system of mixing based on server
CN106886690A (en) * 2017-01-25 2017-06-23 人和未来生物科技(长沙)有限公司 It is a kind of that the heterogeneous platform understood is calculated towards gene data
CN106897581A (en) * 2017-01-25 2017-06-27 人和未来生物科技(长沙)有限公司 A kind of restructural heterogeneous platform understood towards gene data

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463718A (en) * 2020-11-17 2021-03-09 中国计量大学 Signal recognition processing device
CN112463718B (en) * 2020-11-17 2022-05-20 中国计量大学 Signal recognition processing device
CN112506851A (en) * 2020-12-02 2021-03-16 广东电网有限责任公司佛山供电局 SOC chip architecture construction method for solving multi-core access conflict
CN112506851B (en) * 2020-12-02 2022-02-11 广东电网有限责任公司佛山供电局 SOC chip architecture construction method for solving multi-core access conflict
CN112463723A (en) * 2020-12-17 2021-03-09 王志平 Method for realizing microkernel array
CN114647610A (en) * 2022-02-17 2022-06-21 北京百度网讯科技有限公司 Voice chip implementation method, voice chip and related equipment
CN114647610B (en) * 2022-02-17 2022-11-29 北京百度网讯科技有限公司 Voice chip implementation method, voice chip and related equipment
WO2024002172A1 (en) * 2022-06-29 2024-01-04 上海寒武纪信息科技有限公司 System on chip, instruction system, compilation system, and related product

Also Published As

Publication number Publication date
CN111506540B (en) 2021-11-30

Similar Documents

Publication Publication Date Title
CN111506540B (en) Hardware programmable heterogeneous multi-core system on chip
KR100986006B1 (en) Microprocessor subsystem
US20170220499A1 (en) Massively parallel computer, accelerated computing clusters, and two-dimensional router and interconnection network for field programmable gate arrays, and applications
US11789895B2 (en) On-chip heterogeneous AI processor with distributed tasks queues allowing for parallel task execution
US20210073170A1 (en) Configurable heterogeneous ai processor
CN110347635B (en) Heterogeneous multi-core microprocessor based on multilayer bus
CN104820657A (en) Inter-core communication method and parallel programming model based on embedded heterogeneous multi-core processor
CN107341053A (en) The programmed method of heterogeneous polynuclear programmable system and its memory configurations and computing unit
CN105279133A (en) VPX parallel DSP signal processing board card based on SoC online reconstruction
WO2020078470A1 (en) Network-on-chip data processing method and device
CN116028418B (en) GPDSP-based extensible multi-core processor, acceleration card and computer
CN111581152A (en) Reconfigurable hardware acceleration SOC chip system
CN109918335A (en) One kind being based on 8 road DSM IA frame serverPC system of CPU+FPGA and processing method
Zhan et al. A design of versatile image processing platform based on the dual multi-core DSP and FPGA
CN116757132A (en) Heterogeneous multi-core FPGA circuit architecture, construction method and data transmission method
CN102761578B (en) Cluster computing system
WO2023015656A1 (en) Embedded-oriented configurable manycore processor
Teimouri et al. Improving scalability of CMPs with dense ACCs coverage
US10629161B2 (en) Automatic multi-clock circuit generation
Rettkowski et al. Application-specific processing using high-level synthesis for networks-on-chip
Li et al. FPGA overlays: hardware-based computing for the masses
WO2022088171A1 (en) Neural processing unit synchronization systems and methods
CN217606354U (en) Reconfigurable edge calculation module
Huang et al. AIOC: An All-in-One-Card Hardware Design for Financial Market Trading System
Al-Ali et al. Supercomputer networks in the datacenter: Benchmarking the evolution of communication granularity from macroscale down to nanoscale

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant