CN108121685A - A kind of embedded multi-core cpu firmware operation method - Google Patents

A kind of embedded multi-core cpu firmware operation method Download PDF

Info

Publication number
CN108121685A
CN108121685A CN201710667129.1A CN201710667129A CN108121685A CN 108121685 A CN108121685 A CN 108121685A CN 201710667129 A CN201710667129 A CN 201710667129A CN 108121685 A CN108121685 A CN 108121685A
Authority
CN
China
Prior art keywords
cpu
tcm
operation method
dtcm
itcm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710667129.1A
Other languages
Chinese (zh)
Inventor
杨建利
蔡震
张涛
周洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hung Qin (beijing) Technology Co Ltd
Original Assignee
Hung Qin (beijing) Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hung Qin (beijing) Technology Co Ltd filed Critical Hung Qin (beijing) Technology Co Ltd
Priority to CN201710667129.1A priority Critical patent/CN108121685A/en
Publication of CN108121685A publication Critical patent/CN108121685A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/177Initialisation or configuration control

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

The present invention provides a kind of embedded multi-core cpu firmware operation method, in multinuclear SMP CPU architectures, it is managed by TCM managing devices, ITCM and DTCM plug-in SRAM points can be multiple ports by the TCM managing devices, ensure the multiple ITCM/DTCM addresses of multiple CPU concurrent accesses, when more than two CPU access same ITCM/DTCM addresses, it is managed by the arbitrated logic of TCM managing device ITCM/DTCM subports.The embedded multi-core cpu firmware operation method very flexibly, in real time can carry out firmware task distribution to CPU, not only reduce the risk that firmware is repartitioned, also substantially increase the final performance of embedded system.

Description

A kind of embedded multi-core cpu firmware operation method
Technical field
The present invention relates to a kind of processor operation method, more particularly, to a kind of embedded multi-core cpu firmware operation method.
Background technology
Current multiple nuclear CPU framework can be divided into two kinds of AMP (isomery) and SMP (isomorphism), the former more CPU differences, The more CPU of the latter are identical.From the point of view of firmware development, the firmware of SMP architecture generally only needs portion, and AMP The firmware of framework not only need compiling at least two parts, generally can also be related to different translation and compiling environments, to firmware exploitation debugging and Performance optimization brings additional expense, therefore the present invention is just for SMP chip architectures.
Typical SMP architecture has certain hardware requirement, has dedicated hardware in the communication of multinuclear and synchronization aspects Snoop Control Unit (SCU) under control logic, such as ARM architecture can realize the Cache uniformity of multinuclear; And need to ensure on hardware inside each core cpu there is CPU ID registers to be read for software, for example Linux kernel exist Startup stage can read CPU ID registers by smp_processor_id () letter.
Multinuclear SMP CPU generally authorize price to be higher by much than monokaryon CPU, and execution efficiency is also unable to reach linearly in advance Phase.
The content of the invention
The present invention provides a kind of embedded multi-core cpu firmware operation method, solve multinuclear SMP CPU operational efficiency without Method reaches the problem of linear expected, and technical solution is as described below:
A kind of embedded multi-core cpu firmware operation method in multinuclear SMP CPU architectures, is carried out by TCM managing devices ITCM and DTCM plug-in SRAM points can be multiple ports by management, the TCM managing devices, ensure multiple CPU concurrent accesses Multiple ITCM/DTCM addresses, when more than two CPU access same ITCM/DTCM addresses, by TCM managing devices ITCM/ The arbitrated logic of DTCM subports is managed.
In hardware structure, multiple CPU are hung on same bus jointly, each CPU have oneself fixed No. CPU_ID, Gu Part code is deposited in chip internal SRAM or TCM or deposited in external DRAM.
In hardware structure, each main functionality function carries out CPU hardware resource allocation, each main body by weight management function Before power function compiling, measure after the function expends cpu resource situation and carry out weight setting again, weight management letter after firmware performs Number distributes to each CPU according to each function in real time so that each cpu load situation is identical, to ensure chip disposed of in its entirety performance.
Weight management function formulates the weight management algorithm of itself according to practical application, and weight management algorithm passes through funcX_ Function is distributed to corresponding CPU by cpu_id variables.
When arbitrated logic is absolute priority, priority orders are successively decreased successively according to CPU number, when TCM managing devices are received To after read-write requests, judged by CPU number order, until determine whether the request signal of CPUN, it if it is will be secondary The power of sanction gives CPUN, while the coherent signal of TCM0 port signal input and output CPUN, and idle shape is directly jumped back to after having handled State responds next request signal;If not the request signal of CPUN, then into idle state.
When arbitrated logic is polling priority, judge that state is sentenced to a last CPU from first CPU according to CPU number Disconnected state calculates a wheel, under current state, the highest priority of corresponding CPU, but in each round, the priority of each CPU Highest number is the same.
The embedded multi-core cpu firmware operation method very flexibly, in real time can carry out firmware task point to CPU Match somebody with somebody, not only reduce the risk that firmware is repartitioned, also substantially increase the final performance of embedded system.
Description of the drawings
Fig. 1 is hardware architecture diagram of the present invention;
Fig. 2 is firmware structure schematic diagram in the present invention;
Fig. 3 is weight management algorithm according to using the different flow charts that function is distributed to corresponding CPU;
Fig. 4 is the schematic diagram for the TCM managing devices that the present invention uses;
Fig. 5 is the schematic diagram being managed by the arbitrated logic of TCM managing device ITCM/DTCM subports;
Fig. 6 is the schematic diagram of absolute priority arbitrated logic;
Fig. 7 is the schematic diagram of polling priority arbitrated logic.
Specific embodiment
In embedded multi-core system, it is sometimes desirable to which multi-core CPU cooperates, and is referred to reaching higher overall performance Mark.Multi-core CPU is most important to the final performance of embedded system to the distribution of firmware task and executive mode, and the present invention proposes A kind of embedded multi-core cpu firmware operation method very flexibly, in real time can carry out firmware by such method to CPU Task is distributed, and is not only reduced the risk that firmware is repartitioned, is also substantially increased the final performance of embedded system.
Embedded multi-core cpu firmware operation method proposed by the present invention, the hardware logic without similar SCU classes are controlled System, you can reach the efficiency that multinuclear concurrently cooperates, for hardware configuration as shown in Figure 1, in hardware structure, multiple CPU are common Hang on same bus, each CPU have oneself fixed No. CPU_ID, firmware code can deposit in chip internal SRAM or TCM In (close-coupled memory), it can also deposit in external DRAM.
As shown in Fig. 2, in firmware framework, each main functionality function carries out CPU hardware resource tune by weight management function Match somebody with somebody.Before each main functionality function compiling, which is measured by research staff expend and carry out weight again after cpu resource situation and set It is fixed.Weight management function distributes to each CPU according to each function in real time after firmware performs, and ensures that each cpu load situation is identical as far as possible, To ensure chip disposed of in its entirety performance.
Weight management function formulates the weight management algorithm of itself according to practical situations, weight management algorithm according to Can be different using difference, function is distributed to by corresponding CPU by funcX_cpu_id variables, flow chart is as shown in figure 3, set N number of CPU is equipped with, is comprised the following steps:
(1) weight management function accesses CPU 0 first, if CPU 0 is no exceeded, after judging the load updates of CPU 0 It is whether exceeded, if still do not had, function is distributed to by corresponding CPU 0 by funcX_cpu_id variables, is then handled Next function;
(2) if CPU 0 is exceeded, CPU 1 is accessed downwards;If exceeded after the load updates of CPU 0, access downwards CPU 1;Judge whether CPU 1 is exceeded, and handled according to the way of similar step (1);
(3) access CPU successively, until when CPU N are also exceeded, then pops up warning.
For convenience of description, the present invention is with 2 CPU, exemplified by performing 5 main functionality functions.Assuming that 5 function weights are such as Under:
FuncA weights:2
FuncB weights:3
FuncC weights:5
FuncD weights:4
FuncE weights:6
It is as follows by taking single cpu function weight scalar is equal to 10 as an example:
After performing above, the weight of CPU0 and CPU1 will remain basically stable, for example CPU0 performs A+B+C functions, CPU1 Perform D+E functions.
Under different hardware resources (CPU CACHE/DRAM/SRAM/TCM), the invention execution efficiency is also less identical, The present invention proposes a kind of TCM managing devices, for maximizing the efficiency of the practice for realizing weight management algorithm, TCM managing devices As shown in Figure 4:
The TCM managing devices are provided with multiple input ITCM_i/DTCM_i (i is not less than N, and N is the number of CPU), lead to It crosses TCM interfaces and is consecutively connected to N number of CPU, TCM managing devices are provided with 2i output, are i ITCM and i DTCM respectively, divide ITCM SRAM and DTCM SRAM are not connected to.
TCM managing devices can be plug-in by ITCM (instruction close-coupled memory) and DTCM (data close-coupled memory) SRAM points are multiple ports, it is ensured that multiple multiple ITCM/DTCM addresses of CPU (processor) concurrent access promote processing effect Rate.When more than two CPU access same ITCM/DTCM addresses, by the arbitration of TCM managing device ITCM/DTCM subports Logic is managed, and by taking ITCM0 as an example, is illustrated in fig. 5 shown below:
Arbitrated logic can be absolute priority or polling priority, can be depending on applicable cases.
Absolute priority arbitrated logic is illustrated in fig. 6 shown below:
In absolute priority arbitrated logic, priority orders CPU0>CPU1>…>CPUN, when TCM managing devices receive reading After write request, handled by following order:
1) it is first determined whether for CPU0 request signal, if it is will arbitration power give CPU0, while TCM0 ports The coherent signal of signal input and output CPU0 directly jumps back to idle state accordingly next request signal after having handled;If no It is the request signal of CPU0, then into next Rule of judgment.
2) determine whether the request signal of CPU1, if it is give arbitration power to CPU1, while TCM0 port signals The coherent signal of input and output CPU1 directly jumps back to idle state accordingly next request signal after having handled;If not The request signal of CPU1, then into next Rule of judgment.
3) decision logic is same as above, until determining whether the request signal of CPUN, if it is gives arbitration power to CPUN, The coherent signal of TCM0 port signal input and output CPUN simultaneously, directly jumps back to idle state accordingly next request after having handled Signal;If not the request signal of CPUN, then into idle state.
Polling priority arbitrated logic is illustrated in fig. 7 shown below:
In polling priority arbitrated logic, judge that state judges that state calculates a wheel to CPUN from CPU0, under current state, The highest priority of corresponding CPU, but in each round, the number of the highest priority of each CPU is the same.When TCM is managed After reason device receives read-write requests, handled by following order:
1) initially enter CPU0 and judge state, if being currently the request signal of CPU0, give arbitration power to CPU0, together When TCM0 port signal input and output CPU0 coherent signal, judge state into CPU1 after having handled;If not CPU0's Request signal is then directly entered CPU1 and judges state.
2) judge state in CPU1, judge whether currently if it is to give arbitration entirely for the request signal of CPU1 CPU1, while the coherent signal of TCM0 port signal input and output CPU1 enter next judgement state after having handled;If not The request signal of CPU1 is then directly entered next judgement state.
3) remaining judgement state is same as above, until in CPUN states, is determined whether the request signal of CPUN, if it is will Arbitration power gives CPUN, while the coherent signal of TCM0 port signal input and output CPUN, into idle determination shape after having handled State;If not the request signal of CPUN, then idle state is directly entered.
The present invention can reduce multinuclear SMP and authorize expense, and firmware development is flexible, can directly power of amendment weight management function carry out Cpu resource divides, and TCM (close-coupled memory) management module can concurrently perform multiple cpu instructions and reading and writing data, whole to imitate Rate is higher than SMP SCU performances.

Claims (6)

1. in multinuclear SMP CPU architectures, pipe is carried out by TCM managing devices for a kind of embedded multi-core cpu firmware operation method ITCM and DTCM plug-in SRAM points can be multiple ports by reason, the TCM managing devices, ensure that multiple CPU concurrent accesses are more A ITCM/DTCM addresses, when more than two CPU access same ITCM/DTCM addresses, by TCM managing devices ITCM/DTCM The arbitrated logic of subport is managed.
2. embedded multi-core cpu firmware operation method according to claim 1, it is characterised in that:It is more in hardware structure A CPU is hung on same bus jointly, each CPU have oneself fixed No. CPU_ID, firmware code deposits in chip internal In SRAM or TCM or deposit in external DRAM.
3. embedded multi-core cpu firmware operation method according to claim 1, it is characterised in that:It is each in hardware structure Main functionality function carries out CPU hardware resource allocation by weight management function, before each main functionality function compiling, measures the letter Number carries out weight setting again after expending cpu resource situation, and weight management function is distributed in real time according to each function after firmware performs Each CPU so that each cpu load situation is identical, to ensure chip disposed of in its entirety performance.
4. embedded multi-core cpu firmware operation method according to claim 3, it is characterised in that:Weight management function root The weight management algorithm of itself is formulated according to practical application, weight management algorithm is distributed to function by funcX_cpu_id variables Corresponding CPU.
5. embedded multi-core cpu firmware operation method according to claim 1, it is characterised in that:Arbitrated logic is absolute During priority, priority orders are successively decreased successively according to CPU number, after TCM managing devices receive read-write requests, by CPU number Order is judged, until determining whether the request signal of CPUN, if it is gives arbitration power to CPUN, while TCM0 ends The coherent signal of message input and output CPUN directly jumps back to idle state and responds next request signal after having handled;If It is not the request signal of CPUN, then into idle state.
6. embedded multi-core cpu firmware operation method according to claim 1, it is characterised in that:Arbitrated logic is poll During priority, judge that state judges that state calculates a wheel to a last CPU from first CPU according to CPU number, in current state Under, the highest priority of corresponding CPU, but in each round, the number of the highest priority of each CPU is the same.
CN201710667129.1A 2017-08-07 2017-08-07 A kind of embedded multi-core cpu firmware operation method Pending CN108121685A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710667129.1A CN108121685A (en) 2017-08-07 2017-08-07 A kind of embedded multi-core cpu firmware operation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710667129.1A CN108121685A (en) 2017-08-07 2017-08-07 A kind of embedded multi-core cpu firmware operation method

Publications (1)

Publication Number Publication Date
CN108121685A true CN108121685A (en) 2018-06-05

Family

ID=62228167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710667129.1A Pending CN108121685A (en) 2017-08-07 2017-08-07 A kind of embedded multi-core cpu firmware operation method

Country Status (1)

Country Link
CN (1) CN108121685A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086086A (en) * 2018-08-06 2018-12-25 深圳忆联信息系统有限公司 A kind of starting method and device for the multi-core CPU that non-space is shared
CN109800032A (en) * 2019-01-31 2019-05-24 深圳忆联信息系统有限公司 BOOTROM multicore loading method and device
CN111831226A (en) * 2020-07-07 2020-10-27 山东华芯半导体有限公司 Method for accelerating processing of autonomously output NVME protocol command
CN112114754A (en) * 2020-09-25 2020-12-22 青岛信芯微电子科技股份有限公司 System-on-chip SOC for processing backlight data and terminal equipment
CN115185858A (en) * 2022-09-09 2022-10-14 北京特纳飞电子技术有限公司 Processing method and device for address mapping table and storage equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004078713A (en) * 2002-08-21 2004-03-11 Nec Computertechno Ltd Crossbar switch arbitration controlling method, and crossbar switch arbitration controlling system
CN101164051A (en) * 2005-03-01 2008-04-16 高通股份有限公司 Bus access arbitration scheme
CN101667165A (en) * 2009-09-28 2010-03-10 中国电力科学研究院 Bus sharing method and device for distributed multi-master CPUs
CN104536916A (en) * 2014-12-18 2015-04-22 华为技术有限公司 Arbitration method for multi-core system and multi-core system
CN104699641A (en) * 2015-03-20 2015-06-10 浪潮集团有限公司 EDMA (enhanced direct memory access) controller concurrent control method in multinuclear DSP (digital signal processor) system
CN106569727A (en) * 2015-10-08 2017-04-19 福州瑞芯微电子股份有限公司 Shared parallel data reading-writing apparatus of multi memories among multi controllers, and reading-writing method of the same

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004078713A (en) * 2002-08-21 2004-03-11 Nec Computertechno Ltd Crossbar switch arbitration controlling method, and crossbar switch arbitration controlling system
CN101164051A (en) * 2005-03-01 2008-04-16 高通股份有限公司 Bus access arbitration scheme
CN101667165A (en) * 2009-09-28 2010-03-10 中国电力科学研究院 Bus sharing method and device for distributed multi-master CPUs
CN104536916A (en) * 2014-12-18 2015-04-22 华为技术有限公司 Arbitration method for multi-core system and multi-core system
CN104699641A (en) * 2015-03-20 2015-06-10 浪潮集团有限公司 EDMA (enhanced direct memory access) controller concurrent control method in multinuclear DSP (digital signal processor) system
CN106569727A (en) * 2015-10-08 2017-04-19 福州瑞芯微电子股份有限公司 Shared parallel data reading-writing apparatus of multi memories among multi controllers, and reading-writing method of the same

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李兰英等: "《Nios II嵌入式软核SOPC设计原理及应用》", 30 November 2016 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086086A (en) * 2018-08-06 2018-12-25 深圳忆联信息系统有限公司 A kind of starting method and device for the multi-core CPU that non-space is shared
CN109086086B (en) * 2018-08-06 2021-06-08 深圳忆联信息系统有限公司 Starting method and device of non-space-sharing multi-core CPU
CN109800032A (en) * 2019-01-31 2019-05-24 深圳忆联信息系统有限公司 BOOTROM multicore loading method and device
CN109800032B (en) * 2019-01-31 2022-03-25 深圳忆联信息系统有限公司 BOOTROM multi-core loading method and device
CN111831226A (en) * 2020-07-07 2020-10-27 山东华芯半导体有限公司 Method for accelerating processing of autonomously output NVME protocol command
CN111831226B (en) * 2020-07-07 2023-09-29 山东华芯半导体有限公司 Autonomous output NVME protocol command acceleration processing method
CN112114754A (en) * 2020-09-25 2020-12-22 青岛信芯微电子科技股份有限公司 System-on-chip SOC for processing backlight data and terminal equipment
CN112114754B (en) * 2020-09-25 2023-10-27 青岛信芯微电子科技股份有限公司 System-on-chip (SOC) for processing backlight data and terminal equipment
CN115185858A (en) * 2022-09-09 2022-10-14 北京特纳飞电子技术有限公司 Processing method and device for address mapping table and storage equipment

Similar Documents

Publication Publication Date Title
US11221762B2 (en) Common platform for one-level memory architecture and two-level memory architecture
US20230244611A1 (en) Lookahead priority collection to support priority elevation
CN108121685A (en) A kind of embedded multi-core cpu firmware operation method
Huangfu et al. Medal: Scalable dimm based near data processing accelerator for dna seeding algorithm
CN102375800B (en) For the multiprocessor systems on chips of machine vision algorithm
CN102640131B (en) Consistent branch instruction in parallel thread processor
US20180165205A1 (en) Opportunistic increase of ways in memory-side cache
CN103744644A (en) Quad-core processor system built in quad-core structure and data switching method thereof
CN113312303B (en) Micro-architecture system of processor, soC chip and low-power-consumption intelligent equipment
US20190205058A1 (en) Measuring per-node bandwidth within non-uniform memory access (numa) systems
JP5756554B2 (en) Semiconductor device
Cong et al. Architecture support for domain-specific accelerator-rich cmps
Leibson et al. Configurable processors: a new era in chip design
Vieira et al. gem5-ndp: Near-data processing architecture simulation from low level caches to DRAM
CN117348929A (en) Instruction execution method, system controller and related products
Pusceddu et al. A compact transactional memory multiprocessor system on fpga
Wang et al. Incorporating selective victim cache into GPGPU for high‐performance computing
US11954359B2 (en) Circular buffer architecture using local memories with limited resources
Kikuchi et al. Development of Soft-Core Processor with Efficient Array Data Transfer Mechanism
Tegegn An Implementation of a Predictable Cache-coherent Multi-core System
Falahati et al. Cross-Core Data Sharing for Energy-Efficient GPUs
CN117348930A (en) Instruction processing device, instruction execution method, system on chip and board card
CN117369872A (en) Instruction execution method, system controller and related products
Jingye et al. Design and Realization of a Shared Storage Type SOPC Parallel System
CN117349223A (en) System-on-chip, instruction system, compiling system and related products

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180605

RJ01 Rejection of invention patent application after publication