CN102446087B - Instruction prefetching method and device - Google Patents

Instruction prefetching method and device Download PDF

Info

Publication number
CN102446087B
CN102446087B CN 201010508876 CN201010508876A CN102446087B CN 102446087 B CN102446087 B CN 102446087B CN 201010508876 CN201010508876 CN 201010508876 CN 201010508876 A CN201010508876 A CN 201010508876A CN 102446087 B CN102446087 B CN 102446087B
Authority
CN
China
Prior art keywords
instruction
prefetch
request
fetch
prefetch request
Prior art date
Application number
CN 201010508876
Other languages
Chinese (zh)
Other versions
CN102446087A (en
Inventor
李宏亮
谢向辉
任秀江
郑方
吕晖
钱磊
Original Assignee
无锡江南计算技术研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 无锡江南计算技术研究所 filed Critical 无锡江南计算技术研究所
Priority to CN 201010508876 priority Critical patent/CN102446087B/en
Publication of CN102446087A publication Critical patent/CN102446087A/en
Application granted granted Critical
Publication of CN102446087B publication Critical patent/CN102446087B/en

Links

Abstract

一种指令预取方法与预取装置。 An instruction prefetching method of prefetching means. 所述指令预取装置,用于向处理器核心提供指令预取服务,包括:取指控制单元,用于接收处理器核心提供的预取请求,基于所述预取请求在指令缓存单元搜索与所述预取请求对应的指令,或指示指令缓存单元从片外主存中获取与所述预取请求对应的指令;基于所述预取请求指示指令缓存单元将与预取请求对应的指令提供给处理器核心;指令缓存单元,用于存储指令;响应所述取指控制单元的指示,从片外主存中获取与所述预取请求对应的指令,以及将与预取请求对应的指令提供给处理器核心。 Said instruction prefetch means for providing instructions to the processor core prefetch service, comprising: instruction fetch control unit for receiving a prefetch request provided by the processor core, based on the prefetch request and the instruction cache unit searches the prefetch request corresponding to the instruction or instructions to the instruction cache unit acquires an instruction pre-fetch request corresponding to the off-chip main memory; based on the prefetch request indication instruction prefetch buffer unit will provide a corresponding instruction request to the processor core; instruction cache means for storing instructions; in response to said fetch instruction control unit acquires from the off-chip main memory corresponding to the instruction pre-fetch request, and the command corresponding to the prefetch request to the processor core. 本发明的指令预取方法与预取装置以较为简便的方式实现了多核处理器的指令预取,简化了硬件指令存储的管理逻辑,提高了处理器的处理效率。 The instruction prefetching method according to the present invention, prefetch means in a more simple way to achieve the multi-core processor instruction prefetching logic hardware instruction simplifies the management of storage, the processing efficiency of the processor.

Description

指令预取方法与预取装置 Instruction prefetch method and apparatus prefetching

技术领域 FIELD

[0001] 本发明涉及计算机技术领域,更具体地,本发明涉及一种用于多核处理器的指令预取方法与预取装置。 [0001] The present invention relates to computer technology, and more particularly, the present invention relates to a multi-core processor An instruction prefetching method for prefetching means.

背景技术 Background technique

[0002] 多核处理器是指将多个处理器核心及相关功能部件集成到一个处理器芯片上,从而形成包含有多个处理器核心的处理器结构。 [0002] Multicore processor means to integrate a plurality of processor cores and associated features on a processor chip, thereby forming a structure including a plurality of processors of the processor core. 相较于以往的单核处理器,由于集成了多个处理器核心,所述多核处理器的数据处理能力大大提高。 Compared to a conventional single-core processor, since a plurality of processor cores integrated data processing capability of the multi-core processor is greatly increased.

[0003] 在所述多核处理器执行程序的过程中,每个处理器核心有限容量的本地指令缓冲不能满足大規模程序的存储,需要不断的从片外主存中将需要用到的指令装入所述本地指令缓冲。 [0003] In the execution of the program of the multi-core processor, each processor core native instructions finite buffer capacity can not meet the large-scale program is stored, the need to constantly need to use the instructions of the main memory from the off-chip package native instructions into the buffer. 然而,所述片外主存的存取速度很难满足处理器核心处理速度的需求,而且所述多核处理器在实际运行时通常会有多个处理器核心竞争访问片外主存,这加剧了指令获取的难度,多核处理器的数据处理能力无法完全发挥。 However, the off-chip main memory access speeds to satisfy the demands of processing speed of a processor core, multi-core processor and said plurality of processor cores typically have an outer sheet contention access main memory in the actual running, which exacerbates the difficulty of instruction fetch, data processing capabilities of multicore processors can not be fully realized.

[0004] 针对所述因指令获取引起的处理器处理能力无法完全发挥的问题,申请号为01816274.6的中国专利申请提供了一种采用辅助处理器预取用于主要处理器的指令的方法和装置,通过主要处理器外部的辅助处理器预取指令以充分发挥处理器的处理能力。 [0004] For the problem of the processing capability caused by instruction fetch can not be completely played, Application No. of China Patent Application No. 01816274.6 provides a method and apparatus for prefetching instructions for the secondary processor is a main processor using , the pre-processor by the main external auxiliary processor to fetch the full processing power. 然而,该方法只可以为ー个处理器提供指令预取服务,不适于多核处理器的处理器结构。 However, this method can only provide instructions to processor prefetch ー service processor architecture suitable for multi-core processor. 另ー方面,所述用于预取指令的辅助处理器需要执行程序的简化版本,硬件开销较大。ー another aspect, the simplified version for an auxiliary processor prefetch instructions needed to execute the program, a large hardware overhead.

发明内容 SUMMARY

[0005] 本发明解决的问题是提供ー种指令预取方法与预取装置,简化了硬件指令存储的管理逻辑,提高了处理器的处理效率。 [0005] The present invention addresses the problem of providing a method ー kinds of instruction prefetch prefetch means, simplifying management logic instructions stored in the hardware, the processing efficiency of the processor.

[0006] 为解决上述问题,本发明提供了ー种指令预取装置,用于向处理器核心提供指令预取服务,包括:取指控制单元与指令缓存单元,其中: [0006] In order to solve the above problems, the present invention provides a kind ー instruction prefetch means, for providing instructions to the processor core prefetch service, comprising: a fetch unit and an instruction cache control unit, wherein:

[0007] 所述取指控制单元,用于接收处理器核心提供的预取请求,基于所述预取请求在指令缓存单元捜索与所述预取请求对应的指令,或指示指令缓存单元从片外主存中获取与所述预取请求对应的指令;基于所述预取请求指示指令缓存单元将与预取请求对应的指令提供给处理器核心; [0007] The instruction fetch control unit for receiving a prefetch request provided by the processor core, based on the prefetch request in the instruction cache unit Dissatisfied cable prefetch request corresponding to the instruction or instructions from the instruction cache unit sheet acquiring outer main memory corresponding to the instruction pre-fetch request; supplied to the processor core, based on the instruction prefetch buffer unit request indication corresponding to instruction prefetch requests;

[0008] 所述指令缓存单元,用于存储指令;响应所述取指控制单元的指示,从片外主存中获取与所述预取请求对应的指令,以及将与预取请求对应的指令提供给处理器核心。 [0008] The instruction cache unit for storing instructions; in response to said fetch instruction control unit acquires from the off-chip main memory corresponding to the instruction pre-fetch request, and the instruction pre-fetch request corresponding to to the processor core.

[0009] 可选的,所述取指控制单元还用于将不同处理器核心提供的预取请求合并,基于所述合并后的预取请求进行指令预取操作。 [0009] Alternatively, the instruction fetch control unit is further configured to provide a different processor core combined prefetch requests, based on the combined pre-fetch request the instruction prefetch operation.

[0010] 可选的,所述取指控制单元包括预取请求合并单元、指令引擎、取指缓存、访存缓存、装填缓存以及传输单元,其中: [0010] Optionally, the fetch control unit comprises a combining unit prefetch request, the engine instruction fetch cache, the cache memory access, the cache loading unit and a transmission, wherein:

[0011] 所述预取请求合并单元,用于接收处理器核心提供的预取请求,将取指目标相同的预取请求合井,并将合并后的预取请求提供给取指缓存;[0012] 所述指令引擎,用于从取指缓存中获取预取请求,基于所述预取请求在指令缓存单元搜索与所述预取请求对应的指令,若预取请求命中,则将预取请求写入装填缓存,若预取请求未命中,且所述预取请求与装填缓存中已存储的预取请求不冲突时,则将所述未命中的预取请求提供给访存缓存; [0011] The request to join the prefetch unit for prefetching request receiving processor core provides the same target fetch Hapjeong prefetch request, and provides the combined prefetch request to fetch buffer; [ 0012] the script engine, for obtaining a request from a prefetch cache fetch, prefetch requests based on the instruction cache unit corresponding to the pre-search instruction fetch request, if the prefetch request hits, then the prefetch loading buffer write request, if the request is a prefetch miss, a prefetch request and when the loading stored in the cache prefetch request does not conflict, then a miss prefetch request to access the cache memory;

[0013] 所述取指缓存、访存缓存以及装填缓存,用于暂存预取请求; [0013] The fetch cache, the cache memory access and loading buffer for temporarily storing a prefetch request;

[0014] 所述传输単元,用于从装填缓存中获取预取请求,从指令缓存单元中获取指令,基于所述预取请求将与所述预取请求对应的指令提供给处理器核心; [0014] The radiolabeling transmission element, for obtaining a request from the prefetch cache loading, fetch instructions from the instruction buffer unit, based on the prefetch request to a processor core corresponding to the instruction prefetch request;

[0015] 片外主存从访存缓存中获取预取请求,基于所述预取请求,将与所述预取请求对应的指令提供给指令缓存单元,并将所述预取请求提供给装填缓存。 [0015] The off-chip main memory acquisition request from the prefetch cache memory access based on the prefetch request, be provided to the instruction prefetch buffer unit corresponding to the instruction request and the prefetch request to the loading cache.

[0016] 相应的,本发明还提供了ー种指令预取方法,包括: [0016] Accordingly, the present invention also provides a method of instruction prefetch ー species, comprising:

[0017] 获取处理器核心提供的预取请求; [0017] requesting processor core acquires the prefetch provided;

[0018] 基于所述预取请求,获取与所述预取请求对应的指令并存储于处理器的片上指令缓存中; [0018] Based on the prefetch request to obtain the pre-fetch request corresponding to the instruction and stored in the instruction cache on the processor chip;

[0019] 将片上指令缓存中与所述预取请求对应的指令提供给与所述预取请求对应的处理器核心。 [0019] The on-chip instruction cache supplied to the processor core corresponding to the pre-fetch request corresponding to the instruction prefetch request.

[0020] 与现有技术相比,本发明具有以下优点: [0020] Compared with the prior art, the present invention has the following advantages:

[0021] 1.以较为简便的方式实现了多核处理器的指令预取,无需使用辅助处理器执行程序,降低了硬件开销,简化了硬件指令存储的管理逻辑,提高了处理器的处理效率; [0021] 1. In a more simple way to achieve the multi-core processor instruction prefetching need for a second processor to execute the program, reducing the hardware cost and simplify the management logic instructions stored in the hardware, the processing efficiency of the processor;

[0022] 2.可以将不同处理器核心提供的预取请求合并后再进行指令预取操作,以便于向需求相同指令的处理器核心同时装载指令,这进ー步提高了指令预取的效率。 [0022] 2. prefetch requests may be provided by a different processor core then combined instruction prefetch operation, in order to simultaneously load instruction to the processor core needs the same instruction, which further increases the efficiency of feed ー instruction prefetch .

附图说明 BRIEF DESCRIPTION

[0023] 图1示出了本发明指令预取装置的第一实施例; [0023] FIG 1 illustrates the present invention, the instruction prefetching device according to the first embodiment;

[0024] 图2示出了本发明指令预取装置的第二实施例; [0024] FIG. 2 shows the present invention, the instruction prefetching device according to a second embodiment;

[0025] 图3示出了本发明指令预取装置中预取请求合并单元的一种实施方式; [0025] FIG. 3 illustrates the present invention, the instruction prefetching device of one embodiment of prefetch requests merging unit;

[0026] 图4示出了预取请求合并单元的级联结构; [0026] FIG. 4 shows a cascade structure of the merging unit prefetch request;

[0027] 图5示出了本发明指令预取方法一个实施例的流程。 [0027] FIG. 5 illustrates the present invention, the instruction prefetching method according to an embodiment of a process.

具体实施方式 Detailed ways

[0028] 为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图对本发明的具体实施方式做详细的说明。 [0028] For the above-described objects, features and advantages of the present invention can be more fully understood by reading the following description of the drawings in detail specific embodiments of the present invention binds.

[0029] 在下面的描述中阐述了很多具体细节以便于充分理解本发明,但是本发明还可以采用其他不同于在此描述的其它方式来实施,因此本发明不受下面公开的具体实施例的限制。 [0029] forth in the following description, numerous specific details in order to provide a thorough understanding of the present invention, the present invention also in other ways other than described may be employed to implement, therefore the present invention is not limited to the specific embodiments disclosed below limit.

[0030] 正如背景技术部分所述,现有技术利用辅助处理器为主要处理器提供预取指令的方法只可以为ー个处理器提供指令预取服务,不适于多核处理器的处理器结构。 [0030] As described in the background, the prior art method using an auxiliary processor providing prefetch instructions for the main processor may provide only instruction prefetch ー service processor, the processor is not suitable for multi-core processor architecture. 此外,所述用于预取指令的辅助处理器需要执行程序的简化版本,硬件开销较大。 Furthermore, the prefetch instruction for the secondary processor needs to perform a simplified version of the program, a large hardware overhead.

[0031] 针对上述问题,本发明的发明人提供了一种用于处理器的指令预取方法与预取装置,简化了硬件指令存储的管理逻辑,有效提高了处理器的处理效率。 [0031] For the above-described problems, the present invention provides for a processor instruction prefetching method of prefetching device with simplified hardware management logic instructions stored effectively improves the processing efficiency of the processor. 依据具体实施例的不同,本发明的指令预取方法与预取装置既可以用于单核处理器,也可以用于多核处理器。 Depending on the specific embodiment of embodiment, the instruction prefetching method of the present invention and prefetching means may be used for single-core processors, multi-core processor can also be used. 为了便于说明,在下面的实施例中,均以多核处理器为例进行说明,但不应限制其范围。 For convenience of explanation, in the following embodiments, a multi-core processor are described as an example, but should not limit its scope.

[0032] 第一实施例 [0032] First embodiment

[0033] 图1示出了本发明指令预取装置的第一实施例。 [0033] FIG. 1 shows a first embodiment of the present invention, the instruction prefetching device.

[0034] 如图1所示,所述指令预取装置包括:取指控制单元101与指令缓存单元103。 [0034] 1, the instruction prefetch means comprising: instruction fetch control unit 101 and the instruction cache unit 103. 所述指令预取装置配合于片外主存105及多个处理器核心107,向处理器核心107提供预取指令的服务,其中, Said instruction prefetch means fitted to the outer sheet 105 and the main memory 107 a plurality of processor cores, prefetch instructions to the processor core 107 service, wherein

[0035] 所述取指控制单元101,用于接收处理器核心107提供的预取请求,基于所述预取请求在指令缓存单元103搜索与所述预取请求对应的指令,或指示指令缓存单元103从片外主存105中获取与所述预取请求对应的指令,基于所述预取请求指示指令缓存单元103将与预取请求对应的指令提供给处理器核心107的本地指令缓冲111。 [0035] The fetch control unit 101, a prefetch request receiving processor core 107 based on the prefetch request is cached in instruction cache unit 103 searches corresponding to the instruction pre-fetch request, indication, or 103 in main memory unit 105 acquires the prefetch request corresponding to an instruction from the outer sheet, a native instruction provided to processor core 107 based on the prefetch request indication instruction buffer unit 103 corresponding to the requested instruction prefetch buffer 111 .

[0036] 所述指令缓存单元103,用于存储指令;响应所述取指控制单元101的指示,从片外主存105中获取与所述预取请求对应的指令,以及将与预取请求对应的指令提供给处理器核心107。 [0036] The instruction cache unit 103 for storing instructions; in response to the fetch of the instruction control unit 101, main memory 105 acquires the prefetch request corresponding to an instruction from the outer sheet, with the prefetch request and corresponding instruction to the processor core 107.

[0037] 所述处理器核心107包含有微处理単元109与本地指令缓冲111,所述本地指令缓冲111用于存储由指令缓存单元103提供的指令,并将所述指令提供给对应的微处理単元109 ;所述微处理单元109调用本地指令缓冲111中存储的指令,并执行所述指令。 [0037] The processor core 107 includes a micro processing unit 109 and native instructions radiolabeling buffer 111, instruction buffer 111 to the local instruction provided by the instruction buffer unit 103 for storing and providing the instruction to the corresponding microprocessor radiolabeling element 109; the local micro-instruction call processing instructions stored in buffer 111 in unit 109, and executes the instructions.

[0038] 在具体实施例中,所述指令预取装置集成在处理器中,用于预取指令并提供给所述处理器的一个或多个处理器核心。 [0038] In a particular embodiment, the instruction prefetching device is integrated in a processor for prefetching instructions and provided to the processor or a plurality of processor cores.

[0039] 为了实现指令预取,所述预取请求中应包含有预取指令信息、核心需求信息以及指令装填地址等信息。 [0039] To achieve the instruction prefetch, a prefetch request will contain a prefetch instruction information, instruction information, and charging the core needs address information. 其中,所述预取指令信息用于标识需要预取的指令;所述核心需求信息对应于所述预取的指令具体需要存放到哪些处理器核心中;而所述指令装填地址则对应于所述预取的指令存放于处理器核心的本地指令缓冲中的具体地址,优选的,不同的处理器核心对应于相同的指令装填地址,将与预取指令信息对应的指令存储至所述至少ー个处理器核心中相同的指令装填地址。 Wherein said prefetch instruction information identifying instructions prefetched required; demand information corresponding to the core of the instruction prefetch which is stored in the specific needs of the processor core; and filling the instruction corresponding to the address of the said prefetch instruction is stored in the processor core native instructions specific address buffer, preferably, different processor cores corresponding to the same instruction address is loaded, the instruction prefetch instructions stored information corresponding to the at least ーidentical processor core loading instruction address.

[0040] 上述的预取请求是针对多核处理器而言的,依据具体实施例的不同,处理器中还有可能只包含有一个处理器核心,在这种情况下,发出预取请求与接收预取指令的操作只涉及ー个处理器核心,因此,所述预取请求无需包含有核心需求信息。 [0040] The prefetch request is for purposes of multicore processors, depending on the specific embodiment, the processor also may contain only one processor core, in this case, the prefetch requests and receiving operation of the prefetch instruction only relates ー processor core, therefore, without the prefetch request includes information about the core requirements.

[0041] 在具体实施例中,所述预取指令信息可以包含ー个或多个需要预取的指令的标识;指令缓存单元根据所述预取指令信息,将与预取指令信息对应的ー个或多个指令提供至所述核心需求信息对应的处理器核心。 [0041] In a particular embodiment, the prefetch instruction information may comprise one or more required ー prefetched instruction identifier; instruction prefetch buffer unit according to the instruction information, and the information corresponding to the instruction prefetch ーproviding one or more instructions to the processor core corresponding to the core of demand information. 特别的,对于不同的处理器核心,其可以预取相同的指令,也可以预取不同的指令。 In particular, different processor cores, which may be the same prefetch instruction prefetch may be different instructions.

[0042] 接下来,对本发明指令预取装置第一实施例的工作过程进行说明,在本实施例中,集成所述指令预取装置的处理器包含有多个处理器核心,所述多个处理器核心协同工作,完成数据计算等处理任务。 Examples of the working process [0042] Next, the instruction prefetching device of the present invention will be described first embodiment, in the present embodiment, the integrated processor instruction prefetching device comprising a plurality of processor cores, a plurality of The processor core work together to complete the calculation of data processing tasks.

[0043] 处理器在执行处理任务吋,其中的处理器核心执行的指令通常直接从所述处理器核心的本地指令缓冲中获得。 [0043] In the processor performing the processing tasks inch, wherein the instructions executed by the processor core is usually obtained directly from the processor core native instruction buffer. 但由于所述本地指令缓冲的存储容量有限,因此,在所述本地指令缓冲中的指令处理完成前,即当前执行的程序未结束前,该处理器核心提前向取指控制单元发送预取请求。 However, the former due to the limited storage capacity of the local instruction buffer, therefore, the local instruction buffer in an instruction processing is completed, i.e., before the program currently being executed is not completed, the processor core control unit fetch advance to send a prefetch request . 在具体实施例中,所述预取请求可以采用预取请求指令的形式添加在需要执行的程序中。 In a particular embodiment, the prefetch request may take the form of an instruction prefetch request to add the program to be executed. 在处理器当前执行的程序运行到不同的分支时,处理器核心基于当前程序中不同的预取请求指令,即可提供不同的预取请求。 When the currently executed program processor running into different branches, different processor cores prefetch request command based on the current program, to provide different prefetch requests.

[0044] 对于某个处理器核心发出的预取请求,其既可以为当前处理器核心获取指令,也可以为其他处理器核心一起预取指令。 [0044] For a prefetch requests issued by the processor core, the processor core can obtain for the current instruction, may prefetch instructions with other processor cores. 所述多个处理器核心一起预取的指令即可以用于实现后续指令的同步运行。 The instruction prefetch plurality of processor cores which can be used together to achieve synchronous operation of a subsequent instruction.

[0045] 所述取指控制单元在获得预取请求后,即基于所述预取请求从指令缓存单元中搜索与预取指令信息所对应的指令。 [0045] The instruction fetch control unit after obtaining the prefetch request, the prefetch request that is based on information search instruction prefetch instructions from the corresponding instruction cache unit. 若指令缓存单元中未存储有对应的指令,所述取指控制单元即基于预取请求指示指令缓存单元从片外主存中获取与预取指令信息对应的指令并存储所述指令。 If the instruction is not stored in the cache unit corresponding to the instruction, the instruction fetch control unit that is obtained from off-chip main memory based on the prefetch request indication unit and the instruction cache prefetch instruction information corresponding to the instructions and storing the instructions. 所述片外主存中存储的指令是通过软件编译的,而非硬件动态执行的,这就大大减少了硬件开销。 Instructions stored in the main memory chip is compiled by software, rather than hardware performed dynamically, which greatly reduces the hardware costs.

[0046] 在指令缓存单元中确定存储有与预取指令信息对应的指令后,所述取指控制单元基于预取请求,确定需要所述预取指令的处理器核心、以及所述预取指令在所述处理器核心本地指令缓冲中的具体存储地址,并指示指令缓存单元将其存储的与预取指令信息对应的指令提供给对应的处理器核心的本地指令缓冲,以供微处理单元调用。 [0046] determined in instruction cache unit stores information corresponding to the instruction prefetch instruction, the instruction fetch control unit based on the prefetch request, determines that the pre-fetch of the processor core, and said instruction prefetch in particular the processor core native instructions stored in the address buffer, and instructs the instruction prefetch buffer unit corresponding to the instruction information to provide instructions to store the corresponding processor core native instruction buffer, a micro processing unit calls for .

[0047] 在实际应用中,指令缓存单元将其中存储的指令提供给处理器核心的本地指令缓冲时,需要对所述本地指令缓冲的写入端口申请写入操作。 [0047] In practice, the instruction cache unit stored therein instructions to the write port when the application processor core native instruction buffer, the need for local instruction buffer write operation. 相应的,如果该写入操作的申请结果有效,则基于预取请求向处理器核心的本地指令缓冲中写入指令;如果该写入操作的申请结果无效,则在下一周期后继续申请写入操作,直至获得有效的申请结果,从而完成指令的写入。 Accordingly, if the result of the write operation request is valid, it based on a write instruction prefetch request to the processor core native instruction buffer; if the result of the application invalid write operation, then after the next write cycle to continue to apply operation until a valid result of the application, thereby completing the write command.

[0048] 此外,基于实际应用的需要,所述取指控制单元中还可以选择集成有预取请求合并功能。 [0048] Further, based on the needs of practical application, the instruction fetch control unit may select the prefetch request integrated merge function. 所述预取请求合并功能是指,取指控制单元在接收到不同处理器核心提供的预取请求后,可以基于预取指令信息与核心需求信息的信息,选择将预取指令信息相同的预取请求合并为新的预取请求。 The combined pre-prefetch request function is a function, instruction fetch control unit after receiving a prefetch request to provide a different processor core, the instruction prefetch information may demand information and information on the core, the selection information is the same as the instruction prefetch fetch request into a new prefetch request. 在预取请求合并后,所述取指控制单元即基于合并后的预取请求进行相应的指令预取操作。 After the combined prefetch request, the instruction fetch control unit which is based on the combined pre-fetch request corresponding to the instruction prefetch operation.

[0049] 举例说明所述预取请求合并功能。 [0049] illustrates the prefetch request merge.

[0050] 所述预取请求中包含有预取指令信息与核心需求信息等信息。 [0050] The prefetch request includes information of instruction prefetch and demand information of the core information. 其中,所述核心需求信息可以采用核心序号阵列的形式,所述核心序号阵列中的每一元素对应于处理器中的一个处理器核心。 Wherein the core demand information may take the form of an array of sequence number core, each core element of the serial array corresponds to one processor core processor. 例如,所述处理器包含有核心A、核心B、核心C与核心D,而核心A与核心B分别向指令预取装置提供预取请求I与预取请求2,其中,所述预取请求I与预取请求2对应于相同的指令,但预取请求I对应于向核心A与核心B预取所述指令,其对应的核心序号阵列中核心A与核心B对应的元素为有效;预取请求2对应于向核心C预取所述指令,其对应的核心序号阵列中核心C对应的元素为有效。 For example, the processor core comprising A, B core, the core C and core D, and the core A and core B prefetch request to prefetch means for providing I and 2, respectively, to the prefetch request command, wherein the prefetch request I and 2 prefetch request corresponding to the same instruction, but I prefetch request corresponding to the prefetch instruction to the core a and core B, a core ID corresponding array of core a and core B corresponding to the active elements; pre 2 corresponds to the fetch request of the prefetch instruction to the core C, a core ID corresponding array element corresponding to the core C as valid. 基于所述预取请求I与预取请求2,所述取指控制单元可以将这两个预取请求合并为一个新的预取请求,这时,只需要将所述核心序号阵列中核心A、B、C对应的元素设置为有效,即可实现向核心A、B、C预取所述指令的操作。 I based on the prefetch request and prefetch request 2, the control unit may fetch the two merged into a prefetch request to prefetch the new request, then, simply the number array Core A core , B, C corresponding element is set to valid, can be achieved to the core a, B, C of the instruction prefetch operation.

[0051] 之所以在指令预取装置中设置预取请求合并功能,是因为在处理器运行过程中,不同处理器核心运行的程序并不相同,因此,不同处理器核心发出的预取请求也并不相同。 [0051] The reason for setting the prefetch request merge instruction prefetch means, because in the course of running the processor, the processor core running different programs are not the same, therefore, a prefetch request is issued by a different processor core also It is not the same. 但是,有可能出现不同处理器核心需要相同指令的情况。 However, there may be cases different processor cores require the same instruction appears. 在这种情况下,如果仍分别处理所述预取请求,无疑会导致指令预取的效率降低。 In this case, if the process were still the prefetch request, it will undoubtedly lead to reduced efficiency of instruction prefetch. 因此,通过将预取相同指令的预取请求合并,需求相同指令的不同处理器核心即可以同时装载所需求的指令,这就避免了多次装载相同指令的重复操作,提高了指令预取的效率。 Thus, by the same prefetch instruction pre-fetch request to merge, different processor cores of the same instruction can load demand i.e. simultaneously demand instruction, which operation is repeated a plurality of times to avoid the same load instruction, the instruction prefetch improved effectiveness.

[0052] 第二实施例 [0052] Second Embodiment

[0053] 参考图2,示出了本发明指令预取装置的第二实施例。 [0053] Referring to Figure 2, there is shown the present invention, the instruction prefetching device according to a second embodiment. 相较于本发明指令预取装置的第一实施例,所述指令预取装置第二实施例的结构进一步细化。 Compared to the first embodiment of the present invention the instruction prefetch means, said instruction prefetch means structure of the second embodiment further refined. 此外,依据实际应用的不同,本发明的指令预取装置可以选择集成有预取请求合并功能,在本实施例中,以集成有预取请求合并功能的实施例为例进行说明,但不应限制其范围。 Further, depending on the actual application, the instruction prefetching device of the present invention can be integrated with a prefetch request selected merge, in the present embodiment, the prefetch request to integrate the embodiment described as an example of the merge function, but should not be limit its scope.

[0054] 所述指令预取装置包括:预取请求合并单元201、指令引擎203、指令缓存单元205、取指缓存207、访存缓存209、装填缓存211以及传输单元213。 The [0054] instruction prefetch means comprising: a prefetch request combining unit 201, script engine 203, instruction cache unit 205, instruction fetch cache 207, cache memory access 209, filling buffer 211 and a transmission unit 213. 所述指令预取装置配合于片外主存217及多个处理器核心215,向处理器核心215提供预取指令的服务,其中, Said instruction prefetch means fitted to the outer sheet 217 and the main memory 215 a plurality of processor cores, prefetch instructions to the processor core 215 service, wherein

[0055] 所述预取请求合并单元201,用于接收处理器核心提供的预取请求,将取指目标相同的预取请求合并,并将合并后的预取请求提供给取指缓存207 ; [0055] The combining unit 201 prefetch request, a prefetch request receiving processor core provided, the same fetch request prefetch target combined and prefetch request to fetch buffer 207 after the merger;

[0056] 所述指令引擎203,用于从取指缓存207中获取预取请求,基于所述预取请求在指令缓存单元205搜索与所述预取请求对应的指令,若预取请求命中,则将预取请求写入装填缓存211,若预取请求未命中,且所述预取请求与装填缓存211中已存储的预取请求不冲突时,则将所述未命中的预取请求提供给访存缓存209 ; The [0056] engine 203 instructions, for fetching from the cache prefetch request 207 acquires, based on the prefetch request in the instruction cache unit 205 searches the corresponding instruction prefetch request, if the prefetch request hits, when the write request is then loaded prefetch cache 211, a miss if the prefetch request, the prefetch request and filling the cache 211 already stored in prefetch request does not conflict, then a miss prefetch request to fetch buffer 209;

[0057] 片外主存217,用于从访存缓存209中获取预取请求,基于所述预取请求,将与所述预取请求对应的指令提供给指令缓存单元205,并将所述预取请求提供给装填缓存211 ; [0057] chip main memory 217, configured to obtain access from the prefetch request to the cache memory 209, based on the prefetch request will be provided to instruction cache unit 205 corresponding to the instruction prefetch request, and the prefetch request to the loading buffer 211;

[0058] 传输单元213,用于从装填缓存211中获取预取请求,从指令缓存单元205中获取指令,基于所述预取请求将与所述预取请求对应的指令提供给处理器核心215。 [0058] The transmitting unit 213, for providing a loading buffer 211 acquired from the prefetch request to obtain instructions from the instruction buffer unit 205, based on the prefetch request corresponding to the instruction prefetch request to the processor core 215 .

[0059] 对于所述取指缓存207、访存缓存209以及装填缓存211,其主要用于暂存预取请求。 [0059] For the fetch cache 207, cache memory access cache 209 and loaded 211, mainly for temporarily storing the prefetch requests. 特别的,其可以采用FIFO的缓存器结构,通过所述FIFO结构,预取请求的处理顺序得到有效控制。 In particular, it may be employed a buffer FIFO structure by the FIFO structure, the processing sequence a prefetch request has been effectively controlled.

[0060] 其中,所述预取请求合并单元201、指令引擎203、取指缓存207、访存缓存209、装填缓存211以及传输单元213即对应于第一实施例中的取指控制单元。 [0060] wherein the prefetch request combining unit 201, script engine 203, fetch cache 207, memory access buffer 209, filling buffer 211 and a transmission unit 213 corresponds to the first embodiment, i.e., the instruction fetch control unit.

[0061] 接下来,对本实施例的工作过程进行说明。 [0061] Next, the operation of this embodiment will be described. 其中,本实施例与第一实施例相同的工作方式不再赘述。 Wherein, in the present embodiment is the same as the first embodiment works embodiment will not be repeated.

[0062] 处理器核心执行程序,并基于当前执行程序中的预取指令形成预取请求并向指令预取装置的预取请求合并单元发送所述预取请求。 [0062] processor core executing a program, and form a pre prefetch requests and instruction prefetch means in the execution program based on the current prefetch instruction fetch request unit transmits the combined prefetch request. 特别的,对于每个处理器核心,其每次只能发出一个预取请求,前一预取请求对应的指令返回后,处理器核心才能发出下一预取请求。 In particular, for each processor core, which can only issue a prefetch request, before a corresponding return instruction prefetch request, the processor core to issue a next prefetch request.

[0063] 所述预取请求合并单元获得预取请求后,会判断所述预取请求是否与其他预取请求可以合并,若可以合并,则将取指目标相同的预取请求合并,并将合并后的预取请求提供给取指缓存;若不可合并,则直接将所述预取请求提供给取指缓存。 [0063] After the prefetch request combining unit obtained prefetch request, determines whether the prefetch request can be combined with other pre-fetch request, may be combined if, will fetch the same target prefetch request merged, and the combined pre-fetch request to fetch buffer; if not combined directly to the prefetch request to fetch cache.

[0064] 之后,指令引擎从取指缓存中获取预取请求,并基于所述预取请求在指令缓存单元搜索与所述预取请求对应的指令,若预取请求命中,则将预取请求写入装填缓存,若预取请求未命中,且所述预取请求与装填缓存中已存储的预取请求不冲突时,则将所述来命中的预取请求提供给访存缓存,若所述预取请求与装填缓存中已存储的预取请求冲突,则等待装填缓存中已存储的预取请求处理完之后,再将所述未命中的预取请求提供给访存缓存。 After [0064], the instruction prefetch requests from the acquisition engine fetch cache, based on said instruction prefetch requests to the cache search unit corresponding to instruction prefetch request, if the prefetch request hits, then the prefetch request loaded write cache, if the miss prefetch request and the prefetch request requesting the loading stored in the cache do not conflict, then to the prefetch hit request to access the cache memory, if the filling said prefetch requests stored in the cache prefetch request collision, it waits loading buffer stored prefetch requests after processing, then the miss prefetch request to access the cache memory.

[0065] 对于所述命中的预取请求,说明所述指令缓存单元中已存储有对应的指令,无需从片外主存中获取指令;而对于因未命中而暂存于访存缓存中的预取请求,片外主存会从访存缓存中获取所述预取请求,并基于所述预取请求,将与所述预取请求对应的指令提供给指令缓存单元,同时将所述预取请求提供给装填缓存,这样,指令缓存单元中即存储有与预取请求对应的指令,而装填缓存中亦得到了所述预取请求。 [0065] For the prefetch request hits, the instruction cache unit described in the corresponding instruction has been stored, without acquiring instructions from off-chip main memory; for misses due to a temporary cache memory access prefetch requests, off-chip main memory prefetch request retrieves the fetch from the cache, based on the prefetch request will be provided to instruction cache unit corresponding to the instruction prefetch requests while the pre fetch request to the buffer filling, so that, instruction cache unit that is stored in the corresponding instruction prefetch request, but has also been loaded in the cache prefetch request.

[0066] 接着,传输单元从装填缓存中获取预取请求,同时从指令缓存单元中获取对应的指令,基于所述预取请求将与所述预取请求对应的指令提供给处理器核心的本地指令缓冲,从而完成所述指令预取操作。 Local [0066] Next, the transmission unit acquires from the loading prefetch cache requests while obtaining the corresponding instructions from the instruction buffer unit, based on the prefetch request corresponding to the instruction prefetch request to the processor core instruction buffer, thereby completing the instruction prefetch operation.

[0067] 可以看出,通过所述预取请求合并单元,指令预取装置避免了多次装载相同指令的重复操作,提高了指令预取的效率。 [0067] As can be seen, by the prefetch request combining unit, the instruction prefetch means to avoid duplication of the same load multiple instruction operations, improve the efficiency of instruction prefetch. 接下来,再对所述预取请求合并单元的具体结构进行说明。 Next, the specific structure and then the prefetch request combining unit will be described.

[0068] 参考图3,示出了所述预取请求合并单元的一种具体实施方式。 [0068] Referring to Figure 3, there is shown a specific embodiment the prefetch request merging unit. 所述预取请求合并单元包括轮转仲裁单元与指令比较单元,其中, The combining unit includes a prefetch request arbitration unit instruction cycle comparison unit, wherein

[0069] 所述轮转仲裁单元,用于接收处理器核心提供的预取请求,选择一个预取请求作为主预取请求,其他的预取请求作为从预取请求; [0069] The round-robin arbitration unit for receiving a prefetch request processor core provides selecting a prefetch request to prefetch request as a master, the other prefetch requests from a pre-fetch request;

[0070] 所述指令比较单元,用于接收主预取请求及从预取请求并进行比较,将包含有相同的指令需求信息的主预取请求与从预取请求合并为新的预取请求,同时向处理器核心返回预取请求已处理的接收响应。 [0070] The command comparing unit for receiving the primary and the prefetch requests from the pre-fetch request and compared with the demand for the same instruction prefetch requests master information from the prefetch request into a new prefetch request , and returns the prefetch request receives a response to the processor core has been processed.

[0071] 在实际应用中,与所述指令预取装置相连的处理器的每一处理器核心分别通过一个独立的硬连线与预取请求合并单元连接。 [0071] In practice, each processor core of the processor instruction prefetch means connected to the prefetch request combining unit are connected via a separate hard-wired. 通过所述硬连线,轮转仲裁单元接收多个处理器核心提供的预取请求,并采用公平轮转的策略,每次合并时选择一个预取请求作为主预取请求,其他的预取请求作为从预取请求。 By said hardwired, round-robin arbitration unit receives a plurality of processor cores to provide prefetch requests, and the use of a fair round robin policy, a prefetch request to each selected as a master merge prefetch request, prefetch requests as other from the pre-fetch request.

[0072] 在确定主预取请求与从预取请求之后,指令比较单元对所述主预取请求及从预取请求进行比较。 [0072] In determining the prefetch request from the master after a prefetch request, the comparison unit instruction prefetch request to the primary and compares the prefetch requests. 若所述主预取请求与从预取请求包含有相同的指令需求信息,则将所述主预取请求与从预取请求合并为新的预取请求,同时向处理器核心返回预取请求已处理的接收响应。 If the prefetch request from the primary prefetch request with the same information needs instruction, then the prefetch request from the primary prefetch request into a new prefetch request, and returns the prefetch requests to the processor core receiving a response processed. 所述接收响应用于指示处理器核心其发出的预取请求已被处理,以便撤销被处理的预取请求,避免同一预取请求被多次处理。 Receiving a response to the prefetch request indication issued by the processor core which has been processed so as to withdraw the prefetch request processing, to avoid the same prefetch request is processed a plurality of times. 若所述主预取请求与从预取请求不包含有相同的指令需求信息,则不合并所述主预取请求与从预取请求,而直接提供给后续单元。 If the prefetch request from the primary prefetch request does not include the same information needs instruction, not combined with the prefetch request from the primary prefetch request, provided directly to the subsequent unit.

[0073] 对于所述预取指令合并处理,其可以根据电路实现的延时,分为一拍或多拍进行。 [0073] For the prefetch instruction merge processing, which can realize the delay circuit, into a shoot or be shot. 需要注意的是,对于多拍执行的预取指令合并处理,其合并过程不能够流水进行。 Note that, for multi-shot prefetch instruction executing merge processing, merge process can not be pipelined.

[0074] 依据实际应用的不同,所述预取请求合并单元还可以采用采用多级级联结构,图4即示出了预取请求合并单元的一种级联结构。 [0074] Depending on the actual application, the prefetch request combining unit may also be employed a multi-stage cascade configuration, i.e., FIG. 4 shows a cascade structure prefetch request merging unit. 依据具体应用的不同,所述预取请求合并单元包含有两级以上的级联结构,作为示例,图4示出了两级级联结构,但不应限制其范围。 Depending on the particular application, the prefetch request combining unit comprises more than two cascade structure, by way of example, FIG. 4 shows a two-stage cascade configuration, but should not limit its scope.

[0075] 如图4所示,不同的第一级预取请求合并子单元分别接收不同处理器核心提供的预取请求并进行相应的合并处理。 [0075] As shown in FIG. 4, a first stage different subunit combined prefetch request receives a prefetch request and a different processor core provides the corresponding merging process. 合并处理后的预取请求继续提供给第二级预取请求合并子单元,并由其进行合并处理。 The combined pre-fetch request processing to continue to the second stage prefetch request merging subunit, which are merged by. 再所述两级合并处理之后,预取请求合并单元将合并后的预取请求提供给取指缓存。 The two were combined and then, after processing, the prefetch request combining unit prefetch request is supplied to the merged fetch cache. 之所以设置所述级联结构的预取请求合并单元,是因为处理器核心的规模并不确定,对于规模较大的处理器核心,由于其中包含有较多的处理器核心,如果采用一级结构,预取请求需要比较的次数过多,电路逻辑级数过长,限制了芯片工作频率。 The reason of setting the pre-fetch request of the cascade structure combining unit, because the processor is not determining the size of the core, the core for the larger processor, which contains more since the processor core, if a structure, too many prefetch requests to be compared, the logic circuit stages is too long, limiting the operating frequency of the chip. 因此,采用多级级联结构可以减少预取请求比较电路的延迟,从而优化预取请求的合并电路逻辑,提高电路工作频率。 Thus, multi-stage cascade structure can reduce a delay prefetch requests comparator circuit, a logic circuit to optimize the combined prefetch requests, increase the circuit operating frequency.

[0076] 本发明的指令预取装置实现了指令由片外主存向处理器核心的预先转移,提高了处理器的处理效率。 [0076] The instruction prefetching device of the present invention is achieved by the off-chip main memory instructions previously transferred to the processor core, the processing efficiency of the processor. 特别的,一次预取操作可以向多个处理器核心提供指令,这就减少了指令预取操作的次数,减少了片外主存或指令缓存单元与处理器核心的指令交互次数,提高了指令管理的效率。 In particular, a pre-fetch operation may provide instructions to the plurality of processor cores, which reduces the number of prefetch instructions that reduces the number of off-chip main memory interaction commands or instruction cache unit to the processor core, improves instruction management efficiency.

[0077] 图5示出了本发明指令预取方法一个实施例的流程。 [0077] FIG. 5 illustrates the present invention, the instruction prefetching method according to an embodiment of a process.

[0078] 如图5所示,本发明指令预取方法一个实施例的流程包括: Process [0078] As shown in FIG 5, the instruction prefetching method according to the present invention, one embodiment comprising:

[0079] 执行步骤S502,获取处理器核心提供的预取请求。 [0079] performing step S502, the requesting processor core acquires the prefetch provided.

[0080] 在具体实施例中,所进预取请求可以采用预取请求指令的形式添加在需要执行的程序中。 [0080] In a particular embodiment, the prefetch request into a prefetch request may take the form of adding an instruction to be executed in the program. 所述程序中不同分支或不同位置的预取请求指令对应于不同指令的预取操作。 The different branches of the program at different locations or a pre-fetch request instruction corresponds to a different instruction prefetch operation. 在处理器当前执行的程序运行到不同的分支时,处理器核心即基于当前程序中不同的预取请求指令,提供不同的预取请求。 When the currently executed program processor running into different branches, i.e., different processor cores prefetch request command based on the current program, provide different prefetch requests.

[0081] 特别的,处理器核心提供所述预取请求为异步进行的,因此,所述预取请求的提供并不会影响处理器核心中的微处理单元进行运算操作,也不会影响其流水线继续运行。 [0081] In particular, the processor core provides the prefetch request is asynchronous, thus providing the prefetch request does not affect the micro processing unit performs arithmetic processor core operation, which does not affect pipeline continues to run. 可以看出,本发明的指令预取方法通过程序自行管理指令预取,提高了指令预取的可移植性,大大扩展了适用范围。 As can be seen, the instruction prefetching method according to the present invention is self-managed by a program instruction prefetch, the instruction prefetch improve portability, greatly expanding the scope of application.

[0082] 依据具体实施例的不同,可以将不同处理器核心提供的需要相同指令的预取请求合并。 It requires the same instruction pre [0082] Depending on the particular embodiment, the processor core may be different provided fetch request merge. 通过将预取相同指令的预取请求合并,需求相同指令的不同处理器核心即可以同时装载所述指令,这就避免了多次装载相同指令的重复操作,提高了指令预取的效率。 By the same prefetch instruction pre-fetch request to merge, different processor cores of the same instruction can load demand i.e. the instructions simultaneously, which avoids repeating the same operation a plurality of times the load instruction, to improve the efficiency of instruction prefetch.

[0083] 执行步骤S504,基于所述预取请求,获取与所述预取请求对应的指令并存储于处理器的片上指令缓存中。 [0083] performing step S504, based on the prefetch request, the prefetch request to acquire a command corresponding to the processor and stored on-chip instruction cache.

[0084] 为了实现指令预取,所述预取请求中应包含有预取指令信息、核心需求信息以及指令装填地址等信息。 [0084] To achieve the instruction prefetch, a prefetch request will contain a prefetch instruction information, instruction information, and charging the core needs address information. 其中,所述预取指令信息用于标识需要预取的指令,所述核心需求信息对应于所述预取的指令具体需要存放到哪些处理器核心中,而所述指令装填地址则对应于所述预取的指令存放于处理器核心本地指令缓冲中的具体地址。 Wherein said prefetch instruction information identifying instructions prefetched required, the core demand information corresponding to the instruction prefetch which is stored in the specific needs of the processor core, and the address of the instruction corresponding to the filling said prefetch instruction is stored in the instruction buffer in a processor core local specific address.

[0085] 在实际应用中,所述集成于程序中的预取请求指令可以采用下述的指令格式:Preblk Ra Rb,其中,Preblk标记了预取请求指令,该指令的两个参数Ra表示预取指令信息,Rb表示核心需求信息,而指令装填地址可以基于当前本地指令缓冲的存储状况由处理器核心提供。 [0085] In practical applications, the integrated program instruction prefetch request can use the following instruction format: Preblk Ra Rb, wherein, Preblk marked instruction prefetch request, two parameters Ra indicates the instruction pre instruction fetch information, Rb represents a core demand information, the instruction address can be charged based on a current status of the local instruction buffer storage provided by a processor core.

[0086] 基于所述预取请求中的预取指令信息,在处理器的片上指令缓存(所述片上指令缓存不包括处理器核心中的本地指令缓冲,例如图1中示出的指令缓存单元)中搜索与所述预取指令信息所对应的指令。 [0086] Based on the instruction prefetch request information, instruction cache on the processor chip (on-chip instruction cache of the processor core does not include a native instruction buffer, such as shown in FIG. 1 instruction cache unit ) searches the instruction prefetch instruction information corresponds. 若片上指令缓存中已存储有对应的指令,则继续执行后续的指令写入;而若片上指令缓存中未存储有对应的指令,则从片外主存中获取与预取指令信息对应的指令并存储到片上指令缓存中。 If the instruction on-chip cache already stores a corresponding instruction, execution continues subsequent instruction is written; and if instructions on a chip is not stored in the cache has a corresponding instruction from the off-chip main memory and acquires the prefetch instruction information corresponding to the instruction and stored into the instruction cache on chip.

[0087] 相较于片外主存与片上指令缓存间较慢的通信速度,处理器核心与片上指令缓存间具备较快的通信速度,因此,通过将指令由片外主存转移到片上指令缓存中,指令预取速度大为提高。 [0087] Compared to the slower main memory off-chip communication speed between the on-chip instruction cache, cache provided between the communication speed faster processor core with instructions on a sheet, and therefore, by the command transferred from the main memory to the off-chip instruction sheet cache, greatly improved the speed of instruction prefetch. [0088] 执行步骤S506,将片上指令缓存中与所述预取请求对应的指令提供给与所述预取请求对应的处理器核心。 [0088] performing step S506, the corresponding instruction cache prefetch request to the processor core corresponding to the request given to the prefetch instruction sheet.

[0089] 在具体实施例中,所述核心需求信息包含至少一个处理器核心的标识,基于所述预取请求中的核心需求信息,向一个或多个处理器核心的本地指令缓冲提供预取的指令;优选的,不同的处理器核心对应于相同的指令装填地址,将与预取指令信息对应的指令存储至所述至少一个处理器核心中相同的指令装填地址。 [0089] In a particular embodiment, the core needs information includes at least one processor core of the identification, based on the prefetch request information of the core requirements, one or more processor cores to the native instruction prefetch buffer instructions; preferably, different processor cores corresponding to the same instruction address is loaded, the prefetch instruction information corresponding to the instructions stored in the at least one processor core filling the same instruction address.

[0090] 所述预取指令信息可以包含一个或多个需要预取的指令的标识;基于所述预取指令信息,将与预取指令信息对应的一个或多个指令提供至所述核心需求信息对应的处理器核心。 [0090] The prefetch instruction information may include identifying one or more prefetch instructions needed; prefetch instructions based on the information, the information corresponding to the instruction pre-fetch one or more instructions to the core needs to provide information corresponding to the processor core. 特别的,对于不同的处理器核心,其可以预取相同的指令,也可以预取不同的指令。 In particular, different processor cores, which may be the same prefetch instruction prefetch may be different instructions.

[0091] 在实际应用中,需要对处理器核心中用于存储指令的本地指令缓冲的写入端口申请写入操作。 [0091] In practical applications, the need for the processor core native instructions for storing instructions to apply the buffered write port write operations. 相应的,如果该写入操作的申请结果有效,则基于预取请求向处理器核心的本地指令缓冲中写入指令;如果该写入操作的申请结果无效,则在下一周期后继续申请写入操作,直至获得有效的申请结果,从而完成指令的写入。 Accordingly, if the result of the write operation request is valid, it based on a write instruction prefetch request to the processor core native instruction buffer; if the result of the application invalid write operation, then after the next write cycle to continue to apply operation until a valid result of the application, thereby completing the write command.

[0092] 本发明的指令预取方法与预取装置以较为简便的方式实现了多核处理器的指令预取,简化了硬件指令存储的管理逻辑,提高了处理器的处理效率;所述预取请求通过指令的形式添加在需要执行的程序中,通过程序来管理指令预取,这就提高了指令预取的可移植性,大大扩展了适用范围;此外,基于需要预取的指令的不同,还可以将不同处理器核心提供的预取请求合并后再进行指令预取操作,以便于向需求相同指令的处理器核心同时装载指令,这进一步提高了指令预取的效率。 [0092] The instruction prefetching method according to the present invention, prefetch means in a more simple way to achieve the multi-core processor instruction prefetching logic hardware instruction simplifies the management of storage, the processing efficiency of the processor; the prefetch request by adding in the form of instructions to be executed in the program, managed by a program instruction prefetch, which improves the portability of the instruction prefetch, greatly expanding the scope of application; Furthermore, different needs based on an instruction prefetched, the processor core may also provide different prefetch requests after merging instruction prefetch operation, in order to simultaneously load instruction to the processor core needs the same instruction, which further improves the efficiency of instruction prefetch.

[0093] 应该理解,此处的例子和实施例仅是示例性的,本领域技术人员可以在不背离本申请和所附权利要求所限定的本发明的精神和范围的情况下,做出各种修改和更正。 [0093] It should be understood that the examples and embodiments described herein are merely exemplary, and those skilled in the art may be made without departing from the spirit and scope of the present application and the present invention as defined in the appended claims, make various modifications and corrections.

Claims (7)

1.ー种指令预取装置,用于向处理器核心提供指令预取服务,其特征在干, 包括:取指控制单元与指令缓存单元,其中: 所述取指控制单元,用于接收处理器核心提供的预取请求,基于所述预取请求在指令缓存单元捜索与所述预取请求对应的指令,或指示指令缓存单元从片外主存中获取与所述预取请求对应的指令;基于所述预取请求指示指令缓存单元将与预取请求对应的指令提供给处理器核心;以及用于将不同处理器核心提供的预取相同指令的预取请求合并,基于所述合并后的预取请求进行指令预取操作; 所述指令缓存单元,用于存储指令;响应所述取指控制单元的指示,从片外主存中获取与所述预取请求对应的指令,以及将与预取请求对应的指令提供给处理器核心; 所述取指控制单元包括预取请求合并单元、指令引擎、取指缓存、访存缓存、装 1. ー species instruction prefetch means, for providing instructions to the processor core prefetch service, characterized in that the dry, comprising: instruction fetch control unit and the instruction cache unit, wherein: the fetch control unit, for receiving process providing core prefetch request based on the prefetch request in the instruction cache unit Dissatisfied cable corresponding to the instruction prefetch request, an indication or instruction cache unit acquires from the off-chip pre-fetch request corresponding to main memory command ; request instruction based on the instruction prefetch buffer unit corresponding to the processor core and instruction prefetch request; and means for prefetching the same pre-instructions provided by different processor cores fetch request combined, based on the combined instruction prefetch requests prefetch operation; the instruction cache means for storing instructions; in response to the fetch instruction from the control unit acquires the corresponding instruction pre-fetch request from the off-chip main memory, and the instruction prefetch request to the corresponding processor core; fetch the prefetch request control unit includes a combining unit, the engine instruction fetch cache, the cache memory access, means 缓存以及传输单元,其中: 所述预取请求合并单元,用于接收处理器核心提供的预取请求,将取指目标相同的预取请求合并,并将合并后的预取请求提供给取指缓存; 所述指令引擎,用于从取指缓存中获取所述合并后的预取请求,基于所述合并后的预取请求在指令缓存单元捜索与所述合并后的预取请求对应的指令,若合并后的预取请求命中,则将合并后的预取请求写入装填缓存,若合并后的预取请求未命中,且所述合并后的预取请求与装填缓存中已存储的合并后的预取请求不冲突时,则将所述未命中的合并后的预取请求提供给访存缓存; 所述取指缓存、访存缓存以及装填缓存,用于暂存合并后的预取请求; 所述传输単元,用于从装填缓存中获取合并后的预取请求,从指令缓存单元中获取指令,基于所述合并后的预取请求将与所述合并后的预取请 Buffer and a transmission unit, wherein: the prefetch request combining unit, receiving a request for prefetching the processor core provides the same target fetch combined prefetch request, and supplies a prefetch request to fetch the merged buffer; the instruction engine configured to fetch from cache prefetch request to acquire the merged later, after the pre-command instruction cache unit Dissatisfied combined with the cable corresponding to the fetch request based on a pre-fetch request after the merger cache prefetch request and after filling, if combined with the prefetch request hits the prefetch request will be loaded combined write cache, if the prefetch request misses combined and stored in the merge merged after the prefetch request does not conflict, the combined prefetch request will be provided to the missed cache memory access; fetch the cache, the cache memory access and loading buffer for temporarily storing the combined prefetch request; radiolabeling the transport element, for the prefetch request acquired from the loading combined cache, fetching instructions from the instruction buffer unit, based on a pre after prefetch requests after the fusion to combine the taking please 对应的指令提供给处理器核心; 片外主存从访存缓存中获取合并后的预取请求,基于所述合并后的预取请求,将与所述合并后的预取请求对应的指令提供给指令缓存单元,并将所述合并后的预取请求提供给装填缓存。 Corresponding instructions to the processor core; off-chip main memory prefetch requests from the acquired combined fetch cache, the prefetch request based on the combined, the combined pre corresponding to the instruction fetch request unit to the instruction cache and prefetch request to the cache loading of said integrated.
2.如权利要求1所述的指令预取装置,其特征在于,所述预取请求合并单元包括轮转仲裁単元与指令比较单元,其中, 所述轮转仲裁単元,用于接收处理器核心提供的预取请求,选择ー个预取请求作为主预取请求,其他的预取请求作为从预取请求; 所述指令比较单元,用于接收主预取请求及从预取请求并进行比较,将包含有相同的指令需求信息的主预取请求与从预取请求合并为新的预取请求,同时向处理器核心返回预取请求已处理的接收响应。 2 according to an instruction prefetching device according to claim 1, wherein said combining unit comprises a prefetch request element and round-robin arbitration radiolabeling command comparing unit, wherein the round-robin arbitration radiolabeling element, provided for receiving the processor core prefetch request, a prefetch request ー selected as the primary prefetch request, the prefetch request as the other from the pre-fetch request; said command comparing unit for receiving the primary and the prefetch requests from the pre-fetch request and compare the with the same demand for the main instruction fetch request with a pre-information from the prefetch request into a new prefetch request, and returns a response receiving a prefetch request has been processed to the processor core.
3.如权利要求2所述的指令预取装置,其特征在于,所述预取请求合并单元采用级联结构。 2 according to an instruction claim prefetch means, characterized in that the prefetch request combining unit cascade configuration.
4.如权利要求1所述的指令预取装置,其特征在于,所述指令预取装置集成在多核处理器中,用于预取指令并提供给所述多核处理器的多个处理器核心。 4. the instruction prefetching device according to claim 1, characterized in that said instruction prefetch means integrated in the multi-core processor, a prefetch instruction is provided to the multi-core processor and a plurality of processor cores .
5.如权利要求4所述的指令预取装置,其特征在于,所述预取请求中包含有预取指令信息、核心需求信息以及指令装填地址。 5. The instruction of claim 4 prefetch means, characterized in that the prefetch request includes prefetch instruction information, instruction information, and loading the core needs address.
6.如权利要求5所述的指令预取装置,其特征在干,所述核心需求信息包含至少ー个处理器核心的标识,所述指令缓存单元将与预取指令信息对应的指令存储至所述至少ー个处理器核心中相同的指令装填地址。 5 6. The instruction prefetching device as claimed in claim, characterized in that the dry, said core comprising ー demand information identifying at least one processor core, the instruction cache unit information corresponding to the instructions stored in the prefetch instruction said at least ー identical processor core loading instruction address.
7.如权利要求5所述的指令预取装置,其特征在于,所述预取指令信息包含一个或多个需要预取的指令的标识;所述指令缓存单元根据所述预取指令信息,将与预取指令信息对应的ー个或多个指令提供至所述核心需求信息对应的处理器核心。 5 7. The instruction prefetching device as claimed in claim, wherein the pre-fetch information comprises identification of one or more instructions need prefetched; the instruction cache unit according to the pre-fetch instruction information, the prefetch instruction information corresponding to one or more instructions ー the core needs to provide information corresponding to the processor core.
CN 201010508876 2010-10-12 2010-10-12 Instruction prefetching method and device CN102446087B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010508876 CN102446087B (en) 2010-10-12 2010-10-12 Instruction prefetching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010508876 CN102446087B (en) 2010-10-12 2010-10-12 Instruction prefetching method and device

Publications (2)

Publication Number Publication Date
CN102446087A CN102446087A (en) 2012-05-09
CN102446087B true CN102446087B (en) 2014-02-26

Family

ID=46008609

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010508876 CN102446087B (en) 2010-10-12 2010-10-12 Instruction prefetching method and device

Country Status (1)

Country Link
CN (1) CN102446087B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701416B (en) * 2016-01-11 2019-04-05 华为技术有限公司 Forced access control method, device and physical host
CN109219805A (en) * 2017-05-08 2019-01-15 华为技术有限公司 A kind of multiple nucleus system memory pool access method, relevant apparatus, system and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1818855A (en) 2005-02-09 2006-08-16 国际商业机器公司 Method and apparatus for performing data prefetch in a multiprocessor system
CN101855614A (en) 2007-07-18 2010-10-06 先进微装置公司 Multiple-core processor with hierarchical microcode store
EP1442374B1 (en) 2001-10-22 2011-07-27 Oracle America, Inc. Multi-core multi-thread processor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1442374B1 (en) 2001-10-22 2011-07-27 Oracle America, Inc. Multi-core multi-thread processor
CN1818855A (en) 2005-02-09 2006-08-16 国际商业机器公司 Method and apparatus for performing data prefetch in a multiprocessor system
CN101855614A (en) 2007-07-18 2010-10-06 先进微装置公司 Multiple-core processor with hierarchical microcode store

Also Published As

Publication number Publication date
CN102446087A (en) 2012-05-09

Similar Documents

Publication Publication Date Title
JP3461704B2 (en) Instruction processing system and a computer using the condition code
US5872985A (en) Switching multi-context processor and method overcoming pipeline vacancies
EP1421490B1 (en) Methods and apparatus for improving throughput of cache-based embedded processors by switching tasks in response to a cache miss
US5949985A (en) Method and system for handling interrupts during emulation of a program
JP4142141B2 (en) Computer system
US5796971A (en) Method for generating prefetch instruction with a field specifying type of information and location for it such as an instruction cache or data cache
US7716673B2 (en) Tasks distribution in a multi-processor including a translation lookaside buffer shared between processors
EP1550032B1 (en) Method and apparatus for thread-based memory access in a multithreaded processor
CN1092360C (en) Method and apparatus for decreasing thread switch latency in multithread processor
CN101071398B (en) Scatter-gather intelligent memory architecture on multiprocessor systems
US5560029A (en) Data processing system with synchronization coprocessor for multiple threads
CN1089462C (en) System and method using integrated level teo cache and memory controller
KR930002328B1 (en) Method and apparatus for predicting valid perfomance of vitrual addrss to physical address translation
CN1186720C (en) Appts. and method for transferring data according to physical paging pointer comparison result
US6401192B1 (en) Apparatus for software initiated prefetch and method therefor
US5353418A (en) System storing thread descriptor identifying one of plural threads of computation in storage only when all data for operating on thread is ready and independently of resultant imperative processing of thread
KR920004289B1 (en) A pipeline having an integral cache for computer processors
US20110320680A1 (en) Method and Apparatus for Efficient Memory Bank Utilization in Multi-Threaded Packet Processors
US5742802A (en) Method and system for efficiently mapping guest instruction in an emulation assist unit
US6782454B1 (en) System and method for pre-fetching for pointer linked data structures
JP3739491B2 (en) Harmonious software control Harvard architecture cache memory using a prefetch instruction
US5524220A (en) Memory subsystems having look-ahead instruction prefetch buffers and intelligent posted write buffers for increasing the throughput of digital computer systems
KR100240591B1 (en) Branch target buffer for processing branch instruction efficontly and brand prediction method using thereof
CN101526895B (en) High-performance low-power-consumption embedded processor based on command dual-transmission
JP4520790B2 (en) Information processing apparatus and software pre-fetch control method

Legal Events

Date Code Title Description
C06 Publication
C10 Request of examination as to substance
C14 Granted