CN102446087A - Instruction prefetching method and device - Google Patents

Instruction prefetching method and device Download PDF

Info

Publication number
CN102446087A
CN102446087A CN2010105088769A CN201010508876A CN102446087A CN 102446087 A CN102446087 A CN 102446087A CN 2010105088769 A CN2010105088769 A CN 2010105088769A CN 201010508876 A CN201010508876 A CN 201010508876A CN 102446087 A CN102446087 A CN 102446087A
Authority
CN
China
Prior art keywords
instruction
prefetch request
prefetch
buffer memory
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010105088769A
Other languages
Chinese (zh)
Other versions
CN102446087B (en
Inventor
李宏亮
谢向辉
任秀江
郑方
吕晖
钱磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Jiangnan Computing Technology Institute
Original Assignee
Wuxi Jiangnan Computing Technology Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Jiangnan Computing Technology Institute filed Critical Wuxi Jiangnan Computing Technology Institute
Priority to CN201010508876.9A priority Critical patent/CN102446087B/en
Publication of CN102446087A publication Critical patent/CN102446087A/en
Application granted granted Critical
Publication of CN102446087B publication Critical patent/CN102446087B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Advance Control (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention relates to an instruction prefetching method and device. The instruction prefetching device is used for providing an instruction prefetching service for a processor core and comprises an instruction prefetching control unit and an instruction caching unit, wherein the instruction prefetching control unit is used for receiving a prefetching request provided by the processor core, searching an instruction corresponding to the prefetching request in the instruction caching unit on the basis of the prefetching request or indicating that the instruction caching unit acquires the instruction corresponding to the prefetching request from an off-chip main memory and indicating that the instruction caching unit provides the instruction corresponding to the prefetching request for the processor core on the basis of the prefetching request; and the instruction caching unit is used for storing the instruction, responding to the instruction of the instruction prefetching control unit, acquiring the instruction corresponding to the prefetching request from the off-chip main memory and providing the instruction corresponding to the prefetching request for the processor core. According to the instruction prefetching method and device disclosed by the invention, instruction prefetching of a multi-core processor is realized in a simpler mode, management logic of hardware instruction storage is simplified and process efficiency of the processor is improved.

Description

Instruction prefetch method and prefetching device
Technical field
The present invention relates to field of computer technology, more specifically, the present invention relates to a kind of instruction prefetch method and prefetching device that is used for polycaryon processor.
Background technology
Polycaryon processor is meant a plurality of processor cores and correlation function parts is integrated on the processor chips, thereby forms the processor structure that includes a plurality of processor cores.Compared to single core processor in the past, because integrated a plurality of processor cores, the data-handling capacity of said polycaryon processor improves greatly.
In the process of said polycaryon processor executive routine, the local instruction buffer of each processor core limited capacity can not satisfy the storage of extensive program, the instruction that needs constantly outside sheet, the main memory needs the to be used said local instruction buffer of packing into.Yet; The access speed of said outer main memory is difficult to satisfy the demand of processor core processing speed; And said polycaryon processor has main memory outside a plurality of processor core contention access sheets usually when actual motion; This has aggravated the difficulty that instruction is obtained, and the data-handling capacity of polycaryon processor can't be brought into play fully.
Obtain the problem that the processor processes ability that causes can't be brought into play fully to said because of instruction; Application number is that 01816274.6 one Chinese patent application provides a kind of method and apparatus that adopts auxiliary processor to look ahead to be used for the instruction of Main Processor Unit, through the outside auxiliary processor prefetched instruction of Main Processor Unit to give full play to the processing power of processor.Yet this method thinks that only a processor provides the instruction prefetch service, is inappropriate for the processor structure of polycaryon processor.On the other hand, the said auxiliary processor that is used for prefetched instruction needs the simple version of executive routine, and hardware spending is bigger.
Summary of the invention
The problem that the present invention solves provides a kind of instruction prefetch method and prefetching device, has simplified the management logic of hardware instruction storage, has improved the treatment effeciency of processor.
For addressing the above problem, the invention provides a kind of instruction prefetch device, be used for the instruction prefetch service being provided to processor core, comprising: get and accuse system unit and instruction buffer unit, wherein:
Said getting accused the system unit; Be used for the prefetch request that the receiving processor core provides; In the instruction corresponding with said prefetch request of Instructions Cache unit searches, or the indicator buffer unit obtains the instruction corresponding with said prefetch request the main memory outside sheet based on said prefetch request; Based on said prefetch request indicator buffer unit the instruction corresponding with prefetch request offered processor core;
Said Instructions Cache unit is used for storage instruction; Respond the said indication of accusing the system unit of getting, outside sheet, obtain the instruction corresponding the main memory, and the instruction corresponding with prefetch request offered processor core with said prefetch request.
Optional, said getting accuses that the prefetch request that the system unit also is used for the different processor core is provided merges, and carries out the instruction prefetch operation based on the prefetch request after the said merging.
Optional, said getting accuses that the system unit comprises prefetch request merge cells, instruction engine, gets finger buffer memory, memory access buffer memory, filling buffer memory and transmission unit, wherein:
Said prefetch request merge cells is used for the prefetch request that the receiving processor core provides, and merge getting the identical prefetch request of feeling the pulse with the finger-tip mark, and the prefetch request after will merging offers and gets the finger buffer memory;
Said instruction engine; Be used for referring to that from getting buffer memory obtains prefetch request, in the instruction corresponding of Instructions Cache unit searches, hit as if prefetch request with said prefetch request based on said prefetch request; Then prefetch request is write the filling buffer memory; If prefetch request is miss, and said prefetch request with load buffer memory in the prefetch request of having stored when not conflicting, then said miss prefetch request is offered the memory access buffer memory;
Said getting refers to buffer memory, memory access buffer memory and filling buffer memory, is used for temporary prefetch request;
Said transmission unit is used for obtaining prefetch request from the filling buffer memory, from the Instructions Cache unit, obtains instruction, based on said prefetch request the instruction corresponding with said prefetch request is offered processor core;
The outer main memory of sheet obtains prefetch request from the memory access buffer memory, based on said prefetch request, the instruction corresponding with said prefetch request offered the Instructions Cache unit, and said prefetch request is offered the filling buffer memory.
Accordingly, the present invention also provides a kind of instruction prefetch method, comprising:
Obtain the prefetch request that processor core provides;
Based on said prefetch request, obtain the instruction corresponding and be stored in the on-chip command buffer memory of processor with said prefetch request;
Instruction corresponding with said prefetch request in the on-chip command buffer memory is offered and said prefetch request corresponding processing device core.
Compared with prior art, the present invention has the following advantages:
1. realize the instruction prefetch of polycaryon processor with comparatively easy mode, need not to use the auxiliary processor executive routine, reduced hardware spending, simplified the management logic of hardware instruction storage, improved the treatment effeciency of processor;
2. carry out the instruction prefetch operation again after can the prefetch request that the different processor core provides being merged, so that to the processor core while of demand same instructions loading instruction, this has further improved the efficient of instruction prefetch.
Description of drawings
Fig. 1 shows first embodiment of instruction prefetch device of the present invention;
Fig. 2 shows second embodiment of instruction prefetch device of the present invention;
Fig. 3 shows a kind of embodiment of prefetch request merge cells in the instruction prefetch device of the present invention;
Fig. 4 shows the cascade structure of prefetch request merge cells;
Fig. 5 shows the flow process of an embodiment of instruction prefetch method of the present invention.
Embodiment
For make above-mentioned purpose of the present invention, feature and advantage can be more obviously understandable, does detailed explanation below in conjunction with the accompanying drawing specific embodiments of the invention.
Set forth a lot of details in the following description so that make much of the present invention, implement but the present invention can also adopt other to be different from alternate manner described here, so the present invention has not received the restriction of following disclosed specific embodiment.
Said as the background technology part, prior art utilizes auxiliary processor to think only that for Main Processor Unit provides the method for prefetched instruction a processor provides the instruction prefetch service, is inappropriate for the processor structure of polycaryon processor.In addition, the said auxiliary processor that is used for prefetched instruction needs the simple version of executive routine, and hardware spending is bigger.
To the problems referred to above, inventor of the present invention provides a kind of instruction prefetch method and prefetching device that is used for processor, has simplified the management logic of hardware instruction storage, has effectively improved the treatment effeciency of processor.According to the difference of specific embodiment, instruction prefetch method of the present invention and prefetching device both can be used for single core processor, also can be used for polycaryon processor.For the ease of explanation, among the embodiment below, all be that example describes, but should not limit its scope with the polycaryon processor.
First embodiment
Fig. 1 shows first embodiment of instruction prefetch device of the present invention.
As shown in Figure 1, said instruction prefetch device comprises: get and accuse system unit 101 and instruction buffer units 103.Said instruction prefetch device is matched with the outer main memory 105 of sheet and a plurality of processor core 107, to processor core 107 service of prefetched instruction is provided, wherein,
Said getting accused system unit 101; Be used for the prefetch request that receiving processor core 107 provides; Based on the 103 search instruction corresponding of said prefetch request in the Instructions Cache unit with said prefetch request; Or indicator buffer unit 103 obtains the instruction corresponding with said prefetch request the main memory 105 outside sheet, will the instruction corresponding with prefetch request offers the local instruction buffer 111 of processor core 107 based on said prefetch request indicator buffer unit 103.
Said Instructions Cache unit 103 is used for storage instruction; Respond the said indication of accusing system unit 101 of getting, outside sheet, obtain the instruction corresponding the main memory 105, and the instruction corresponding with prefetch request offered processor core 107 with said prefetch request.
Said processor core 107 includes microprocessing unit 109 and local instruction buffer 111, and said local instruction buffer 111 is used to store the instruction that is provided by Instructions Cache unit 103, and said instruction is offered corresponding microprocessing unit 109; Said microprocessing unit 109 calls instructions stored in the local instruction buffer 111, and carries out said instruction.
In specific embodiment, said instruction prefetch device is integrated in the processor, is used for prefetched instruction and offers one or more processor cores of said processor.
In order to realize instruction prefetch, should include information such as prefetched instruction information, core demand information and instruction filling address in the said prefetch request.Wherein, said prefetched instruction information is used to identify the instruction that need look ahead; Which processor core said core demand information specifically need be stored in the heart corresponding to said instruction of looking ahead; The specific address in the local instruction buffer of processor core is then deposited in corresponding to said instruction of looking ahead in said instruction filling address; Preferably; Different processor cores will load the address with the corresponding identical instruction of instruction storage to said at least one processor core of prefetched instruction information in the heart corresponding to identical instruction filling address.
Above-mentioned prefetch request is to polycaryon processor; Difference according to specific embodiment; Also might only include a processor core in the processor, in this case, send prefetch request and only relate to a processor core with the operation that receives prefetched instruction; Therefore, said prefetch request need not to include the core demand information.
In specific embodiment, said prefetched instruction information can comprise the sign of the instruction that one or more needs look ahead; The Instructions Cache unit provides the one or more instructions corresponding with prefetched instruction information to said core demand information corresponding processing device core according to said prefetched instruction information.Special, for different processor cores, its identical instruction of can looking ahead, the different instruction of also can looking ahead.
Next; The course of work to instruction prefetch device first embodiment of the present invention describes, and in the present embodiment, the processor of integrated said instruction prefetch device includes a plurality of processor cores; Processing tasks such as data computation are accomplished in said a plurality of processor core collaborative work.
Processor is when carrying out Processing tasks, and the instruction that processor core is wherein carried out directly obtains from the local instruction buffer of said processor core usually.But because the memory capacity of said local instruction buffer is limited, therefore, before the instruction process in said local instruction buffer was accomplished, before promptly the program of current executed did not finish, this processor core accused that to getting the system unit sends prefetch request in advance.In specific embodiment, said prefetch request can adopt the form of prefetch request instruction to be added in the program that needs to carry out.When different branches was arrived in the program run of processor current executed, processor core can provide different prefetch request based on prefetch request instructions different in the present procedure.
For the prefetch request that certain processor core sends, it both can obtain instruction for the current processor core, also can be other processor cores prefetched instruction together.The instruction that said a plurality of processor core is looked ahead together promptly can be used to realize the synchronous operation of subsequent instructions.
The said charge system unit of getting is promptly searched for from the Instructions Cache unit and the pairing instruction of prefetched instruction information based on said prefetch request after obtaining prefetch request.If do not store corresponding instruction in the Instructions Cache unit, said getting accuses that the system unit promptly obtains the instruction corresponding with prefetched instruction information and stores said instruction based on prefetch request indicator buffer unit the main memory outside sheet.Instructions stored is through software translating in the said outer main memory, but not hardware dynamic carries out, and this has just significantly reduced hardware spending.
After in the Instructions Cache unit, confirming to store the instruction corresponding with prefetched instruction information; Said getting accuses that the system unit is based on prefetch request; Need to confirm the processor core and the concrete memory address of said prefetched instruction in the local instruction buffer of said processor core of said prefetched instruction; And the indicator buffer unit offers the local instruction buffer of corresponding processing device core with the instruction corresponding with prefetched instruction information of its storage, calls for microprocessing unit.
In practical application, the Instructions Cache unit will be wherein instructions stored when offering the local instruction buffer of processor core, need be to said local instruction buffer write inbound port application write operation.Accordingly, if the application result of this write operation is effective, then in the local instruction buffer of processor core, write instruction based on prefetch request; If the application result of this write operation is invalid,,, thereby accomplish writing of instruction until the effective application result of acquisition then in the continued application write operation of following one-period.
In addition, based on the needs of practical application, said getting in the charge system unit can also select to be integrated with the prefetch request pooling function.Said prefetch request pooling function is meant; Get and accuse that the system unit is after receiving the prefetch request that the different processor core provides; Can select the prefetch request that prefetched instruction information is identical to merge into new prefetch request based on the information of prefetched instruction information and core demand information.After prefetch request merged, said getting accused that the system unit promptly carries out the corresponding instruction prefetch operation based on the prefetch request after merging.
Illustrate said prefetch request pooling function.
Include information such as prefetched instruction information and core demand information in the said prefetch request.Wherein, said core demand information can adopt the form of core sequence number array, and each element in the said core sequence number array is corresponding to a processor core in the processor.For example; Said processor includes core A, core B, core C and core D; And core A and core B provide prefetch request 1 and prefetch request 2 to the instruction prefetch device respectively, and wherein, said prefetch request 1 and prefetch request 2 are corresponding to identical instruction; But prefetch request 1 is corresponding to the said instruction of looking ahead to core A and core B, and the core A element corresponding with core B is effective in its corresponding core sequence number array; Prefetch request 2 is corresponding to the said instruction of looking ahead to core C, and the element that core C is corresponding in its corresponding core sequence number array is effective.Based on said prefetch request 1 and prefetch request 2; Said getting accuses that making the unit can merge into a new prefetch request with these two prefetch request; At this moment; Core A in only need said core sequence number array, B, the element that C is corresponding are set to effectively, can realize the look ahead operation of said instruction to core A, B, C.
Why the prefetch request pooling function being set in the instruction prefetch device, is because in the processor operational process, and the program of different processor core operation is also inequality, and therefore, the prefetch request that the different processor core is sent is also also inequality.But, the situation that the different processor core needs same instructions might appear.In this case, if still handle said prefetch request respectively, can cause the efficient of instruction prefetch to reduce undoubtedly.Therefore, the prefetch request through the same instructions of will looking ahead merges, and the different processor core of demand same instructions promptly can be loaded required instruction simultaneously, and the repetitive operation that this has just been avoided repeatedly loading same instructions has improved the efficient of instruction prefetch.
Second embodiment
With reference to figure 2, show second embodiment of instruction prefetch device of the present invention.Compared to first embodiment of instruction prefetch device of the present invention, the further refinement of structure of said instruction prefetch device second embodiment.In addition, according to the difference of practical application, instruction prefetch device of the present invention can select to be integrated with the prefetch request pooling function, in the present embodiment, is that example describes with the embodiment that is integrated with the prefetch request pooling function, but should not limits its scope.
Said instruction prefetch device comprises: prefetch request merge cells 201, instruction engine 203, Instructions Cache unit 205, get and refer to buffer memory 207, memory access buffer memory 209, filling buffer memory 211 and transmission unit 213.Said instruction prefetch device is matched with the outer main memory 217 of sheet and a plurality of processor core 215, to processor core 215 service of prefetched instruction is provided, wherein,
Said prefetch request merge cells 201 is used for the prefetch request that the receiving processor core provides, and merge getting the identical prefetch request of feeling the pulse with the finger-tip mark, and the prefetch request after will merging offers and gets finger buffer memory 207;
Said instruction engine 203; Be used for referring to that from getting buffer memory 207 obtains prefetch request, based on the 205 search instruction corresponding of said prefetch request, if prefetch request is hit in the Instructions Cache unit with said prefetch request; Then prefetch request is write filling buffer memory 211; If prefetch request is miss, and said prefetch request with load buffer memory 211 in the prefetch request of having stored when not conflicting, then said miss prefetch request is offered memory access buffer memory 209;
The outer main memory 217 of sheet is used for obtaining prefetch request from memory access buffer memory 209, based on said prefetch request, the instruction corresponding with said prefetch request is offered Instructions Cache unit 205, and said prefetch request is offered filling buffer memory 211;
Transmission unit 213 is used for obtaining prefetch request from filling buffer memory 211, from Instructions Cache unit 205, obtains instruction, based on said prefetch request the instruction corresponding with said prefetch request is offered processor core 215.
Refer to buffer memory 207, memory access buffer memory 209 and filling buffer memory 211 for said getting, it is mainly used in temporary prefetch request.Special, it can adopt the buffer structure of FIFO, and through said fifo structure, the processing sequence of prefetch request is effectively controlled.
Wherein, said prefetch request merge cells 201, instruction engine 203, get and refer to that buffer memory 207, memory access buffer memory 209, filling buffer memory 211 and transmission unit 213 promptly accuse the system unit corresponding to getting among first embodiment.
Next, the course of work to present embodiment describes.Wherein, the working method that present embodiment is identical with first embodiment repeats no more.
The processor core executive routine, and form prefetch request and send said prefetch request to the prefetch request merge cells of instruction prefetch device based on the prefetched instruction in the current executed program.Special, for each processor core, it can only send a prefetch request at every turn, and after the corresponding instruction of last prefetch request was returned, processor core just can send next prefetch request.
Said prefetch request merge cells can judge whether said prefetch request can merge with other prefetch request after obtaining prefetch request, if can merge, then will get the identical prefetch request of feeling the pulse with the finger-tip mark and merge, and the prefetch request after will merging offers and gets the finger buffer memory; If can not merge, then directly said prefetch request is offered and get the finger buffer memory.
Afterwards, instruction engine refers to obtain prefetch request the buffer memory from getting, and based on said prefetch request in the instruction corresponding of Instructions Cache unit searches with said prefetch request; If prefetch request is hit; Then prefetch request is write the filling buffer memory, if prefetch request is miss, and said prefetch request with load buffer memory in the prefetch request of having stored when not conflicting; Then said prefetch request of hitting is offered the memory access buffer memory; If said prefetch request is conflicted with the prefetch request that filling has been stored in the buffer memory, wait for that then loading the prefetch request of having stored in the buffer memory handles after, more said miss prefetch request is offered the memory access buffer memory.
For said prefetch request of hitting, explain to have stored corresponding instruction in the said Instructions Cache unit, need not outside sheet, to obtain the main memory instruction; And for because of the miss prefetch request that is temporary in the memory access buffer memory; The outer main memory of sheet can obtain said prefetch request from the memory access buffer memory, and based on said prefetch request, the instruction corresponding with said prefetch request is offered the Instructions Cache unit; Simultaneously said prefetch request is offered the filling buffer memory; Like this, promptly store the instruction corresponding in the Instructions Cache unit, and also obtained said prefetch request in the filling buffer memory with prefetch request.
Then; Transmission unit obtains prefetch request from the filling buffer memory; From the Instructions Cache unit, obtain corresponding instruction simultaneously, the instruction corresponding with said prefetch request offered the local instruction buffer of processor core, thereby accomplish said instruction prefetch operation based on said prefetch request.
Can find out that through said prefetch request merge cells, the instruction prefetch device has avoided repeatedly loading the repetitive operation of same instructions, has improved the efficient of instruction prefetch.Next, again the concrete structure of said prefetch request merge cells is described.
With reference to figure 3, show a kind of embodiment of said prefetch request merge cells.Said prefetch request merge cells comprises that wheel changes arbitration unit and instruction comparing unit, wherein,
The said commentaries on classics arbitration unit of taking turns is used for the prefetch request that the receiving processor core provides, and selects a prefetch request as main prefetch request, and the conduct of other prefetch request is from prefetch request;
Said instruction comparing unit; Be used to receive main prefetch request and from prefetch request and compare; To include the main prefetch request of identical instruction demand information and merge into new prefetch request, return the reception response that prefetch request has been handled to processor core simultaneously from prefetch request.
In practical application, each processor core of the processor that links to each other with said instruction prefetch device respectively through one independently hardwired be connected with the prefetch request merge cells.Through said hardwired, wheel changes arbitration unit and receives the prefetch request that a plurality of processor cores provide, and the fair strategy that changes of taking turns of employing, selects a prefetch request as main prefetch request when merging at every turn, and the conduct of other prefetch request is from prefetch request.
With after prefetch request, the instruction comparing unit reaches said main prefetch request and compares from prefetch request in definite main prefetch request.If said main prefetch request with include identical instruction demand information from prefetch request, then with said main prefetch request with merge into new prefetch request from prefetch request, return the reception that prefetch request handled to processor core simultaneously and respond.Said reception response is used to indicate its prefetch request of sending of processor core to be processed, so that cancel the prefetch request that is processed, avoids same prefetch request repeatedly to be handled.If said main prefetch request with do not include identical instruction demand information from prefetch request, the said main prefetch request of nonjoinder and then, and directly offer follow-up unit from prefetch request.
Merge processing for said prefetched instruction, it can be divided into one and clap or clap more and carry out according to the time-delay of circuit realization.It should be noted that the prefetched instruction of carrying out for many bats merges processing, its merging process can not carry out by flowing water.
According to the difference of practical application, said prefetch request merge cells can also the employing multi-stage cascade structure, and Fig. 4 promptly shows a kind of cascade structure of prefetch request merge cells.According to the concrete difference of using, said prefetch request merge cells includes the above cascade structure of two-stage, and as an example, Fig. 4 shows the two-stage cascade structure, but should not limit its scope.
As shown in Figure 4, different first order prefetch request merges subelement and receives the prefetch request that the different processor core provides respectively and merge processing accordingly.The prefetch request that merges after handling continues to offer second level prefetch request merging subelement, and merges processing by it.Said again two-stage merges after the processing, and the prefetch request after the prefetch request merge cells will merge offers gets the finger buffer memory.Why the prefetch request merge cells of said cascade structure is set; Be because the scale of processor core is also uncertain, for larger processor core, owing to wherein include more processor core; If employing primary structure; The number of times that prefetch request need compare is too much, and circuit logic progression is long, has limited working frequency of chip.Therefore, adopt multi-stage cascade structure can reduce the delay of prefetch request comparator circuit, thereby optimize the merging circuit logic of prefetch request, improve circuit work frequency.
Instruction prefetch device of the present invention has realized that instruction by the in advance transfer of the outer main memory of sheet to processor core, has improved the treatment effeciency of processor.Special, one time prefetch operation can provide instruction to a plurality of processor cores, and this has just reduced the number of times that instruction prefetch is operated, and has reduced the instruction interaction number of times of the outer main memory of sheet or Instructions Cache unit and processor core, has improved the instruction efficiency of managing.
Fig. 5 shows the flow process of an embodiment of instruction prefetch method of the present invention.
As shown in Figure 5, the flow process of an embodiment of instruction prefetch method of the present invention comprises:
Execution in step S502 obtains the prefetch request that processor core provides.
In specific embodiment, advance the form that prefetch request can adopt prefetch request instruction and be added in the program that needs to carry out.The instruction of the prefetch request of different branches or diverse location is corresponding to the prefetch operation of different instruction in the said program.When different branches was arrived in the program run of processor current executed, processor core promptly based on prefetch request instructions different in the present procedure, provided different prefetch request.
Special, it is asynchronous carrying out that processor core provides said prefetch request, and therefore, the providing of said prefetch request can't influence processor core microprocessing unit in the heart and carry out arithmetic operation, also can not influence its streamline and continue operation.Can find out, instruction prefetch method of the present invention through program voluntarily supervisory instruction look ahead, improved the portability of instruction prefetch, expanded the scope of application greatly.
According to the difference of specific embodiment, can the prefetch request that needs same instructions that the different processor core provides be merged.Prefetch request through the same instructions of will looking ahead merges, and the different processor core of demand same instructions promptly can be loaded said instruction simultaneously, and the repetitive operation that this has just been avoided repeatedly loading same instructions has improved the efficient of instruction prefetch.
Execution in step S504 based on said prefetch request, obtains the instruction corresponding with said prefetch request and is stored in the on-chip command buffer memory of processor.
In order to realize instruction prefetch, should include information such as prefetched instruction information, core demand information and instruction filling address in the said prefetch request.Wherein, Said prefetched instruction information is used to identify the instruction that need look ahead; Which processor core said core demand information specifically need be stored in the heart corresponding to said instruction of looking ahead, and the specific address in the local instruction buffer of processor core is then deposited in corresponding to said instruction of looking ahead in said instruction filling address.
In practical application; The said prefetch request instruction that is integrated in the program can be adopted following order format: Preblk Ra Rb; Wherein, the Preblk mark prefetch request instruction, two parameter Ra of this instruction are represented prefetched instruction information; Rb representes the core demand information, and instruction filling address can be provided by processor core based on the memory state of current local instruction buffer.
Based on the prefetched instruction information in the said prefetch request; Search and the pairing instruction of said prefetched instruction information in the on-chip command buffer memory (said on-chip command buffer memory does not comprise processor core local instruction buffer in the heart, for example the Instructions Cache unit shown in Fig. 1) of processor.If stored corresponding instruction in the on-chip command buffer memory, then continue to carry out follow-up instruction and write; And if do not store corresponding instruction in the on-chip command buffer memory, then outside sheet, obtain the instruction corresponding with prefetched instruction information the main memory and store in the on-chip command buffer memory.
Slower communication speed between main memory and on-chip command buffer memory possesses communication speed faster between processor core and on-chip command buffer memory outside sheet, therefore, is transferred in the on-chip command buffer memory by the outer main memory of sheet through instructing, and instruction prefetch speed greatly improves.
Execution in step S506 offers instruction corresponding with said prefetch request in the on-chip command buffer memory and said prefetch request corresponding processing device core.
In specific embodiment, said core demand information comprises the sign of at least one processor core, based on the core demand information in the said prefetch request, to the local instruction buffer of one or more processor cores the instruction of looking ahead is provided; Preferably, different processor cores will load the address with the corresponding identical instruction of instruction storage to said at least one processor core of prefetched instruction information in the heart corresponding to identical instruction filling address.
Said prefetched instruction information can comprise the sign of the instruction that one or more needs look ahead; Based on said prefetched instruction information, the one or more instructions corresponding with prefetched instruction information are provided to said core demand information corresponding processing device core.Special, for different processor cores, its identical instruction of can looking ahead, the different instruction of also can looking ahead.
In practical application, need to processor core be used in the heart storage instruction local instruction buffer write inbound port application write operation.Accordingly, if the application result of this write operation is effective, then in the local instruction buffer of processor core, write instruction based on prefetch request; If the application result of this write operation is invalid,,, thereby accomplish writing of instruction until the effective application result of acquisition then in the continued application write operation of following one-period.
Instruction prefetch method of the present invention and prefetching device have been realized the instruction prefetch of polycaryon processor with comparatively easy mode, have simplified the management logic of hardware instruction storage, have improved the treatment effeciency of processor; Said prefetch request is added in the program that needs to carry out through the form of instruction, comes supervisory instruction to look ahead through program, and this has just improved the portability of instruction prefetch, has expanded the scope of application greatly; In addition; The difference of the instruction of looking ahead based on needs; Carry out the instruction prefetch operation again after can also the prefetch request that the different processor core provides being merged, so that to the processor core while of demand same instructions loading instruction, this has further improved the efficient of instruction prefetch.
Should be appreciated that example here and embodiment only are exemplary, those skilled in the art can make various modifications and corrigendum under the situation of the spirit and scope of the present invention that do not deviate from the application and accompanying claims and limited.

Claims (18)

1. an instruction prefetch device is used for to processor core the instruction prefetch service being provided, it is characterized in that,
Comprise: get and accuse system unit and instruction buffer unit, wherein:
Said getting accused the system unit; Be used for the prefetch request that the receiving processor core provides; In the instruction corresponding with said prefetch request of Instructions Cache unit searches, or the indicator buffer unit obtains the instruction corresponding with said prefetch request the main memory outside sheet based on said prefetch request; Based on said prefetch request indicator buffer unit the instruction corresponding with prefetch request offered processor core;
Said Instructions Cache unit is used for storage instruction; Respond the said indication of accusing the system unit of getting, outside sheet, obtain the instruction corresponding the main memory, and the instruction corresponding with prefetch request offered processor core with said prefetch request.
2. instruction prefetch device as claimed in claim 1 is characterized in that, said getting accuses that the system unit comprises instruction engine, gets finger buffer memory, memory access buffer memory, filling buffer memory and transmission unit, wherein:
The said finger buffer memory of getting is used to receive and prefetch request that temporary processor core provides;
Said instruction engine; Be used for referring to that from getting buffer memory obtains prefetch request, in the instruction corresponding of Instructions Cache unit searches, hit as if prefetch request with said prefetch request based on said prefetch request; Then prefetch request is write the filling buffer memory; If prefetch request is miss, and said prefetch request with load buffer memory in the prefetch request of having stored when not conflicting, then said miss prefetch request is offered the memory access buffer memory;
Said memory access buffer memory and filling buffer memory are used for temporary prefetch request;
Said transmission unit is used for obtaining prefetch request from the filling buffer memory, from the Instructions Cache unit, obtains instruction, based on said prefetch request the instruction corresponding with said prefetch request is offered processor core;
The outer main memory of sheet obtains prefetch request from the memory access buffer memory, based on said prefetch request, the instruction corresponding with said prefetch request offered the Instructions Cache unit, and said prefetch request is offered the filling buffer memory.
3. instruction prefetch device as claimed in claim 1 is characterized in that, said getting accuses that the prefetch request that the system unit also is used for the different processor core is provided merges, and carries out the instruction prefetch operation based on the prefetch request after the said merging.
4. instruction prefetch device as claimed in claim 3 is characterized in that, said getting accuses that the system unit comprises prefetch request merge cells, instruction engine, gets finger buffer memory, memory access buffer memory, filling buffer memory and transmission unit, wherein:
Said prefetch request merge cells is used for the prefetch request that the receiving processor core provides, and merge getting the identical prefetch request of feeling the pulse with the finger-tip mark, and the prefetch request after will merging offers and gets the finger buffer memory;
Said instruction engine; Be used for referring to that from getting buffer memory obtains prefetch request, in the instruction corresponding of Instructions Cache unit searches, hit as if prefetch request with said prefetch request based on said prefetch request; Then prefetch request is write the filling buffer memory; If prefetch request is miss, and said prefetch request with load buffer memory in the prefetch request of having stored when not conflicting, then said miss prefetch request is offered the memory access buffer memory;
Said getting refers to buffer memory, memory access buffer memory and filling buffer memory, is used for temporary prefetch request;
Said transmission unit is used for obtaining prefetch request from the filling buffer memory, from the Instructions Cache unit, obtains instruction, based on said prefetch request the instruction corresponding with said prefetch request is offered processor core;
The outer main memory of sheet obtains prefetch request from the memory access buffer memory, based on said prefetch request, the instruction corresponding with said prefetch request offered the Instructions Cache unit, and said prefetch request is offered the filling buffer memory.
5. instruction prefetch device as claimed in claim 4 is characterized in that, said prefetch request merge cells comprises that wheel changes arbitration unit and instruction comparing unit, wherein,
The said commentaries on classics arbitration unit of taking turns is used for the prefetch request that the receiving processor core provides, and selects a prefetch request as main prefetch request, and the conduct of other prefetch request is from prefetch request;
Said instruction comparing unit; Be used to receive main prefetch request and from prefetch request and compare; To include the main prefetch request of identical instruction demand information and merge into new prefetch request, return the reception response that prefetch request has been handled to processor core simultaneously from prefetch request.
6. instruction prefetch device as claimed in claim 5 is characterized in that, said preparatory request merge cells adopts cascade structure.
7. instruction prefetch device as claimed in claim 1 is characterized in that, said instruction prefetch device is integrated in the polycaryon processor, is used for prefetched instruction and offers a plurality of processor cores of said polycaryon processor.
8. instruction prefetch device as claimed in claim 7 is characterized in that, includes prefetched instruction information, core demand information and instruction filling address in the said prefetch request.
9. instruction prefetch device as claimed in claim 8; It is characterized in that; Said core demand information comprises the sign of at least one processor core, the identical in the heart instruction filling address of instruction storage to said at least one processor core that said Instructions Cache unit will be corresponding with prefetched instruction information.
10. instruction prefetch device as claimed in claim 8 is characterized in that, said prefetched instruction information comprises the sign of the instruction that one or more needs look ahead; Said Instructions Cache unit provides the one or more instructions corresponding with prefetched instruction information to said core demand information corresponding processing device core according to said prefetched instruction information.
11. an instruction prefetch method is characterized in that, comprising:
Obtain the prefetch request that processor core provides;
Based on said prefetch request, obtain the instruction corresponding and be stored in the on-chip command buffer memory of processor with said prefetch request;
Instruction corresponding with said prefetch request in the on-chip command buffer memory is offered and said prefetch request corresponding processing device core.
12. instruction prefetch method as claimed in claim 11; It is characterized in that; Said based on said prefetch request; Obtain the instruction corresponding and be stored in the on-chip command buffer memory of processor and comprise with said prefetch request: search and the pairing instruction of said prefetch request in the on-chip command buffer memory of processor, if do not store the instruction corresponding in the said on-chip command buffer memory, then outside sheet, obtain the instruction corresponding the main memory and store in the on-chip command buffer memory with prefetch request with prefetch request.
13. instruction prefetch method as claimed in claim 11; It is characterized in that; Said prefetch request is added in the executable program of current executed with the form of prefetch request instruction, and processor core provides different prefetch request based on prefetch request instructions different in the said current executed program.
14. instruction prefetch method as claimed in claim 11; It is characterized in that; Include prefetched instruction information, core demand information and instruction filling address in the said prefetch request; Said instruction corresponding with said prefetch request in the on-chip command buffer memory is offered with said prefetch request corresponding processing device core also comprises: said core demand information comprises the sign of at least one processor core; Based on the core demand information in the said prefetch request, the instruction of looking ahead is provided to the local instruction buffer of one or more processor cores.
15. instruction prefetch method as claimed in claim 14; It is characterized in that; Also comprise: different processor cores will load the address with the corresponding identical instruction of instruction storage to said at least one processor core of prefetched instruction information in the heart corresponding to identical instruction filling address.
16. instruction prefetch method as claimed in claim 14; It is characterized in that; Also comprise: said prefetched instruction information comprises the sign of the instruction that one or more needs look ahead; Based on said prefetched instruction information, the one or more instructions corresponding with prefetched instruction information are provided to said core demand information corresponding processing device core.
17. instruction prefetch method as claimed in claim 11; It is characterized in that; Said instruction corresponding with said prefetch request in the on-chip command buffer memory is offered with said prefetch request corresponding processing device core also comprises: based on said prefetch request; To the local instruction buffer application of processor core write operation,, then in said local instruction buffer, write the instruction corresponding in the heart with said prefetch request if application result is effective; If application result is invalid, then in the continued application write operation of following one-period.
18. instruction prefetch method as claimed in claim 11 is characterized in that, said obtain the prefetch request that processor core provides after, also comprise: the prefetch request that needs same instructions that the different processor core is provided merges.
CN201010508876.9A 2010-10-12 2010-10-12 Instruction prefetching method and device Active CN102446087B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010508876.9A CN102446087B (en) 2010-10-12 2010-10-12 Instruction prefetching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010508876.9A CN102446087B (en) 2010-10-12 2010-10-12 Instruction prefetching method and device

Publications (2)

Publication Number Publication Date
CN102446087A true CN102446087A (en) 2012-05-09
CN102446087B CN102446087B (en) 2014-02-26

Family

ID=46008609

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010508876.9A Active CN102446087B (en) 2010-10-12 2010-10-12 Instruction prefetching method and device

Country Status (1)

Country Link
CN (1) CN102446087B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701416A (en) * 2016-01-11 2016-06-22 华为技术有限公司 Mandatory access control method and device as well as physical host
WO2018205117A1 (en) * 2017-05-08 2018-11-15 华为技术有限公司 Memory access method for multi-core system, and related apparatus, system and storage medium
CN110990062A (en) * 2019-11-27 2020-04-10 上海高性能集成电路设计中心 Instruction prefetching filtering method
CN112527390A (en) * 2019-08-28 2021-03-19 武汉杰开科技有限公司 Data acquisition method, microprocessor and device with storage function
CN112631490A (en) * 2020-12-30 2021-04-09 北京飞讯数码科技有限公司 Display interface control method and device, computer equipment and storage medium
CN113703835A (en) * 2021-08-11 2021-11-26 深圳市德明利技术股份有限公司 High-speed data stream processing method and system based on multi-core processor
CN114528025A (en) * 2022-02-25 2022-05-24 深圳市航顺芯片技术研发有限公司 Instruction processing method and device, microcontroller and readable storage medium
CN114721727A (en) * 2022-06-10 2022-07-08 成都登临科技有限公司 Processor, electronic equipment and multithreading shared instruction prefetching method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1818855A (en) * 2005-02-09 2006-08-16 国际商业机器公司 Method and apparatus for performing data prefetch in a multiprocessor system
CN101855614A (en) * 2007-07-18 2010-10-06 先进微装置公司 Have the hierarchy type microcode store more than core processor
EP1442374B1 (en) * 2001-10-22 2011-07-27 Oracle America, Inc. Multi-core multi-thread processor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1442374B1 (en) * 2001-10-22 2011-07-27 Oracle America, Inc. Multi-core multi-thread processor
CN1818855A (en) * 2005-02-09 2006-08-16 国际商业机器公司 Method and apparatus for performing data prefetch in a multiprocessor system
CN101855614A (en) * 2007-07-18 2010-10-06 先进微装置公司 Have the hierarchy type microcode store more than core processor

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017121305A1 (en) * 2016-01-11 2017-07-20 华为技术有限公司 Mandatory access control method and device, and physical host
CN105701416B (en) * 2016-01-11 2019-04-05 华为技术有限公司 Forced access control method, device and physical host
US10762223B2 (en) 2016-01-11 2020-09-01 Huawei Technologies Co., Ltd. Mandatory access control method and apparatus, and physical host
CN105701416A (en) * 2016-01-11 2016-06-22 华为技术有限公司 Mandatory access control method and device as well as physical host
CN109219805B (en) * 2017-05-08 2023-11-10 华为技术有限公司 Memory access method, related device, system and storage medium of multi-core system
WO2018205117A1 (en) * 2017-05-08 2018-11-15 华为技术有限公司 Memory access method for multi-core system, and related apparatus, system and storage medium
CN109219805A (en) * 2017-05-08 2019-01-15 华为技术有限公司 A kind of multiple nucleus system memory pool access method, relevant apparatus, system and storage medium
US11294675B2 (en) 2017-05-08 2022-04-05 Huawei Technolgoies Co., Ltd. Writing prefetched data into intra-core caches of cores identified by prefetching instructions
CN112527390A (en) * 2019-08-28 2021-03-19 武汉杰开科技有限公司 Data acquisition method, microprocessor and device with storage function
CN112527390B (en) * 2019-08-28 2024-03-12 武汉杰开科技有限公司 Data acquisition method, microprocessor and device with storage function
CN110990062A (en) * 2019-11-27 2020-04-10 上海高性能集成电路设计中心 Instruction prefetching filtering method
CN112631490A (en) * 2020-12-30 2021-04-09 北京飞讯数码科技有限公司 Display interface control method and device, computer equipment and storage medium
CN113703835A (en) * 2021-08-11 2021-11-26 深圳市德明利技术股份有限公司 High-speed data stream processing method and system based on multi-core processor
CN113703835B (en) * 2021-08-11 2024-03-19 深圳市德明利技术股份有限公司 High-speed data stream processing method and system based on multi-core processor
CN114528025A (en) * 2022-02-25 2022-05-24 深圳市航顺芯片技术研发有限公司 Instruction processing method and device, microcontroller and readable storage medium
CN114721727A (en) * 2022-06-10 2022-07-08 成都登临科技有限公司 Processor, electronic equipment and multithreading shared instruction prefetching method
CN114721727B (en) * 2022-06-10 2022-09-13 成都登临科技有限公司 Processor, electronic equipment and multithreading shared instruction prefetching method
WO2023236443A1 (en) * 2022-06-10 2023-12-14 成都登临科技有限公司 Processor, electronic device and multi-thread shared instruction prefetching method

Also Published As

Publication number Publication date
CN102446087B (en) 2014-02-26

Similar Documents

Publication Publication Date Title
CN102446087A (en) Instruction prefetching method and device
CN107301455B (en) Hybrid cube storage system for convolutional neural network and accelerated computing method
CN104899182B (en) A kind of Matrix Multiplication accelerated method for supporting variable partitioned blocks
CN100535850C (en) Registers for data transfers within a multithreaded processor
CN102446158B (en) Multi-core processor and multi-core processor set
CN104102474B (en) Information processing unit and information processing method
CN101055644B (en) Mapping processing device and its method for processing signaling, data and logic unit operation method
KR101710116B1 (en) Processor, Apparatus and Method for memory management
KR100766732B1 (en) Device and method for performing high-speed low overhead context switch
CN103019838B (en) Multi-DSP (Digital Signal Processor) platform based distributed type real-time multiple task operating system
CN111047036B (en) Neural network processor, chip and electronic equipment
CN110908716B (en) Method for implementing vector aggregation loading instruction
CN104239134A (en) Method and device for managing tasks of many-core system
CN114610472B (en) Multi-process management method in heterogeneous computing and computing equipment
CN112667289A (en) CNN reasoning acceleration system, acceleration method and medium
CN115640052A (en) Multi-core multi-pipeline parallel execution optimization method for graphics processor
CN111091181B (en) Convolution processing unit, neural network processor, electronic device and convolution operation method
CN115098412A (en) Peripheral access controller, data access device and corresponding method, medium and chip
CN102855213B (en) A kind of instruction storage method of network processing unit instruction storage device and the device
CN114661353A (en) Data handling device and processor supporting multithreading
CN110515872B (en) Direct memory access method, device, special computing chip and heterogeneous computing system
CN108632166B (en) DPDK-based packet receiving secondary caching method and system
JP5133540B2 (en) Information processing apparatus, data transfer method, and program
CN112506676B (en) Inter-process data transmission method, computer device and storage medium
CN111047035B (en) Neural network processor, chip and electronic equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant