CN1698031A

CN1698031A - Method of prefetching data/instructions related to externally triggered events

Info

Publication number: CN1698031A
Application number: CNA038012367A
Authority: CN
Inventors: 安德烈亚斯·多林
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2002-03-05
Filing date: 2003-02-27
Publication date: 2005-11-16
Anticipated expiration: 2023-02-27
Also published as: WO2003075154A3; BR0308268A; CN100345103C; JP2005519389A; AU2003221510A1; AU2003221510A8; WO2003075154A2; KR20040101231A; CA2478007A1; MXPA04008502A

Abstract

Method of prefetching data /instructions related to externally triggered events in a system including an infrastructure (18) having an input interface (20) for receiving data/ instructions to be handled by the infrastructure and an output interface (22) for transmitting data after they have been handled, a memory (14) for storing data/instructions when they are received by input interface, a processor (10) for processing at least some data/instructions, the processor having a cache wherein the data/instructions are stored before being processed, and an external source (26) for assigning sequential tasks to the processor. The method comprises the following steps which are performed while the processor is performing a previous task: determining the location in the memory of data/ instructions to be processed by the processor, indicating to the cache the addresses of these memory locations, fetching the contents of the memory locations and writing them into the cache, and assigning the task of processing the data/instructions to the processor.

Description

The method of the data/commands that is associated with the incident of external trigger of looking ahead

Technical field

The system that relate generally to of the present invention is such, wherein can interrupt being used to handle wherein data/commands and the irrelevant task handling device of previous task, the invention particularly relates to the method for the data/commands that the incident of the external trigger that is used to look ahead is associated such as the external source of the scheduler in network processing unit.

Background technology

The efficient of Modern microprocessor and microcontroller core depends on the efficient of cache memory very much, because instruction cycle time is more much smaller than memory access time.Cache memory utilizes the locality characteristics of storage access, and a storage access that Here it is is more likely near the fact of former access.

Cache memory comprises a mechanism, it is cache controller, be used for loading new content, and for so, cache memory is this behavior slot milling by abandoning old input item to selected zone (cache line).The current software (the Data Cache Block Touch (data cache block contact software) that for example, is used for all devices that meet PowerPC) that can be had the cache memory prefetched instruction of cache controller activates.And, there is the suggestion of cache controller identification such as the conventional access mode of the data structure of linear span or link.Unfortunately, existing method does not contain the incident of external trigger, and the processing of wherein needed in these cases memory content and front is irrelevant.Under these circumstances, be the event source such as interrupt source, scheduler or other processors of allocating task to unique understanding about needed memory content.

Can interrupt being used for handling system with the data of the data independence of formerly handling such as the external source of the scheduler in network processing unit therein, described processor produces a cache miss (cache miss).This means that processor stops to handle the data that need up to it and is loaded onto the cache memory from storer.This has wasted the considerable time.Therefore, for the processor clock speed of current memory technology and 400MHz, each cache miss relates to 36 processor clock cycles, this means about 40 instructions.Because it is stronger than the storer time-delay that current technological trend demonstrates the growth of processor instruction speed.Therefore the loss instruction number of each cache miss increases.

Summary of the invention

Therefore, fundamental purpose of the present invention is to realize a kind of method, is used to the data/commands of looking ahead and being associated with external trigger, so that avoid for the cache miss that can easily determine the data of address.

Therefore the present invention relates to and is used for the look ahead method of the data/commands that is associated with external trigger a system, described system comprises: foundation structure, and it has the input interface that is used to receive the data/commands that will be handled by described foundation structure, the output interface that is used for sending them after data are processed; Storer is used for storing data/commands when interface receives when data/commands is transfused to; Processor is used to handle at least some data/commands, and described processor has cache memory, and wherein data/commands was stored before processed; External source is used for to processor distribution sequence task.Described method comprises the following steps, these steps are performed when carrying out previous task when processor: determining will be by the position of data/commands in storer of processor processing; Indicate the address of these memory locations to cache memory; Obtain the content of memory location and they are write in the cache memory; Task to processor distribution deal with data/instruction.

Description of drawings

To more specific description of the present invention, can understand above-mentioned and other purposes of the present invention, feature and advantage better below reading in conjunction with the drawings, wherein:

Fig. 1 is a block scheme of wherein realizing the network processing system of the method according to this invention.

Fig. 2 is the process flow diagram of the step of expression the method according to this invention.

Embodiment

Such system comprises processor core 10, such as the PowerPC processor core that has been equipped with the data/commands cache memory.Described system is by high performance bus 12 structures, described high performance bus 12 is such as processor local bus (PLB), it is provided to the connection of external memory storage 14 (for example SDRAM), described external memory storage 14 comprises the instruction of the intermediary of data and Memory Controller 16, and Memory Controller 16 waits the independence that bus structure and storer are provided by producing the timing, the refresh signal that for example are necessary.

Bus 12 and storer 14 are also used by foundation structure 18, and foundation structure 18 is handled the packet that receives from network on input interface 20.Foundation structure 18 management comprises packet group dress, memory allocation and release and from the insertion of packet queue and the reception and the transmission of deletion.

Some groupings do not need processed, and are output interface 22 and directly are sent out by network.Other grouping need be handled by processor 10.Search with taxon 24 for one and determine whether that a grouping needs processed and the sort of processing to be performed.In order to handle a data grouping, processor 10 needs several information.For this reason, it need visit the head of grouping and the additional information of generation in foundation structure 18.For example, described foundation structure may have several ports that grouping can arrive, and processor need divide into groups from information where.

Scheduler 26 is handled needs all packets of being handled by processor in one or several formation.These formations needn't physically exist in scheduler.At least the project of the front of each formation need be stored on the chip.This scheduler records processor behavior.When processor grouping of end process, the task that it please be looked for novelty to scheduler.But if management has several formations of different priorities, then scheduler 26 also can interrupt handler be handled the task of higher priority to the task handling of low priority.

Under any circumstance, scheduler 26 is known the next task that processor 10 will be handled.Selected task determines to visit which data.Here under the situation of described network processing unit, in task (project of formation) and at first accessed address, be that relation between packet header and the additional information is very simple.Undertaken from the translation of queued entry by address calculation to a group address.

When processor 10 is handled the data of new grouping and visit such as packet header, if do not use the present invention then cache miss of its common generation.This means that processor will stop to handle, up to loading needed data from external memory storage 14 to cache memory, as mentioned above, this has wasted a large amount of time.Cache memory according to the present invention is looked ahead and has been avoided the speed buffering of the data access that will take place certainly to hit omission, and can determine easily that speed buffering hits the address of omission.

In order to work, must before taking place, visit load needed data by instruction cache.The startup of this behavior is from scheduler, and scheduler uses the direct connection 28 of cache memory.After the position in determining wherein to have stored the storer of packet header and additional information, scheduler distributes the address that will grab cache memory, and cache controller grasps described data the cache memory from storer.After finishing this and writing, perhaps scheduler interrupts processor and when new task has than the higher priority of previous task, distribute new grouping to be used for handling, perhaps scheduler was waited for finishing of previous task before sending new grouping.

The method according to this invention is by flowcharting shown in Figure 2.At first, described foundation structure is waited for and is received new data, i.e. packet (step 30) in described example.The head of grouping is used for classification, and the head and the additional information that obtain from described processing of searching with taxon are stored in (step 32) the external memory storage.Described searching with taxon determines whether that grouping need be by software processes, and determines its priority (step 34).If grouping does not need to handle, cycle of treatment is got back to the data (step 30) of waiting for that reception is new.

When packet need be handled, scheduler need calculate the address in the memory of data that will visit corresponding to processor.In described example, it is the address of packet header and such as the address (step 36) of the additional information of classifier result, input port.These addresses are sent to the data caching controller (step 38) of processor then.The data caching controller writes corresponding data (step 40) in data caching.This is to be undertaken by the staggered storage access of being set up by current packet transaction.

In this stage, handle and to depend on that the grouping that whether just arrived has a higher priority (step 42) than previous.If like this, scheduler interrupts is by the previous task (step 44) of the current execution of processor, and distributes new grouping to be used for handling, and processor begins to handle and finds related data (step 46) in cache memory.If new grouping has higher priority unlike previous, then processor must be finished previous processing (step 46) before handling new grouping (step 48).

Notice that if grouping has under the situation of higher priority, scheduler need be waited for the extracting of finishing data caching before interrupt handler.For this reason, scheduler can be observed the behavior on bus, and wait is done up to all visits that is assigned with.Perhaps, scheduler can be waited for the time of fixed qty, perhaps can use the direct feedback from the cache controller to the scheduler.

Also must be noted that, for wherein as mentioned above first the grouping processing be interrupted so that handle two groupings of second higher priority packets, should all occur on disjoint part of cache memory under two situations.Otherwise the data of looking ahead can be deleted before it is accessed.This can the mapping from the virtual address to the actual address be implemented in processor by using, because use virtual address to come the index cache memory usually.

Though method of the present invention has been described in the network processing unit environment, those of skill in the art can be clear, described method can be used for so any system, is wherein taken place certainly by the visit of processor to some data, and can easily determine its address.In all cases, external event is connected with data more to be processed.Therefore, suppose that new image arrived with the time interval of routine in the robot that uses camera to be used for navigating.The arrival of image is incident, and view data itself is prefetched associated data.

Must be noted that,, can use address bus to be used as external source, because its observed cache coherency that is used for for the microprocessor of standard.In this case, only need an aerial lug to indicate prefetch request.

Claims

1. the method for the data/commands that is associated with external trigger of in a system, looking ahead, described system comprises: foundation structure (18), and it has the input interface (20) that is used to receive the data/commands that will be handled by described foundation structure, the output interface (22) that is used for sending them after data are processed; Storer (14) is used for storage data/commands when data/commands is received by described input interface; Processor (10) is used to handle at least some described data/commands, and described processor has cache memory, and wherein data/commands was stored before processed; External source (26) is used for to described processor distribution sequence task;

Described method is characterised in that, it comprises the following steps, these steps are performed when carrying out previous task when processor:

Determining will be by the position of data/commands in described storer of described processor processing;

Indicate the address of described memory location to described cache memory;

Obtain the content of described memory location and they are write in the described cache memory;

Task to described processor distribution processing said data/instruction.

2. the method for claim 1, wherein said processor (10) is a network processing unit, and data to be processed are in the head of the packet that is received by described foundation structure (18).

3. method as claimed in claim 2, wherein said external source is a scheduler (26) that is directly connected to the described cache memory in the described processor (10), described scheduler is determined the position of data/commands to be processed in described storer (14), and directly indicates described address to described cache memory.

4. method as claimed in claim 3, wherein said scheduler (26) is determined the position of the described data/commands in described storer (14) by calculating described address.

5. as any one described method of claim 2-4, wherein the step of the task of the described prefetch data of allocation process/instruction comprises: interrupt the processing of previous grouping, begin to have than described previous grouping the processing of the new grouping of higher priority.

6. as any one described method of claim 3-5, wherein said cache memory is associated with a cache controller, described cache controller is responsible for obtaining its address by the content of the definite described memory location of described scheduler (26), and they are write in the described cache memory.

7. method as claimed in claim 5, wherein said processor (10) and described scheduler (26) use a processor local bus (PLB), described scheduler obtains the back and interrupts described processor determining to finish data caching, wherein by when data from described storer (14) monitor when returning described bus and accurately observation finish data caching and obtain.

8. system comprises by the adaptive device of realizing according to the step of the method for claim 1-7.