CN102073596B - Method for managing reconfigurable on-chip unified memory aiming at instructions - Google Patents

Method for managing reconfigurable on-chip unified memory aiming at instructions Download PDF

Info

Publication number
CN102073596B
CN102073596B CN2011100073102A CN201110007310A CN102073596B CN 102073596 B CN102073596 B CN 102073596B CN 2011100073102 A CN2011100073102 A CN 2011100073102A CN 201110007310 A CN201110007310 A CN 201110007310A CN 102073596 B CN102073596 B CN 102073596B
Authority
CN
China
Prior art keywords
spm
cache
program
memory
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2011100073102A
Other languages
Chinese (zh)
Other versions
CN102073596A (en
Inventor
凌明
王欢
梅晨
翟婷婷
张阳
武建平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN2011100073102A priority Critical patent/CN102073596B/en
Publication of CN102073596A publication Critical patent/CN102073596A/en
Application granted granted Critical
Publication of CN102073596B publication Critical patent/CN102073596B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention discloses a method for implementing managing a reconfigurable on-chip unified memory aiming at instructions by utilizing a virtual memory mechanism. Through the method, the parameters of the Cache part and the SPM (Scratch-PadMemory) part in the reconfigurable unified memory can be dynamically adjusted in the program running process to adapt to the requirements for memory architecture in different program execution stages. The method is characterized by analyzing the memory access behaviors in different program running stages, obtaining phase change behavior diagrams of the instruction parts and carrying out mathematical abstraction on the phase change behavior diagrams, obtaining the reconfigurable memory configuration information in each program stage and selecting the program instruction parts needing to be optimized by adopting integer nonlinear programming (INLP) according to the energy consumption objective function and the performance objective function and mapping the code segments which have severe conflicts and are frequently accessed in the Cache into the SPM part as much as possible by virtue of a virtual memory management mechanism, thus not only reducing the external memory access energy consumption caused by repeatedly filling Cache, reducing the extra energy consumption caused by compare logic in the Cache and improving the system performance.

Description

To storage and uniform management method on the restructural sheet of instruction
Technical field
The present invention relates to storage and uniform device on a kind of restructural sheet, relate in particular to a kind of virtual memory mechanism that utilizes and realize dynamic management, specifically provide the circuit and the dynamic management approach of this storer storage and uniform device on this restructural sheet.
Background technology
Along with the development of microelectric technique, increasingly mature with the embedded computing platform that SoC (System-on-a-Chip) is the basis.Yet because the gap of processor speed and external memory storage speed constantly increases, the SoC storage subsystem has become the bottleneck of system performance, power consumption and cost.Therefore how the framework and the operating strategy of optimal Storage subsystem are the focuses of embedded research always.
Cache and SPM (Scratch-Pad Memory, memo storer) are modal traditional on-chip memories.Cache is by hardware management, and is transparent to software under most of situation, and the instruction and data that can load nearest visit automatically is in on-chip memory.Yet the high power consumption of Cache, area occupied is big, program execution time is unpredictable etc., and deficiency limits its extensive utilization in embedded system always.Especially the group associate feature of Cache possibly cause being mapped to the capable distinct program content of same Cache, because the memory access rule, and mutual alternative repeatedly, thus increased the expense of system performance and energy consumption, Cache promptly occurs and shake.Compare with Cache, SPM is a kind of high speed on-chip memory, realize by SRAM usually, and be very important system framework design consideration in the modern embedded system.SPM is within the address space that processor can directly visit; Because traditional SPM controller does not comprise the logical circuit of any auxiliary management data; All the elements among the SPM must with respect to the transparent Cache of programmer, increase the complicacy of program management via the explicit management of software.Because the extra cost that does not have the management logic circuit to bring, compared to traditional C ache, the realization of SPM hardware is more simple, the single reference power consumption is lower, chip occupying area is littler and the access time can be foreseen.To sum up; Each tool advantage of Cache and SPM and existence are complementary; Therefore intercommunication Cache and SPM unify on the restructural sheet of configuration management the storage and uniform device and study, and can make full use of both advantages, thereby reduce system energy consumption, elevator system performance to greatest extent.
The framework that some dispose Cache merely or dispose SPM merely to the main analysis of the research of embedded on-chip memory can not both complementary characteristics of good use.Directly will be only to the optimized Algorithm of SPM or only apply on the restructural sheet in the storage and uniform device to the optimized Algorithm of Cache; Can not reach overall power and best performanceization; The optimization income that on a kind of memory bank, obtains possibly offset by the expense of another kind of storer, even introduces the more overhead of multisystem performance and energy consumption.For example the content of certain section main memory is transported to SPM, thereby has obtained the income of performance and energy consumption to the optimized Algorithm of SPM.Yet carrying code itself possibly pollute, cause the inefficacy of Cache optimized Algorithm to instruction Cache, thereby causes extra Cache disappearance, offsets the optimization income of SPM.
In Cache when disappearance, need actually to visit the external memory operation and will new content change to Cache capable, and expense is bigger, and this is called as the punishment that Cache lacks.Because the group associate feature of Cache is mapped to the capable content of same Cache mutual alternative repeatedly, brings a large amount of accessing operations, thereby causes system performance sharply to reduce, system energy consumption sharply increases, the conflict of Here it is Cache.Through increasing methods such as Cache capacity, increase group incidence number; Can reduce the Cache conflict; But can introduce new chip area so again and promote single Cache access time and energy consumption; And have the storage block of a large amount of free time among the Cache of the high degree of association in some road, wasted storage resources on the valuable sheet.Have research to point out that Cache conflict is the major reason that causes system performance and energy consumption bottleneck at present, so they will cause easily that the program segment that Cache conflicts puts into SPM, obtain the income of performance and energy consumption with this.The page or leaf that causes the Cache conflict easily is selected among the SPM, not only can reduces system energy consumption through reducing the Cache conflict, the elevator system performance can also obtain more to overcharge benefit by the energy consumption difference of single reference SPM and Cache.But these researchs all are based on static circuit design, and promptly the degree of association of Cache and the size of SPM can not change in program is carried out.Research shows that the different application programs even the different phase of consolidator have different memory access characteristics, and this fixing storage architecture can not adapt to the variation of memory access characteristic.
Owing to the change of SPM content is needed the carrying out of software demonstration; Therefore generally the research of SPM being carried out dynamic management all is the form through " piling "; Promptly before and after the program kernel circulation that needs are optimized, manual insertion code carrying instruction swaps out to changing to of contents of program thereby accomplish.In program image, insert new instruction, need to rely on analysis, and new instruction causes the variation of Cache behavior in the coexistence framework, the for example more conflict of generation probably source code.
To the research of operation part in Cache and the SPM coexistence framework, generally need carry out the analysis of intrusive mood to program at present, need in user program, insert, revise partial code, dynamically changing to of content swaps out in the program process to be implemented in.Mostly research to reconstruction structure is the research to restructural Cache, and the parameter of the change Cache of trial property is minimum in the hope of energy consumption in program operation process, but can't improve program feature.Up to the present, also there is not correlative study to relate to, utilizes the virtual memory management mode, the method for storage and uniform device on the dynamic management restructural sheet to the programmed instruction part.
Summary of the invention
Technical matters:The objective of the invention is to overcome the deficiency of storage subsystem on the existing sheet; Adopt a kind of reconfigurable to go up the storage and uniform device; Propose a kind of virtual memory mechanism that utilizes and realize method the reconfigurable memorizer dynamic management; Parameter according to Cache part in the interim dynamic-configuration reconfigurable memorizer of program execution and SPM part; To cause that the instruction page of Cache conflict and the instruction page of frequent access are mapped in the SPM part, thereby reduce the extra memory access that brings by conflict and the extra energy consumption of Cache CL Compare Logic, and finally reduce system energy consumption and improve the speed that microprocessor moves.
Technical scheme:The method of storage and uniform device on the virtual memory mechanism dynamic management restructural sheet of utilizing of the present invention reaches the tracking to the cache memory Cache part behavior in the reconfigurable memorizer through the tracking to the processor reading command in the application program implementation; Obtain instructing time and the space distribution of carrying out instruction hit and disappearance among characteristic and the Cache; And then to instruction Cache at the transformation behavior figure of different phase and it is carried out mathematics take out picture, utilize the method for integral nonlinear planning to select reconfigurable memorizer parameter configuration and the distribution of each instruction page of system's total energy consumption when optimum respectively according to power dissipation obj ectives function, performance objective function; In program is carried out, producing phase transformation through program phase transformation detecting device interrupts; Cache part and memo storer SPM (Scratch-Pad Memory) structure partly in the storage and uniform device on each stage restructural sheet; And through configuration to storage and uniform device controller on the modification of page table entry inlet, the direct memory access restructural sheet; Suitable instruction page is mapped in the SPM storer the extra energy consumption of CL Compare Logic that extra memory access that elimination instruction Cache conflict brings and frequent access Cache bring.
Utilize program to carry out different phase and embody different instruction execution characteristics, program process is divided into the different stages; After in the different stages, obtaining the transformation behavior figure of Cache in the reconfigurable memorizer; Utilize the locality of current generation instruction; The not high road of utilization factor in the Cache part is reconstructed into the SPM storage organization; The instruction address space replay that the most frequently causes instruction Cache conflict and frequent access in a period of time is mapped in the SPM storage area, and when its income is little, shines upon back main memory.
The storage and uniform device can be in program operation process on the said restructural sheet; Current configuration information register through storage and uniform device controller on the configuration restructural sheet; The Tag bank that Cache in the reconfigurable memorizer is partly closed a certain road, and its Data bank is reconstructed into SPM uses; Perhaps the corresponding Tag bank of a certain bank among the SPM is opened and be reconstructed into the Cache use; Can dynamically adjust the degree of association and the SPM capacity of Cache in the storage architecture in this way; In Memory Controller, also be provided with simultaneously the configuration information registers group of the reconstruct configuration information that is exclusively used in each program phase of record and the SPM regional register group of record SPM zone mapping relations, its effect is:
1) the configuration information registers group is responsible for writing down the configuration information of reconfigurable memorizer in pairing Cache part of each program phase and SPM part; When program phase transformation detecting device detects the program phase when changing; Interrupt handling routine should be loaded into the current configuration information register from this group register by required configuration information of stage, accomplishes the dynamic-configuration to reconfigurable memorizer;
2) SPM regional register group is responsible for writing down the physical address that each program phase need change to the instruction page of SPM storage area; Be used to dispose the direct memory access controller instruction page is moved into the SPM storage area from main memory, this group register also will be responsible for depositing the page table entry before being used to recover to change to when page or leaf is swapped out the SPM storage area in certain void;
Said program phase transformation detecting device is added up the execution characteristic of instruction in program process; And according to detection mode that disposes in the configuration register and threshold value; When program phase property changes, producing phase transformation interrupts; In interrupt handling routine, can be configured, and then cater to of the requirement of program different phase storage architecture to storage and uniform device on the restructural sheet.
The storage and uniform device comprises Cache part and SPM part on the restructural sheet, and these two parts can be adjusted the degree of association of parameter: Cache part dynamically in program operation process, the capacity of SPM part.
Described phase transformation detecting device utilizes the variation determining program phase transformation of this characteristic through the characteristic that real-time measurement processor executes instruction in the working procedure process, write down the phase transformation sequence number and produce look-at-me to processor.
Obtain instructing time and the space distribution of carrying out instruction hit and disappearance among characteristic and the Cache; The stage that shows when utilizing program to carry out; With causing the most frequently in a period of time that the Cache conflict address space replay with frequent access is mapped in the SPM storer, and when its income is little, shine upon back main memory.
What restructural unified that the on-chip memory controller utilizes its inner direct memory access controller dynamic high-efficiency in program process partly changes to the SPM storage area with programmed instruction; Utilize the Burst characteristic of AHB high-speed bus on the sheet, avoid carrying secondary pollution Cache through processor.
On the restructural sheet, be provided with one group of regional register group that is exclusively used in each program phase reconfigurable memorizer configuration information of record and SPM storage area address mapping relation in the storage and uniform device controller:
1) this group register will be responsible for detecting the program phase when changing at program phase transformation testing circuit, should be loaded in the current configuration information register by required configuration information of stage by interrupt handling routine, accomplish the dynamic-configuration to reconfigurable memorizer;
2) this group register is responsible for writing down the physical address that each program phase need change to the instruction page of SPM storage area, is used to dispose the direct memory access controller instruction page is moved into the SPM storage area from main memory;
3) this group register will be responsible for depositing page or leaf when remapping at the SPM storage area in certain void, write down its corresponding core address, and the page table entry before page or leaf is used to recover to change to when being swapped out the SPM storage area will be deposited in this void in this address.
Beneficial effect:The present invention makes full use of the phasic characteristics in the program process; The proposition of novelty the notion of transformation behavior figure; Through analysis to transformation behavior figure, dispose part of the Cache in the storage and uniform device and SPM parameter partly on the restructural sheet dynamically, the adaptation program is carried out the memory access characteristic in each stage; Farthest reduce system energy consumption, and elevator system performance to a certain degree.Utilize the thought of virtual memory management can conveniently solve the shortcoming of invasive update routine code layout in traditional SPM optimisation technique.Traditional optimisation technique is employed in the program more inserts the way that section that the carrying instruction will be to be optimized dynamically is transported to SPM, adopts the thought of virtual memory management, just can actual physical address and program be kept apart in the virtual address of compile time distribution use.Like this; Virtual address space all is continuous before and after optimizing for program; But for real hardware; The instruction segment part replay with causing the Cache conflict with frequent access is mapped in the SPM part, thereby has reduced the access times and conflict number of times of Cache, has finally obtained the income on performance and the energy consumption.Simultaneously; Utilize virtual memory mechanism that program is managed; Can realize analysis and optimization to the program non-intrusion type, the carrying code of the increase SPM that promptly need in user program, not show, and in the phase transformation Interrupt Process through configuration DMA with revise page table and accomplish changing to of contents of program swapped out.The present invention organically combines storage and uniform device on the mechanism of virtual memory management and the restructural sheet, obtains to optimize or single SPM optimizes more considerable performance and energy consumption income compared to other single Cache.
Description of drawings
Fig. 1 utilizes virtual memory mechanism to realize the system chart to storage and uniform device dynamic management on the restructural sheet;
Fig. 2 is amended TLB page table entry synoptic diagram;
Storage and uniform device synoptic diagram on Fig. 3 restructural sheet;
Fig. 4 is transformation behavior diagram intention;
Fig. 5 utilizes virtual memory mechanism storage and uniform device on the restructural sheet to be carried out the system flowchart of the method for dynamic management.
Embodiment
The inventive method specifically can realize according to the following steps:
(1) sets up the mechanism of virtual memory management
Virtual memory management mechanism can form physical separation, logic continuous address through revising page table entry, so just can realize with the map addresses of subprogram page or leaf to the SPM of reconfigurable memorizer partly in.With respect to conventional dynamic SPM optimisation technique, utilize void to deposit the change of accomplishing the address space mapping relations, can realize the complete non-intrusion type optimization of program source code with the binary image of compiling back generation.In order to adapt to method to Cache and SPM dynamic management, improve the utilization factor of SPM part, the present invention need improve original MMU hardware.Through revising the decoding logic of TLB, increase by 512 Bytes/ virtual pages, 256 Bytes/ virtual pages are supported.Traditional T LB only supports the management of minimum 1K Bytes/ virtual page; And Cache is by the row tissue; Every row is 32-64Bytes only, in a period of time that program is carried out, occur instruction Cache conflict or frequent access the minimum void supported less than traditional TLB mostly of address space deposit the page or leaf size, in order to carry out refinement to optimizing granularity; Improve the SPM utilization factor; The present invention will utilize the reservation position in the conventional page list item inlet, revise Tag storer and the comparator circuit of TLB, realize the support to 256 Bytes/ virtual pages and 512 Bytes/ virtual pages.
(2) foundation of transformation behavior figure
The present invention carries out dynamic optimization through the visit behavior of analyzing Cache part in the reconfigurable memorizer to reconfigurable memorizer; Because the Cache behavior shows tangible program phase property; Therefore proposition " transformation behavior figure " notion of novelty of the present invention, behavior is analyzed to Cache on time and the space.Transformation behavior figure carries out mathematical abstractions according to the trace information to reconfigurable memorizer Cache part to it.Transformation behavior figure be a kind of quantitative description be mapped to same Cache capable in the weight vectors figure of fallback relationship and visit behavior between the distinct program instruction segment.Because the present invention adopts virtual memory management mechanism that programmed instruction is partly managed; The granularity of division of program is the page or leaf size of MMU; The Cache behavior will be undertaken abstract by page or leaf; And it is carried out mathematical modeling describing the weight distribution between each page, the allocation optimum of reconfigurable memorizer and the mapping status of each page when finally trying to achieve by integral nonlinear planning that whole energy consumption and performance benefits are optimum in the different time-gap.So just can obtain in each stage, having most and optimize the page or leaf that is worth, when program undergoes phase transition, storer carried out reconstruct and these pages or leaves are changed in the SPM part dynamically.
(3) the program phase transformation is analyzed
The phase transformation of this research and utilization program is carried out dynamic management to reconfigurable memorizer.The operational process of program often can be divided into the different programs stage, and in each program phase, the behavioural characteristic of program is constant basically, is embodied in the requirement to memory construction, the instruction number of phase operation weekly etc.The present invention utilizes the real-time measurement processor of the phase transformation detecting device instruction number of phase operation weekly;, program produces hardware interrupts when undergoing phase transition; Processor cores will receive the interrupt request that interruption processing module is sent; System gets into interrupt mode, accomplishes the structural adjustment to reconfigurable processor, and the SPM storage area is remapped.
(4) utilize the reconfigurable memorizer controller to accomplish dynamic management
In the program execute phase; When the phase transformation detection module detects the conversion of program phase property; Processor cores is under abnormal patterns; Through configuration, accomplish to the changing to of content in the modification of the reconstruct of storer and page table entry and the SPM storer, to adapt to the program memory access mode in this stage to the reconfigurable memorizer controller.
In phase transformation was interrupted, the reconstruct of storer was accomplished through configuration reconstruction memory controller: the first, search the configuration information memory location that phase-change recording register in the phase transformation detection module finds the current generation; The second, configuration information is loaded into the current configuration register in the reconfigurable memorizer controller, with the parameter of adjustment Cache part and SPM part; The 3rd, the instruction page that will be mapped to the SPM part in this stage is carried out page table entry upgrade operation; The 4th, the instruction page that configuration DMA register will be mapped to the SPM part is transported to from main memory in the SPM part; The 5th, enable reconfigurable memorizer, processor gets into the normal procedure implementation.
Reconfigurable memorizer controller involved in the present invention will be referred to following registers group: the first, and current configuration information register is used for a certain Bank of reconfigurable memorizer is configured to Cache or SPM; The second, context configuration information registers group, wherein the memory configurations of each register in the corresponding program phase is used for when the variation of program phase property, being loaded into current configuration information register; The 3rd, SPM regional register group writes down the mapping situation of each program phase SPM, is used for changing to when swapping out the SPM part at page or leaf revising page table entry through reading this registers group; Second; DMA transmits control register; Realize that through configuration DMA the main memory content dynamically changes to the SPM storage area; Compared to traditional, carry out changing to of data SPM memory content through the LDR/STR instruction and swap out, DMA has utilized the BURST characteristic of high-speed bus AHB on main memory SDRAM and the sheet to a great extent, thus the cost that has reduced transmission with interrupt delaying time.
Below in conjunction with accompanying drawing and embodiment the present invention is described in further detail.
Shown in Figure 1 is system chart, comprises the outer main memory SDRAM of storage and uniform device on processor cores, phase transformation detecting device, MMU memory management unit MMU, operation part router, the restructural sheet, reconfigurable memorizer controller, special-purpose direct memory access controller DMA, bus, interruptable controller, clock module, external memory interface and sheet.Comprise storage and uniform device, reconfigurable memorizer controller on phase transformation detecting device, the restructural sheet in the part that need increase on original framework.
Processor cores sends the virtual address of access instruction; After process memory management unit (MMU) converts physical address into; Zone bit state according to its bypass conversion buffered TLB; Advanced the operation part router, physical address was sent to Cache part, SPM part or chip external memory in the reconfigurable memorizer; The phase transformation detecting device detects the finger situation of getting of CPU in real time, when detecting phase transformation, sends look-at-me, by reconfigurable memorizer controller and interruptable controller response, and configuration reconfigurable memorizer controller in interrupt handling routine; The reconfigurable memorizer controller comprises current configuration information register; One group context configuration information register and SPM regional register; Controller is according to the information of SPM regional register; The source address of configuration dma controller, destination address and carrying length, dma controller are changed the content in the SPM storage area according to the contents of program among the outer main memory SDRAM of sheet through high speed ahb bus and external memory interface.
Shown in Figure 2 is modification to instruction TLB page table entry, to support 512 Bytes/ virtual pages and 256 Bytes/ virtual pages.The minimum management of only supporting 1K Bytes/ virtual page of the page or leaf of traditional M MU, and in the management based on isomerism storage resources dynamic assignment on the instruction sheet of virtual memory mechanism, the minimum of SPM management granularity is the page or leaf size of MMU.If use bigger page or leaf to manage,, can not finely utilize the area of SPM part for the programmed instruction part of comparatively disperseing.Therefore the present invention will make amendment to the 2nd of secondary page table entry in the ARMv5TEJ standard P TEs framework; Owing to concerning instruction, do not need Buffer; So with former B position as the Size extension bits; And Tag storer and the comparator circuit of modification TLB, realize support to 256 Bytes/ virtual pages and 512 Bytes/ virtual pages.Need the original address conversion circuit of adjustment, revise the structure of TLB,, when the dynamic management of instruction SPM storer, can make full use of the area of on-chip memory like this to increase support to 512 Bytes/ virtual pages and 256 Bytes/ virtual pages.TLB mainly comprises following components: a Tag storage array, two SRAM storage arrays, address decoding circuitry, Hit logic, read-write steering logic and input and output driving circuits.A virtual address is made up of page number and offset address usually, and during work, CPU sees 32 virtual address off, and the high-order page number of virtual address and the virtual page number among the Tag are compared.Owing to increased the more support of fine granularity page or leaf, page number is also corresponding elongated, and the present invention is maximum to support 24 Tag to compare, and supports that promptly minimum page or leaf is 256 Bytes/ virtual pages.During 512 Bytes/ virtual pages, Tag only need use preceding 23; The Tag that TLB also can support 22,20,16 or 12 simultaneously relatively, the conversion regime of corresponding little respectively page or leaf, little page or leaf, big page or leaf and section.
Shown in Figure 3 is the reconfigurable memorizer structural drawing.Comprise reconfigurable memorizer controller, tag storage array, data storage array, special-purpose DMA etc.Memory bank part is based on the related Cache structure of 4 tunnel groups, maximum is not both the tag storage array and the data storage array can be controlled by the reconfigurable memorizer controller.One group of current configuration information register current_cs_reg is arranged in the controller, and wherein C1-C4 is respectively applied for control one road tag storage array and corresponding data storage array thereof.Work as C iBe 1 o'clock, tag iTo be closed data iAs the SPM storage area; Work as C iBe 0 o'clock, tag iTo be opened data iAs the Cache storage area.Also have one group of SPM regional register in the controller, can be used to store the SPM part of each program phase and the mapping relations of main memory.One group context configuration information register also is set in the controller in addition; Be for when program undergoes phase transition; Can carry out the contextual switching of phase transformation rapidly, make reconfigurable memorizer in the shortest time, accomplish the reconstruct of memory bank and utilize special-purpose DMA that the SPM storage area is carried out fast mapping.From structural drawing, can find out, when a certain road is configured to the SPM part, can reduce the extra power consumption of bringing owing to the tag CL Compare Logic, and the data part is by the unaddressable software addressable that becomes of software.
Fig. 4 is the synoptic diagram of transformation behavior figure.Because carrying out, program has comparatively significantly program phase property; Transformation behavior figure is according to the phasic characteristics of program; The whole process that partition program is carried out is several stages; And obtain memory access behavior figure separately respectively in different phase inside, and obtain the best storage configuration of reconfigurable memorizer in each program phase according to behavior figure.Utilize the mechanism of virtual memory management to be reoriented to the SPM storage area page or leaf of page or leaf that causes Cache conflict in each time slot and frequent access through dynamic allocation algorithm; Dynamic optimization based on the program phase characteristic can utilize storage resources on the limited sheet, obtains compared to fixed storage structure more considerable performance and energy consumption income.
Shown in Figure 5 for to utilize virtual memory mechanism storage and uniform device on the restructural sheet to be carried out the system flowchart of the method for dynamic management.
At program analysis phase, the first step is configured to Cache with all bank of reconfigurable processor, through the Cache trace information of collecting partly, creation facilities program (CFP) transformation behavior figure.Can realize analysis based on transformation behavior figure to the program non-intrusion type.Second step; Carry out mathematical abstractions; Through transformation behavior figure being carried out mathematical modeling to describe visit situation and the relation between each page of each instruction page in program process; Then come of the influence of the state of each alternative node of quantitative description through the variation of analyzing each instruction page weight distribution of distinct program stage, finally try to achieve the state of whole energy consumption income each node when optimum by integral nonlinear planning to the energy consumption function.The 3rd step obtained in each program phase according to the analysis result in second step, and the best configuration of required storer is confirmed the reconstruct configuration information of each program phase of reconfigurable memorizer.The 4th step, distribute according to the storage organization after the reconstruct, promptly the parameter of Cache part and SPM part determines each program phase need be mapped to the instruction page page number and the areal distribution in the SPM part of SPM part, obtains the value of SPM regional register group.After accomplishing above-mentioned steps, can obtain the memory configurations information in each stage in the program process and the regional mapping relations of SPM storage area.
In the program execute phase, at first the value with configuration information register and SPM regional register is loaded in the reconfigurable memorizer controller.When the program phase place changes, processor cores will receive the interrupt request that interruptable controller sends, and system gets into interrupt mode then.Under interrupt mode; Through loading configuration information in the context configuration information register in current configuration information register; Completion reconfigures reconfigurable memorizer, and changing to of content swaps out in the modification of page table entry and the SPM storer, to adapt to the program memory access mode in present procedure stage.The detailed process of Interrupt Process is: the first step, after getting into interrupt mode and preserving relevant environmental variance,, close Cache part and MMU in the reconfigurable memorizer owing to need reconfigure and revise page table to storer.In second step, read the phase transformation counter register and obtain current program phase number.The 3rd step, read the current generation regional register, modification need be mapped to the pairing page table entry of instruction page of SPM.In the 4th step, the configuration information in the loading context configuration information register is in current configuration information register, and the reconfigurable memorizer controller is according to current configuration information register configuration memory construction.The 5th step; Configure dedicated DMA; Core address arrives the source address register of DMA in the loading mapping area register, and loads the destination address register of the physical address of the corresponding page or leaf of SPM storage area to DMA, enables DMA then the instruction page that needs are mapped to SPM is transported to the SPM part.The 6th step after the DMA carrying finishes, enabled Cache and MMU, and the environmental variance before recovering to interrupt, and withdrawed from interrupt handling routine, and processor cores begins to continue to carry out the program before interrupting.

Claims (5)

1. method of utilizing storage and uniform device on the virtual memory mechanism dynamic management restructural sheet; It is characterized in that: reach tracking the cache memory Cache part behavior in the reconfigurable memorizer through tracking in the application program implementation to the processor reading command; Obtain instructing time and the spatial distribution of carrying out instruction hit and disappearance among characteristic and the Cache; And then instruction Cache set up transformation behavior figure and it is carried out mathematics take out picture in different phase, utilize the method for integral nonlinear planning to select reconfigurable memorizer parameter configuration and the distribution of each instruction page of system's total energy consumption when optimum respectively based on power dissipation obj ectives function, performance objective function; In program is carried out, producing phase transformation through program phase transformation detector interrupts; Each stage to sheet in the storage and uniform device Cache part and memo memory SPM partly carry out structural remodeling; And through configuration to storage and uniform device controller on the modification of page table entry inlet, the direct memory access restructural sheet; Suitable instruction page is mapped in the SPM memory the extra energy consumption of CL Compare Logic that extra memory access that elimination instruction Cache conflict brings and frequent access Cache bring.
2. the method for utilizing storage and uniform device on the virtual memory mechanism dynamic management restructural sheet according to claim 1; It is characterized in that: the storage and uniform device comprises Cache part and SPM part on the restructural sheet; These two parts can be adjusted the degree of association of parameter: Cache part dynamically in program operation process, the capacity of SPM part.
3. the method for utilizing storage and uniform device on the virtual memory mechanism dynamic management restructural sheet according to claim 1; It is characterized in that: the characteristic that described phase transformation detecting device executes instruction in the working procedure process through real-time measurement processor; Utilize the variation determining program phase transformation of this characteristic, write down the phase transformation sequence number and produce look-at-me to processor.
4. the method for utilizing storage and uniform device on the virtual memory mechanism dynamic management restructural sheet according to claim 1; It is characterized in that: what restructural unified that the on-chip memory controller utilizes its inner direct memory access controller dynamic high-efficiency in program process partly changes to the SPM storage area with programmed instruction; Utilize the Burst characteristic of AHB high-speed bus on the sheet, avoid carrying secondary pollution Cache through processor.
5. the method for utilizing storage and uniform device on the virtual memory mechanism dynamic management restructural sheet according to claim 1 is characterized in that: on the restructural sheet, be provided with one group of regional register group that is exclusively used in each program phase reconfigurable memorizer configuration information of record and SPM storage area address mapping relation in the storage and uniform device controller:
1) this group register will be responsible for detecting the program phase when changing at program phase transformation testing circuit; The required configuration information of program phase that will be changed after taking place by interrupt handling routine is loaded in the current configuration information register, accomplishes the dynamic-configuration to reconfigurable memorizer;
2) this group register is responsible for writing down the physical address that each program phase need change to the instruction page of SPM storage area, is used to dispose the direct memory access controller instruction page is moved into the SPM storage area from main memory;
3) this group register will be responsible for depositing page or leaf when remapping at the SPM storage area in certain void, write down its corresponding core address, and the page table entry before page or leaf is used to recover to change to when being swapped out the SPM storage area will be deposited in this void in this address.
CN2011100073102A 2011-01-14 2011-01-14 Method for managing reconfigurable on-chip unified memory aiming at instructions Expired - Fee Related CN102073596B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011100073102A CN102073596B (en) 2011-01-14 2011-01-14 Method for managing reconfigurable on-chip unified memory aiming at instructions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011100073102A CN102073596B (en) 2011-01-14 2011-01-14 Method for managing reconfigurable on-chip unified memory aiming at instructions

Publications (2)

Publication Number Publication Date
CN102073596A CN102073596A (en) 2011-05-25
CN102073596B true CN102073596B (en) 2012-07-25

Family

ID=44032142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011100073102A Expired - Fee Related CN102073596B (en) 2011-01-14 2011-01-14 Method for managing reconfigurable on-chip unified memory aiming at instructions

Country Status (1)

Country Link
CN (1) CN102073596B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103207838B (en) * 2012-01-17 2016-03-30 展讯通信(上海)有限公司 Improve the method for chip performance
US9239786B2 (en) 2012-01-18 2016-01-19 Samsung Electronics Co., Ltd. Reconfigurable storage device
US10114750B2 (en) 2012-01-23 2018-10-30 Qualcomm Incorporated Preventing the displacement of high temporal locality of reference data fill buffers
CN102662861B (en) * 2012-03-22 2014-12-10 北京北大众志微系统科技有限责任公司 Software-aided inserting strategy control method for last-level cache
US9558006B2 (en) * 2012-12-20 2017-01-31 Intel Corporation Continuous automatic tuning of code regions
CN103345429B (en) * 2013-06-19 2018-03-30 中国科学院计算技术研究所 High concurrent memory access accelerated method, accelerator and CPU based on RAM on piece
CN103593324B (en) * 2013-11-12 2017-06-13 上海新储集成电路有限公司 A kind of quick startup low-power consumption computer on-chip system with self-learning function
CN103942181B (en) * 2014-03-31 2017-06-06 清华大学 Method, device for generating the configuration information of dynamic reconfigurable processor
CN106708747A (en) * 2015-11-17 2017-05-24 深圳市中兴微电子技术有限公司 Memory switching method and device
US11016771B2 (en) * 2019-05-22 2021-05-25 Chengdu Haiguang Integrated Circuit Design Co., Ltd. Processor and instruction operation method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1045307A2 (en) * 1999-04-16 2000-10-18 Infineon Technologies North America Corp. Dynamic reconfiguration of a micro-controller cache memory
CN101739358A (en) * 2009-12-21 2010-06-16 东南大学 Method for dynamically allocating on-chip heterogeneous memory resources by utilizing virtual memory mechanism
CN101763316A (en) * 2009-12-25 2010-06-30 东南大学 Method for dynamically distributing isomerism storage resources on instruction parcel based on virtual memory mechanism
CN201540564U (en) * 2009-12-21 2010-08-04 东南大学 Dynamic distribution circuit for distributing on-chip heterogenous storage resources by utilizing virtual memory mechanism

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1045307A2 (en) * 1999-04-16 2000-10-18 Infineon Technologies North America Corp. Dynamic reconfiguration of a micro-controller cache memory
CN101739358A (en) * 2009-12-21 2010-06-16 东南大学 Method for dynamically allocating on-chip heterogeneous memory resources by utilizing virtual memory mechanism
CN201540564U (en) * 2009-12-21 2010-08-04 东南大学 Dynamic distribution circuit for distributing on-chip heterogenous storage resources by utilizing virtual memory mechanism
CN101763316A (en) * 2009-12-25 2010-06-30 东南大学 Method for dynamically distributing isomerism storage resources on instruction parcel based on virtual memory mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张阳等.利用虚存管理的思想实现基于SPM的动态能耗优化机制.《电脑知识与技术》.2009,第5卷(第24期), *

Also Published As

Publication number Publication date
CN102073596A (en) 2011-05-25

Similar Documents

Publication Publication Date Title
CN102073596B (en) Method for managing reconfigurable on-chip unified memory aiming at instructions
CN101763316B (en) Method for dynamically distributing isomerism storage resources on instruction parcel based on virtual memory mechanism
CN201540564U (en) Dynamic distribution circuit for distributing on-chip heterogenous storage resources by utilizing virtual memory mechanism
CN101739358B (en) Method for dynamically allocating on-chip heterogeneous memory resources by utilizing virtual memory mechanism
CN201570016U (en) Dynamic command on-chip heterogenous memory resource distribution circuit based on virtual memory mechanism
CN101375228B (en) Microprocessor having a power-saving instruction cache way predictor and instruction replacement scheme
US9244883B2 (en) Reconfigurable processor and method of reconfiguring the same
CN101464834B (en) Flash memory data write-in method and controller using the same
US10592430B2 (en) Memory structure comprising scratchpad memory
CN102792285A (en) Hierarchical translation tables control
JP2016520233A (en) Memory system, method for processing memory access requests, and computer system
US20180024755A1 (en) Simulator for enterprise-scale simulations on hybrid main memory systems
GB2505564A (en) Generating executable code by selecting an optimization from a plurality of optimizations on basis of ACET.
AU2011341507A1 (en) CPU in memory cache architecture
CN100377117C (en) Method and device for converting virtual address, reading and writing high-speed buffer memory
Liu et al. Scratchpad memory architectures and allocation algorithms for hard real-time multicore processors
Zhang et al. G10: Enabling an efficient unified gpu memory and storage architecture with smart tensor migrations
Siddique et al. Lmstr: Local memory store the case for hardware controlled scratchpad memory for general purpose processors
CN105447285A (en) Method for improving OpenCL hardware execution efficiency
JP2020046761A (en) Management device, information processing apparatus and memory control method
CN101251810A (en) Method for optimizing embedded type operating system process scheduling based on SPM
CN101853209B (en) Method for managing network node memory of wireless sensor
Ji et al. Dynamic and adaptive SPM management for a multi-task environment
Du et al. Optimization of data allocation on CMP embedded system with data migration
Ungethüm et al. Overview on hardware optimizations for database engines

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120725

Termination date: 20150114

EXPY Termination of patent right or utility model