CN103218304A - On-chip and off-chip distribution method for embedded memory data - Google Patents

On-chip and off-chip distribution method for embedded memory data Download PDF

Info

Publication number
CN103218304A
CN103218304A CN2013101146843A CN201310114684A CN103218304A CN 103218304 A CN103218304 A CN 103218304A CN 2013101146843 A CN2013101146843 A CN 2013101146843A CN 201310114684 A CN201310114684 A CN 201310114684A CN 103218304 A CN103218304 A CN 103218304A
Authority
CN
China
Prior art keywords
data
data object
tcg
chip
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013101146843A
Other languages
Chinese (zh)
Other versions
CN103218304B (en
Inventor
姚英彪
陈越佳
王璇
曾宪彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201310114684.3A priority Critical patent/CN103218304B/en
Publication of CN103218304A publication Critical patent/CN103218304A/en
Application granted granted Critical
Publication of CN103218304B publication Critical patent/CN103218304B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention relates to an on-chip and off-chip distribution method for embedded memory data. An on-chip memory serving as the key part of an embedded system directly affects the integral performance of the system. The on-chip and off-chip distribution method firstly proposes that a TCG (trusted computing group) model is used as a new standard for measuring a data object to cause Cache missing, and majority of key factors, such as the data object size, the life cycle, the access times, the time locality and the space locality, are comprehensively considered; then, an SPM(scratch pad memory)/Cache data distribution method is proposed to distribute a data object, which most likely generates confliction (big TCG value), to an SPM; and finally, a fixed Cache data layout method is proposed to map the data object with the big TCG value to different Cache groups to avoid confliction. According to the method disclosed by the invention, on-chip memory hardware and software operated on the on-chip memory hardware are better matched, and time for a program to access a memory system is shortened so as to improve the integral performance of the system.

Description

The outer distribution method of a kind of embedded memory data slice last slice
Technical field
The invention belongs to the embedded memory technical field.Relate in particular to the outer distribution method of a kind of embedded memory data slice last slice.The present invention can obtain at the best performance that specifically is applied on the concrete memory configurations, is particularly useful for the performance optimization of multimedia application on scratch-pad memory/high-speed cache mixing internal storage structure.
Background technology
Because the difference of manufacturing process and circuit logic structure, the speed of processor execution unit is higher than the memory read writing rate always, and along with the development of semiconductor process techniques, the performance difference that these gaps between their growth rates cause is progressively strengthening.An important technology that solves processor and external memory velocity mismatch is exactly a storage system employing hierarchical design, on sheet an integrated little but speed faster storer improve system's memory access performance.
Internal storage structure is as the pith of embedded system on the sheet, directly affects the key parameter such as performance, power consumption, cost of system.Internal storage structure has high-speed cache Cache and scratch-pad memory SPM(Scratch-Pad Memory on the sheet) two types.SPM compares every cost less area of Cache storage and power consumption, thereby internal storage structure takes the mixed structure of SPM/Cache to become a kind of trend gradually on the embedded system sheet.Yet the very little and specificity of the capacity of SPM makes how effectively to use memory source on the sheet to become the key issue of Embedded System Design.
Existing software data storage optimization research mainly concentrates on the Cache order rate that how to increase, and perhaps how to increase the SPM access times, lacks research is optimized in the datarams visit of adopting internal storage structure on Cache and the SPM mixing tab.
The outer distribution technique of data slice last slice is a kind of embedded system storage optimization technology, utilizes this technology to obtain the outer allocation strategy of sheet last slice and determines which data by SPM (being called on the sheet) visit, and which data is by Cache (it is outer to be called sheet) visit.The outer distribution technique of data slice last slice has been optimized the distribution of data between SPM and Cache, can obtain the best performance to concrete application, has become the focus of embedded system storage optimization research.
Summary of the invention
The objective of the invention is at the deficiencies in the prior art, provide a kind of embedded memory data slice last slice outer distribution method, can realize the best performance of concrete application program on concrete memory configurations.
In order to solve the problems of the technologies described above, the technical solution used in the present invention comprises the steps:
Step 1. utilizes compiler and simulator tool to extract the information of concrete application program;
Step 2. is set up the TCG model to these information;
Step 3. proposes the data distributing method data object that the TCG value is big and is assigned to SPM;
Step 4. proposes the data layout method data object that the TCG value is big and is mapped to different Cache groups to avoid conflict.
The information of the described concrete application program of step 1 comprises size, life cycle, access times, temporal locality and the spatial locality of data object; Described temporal locality is by time chart TRG(Temporal Relationship Graph) represent; Spatial locality is to be represented by maximum connected reference number of times.
The described TCG model of step 2, its content comprise size, life cycle, access times, temporal locality and the spatial locality factor of the data object that step 1 is extracted, and its model formation is as follows:
TCG=(access times * life cycle * TRG value)/(maximum connected reference number of times * object size).
The described data distributing method of step 3 specifically comprises the steps:
3-1. according to the descending sort of TCG value, and initialization is assigned to the outer internal memory of sheet, then as the data to be distributed object with the total data object;
3-2. in all data to be distributed objects, first satisfies the data object that capacity is less than or equal to scratch-pad memory residual capacity according to the descending select progressively, and this data object is assigned to scratch-pad memory on the sheet;
3-3. repeating step 3-2, then finishes all greater than scratch-pad memory residual capacity up to all data to be distributed object capacities.
The described data layout method of step 4 comprises following steps:
4-1. calculate the cache set number that data object needs in the residue data to be distributed object, computing formula is as follows:
Group number=data object size/cache set size;
4-2. the current group number of high-speed cache is distributed to data object, and the current group number of buffer memory is added one, required group of number of data object subtracts one simultaneously;
4-3. repeating step 4-2 is zero up to required group of number of data object;
4-4. repeating step 4-1,4-2 and 4-3 have all assigned up to residue data to be distributed object.
 
Beneficial effect of the present invention is as follows:
The inventive method utilizes TCG model application programs information to carry out modeling, the reasonable utilization of SPM and the rational deployment of the outer internal storage data object of sheet have been taken all factors into consideration, optimized the data allocations between SPM and the Cache, reduce program consumes in data storage access time and reduction data storage visit energy consumption, realized the best performance of concrete application program on concrete memory configurations.
Description of drawings
Fig. 1 is the process flow diagram of the inventive method;
The TCG model structure figure that Fig. 2 proposes for the inventive method;
Fig. 3 is the SPM/Cache data distributing method process flow diagram in the inventive method;
Fig. 4 is the fixation of C ache data layout method flow diagram in the inventive method.
Embodiment
Describe the present invention below in conjunction with embodiment and accompanying drawing.
As shown in Figure 1, at first utilize compiler and simulator tool to extract the information of concrete application program in the present embodiment: 1. select the GCC-2.7.1-MIPS compiler-O3 optimizes the static compiling of option application program and obtains the MIPS assembly code; 2. select the MIPS emulator that internal memory on the sheet is configured, comprise amount of capacity, access delay and organizational form (replacement policy, write strategy, write disappearance strategy and interrelational form) etc., open the performance statistical tool, carry out the emulation of the data storage access performance of program.Secondly these information are set up the TCG model; Utilize the SPM/Cache data distributing method data object that the TCG value is big to be assigned to SPM then; Utilize the fixation of C ache data layout method data object that the TCG value is big to be mapped to different Cache groups at last to avoid conflict.
The information of described concrete application program comprises size, life cycle, access times, temporal locality and the spatial locality of data object; Described temporal locality is by time chart TRG(Temporal Relationship Graph) represent; Spatial locality is to be represented by maximum connected reference number of times.
As shown in Figure 2, described TCG model, its content comprises size, life cycle, access times, temporal locality and the spatial locality factor of the data object that step 1 is extracted, its model formation is as follows:
TCG=(access times * life cycle * TRG value)/(maximum connected reference number of times * object size).
Wherein, time chart TRG(Temporal Relationship Graph) and the TRG value can be referring to N. Gloy, T. Blockwell, cross reference file behind the M.D. Zorn thesis topic Procedure placement using temporal ordering information.
As shown in Figure 3, the purpose of the SPM/Cache data allocations in the present embodiment is that the easiest data object that clashes is assigned among the SPM, comprises the steps:
Step1, with the total data object according to the descending sort of TCG value, and initialization is assigned to the outer internal memory of sheet, then as the data to be distributed object;
Step2, in all data to be distributed objects, first satisfies the data object that capacity is less than or equal to scratch-pad memory residual capacity according to the descending select progressively, and this data object is assigned to scratch-pad memory SPM on the sheet;
Step3, repeating step Step2, then finish all greater than scratch-pad memory residual capacity up to all data to be distributed object capacities.
As shown in Figure 4, the fixation of C ache data layout in the present embodiment has two targets: the number of times that 1. reduces the Cache disappearance; 2. reduce the outer memory headroom of sheet (being that the hole in the outer internal memory of sheet is reduced in the intact back of data layout), comprise the steps:
The cache set that each data object needs in Step4, remaining i data to be distributed object of calculating is counted j, and computing formula is as follows:
Group number j=data object size/cache set size;
Step5, the current group number setNO of high-speed cache is distributed to data object, and the current group number setNO of buffer memory is added one, required group of number j of data object subtracts one simultaneously;
Step6, repeating step 4-5 are zero up to required group of number j of data object;
Step7, repeating step 4-4,4-5 and 4-6 have all assigned up to remaining i data to be distributed object.

Claims (1)

1. the outer distribution method of embedded memory data slice last slice is characterized in that comprising the steps:
Step 1. utilizes compiler and simulator tool to extract the information of concrete application program;
Step 2. is set up the TCG model to these information;
Step 3. proposes the data distributing method data object that the TCG value is big and is assigned to SPM;
Step 4. proposes the data layout method data object that the TCG value is big and is mapped to different Cache groups to avoid conflict;
The information of the described concrete application program of step 1 comprises size, life cycle, access times, temporal locality and the spatial locality of data object; Described temporal locality is to be represented by time chart TRG; Spatial locality is to be represented by maximum connected reference number of times;
The described TCG model of step 2, its content comprise size, life cycle, access times, temporal locality and the spatial locality factor of the data object that step 1 is extracted, and its model formation is as follows:
TCG=(access times * life cycle * TRG value)/(maximum connected reference number of times * object size);
The described data distributing method of step 3 specifically comprises the steps:
3-1. according to the descending sort of TCG value, and initialization is assigned to the outer internal memory of sheet, then as the data to be distributed object with the total data object;
3-2. in all data to be distributed objects, first satisfies the data object that capacity is less than or equal to scratch-pad memory residual capacity according to the descending select progressively, and this data object is assigned to scratch-pad memory on the sheet;
3-3. repeating step 3-2, then finishes all greater than scratch-pad memory residual capacity up to all data to be distributed object capacities;
The described data layout method of step 4 comprises following steps:
4-1. calculate the cache set number that data object needs in the residue data to be distributed object, computing formula is as follows:
Group number=data object size/cache set size;
4-2. the current group number of high-speed cache is distributed to data object, and the current group number of buffer memory is added one, required group of number of data object subtracts one simultaneously;
4-3. repeating step 4-2 is zero up to required group of number of data object;
4-4. repeating step 4-1,4-2 and 4-3 have all assigned up to residue data to be distributed object.
CN201310114684.3A 2013-04-03 2013-04-03 Off-chip distribution method in a kind of embedded memory data slice Expired - Fee Related CN103218304B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310114684.3A CN103218304B (en) 2013-04-03 2013-04-03 Off-chip distribution method in a kind of embedded memory data slice

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310114684.3A CN103218304B (en) 2013-04-03 2013-04-03 Off-chip distribution method in a kind of embedded memory data slice

Publications (2)

Publication Number Publication Date
CN103218304A true CN103218304A (en) 2013-07-24
CN103218304B CN103218304B (en) 2016-07-20

Family

ID=48816120

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310114684.3A Expired - Fee Related CN103218304B (en) 2013-04-03 2013-04-03 Off-chip distribution method in a kind of embedded memory data slice

Country Status (1)

Country Link
CN (1) CN103218304B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559148A (en) * 2013-11-15 2014-02-05 山东大学 On-chip scratch-pad memory (SPM) management method facing multitasking embedded system
CN103793339A (en) * 2014-01-13 2014-05-14 杭州电子科技大学 Memory access stack distance based data Cache performance exploring method
CN105204940A (en) * 2014-05-28 2015-12-30 中兴通讯股份有限公司 Memory allocation method and device
CN106940682A (en) * 2017-03-07 2017-07-11 武汉科技大学 A kind of embedded system optimization method based on programmable storage on piece
CN116097222A (en) * 2020-05-18 2023-05-09 华为技术有限公司 Memory arrangement optimization method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763316A (en) * 2009-12-25 2010-06-30 东南大学 Method for dynamically distributing isomerism storage resources on instruction parcel based on virtual memory mechanism
CN101901192A (en) * 2010-07-27 2010-12-01 杭州电子科技大学 On-chip and off-chip data object static assignment method
US20110219193A1 (en) * 2007-11-06 2011-09-08 Il Hyun Park Processor and memory control method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110219193A1 (en) * 2007-11-06 2011-09-08 Il Hyun Park Processor and memory control method
CN101763316A (en) * 2009-12-25 2010-06-30 东南大学 Method for dynamically distributing isomerism storage resources on instruction parcel based on virtual memory mechanism
CN101901192A (en) * 2010-07-27 2010-12-01 杭州电子科技大学 On-chip and off-chip data object static assignment method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
袁名举: "基于ScratchPad Memory的低功耗技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559148A (en) * 2013-11-15 2014-02-05 山东大学 On-chip scratch-pad memory (SPM) management method facing multitasking embedded system
CN103559148B (en) * 2013-11-15 2016-03-23 山东大学 Scratch-pad storage management method on the sheet of multi-task embedded operation system
CN103793339A (en) * 2014-01-13 2014-05-14 杭州电子科技大学 Memory access stack distance based data Cache performance exploring method
CN103793339B (en) * 2014-01-13 2016-08-24 杭州电子科技大学 Data Cache performance heuristic approach based on internal storage access storehouse distance
CN105204940A (en) * 2014-05-28 2015-12-30 中兴通讯股份有限公司 Memory allocation method and device
CN106940682A (en) * 2017-03-07 2017-07-11 武汉科技大学 A kind of embedded system optimization method based on programmable storage on piece
CN116097222A (en) * 2020-05-18 2023-05-09 华为技术有限公司 Memory arrangement optimization method and device

Also Published As

Publication number Publication date
CN103218304B (en) 2016-07-20

Similar Documents

Publication Publication Date Title
Hu et al. Data allocation optimization for hybrid scratch pad memory with SRAM and nonvolatile memory
Salkhordeh et al. An operating system level data migration scheme in hybrid DRAM-NVM memory architecture
CN104081315B (en) Including thread merging for efficiency and the methods, devices and systems of energy-conservation
Zomaya et al. Energy-efficient distributed computing systems
Capra et al. Measuring application software energy efficiency
CN103218304B (en) Off-chip distribution method in a kind of embedded memory data slice
CN103150265B (en) The fine-grained data distribution method of isomery storer on Embedded sheet
US20180024928A1 (en) Modified query execution plans in hybrid memory systems for in-memory databases
CN104115093A (en) Method, apparatus, and system for energy efficiency and energy conservation including power and performance balancing between multiple processing elements
Li et al. MAC: Migration-aware compilation for STT-RAM based hybrid cache in embedded systems
CN103559148B (en) Scratch-pad storage management method on the sheet of multi-task embedded operation system
CN104572500B (en) The management method of microprocessor and its performance and power consumption
Li et al. Compiler-assisted preferred caching for embedded systems with STT-RAM based hybrid cache
Hu et al. Management and optimization for nonvolatile memory-based hybrid scratchpad memory on multicore embedded processors
Kline Jr et al. Greenchip: A tool for evaluating holistic sustainability of modern computing systems
Kannan et al. A software solution for dynamic stack management on scratch pad memory
Liu et al. A space-efficient fair cache scheme based on machine learning for nvme ssds
CN101901192B (en) On-chip and off-chip data object static assignment method
Köhler et al. Carbon-Aware Memory Placement
Hu et al. Optimizing data allocation and memory configuration for non-volatile memory based hybrid SPM on embedded CMPs
Tian et al. Optimal task allocation on non-volatile memory based hybrid main memory
CN104182280B (en) Low-energy RM real-time task scheduling method for hybrid main memory embedded system
Ramesh et al. Energy management in embedded systems: Towards a taxonomy
Li et al. Energy optimization of branch-aware data variable allocation on hybrid SRAM+ NVM SPM for CPS
Poursafaei et al. NPAM: NVM-aware page allocation for multi-core embedded systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160720

Termination date: 20170403