CN1996268A - Method for implementing on-chip command cache - Google Patents

Method for implementing on-chip command cache Download PDF

Info

Publication number
CN1996268A
CN1996268A CNA2006101697244A CN200610169724A CN1996268A CN 1996268 A CN1996268 A CN 1996268A CN A2006101697244 A CNA2006101697244 A CN A2006101697244A CN 200610169724 A CN200610169724 A CN 200610169724A CN 1996268 A CN1996268 A CN 1996268A
Authority
CN
China
Prior art keywords
cache
lru
output
storehouse
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2006101697244A
Other languages
Chinese (zh)
Other versions
CN100428200C (en
Inventor
车德亮
黄玮
权海洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing times people core technology Co., Ltd.
China Aerospace Modern Electronic Company 772nd Institute
Original Assignee
Mxtronics Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mxtronics Corp filed Critical Mxtronics Corp
Priority to CNB2006101697244A priority Critical patent/CN100428200C/en
Publication of CN1996268A publication Critical patent/CN1996268A/en
Application granted granted Critical
Publication of CN100428200C publication Critical patent/CN100428200C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The cache is made by cache control register, cache starting address register, matching mark P position, cache storage and LRU substitute stack. The cache is simple in structure, small in size, low in energy consumption, simple working process based on the structure, easy coordination with other functional parts, improving the reliability of the instruction cache, using hardware substituting stack selecting substituting section name, avoiding software judgment, improving working speed of the instruction cache.

Description

A kind of implementation method of on-chip command cache
Technical field
The invention belongs to field of computer technology, relate to the design and the manufacturing of high performance signal processor, specially refer to a kind of hardware implementation method of on-chip command cache control system.
Background technology
Since invention on computer, computing velocity and the unmatched contradiction of I/O speed just exist, and the cache technology is exactly one of technology that solves this contradiction.Traditional cache control system is normally finished by software, just be responsible for finishing the control and management of cache as the operating system of PC, the store status of whole cache, data dispatch, replacement policy, abnormality processing etc. all are to be responsible for finishing by the software program in the operating system.Along with the progress of microelectronics manufacture, computer system can realize on a chip that cache also is integrated on the sheet, how the cache on the sheet is carried out effective control and management, is the important content in the high-performance treatments chip design.
The implementation method of relevant command cache at present both domestic and external mainly is the method that adopts hardware and software to combine, and realizes by mainframe computer hardware and software operating system.This method realizes that the control of cache instruction has that power consumption is big, the time of instruction fetch is long and the shortcoming of poor reliability.
Summary of the invention
Technology of the present invention is dealt with problems: overcome the deficiencies in the prior art, a kind of implementation method of on-chip command cache is provided, adopt this method to improve the operating rate of command cache, simultaneously simple in structure, area is little, and is low in energy consumption, be convenient on chip, realize, and the reliability height.
Technical solution of the present invention: a kind of implementation method of on-chip command cache, its characteristics are: the structure of command cache is by a cache control register, cache segment base register, matching mark P position, cache section word memory and LRU replace storehouse and constitute, the cache control register is used to control or represent the state of cache, cache segment base register is used for the sector address of storage instruction address, whether matching mark P position is used for sign has the interior word of a section to aim at, cache section word memory is used for storage instruction, and LRU replaces storehouse and is used to write down the order that the cache section is replaced; When request external memory storage when providing instruction word, there will be two kinds of situations, cache hit or cache miss, if cache hits, sense order from cache, this fragment number are pressed into the top that LRU replaces storehouse, dispose corresponding P mark; If cache is miss, have two kinds of situations: first kind of situation is that cache segment base register and this instruction address are complementary, but corresponding matching mark P position is not set, carry out following operation this moment simultaneously: sense order and copy cache to from storer, this fragment number is pressed into the top that LRU replaces storehouse, disposes corresponding matching mark P position; Second kind of situation is that cache segment base register and instruction address does not match, carry out following operation this moment simultaneously: replace from LRU and select the storehouse with the fragment number that is replaced, all matching mark P bit clears in this section, the corresponding bit wide value of address of instruction is downloaded in the cache segment base register of segmentation of replacement, get instruction and copy cache to and, the fragment number of replacing is pressed into the stack top that LRU replaces storehouse the P flag set.
The present invention's advantage compared with prior art is:
(1) go up the cache order structure for of the present invention, have simple in structurely, area is little, and is low in energy consumption, is convenient to realize on chip.
(2) workflow that designs according to the structure of on-chip command cache of the present invention is simple, has improved the reliability of command cache work.
(3) adopt the hardware replacement storehouse to select to replace segment number, avoided the software of prior art to judge, improved the operating rate of command cache.
Description of drawings
Fig. 1 is the structural representation of a kind of on-chip command cache of the present invention;
Fig. 2 is the workflow of a kind of on-chip command cache of the present invention;
Fig. 3 is the structural drawing that LRU of the present invention replaces storehouse.
Embodiment
As shown in Figure 1, in the SMDSP signal processor, the command cache capacity is 64 * 32 words, and it is divided into two 32 fields, and one 19 segment base register and each section are associated.Each word among the cache all has corresponding matching mark P position, otherwise P is that word is effectively then invalid among the 1 expression cache.
The structure of command cache is replaced storehouse by a cache control register, cache segment base register, matching mark P position, cache section word memory and LRU and is constituted, the cache control register is used to control or represent the state of cache, cache segment base register is used for the sector address of storage instruction address, whether matching mark P position is used for sign has the interior word of a section to aim at, cache section word memory is used for storage instruction, and LRU replaces storehouse and is used to write down the order that the cache section is replaced.
As shown in Figure 2, when request external memory storage when providing instruction word, there will be two kinds of situations, cache hit or cache miss, if cache hits, sense order from cache, this fragment number are pressed into the top that LRU replaces storehouse, dispose corresponding P mark; If cache is miss, have two kinds of situations: first kind of situation is that cache segment base register and this instruction address are complementary, but corresponding matching mark P position is not set, carry out following operation this moment simultaneously: sense order and copy cache to from storer, this fragment number is pressed into the top that LRU replaces storehouse, disposes corresponding matching mark P position; Second kind of situation is that cache segment base register and instruction address does not match, carry out following operation this moment simultaneously: replace from LRU and select the storehouse with the fragment number that is replaced, all matching mark P bit clears in this section, the corresponding bit wide value of address of instruction is downloaded in the cache segment base register of segmentation of replacement, get instruction and copy cache to and, the fragment number of replacing is pressed into the stack top that LRU replaces storehouse the P flag set.
LRU replaces storehouse and is used for determining which section of two sections is least recently used.When visiting one section, fragment number is left the LRU storehouse and is advanced to the top that LRU replaces storehouse at every turn.Therefore, the sequence number at storehouse top is the fragment number of using recently at most, and the sequence number of storehouse bottom is least-recently-used fragment number, and when resetting, the LRU storehouse is with 0 initialization top with 1 initialization lower curtate.
As shown in Figure 3, the replacement stack architecture of LRU of the present invention: significance bit CE (cache enable) and read signal position R (Read) process and door AND_1, the output and the RESET or non-that resets, through rejection gate NOR_2, output to the control end of transmission gate T1, the output of NOR_2 is connected to the complementary control end of T1 through reverser INV_1, clock CLK1 is connected to an input end with door And_2 behind T1, the output of Reset and SSA0 (section 0) warp or door or_1 and Reset and SSA1 (section 1) are connected to Sheffer stroke gate Nand_1 through the output of rejection gate Nor_1, output is linked and door And_2, be connected to reverser INV_2 with the output of door And_2, output is connected to reverser INV_3 and transmission gate T3.Reverser INV_3 is output as MRU_SN (using maximum segment numbers in the recent period), connects transmission gate T2 simultaneously.The output of transmission gate T2 is connected to reverser INV_2; Transmission gate T3 is output as LRU_SN (using minimum segment number in the recent period), and the control end of transmission gate T2 and T3 all meets CLK2, and complementary control signal connects the output of CLK2 through reverser INV_1.
The cache control register has 3: (1) clear cache position CC, work as CC=1, and make all passages among the Cache invalid.This position is changed to 0 after writing cache, and this position is 0 when resetting.(2) cache significance bit CE works as CE=1, enables cache, allows to use cache according to lru algorithm; Work as CE=0, forbid cache, can prevent that cache is updated modification; This position 0 when resetting.(3) freeze Cache position CF, freeze the operation of cache and LRU replacement storehouse as CF=1.If when CE=1 and CF=1, allow from cache, to read but do not allow to revise cache.When CF=1 or 0, allow cache to remove (CC=1), the CF position is made as 0 when resetting.
When the request external memory storage provides when instruction, the workflow of command cache as shown in Figure 2, promptly when asking external memory storage to provide instruction word, there will be two kinds of situations, cache hit or cache miss.If cache hits, sense order from cache; This fragment number is pressed into the top that LRU replaces storehouse; Dispose corresponding P mark.If cache is miss, there are two kinds of situations: first kind of situation, be that cache section beginning address register and instruction address is complementary, but corresponding matching mark P position is not set, carry out 3 operations this moment simultaneously, sense order and copy Cache to from storer, this fragment number are pressed into the top that LRU replaces storehouse, dispose corresponding matching mark P position; Second kind of situation, be that caohe segment base register and instruction address does not match, carry out 4 operations this moment simultaneously, from LRU replacement storehouse, select the fragment number that is replaced, 32 matching mark P bit clears in this section, 19 high addresss of request instruction are downloaded in the cache segment base register of segmentation of LRU_SN signal indication that LRU replaces storehouse, get instruction and copy cache to and, the fragment number of replacing is pressed into the stack top that LRU replaces storehouse the P flag set.

Claims (3)

1, a kind of implementation method of on-chip command cache, it is characterized in that: the structure of command cache is by a cache control register, cache segment base register, matching mark P position, cache section word memory and LRU replace storehouse and constitute, the cache control register is used to control or represent the state of cache, cache segment base register is used for the sector address of storage instruction address, whether matching mark P position is used for sign has the interior word of a section to aim at, cache section word memory is used for storage instruction, and LRU replaces storehouse and is used to write down the order that the cache section is replaced; When request external memory storage when providing instruction word, there will be two kinds of situations, cache hit or cache miss, if cache hits, sense order from cache, this fragment number are pressed into the top that LRU replaces storehouse, dispose corresponding P mark; If cache is miss, have two kinds of situations: first kind of situation is that cache segment base register and this instruction address are complementary, but corresponding matching mark P position is not set, carry out following operation this moment simultaneously: sense order and copy cache to from storer, this fragment number is pressed into the top that LRU replaces storehouse, disposes corresponding matching mark P position; Second kind of situation is that cache segment base register and instruction address does not match, carry out following operation this moment simultaneously: replace from LRU and select the storehouse with the fragment number that is replaced, all matching mark P bit clears in this section, the value of corresponding bit wide of the address of instruction is downloaded in the cache section beginning address register of segmentation of replacement, get instruction and copy cache to and, the fragment number of replacing is pressed into the stack top that LRU replaces storehouse the P flag set.
2, the implementation method of a kind of on-chip command cache according to claim 1, it is characterized in that: the replacement stack architecture of described LRU: the significance bit CE of cache and read signal position R process and door AND_1, the output and the RESET or non-that resets, through rejection gate NOR_2, output to the control end of transmission gate T1, the output of NOR_2 is connected to the complementary control end of T1 through reverser INV_1, CLK1 is connected to an input end with door And_2 behind T1, Reset and SSA0 (segment base register 0, SSA0) output and Reset and SSA1 (the segment base register 1 of warp or door or_1, SSA1) output through rejection gate Nor_1 is connected to Sheffer stroke gate Nand_1, output is linked and door And_2, be connected to reverser INV_2 with the output of door And_2, output is connected to reverser INV_3 and transmission gate T3, reverser INV_3 is output as the maximum segment number MRU_SN of recent use, connect transmission gate T2 simultaneously, the output of transmission gate T2 is connected to reverser INV_2, transmission gate T3 is output as the minimum segment number LRU_SN of recent use, the control end of transmission gate T2 and T3 all meets clock CLK2, and complementary control signal connects the output of CLK2 through reverser INV_1.
3, the implementation method of a kind of on-chip command cache according to claim 1 is characterized in that: described cache control register has 3, (1) clear cache position CC, work as CC=1, make all passages among the cache invalid, write that this position is changed to 0 behind the cache, this position is 0 when resetting; (2) cache significance bit CE works as CE=1, enables cache, allows to use cache according to lru algorithm, works as CE=0, forbids cache, prevents that cache is updated modification, this position 0 when resetting; (3) freeze cache position CF, when CF=1 freezes the operation that cache and LRU replace storehouse, if CE=1, and during CF=1, allow from cache, to read but do not allow to revise cache; When CF=1 or 0, allow cache to remove (CC=1), the CF position is made as 0 when resetting.
CNB2006101697244A 2006-12-28 2006-12-28 Method for implementing on-chip command cache Expired - Fee Related CN100428200C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006101697244A CN100428200C (en) 2006-12-28 2006-12-28 Method for implementing on-chip command cache

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006101697244A CN100428200C (en) 2006-12-28 2006-12-28 Method for implementing on-chip command cache

Publications (2)

Publication Number Publication Date
CN1996268A true CN1996268A (en) 2007-07-11
CN100428200C CN100428200C (en) 2008-10-22

Family

ID=38251365

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006101697244A Expired - Fee Related CN100428200C (en) 2006-12-28 2006-12-28 Method for implementing on-chip command cache

Country Status (1)

Country Link
CN (1) CN100428200C (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102624331A (en) * 2012-04-01 2012-08-01 钜泉光电科技(上海)股份有限公司 Temperature-compensation circuit and temperature-compensation method of real-time clock
CN104699574A (en) * 2013-12-09 2015-06-10 华为技术有限公司 Method, device and system for establishing Cache check points of processor
CN107451071A (en) * 2017-08-04 2017-12-08 郑州云海信息技术有限公司 A kind of caching replacement method and system
CN107729263A (en) * 2017-09-18 2018-02-23 暨南大学 The replacement policy of the modified lru algorithm of a kind of tree in high speed Cache

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08339331A (en) * 1995-06-12 1996-12-24 Alps Lsi Technol Kk Cache memory
US5809528A (en) * 1996-12-24 1998-09-15 International Business Machines Corporation Method and circuit for a least recently used replacement mechanism and invalidated address handling in a fully associative many-way cache memory
CN1164706A (en) * 1997-04-25 1997-11-12 清华大学 Method for designing JAVA processor by using virtual register structure
US6643742B1 (en) * 2000-03-20 2003-11-04 Intel Corporation Method and system for efficient cache memory updating with a least recently used (LRU) protocol
US6745291B1 (en) * 2000-08-08 2004-06-01 Unisys Corporation High speed LRU line replacement system for cache memories
US7139877B2 (en) * 2003-01-16 2006-11-21 Ip-First, Llc Microprocessor and apparatus for performing speculative load operation from a stack memory cache
CN1280736C (en) * 2004-09-17 2006-10-18 中国人民解放军国防科学技术大学 Dummy least recently used uniform replacement method of cache controller

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102624331A (en) * 2012-04-01 2012-08-01 钜泉光电科技(上海)股份有限公司 Temperature-compensation circuit and temperature-compensation method of real-time clock
CN104699574A (en) * 2013-12-09 2015-06-10 华为技术有限公司 Method, device and system for establishing Cache check points of processor
CN104699574B (en) * 2013-12-09 2018-04-20 华为技术有限公司 A kind of method, apparatus and system for establishing processor Cache checkpoints
CN107451071A (en) * 2017-08-04 2017-12-08 郑州云海信息技术有限公司 A kind of caching replacement method and system
CN107729263A (en) * 2017-09-18 2018-02-23 暨南大学 The replacement policy of the modified lru algorithm of a kind of tree in high speed Cache
CN107729263B (en) * 2017-09-18 2020-02-07 暨南大学 Replacement strategy of tree-structured improved LRU algorithm in high-speed Cache

Also Published As

Publication number Publication date
CN100428200C (en) 2008-10-22

Similar Documents

Publication Publication Date Title
CN101694613B (en) Unaligned memory access prediction
US8904112B2 (en) Method and apparatus for saving power by efficiently disabling ways for a set-associative cache
CN104252425B (en) The management method and processor of a kind of instruction buffer
CN100573477C (en) The system and method that group in the cache memory of managing locks is replaced
US6356990B1 (en) Set-associative cache memory having a built-in set prediction array
Yang et al. NV-Tree: A consistent and workload-adaptive tree structure for non-volatile memory
CN103597455A (en) Efficient tag storage for large data caches
Li et al. Exploiting set-level write non-uniformity for energy-efficient NVM-based hybrid cache
CN104834483B (en) A kind of implementation method for lifting embedded MCU performance
US20200201776A1 (en) Using a second content-addressable memory to manage memory burst accesses in memory sub-systems
CN100399299C (en) Memory data processing method of cache failure processor
CN101178690B (en) Low-power consumption high performance high speed scratch memory
CN106569960A (en) Last stage cache management method for mixed main store
Quan et al. Prediction table based management policy for STT-RAM and SRAM hybrid cache
CN100428200C (en) Method for implementing on-chip command cache
US6240489B1 (en) Method for implementing a pseudo least recent used (LRU) mechanism in a four-way cache memory within a data processing system
Kim et al. Low-energy data cache using sign compression and cache line bisection
Kim et al. Energy-efficient exclusive last-level hybrid caches consisting of SRAM and STT-RAM
US7010649B2 (en) Performance of a cache by including a tag that stores an indication of a previously requested address by the processor not stored in the cache
CN111736900B (en) Parallel double-channel cache design method and device
CN102360339A (en) Method for improving utilization efficiency of TLB (translation lookaside buffer)
Komalan et al. Feasibility exploration of NVM based I-cache through MSHR enhancements
CN101158926B (en) Apparatus and method for saving power in a trace cache
Tsao et al. Boosting NVDIMM performance with a lightweight caching algorithm
CN106383926A (en) Instruction prefetching method based on Cortex-M series processor and circuit

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: BEIJING TIMES MINXIN TECHNOLOGY CO., LTD.; CHINA

Free format text: FORMER OWNER: BEIJING TIMES MINXIN TECHNOLOGY CO., LTD.

Effective date: 20081017

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20081017

Address after: Beijing city Fengtai District Donggaodi four camp gate road 2 ZIP Code: 100076

Co-patentee after: China Aerospace Modern Electronic Company 772nd Institute

Patentee after: Beijing times people core technology Co., Ltd.

Address before: Beijing city Fengtai District Donggaodi four camp gate road 2 ZIP Code: 100076

Patentee before: BeiJing Times Minxin Technology Co., Ltd.

EE01 Entry into force of recordation of patent licensing contract

Assignee: Space Klc Holdings Ltd

Assignor: BeiJing Times Minxin Technology Co., Ltd.|China Aerospace Modern Electronic Company 772nd Institute

Contract fulfillment period: 2009.8.18 to 2017.8.18

Contract record no.: 2009990001271

Denomination of invention: Method for implementing on-chip command cache

Granted publication date: 20081022

License type: Exclusive license

Record date: 20091116

LIC Patent licence contract for exploitation submitted for record

Free format text: EXCLUSIVE LICENSE; TIME LIMIT OF IMPLEMENTING CONTACT: 2009.8.18 TO 2017.8.18; CHANGE OF CONTRACT

Name of requester: AEROSPACE CAPITAL HOLDING CO. LTD.

Effective date: 20091116

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20081022

Termination date: 20181228

CF01 Termination of patent right due to non-payment of annual fee