CN112148366A - FLASH acceleration method for reducing power consumption and improving performance of chip - Google Patents

FLASH acceleration method for reducing power consumption and improving performance of chip Download PDF

Info

Publication number
CN112148366A
CN112148366A CN202010958159.XA CN202010958159A CN112148366A CN 112148366 A CN112148366 A CN 112148366A CN 202010958159 A CN202010958159 A CN 202010958159A CN 112148366 A CN112148366 A CN 112148366A
Authority
CN
China
Prior art keywords
data
flash
module
fetching
prefetching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010958159.XA
Other languages
Chinese (zh)
Inventor
时颖
舒海军
吴明勋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Huahong Integrated Circuit Co Ltd
Beijing CEC Huada Electronic Design Co Ltd
Original Assignee
Shanghai Huahong Integrated Circuit Co Ltd
Beijing CEC Huada Electronic Design Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Huahong Integrated Circuit Co Ltd, Beijing CEC Huada Electronic Design Co Ltd filed Critical Shanghai Huahong Integrated Circuit Co Ltd
Priority to CN202010958159.XA priority Critical patent/CN112148366A/en
Publication of CN112148366A publication Critical patent/CN112148366A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/22Microcontrol or microprogram arrangements
    • G06F9/28Enhancement of operational speed, e.g. by using several microcontrol devices operating in parallel
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3814Implementation provisions of instruction buffers, e.g. prefetch buffer; banks

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a design of a Flash acceleration controller, and adopts a data prefetching cache, an instruction field storage and a replacement algorithm controllable acceleration scheme. The acceleration scheme is selected through software configuration or self-adaptive dynamic switching, and system performance improvement under different scene requirements in embedded application is achieved. The main innovation points of the work of the method are as follows 1) Flash prefetching operation: for the sequential pre-fetching data, the pre-fetching operation is comprehensively used to realize the reading acceleration; for non-sequential pre-fetching data, an instruction cache technology and a data replacement principle of a least recently used algorithm are provided, and the pre-fetching missing cost is reduced; 2) high frequency instruction cache operation: the enabling can be controlled independently, the influence on the instruction fetching operation is reduced, and the performance is improved. 3) And the independent acceleration scheme of instruction fetching and data fetching is realized independently through a software control flow.

Description

FLASH acceleration method for reducing power consumption and improving performance of chip
Technical Field
The invention belongs to the technical field of low-power consumption chips. In particular to a FLASH acceleration method for reducing the power consumption and improving the performance of a chip.
Background
Flash is increasingly widely used as a nonvolatile memory module of instructions and data in the field of low-power-consumption and low-cost embedded chip soc (system on chip) design. In the embedded chip, a processor is responsible for tasks such as control, operating system platform and general signal processing, and Flash is used for storing instructions and data. The processor needs to access Flash to obtain the required instructions and data to complete the corresponding task operation. Usually most of the on-chip Flash space is used to store instructions, and the processor accesses the instructions more frequently. Compared with the performance of a processor which can be improved by instruction level parallelism, superscalar design and a large number of registers, the performance of the Flash can be improved only by a few methods such as process improvement. Therefore, as the performance of the processor is improved, the instruction fetching speed of Flash gradually becomes a bottleneck of the system performance. In other words, the overall performance of the embedded system chip SoC is directly affected and restricted by the speed of the Flash instruction fetching speed. Therefore, the research on how to improve the reading speed of Flash has important significance on improving the overall performance of the system.
Disclosure of Invention
The invention aims to solve the technical problem of improving the prior art, and provides a Flash acceleration controller which adopts a pre-fetching cache, an instruction field storage and a storage algorithm acceleration scheme. The acceleration scheme is selected through software configuration or self-adaptive dynamic switching, and system performance improvement under different scene requirements in embedded application is achieved.
In order to solve the above problems, the FLASH acceleration method for reducing power consumption and improving performance of the invention comprises five steps:
step 1: the high-frequency instruction is counted by software before data is written into flash or flash data is read in advance and stored in a high-frequency instruction cache region.
Step 2: and storing the prefetched flash data in a prefetching data module by the accelerated cache controller according to a least recently used algorithm LRU (least-recently used). The prefetch data module is divided into two parts: a sequential prefetch module, an LRU replacement module. When the cache controller prefetching data module has a null value, the cache controller prefetching data module can automatically fetch data from the flash and store the data in an idle position of the prefetching data module.
And step 3: and when the CPU accesses the flash data, the CPU preferentially reads the flash data from the pre-fetching data module. There is no priority distinction for the CPU access order prefetch module and the LRU replacement module. .
And 4, step 4: and when the data accessed by the CPU is not in the data pre-fetching module, storing the current jump instruction and simultaneously re-reading the data from the flash. Since data needs to be read again from the flash and the flash access speed is slow, a waiting period is inserted for the CPU access.
And 5: the data read from the flash are stored in the data pre-fetching module according to the least recently used algorithm, the data read from the flash are firstly stored in the sequential pre-fetching module, and the data of the LRU replacement module is replaced by the data of the sequential pre-fetching module according to the storage algorithm. And (5) circulating the steps 2-5.
Drawings
FIG. 1 is a block diagram of a flash acceleration method for reducing power consumption and improving performance;
FIG. 2 is a general flash read timing diagram;
FIG. 3flash accelerated read timing diagram
Detailed Description
The invention will be described in further detail with reference to the following detailed description and accompanying drawings:
the block diagram of the flash acceleration method for reducing power consumption and improving performance is shown in fig. 1 and comprises modules such as a flash, a data flow control module, an acceleration cache controller, a flash control module, a bus interface, an erasing module and the like.
Timing diagrams of the flash acceleration method for reducing power consumption and improving performance are shown in fig. 2 and fig. 3, wherein fig. 2 is a timing diagram of a delayed flash read, and fig. 3 is a timing diagram of the flash acceleration.
The method is realized by the following five steps:
step 1: the high-frequency instruction is counted by software before data is written into flash or flash data is read in advance and stored in a high-frequency instruction cache region.
Step 2: and storing the prefetched flash data in a prefetching data module by the accelerated cache controller according to a least recently used algorithm LRU (least-recently used). The prefetch data module is divided into two parts: a sequential prefetch module, an LRU replacement module. When the cache controller prefetching data module has a null value, the cache controller prefetching data module can automatically fetch data from the flash and store the data in an idle position of the prefetching data module.
And step 3: and when the CPU accesses the flash data, the CPU preferentially reads the flash data from the pre-fetching data module. There is no priority distinction for the CPU access order prefetch module and the LRU replacement module. .
And 4, step 4: and when the data accessed by the CPU is not in the data pre-fetching module, storing the current jump instruction and simultaneously re-reading the data from the flash. Since data needs to be read again from the flash and the flash access speed is slow, a waiting period is inserted for the CPU access.
And 5: and storing the data read from the flash into a data pre-fetching module according to the least recently used algorithm, storing the data read from the flash in a sequential pre-fetching module, and replacing the data of the LRU replacement module with the data of the sequential pre-fetching module according to a storage algorithm. And (5) circulating the steps 2-5.

Claims (6)

1. A FLASH acceleration method for reducing power consumption and improving performance of a chip is characterized by mainly comprising the following steps:
step 1, an acceleration cache controller stores a command used at a high frequency in a flash in a high-frequency command cache region;
step 2, storing the prefetched flash data in a prefetching data module by an accelerated cache controller according to a least recently used algorithm LRU (least recently used);
step 3, when the CPU accesses the flash data, reading the flash data from the buffer of the data prefetching module preferentially;
step 4, when the data accessed by the CPU is not in the pre-fetching data module, storing the current jump instruction, and simultaneously reading the data from the flash again;
and 5, storing the data read from the flash into a pre-fetching data module according to the least recently used algorithm, and circulating the steps 2-5.
2. The chip power consumption reduction and performance improvement FLASH acceleration method according to claim 1, wherein the step 1: storing the instructions used at high frequency in the flash in a high-frequency instruction cache region by an acceleration cache controller:
the high-frequency instruction is counted by software before data is written into flash or flash data is read in advance and stored in a high-frequency instruction cache region.
3. The chip power consumption reduction and performance improvement FLASH acceleration method according to claim 1, wherein the step 2: according to least recently used algorithm LRU (least-recently used), the cache acceleration controller stores the prefetched flash data in the prefetching data module:
the data prefetching module is divided into two parts: the cache controller comprises a sequential prefetching module and an LRU replacement module, wherein the cache controller can automatically fetch data from a flash when the prefetching data module has a null value and store the data in an idle position of the prefetching data module.
4. The chip power consumption reduction and performance improvement FLASH acceleration method according to claims 1 and 3, characterized in that, the step 3: when the CPU accesses the flash data, the CPU preferentially reads from the prefetch data module:
wherein there is no priority distinction for the CPU access order prefetch module and the LRU replacement module.
5. The method for accelerating the FLASH with the chip power consumption reduced and the performance improved according to claim 1, wherein the step 4: when the data accessed by the CPU is not in the data pre-fetching module, the current jump instruction is stored, and simultaneously the data is read from the flash again:
in which a CPU access inserts a wait cycle because data needs to be read again from a flash and the flash access speed is slow.
6. The chip power consumption reduction and performance improvement FLASH acceleration method according to claims 1 and 3, characterized in that, the step 5: storing data read from the flash into a pre-fetching data module according to a least recently used algorithm, and circulating the steps 2-5:
the data read from the flash is firstly stored in the sequential prefetching module, and meanwhile, the data of the LRU replacement module is replaced by the data of the sequential prefetching module according to a storage algorithm.
CN202010958159.XA 2020-09-14 2020-09-14 FLASH acceleration method for reducing power consumption and improving performance of chip Pending CN112148366A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010958159.XA CN112148366A (en) 2020-09-14 2020-09-14 FLASH acceleration method for reducing power consumption and improving performance of chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010958159.XA CN112148366A (en) 2020-09-14 2020-09-14 FLASH acceleration method for reducing power consumption and improving performance of chip

Publications (1)

Publication Number Publication Date
CN112148366A true CN112148366A (en) 2020-12-29

Family

ID=73889655

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010958159.XA Pending CN112148366A (en) 2020-09-14 2020-09-14 FLASH acceleration method for reducing power consumption and improving performance of chip

Country Status (1)

Country Link
CN (1) CN112148366A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114281570A (en) * 2021-12-23 2022-04-05 合肥市芯海电子科技有限公司 Embedded control circuit, control method, device and chip
CN116896606A (en) * 2022-12-31 2023-10-17 苏州精源创智能科技有限公司 Method for compressing and reading pictures in embedded application scene

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102566978A (en) * 2010-09-30 2012-07-11 Nxp股份有限公司 Memory accelerator buffer replacement method and system
CN105373338A (en) * 2014-08-20 2016-03-02 深圳市中兴微电子技术有限公司 Control method and controller for FLASH
WO2017211240A1 (en) * 2016-06-07 2017-12-14 华为技术有限公司 Processor chip and method for prefetching instruction cache
CN108874685A (en) * 2018-06-21 2018-11-23 郑州云海信息技术有限公司 The data processing method and solid state hard disk of solid state hard disk
CN110888600A (en) * 2019-11-13 2020-03-17 西安交通大学 Buffer area management method for NAND flash memory

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102566978A (en) * 2010-09-30 2012-07-11 Nxp股份有限公司 Memory accelerator buffer replacement method and system
CN105373338A (en) * 2014-08-20 2016-03-02 深圳市中兴微电子技术有限公司 Control method and controller for FLASH
WO2017211240A1 (en) * 2016-06-07 2017-12-14 华为技术有限公司 Processor chip and method for prefetching instruction cache
CN107479860A (en) * 2016-06-07 2017-12-15 华为技术有限公司 A kind of forecasting method of processor chips and instruction buffer
CN108874685A (en) * 2018-06-21 2018-11-23 郑州云海信息技术有限公司 The data processing method and solid state hard disk of solid state hard disk
CN110888600A (en) * 2019-11-13 2020-03-17 西安交通大学 Buffer area management method for NAND flash memory

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
蒋进松等: ""基于预取和缓存原理的片上Flash 加速控制器设计"", 《计算机工程与科学》, vol. 38, no. 12, pages 2381 - 2391 *
钟锐;方文楷;: "嵌入式系统的高速缓存管理", 电脑知识与技术, no. 12 *
陆政;范长军;江云飞;: "基于OpenSSD的闪存转换算法优化", 计算机系统应用, no. 05 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114281570A (en) * 2021-12-23 2022-04-05 合肥市芯海电子科技有限公司 Embedded control circuit, control method, device and chip
WO2023116093A1 (en) * 2021-12-23 2023-06-29 合肥市芯海电子科技有限公司 Embedded control circuit, control method and apparatus, and chip
CN114281570B (en) * 2021-12-23 2024-05-03 合肥市芯海电子科技有限公司 Embedded control circuit, control method, device and chip
CN116896606A (en) * 2022-12-31 2023-10-17 苏州精源创智能科技有限公司 Method for compressing and reading pictures in embedded application scene
CN116896606B (en) * 2022-12-31 2024-02-06 苏州精源创智能科技有限公司 Method for compressing and reading pictures in embedded application scene

Similar Documents

Publication Publication Date Title
US9396117B2 (en) Instruction cache power reduction
US6978350B2 (en) Methods and apparatus for improving throughput of cache-based embedded processors
JP3739491B2 (en) Harmonized software control of Harvard architecture cache memory using prefetch instructions
US6564313B1 (en) System and method for efficient instruction prefetching based on loop periods
WO2001004763A1 (en) Buffering system bus for external-memory accesses
KR102594288B1 (en) Processing pipeline having first and second processing modes with different performance or energy consumption characteristics
CN109461113B (en) Data structure-oriented graphics processor data prefetching method and device
US6282706B1 (en) Cache optimization for programming loops
CN112148366A (en) FLASH acceleration method for reducing power consumption and improving performance of chip
US20090177842A1 (en) Data processing system and method for prefetching data and/or instructions
WO2017222801A1 (en) Pre-fetch mechanism for compressed memory lines in a processor-based system
WO2023129386A1 (en) Leveraging processing-in-memory (pim) resources to expedite non-pim instructions executed on a host
CN111639042A (en) Method and device for processing consistency of prefetched buffer data
CN116149554A (en) RISC-V and extended instruction based data storage processing system and method thereof
US11449428B2 (en) Enhanced read-ahead capability for storage devices
US9645825B2 (en) Instruction cache with access locking
KR101376884B1 (en) Apparatus for controlling program command prefetch and method thereof
CN105786758A (en) Processor device with data caching function and data read-write method of processor device
JP2008015668A (en) Task management device
JP5116275B2 (en) Arithmetic processing apparatus, information processing apparatus, and control method for arithmetic processing apparatus
JP2007193433A (en) Information processor
CN116700794A (en) Method and system for acquiring instruction to be executed
CN111475203B (en) Instruction reading method for processor and corresponding processor
CN116627335A (en) Low-power eFlash reading acceleration system
JP2008052518A (en) Cpu system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20201229

WD01 Invention patent application deemed withdrawn after publication