CN112148366A - FLASH acceleration method for reducing power consumption and improving performance of chip - Google Patents
FLASH acceleration method for reducing power consumption and improving performance of chip Download PDFInfo
- Publication number
- CN112148366A CN112148366A CN202010958159.XA CN202010958159A CN112148366A CN 112148366 A CN112148366 A CN 112148366A CN 202010958159 A CN202010958159 A CN 202010958159A CN 112148366 A CN112148366 A CN 112148366A
- Authority
- CN
- China
- Prior art keywords
- data
- flash
- module
- fetching
- prefetching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000001133 acceleration Effects 0.000 title claims abstract description 25
- 238000000034 method Methods 0.000 title claims abstract description 17
- 238000010586 diagram Methods 0.000 description 7
- 230000003111 delayed effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/22—Microcontrol or microprogram arrangements
- G06F9/28—Enhancement of operational speed, e.g. by using several microcontrol devices operating in parallel
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3814—Implementation provisions of instruction buffers, e.g. prefetch buffer; banks
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The invention provides a design of a Flash acceleration controller, and adopts a data prefetching cache, an instruction field storage and a replacement algorithm controllable acceleration scheme. The acceleration scheme is selected through software configuration or self-adaptive dynamic switching, and system performance improvement under different scene requirements in embedded application is achieved. The main innovation points of the work of the method are as follows 1) Flash prefetching operation: for the sequential pre-fetching data, the pre-fetching operation is comprehensively used to realize the reading acceleration; for non-sequential pre-fetching data, an instruction cache technology and a data replacement principle of a least recently used algorithm are provided, and the pre-fetching missing cost is reduced; 2) high frequency instruction cache operation: the enabling can be controlled independently, the influence on the instruction fetching operation is reduced, and the performance is improved. 3) And the independent acceleration scheme of instruction fetching and data fetching is realized independently through a software control flow.
Description
Technical Field
The invention belongs to the technical field of low-power consumption chips. In particular to a FLASH acceleration method for reducing the power consumption and improving the performance of a chip.
Background
Flash is increasingly widely used as a nonvolatile memory module of instructions and data in the field of low-power-consumption and low-cost embedded chip soc (system on chip) design. In the embedded chip, a processor is responsible for tasks such as control, operating system platform and general signal processing, and Flash is used for storing instructions and data. The processor needs to access Flash to obtain the required instructions and data to complete the corresponding task operation. Usually most of the on-chip Flash space is used to store instructions, and the processor accesses the instructions more frequently. Compared with the performance of a processor which can be improved by instruction level parallelism, superscalar design and a large number of registers, the performance of the Flash can be improved only by a few methods such as process improvement. Therefore, as the performance of the processor is improved, the instruction fetching speed of Flash gradually becomes a bottleneck of the system performance. In other words, the overall performance of the embedded system chip SoC is directly affected and restricted by the speed of the Flash instruction fetching speed. Therefore, the research on how to improve the reading speed of Flash has important significance on improving the overall performance of the system.
Disclosure of Invention
The invention aims to solve the technical problem of improving the prior art, and provides a Flash acceleration controller which adopts a pre-fetching cache, an instruction field storage and a storage algorithm acceleration scheme. The acceleration scheme is selected through software configuration or self-adaptive dynamic switching, and system performance improvement under different scene requirements in embedded application is achieved.
In order to solve the above problems, the FLASH acceleration method for reducing power consumption and improving performance of the invention comprises five steps:
step 1: the high-frequency instruction is counted by software before data is written into flash or flash data is read in advance and stored in a high-frequency instruction cache region.
Step 2: and storing the prefetched flash data in a prefetching data module by the accelerated cache controller according to a least recently used algorithm LRU (least-recently used). The prefetch data module is divided into two parts: a sequential prefetch module, an LRU replacement module. When the cache controller prefetching data module has a null value, the cache controller prefetching data module can automatically fetch data from the flash and store the data in an idle position of the prefetching data module.
And step 3: and when the CPU accesses the flash data, the CPU preferentially reads the flash data from the pre-fetching data module. There is no priority distinction for the CPU access order prefetch module and the LRU replacement module. .
And 4, step 4: and when the data accessed by the CPU is not in the data pre-fetching module, storing the current jump instruction and simultaneously re-reading the data from the flash. Since data needs to be read again from the flash and the flash access speed is slow, a waiting period is inserted for the CPU access.
And 5: the data read from the flash are stored in the data pre-fetching module according to the least recently used algorithm, the data read from the flash are firstly stored in the sequential pre-fetching module, and the data of the LRU replacement module is replaced by the data of the sequential pre-fetching module according to the storage algorithm. And (5) circulating the steps 2-5.
Drawings
FIG. 1 is a block diagram of a flash acceleration method for reducing power consumption and improving performance;
FIG. 2 is a general flash read timing diagram;
FIG. 3flash accelerated read timing diagram
Detailed Description
The invention will be described in further detail with reference to the following detailed description and accompanying drawings:
the block diagram of the flash acceleration method for reducing power consumption and improving performance is shown in fig. 1 and comprises modules such as a flash, a data flow control module, an acceleration cache controller, a flash control module, a bus interface, an erasing module and the like.
Timing diagrams of the flash acceleration method for reducing power consumption and improving performance are shown in fig. 2 and fig. 3, wherein fig. 2 is a timing diagram of a delayed flash read, and fig. 3 is a timing diagram of the flash acceleration.
The method is realized by the following five steps:
step 1: the high-frequency instruction is counted by software before data is written into flash or flash data is read in advance and stored in a high-frequency instruction cache region.
Step 2: and storing the prefetched flash data in a prefetching data module by the accelerated cache controller according to a least recently used algorithm LRU (least-recently used). The prefetch data module is divided into two parts: a sequential prefetch module, an LRU replacement module. When the cache controller prefetching data module has a null value, the cache controller prefetching data module can automatically fetch data from the flash and store the data in an idle position of the prefetching data module.
And step 3: and when the CPU accesses the flash data, the CPU preferentially reads the flash data from the pre-fetching data module. There is no priority distinction for the CPU access order prefetch module and the LRU replacement module. .
And 4, step 4: and when the data accessed by the CPU is not in the data pre-fetching module, storing the current jump instruction and simultaneously re-reading the data from the flash. Since data needs to be read again from the flash and the flash access speed is slow, a waiting period is inserted for the CPU access.
And 5: and storing the data read from the flash into a data pre-fetching module according to the least recently used algorithm, storing the data read from the flash in a sequential pre-fetching module, and replacing the data of the LRU replacement module with the data of the sequential pre-fetching module according to a storage algorithm. And (5) circulating the steps 2-5.
Claims (6)
1. A FLASH acceleration method for reducing power consumption and improving performance of a chip is characterized by mainly comprising the following steps:
step 1, an acceleration cache controller stores a command used at a high frequency in a flash in a high-frequency command cache region;
step 2, storing the prefetched flash data in a prefetching data module by an accelerated cache controller according to a least recently used algorithm LRU (least recently used);
step 3, when the CPU accesses the flash data, reading the flash data from the buffer of the data prefetching module preferentially;
step 4, when the data accessed by the CPU is not in the pre-fetching data module, storing the current jump instruction, and simultaneously reading the data from the flash again;
and 5, storing the data read from the flash into a pre-fetching data module according to the least recently used algorithm, and circulating the steps 2-5.
2. The chip power consumption reduction and performance improvement FLASH acceleration method according to claim 1, wherein the step 1: storing the instructions used at high frequency in the flash in a high-frequency instruction cache region by an acceleration cache controller:
the high-frequency instruction is counted by software before data is written into flash or flash data is read in advance and stored in a high-frequency instruction cache region.
3. The chip power consumption reduction and performance improvement FLASH acceleration method according to claim 1, wherein the step 2: according to least recently used algorithm LRU (least-recently used), the cache acceleration controller stores the prefetched flash data in the prefetching data module:
the data prefetching module is divided into two parts: the cache controller comprises a sequential prefetching module and an LRU replacement module, wherein the cache controller can automatically fetch data from a flash when the prefetching data module has a null value and store the data in an idle position of the prefetching data module.
4. The chip power consumption reduction and performance improvement FLASH acceleration method according to claims 1 and 3, characterized in that, the step 3: when the CPU accesses the flash data, the CPU preferentially reads from the prefetch data module:
wherein there is no priority distinction for the CPU access order prefetch module and the LRU replacement module.
5. The method for accelerating the FLASH with the chip power consumption reduced and the performance improved according to claim 1, wherein the step 4: when the data accessed by the CPU is not in the data pre-fetching module, the current jump instruction is stored, and simultaneously the data is read from the flash again:
in which a CPU access inserts a wait cycle because data needs to be read again from a flash and the flash access speed is slow.
6. The chip power consumption reduction and performance improvement FLASH acceleration method according to claims 1 and 3, characterized in that, the step 5: storing data read from the flash into a pre-fetching data module according to a least recently used algorithm, and circulating the steps 2-5:
the data read from the flash is firstly stored in the sequential prefetching module, and meanwhile, the data of the LRU replacement module is replaced by the data of the sequential prefetching module according to a storage algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010958159.XA CN112148366A (en) | 2020-09-14 | 2020-09-14 | FLASH acceleration method for reducing power consumption and improving performance of chip |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010958159.XA CN112148366A (en) | 2020-09-14 | 2020-09-14 | FLASH acceleration method for reducing power consumption and improving performance of chip |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112148366A true CN112148366A (en) | 2020-12-29 |
Family
ID=73889655
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010958159.XA Pending CN112148366A (en) | 2020-09-14 | 2020-09-14 | FLASH acceleration method for reducing power consumption and improving performance of chip |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112148366A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114281570A (en) * | 2021-12-23 | 2022-04-05 | 合肥市芯海电子科技有限公司 | Embedded control circuit, control method, device and chip |
CN116896606A (en) * | 2022-12-31 | 2023-10-17 | 苏州精源创智能科技有限公司 | Method for compressing and reading pictures in embedded application scene |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102566978A (en) * | 2010-09-30 | 2012-07-11 | Nxp股份有限公司 | Memory accelerator buffer replacement method and system |
CN105373338A (en) * | 2014-08-20 | 2016-03-02 | 深圳市中兴微电子技术有限公司 | Control method and controller for FLASH |
WO2017211240A1 (en) * | 2016-06-07 | 2017-12-14 | 华为技术有限公司 | Processor chip and method for prefetching instruction cache |
CN108874685A (en) * | 2018-06-21 | 2018-11-23 | 郑州云海信息技术有限公司 | The data processing method and solid state hard disk of solid state hard disk |
CN110888600A (en) * | 2019-11-13 | 2020-03-17 | 西安交通大学 | Buffer area management method for NAND flash memory |
-
2020
- 2020-09-14 CN CN202010958159.XA patent/CN112148366A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102566978A (en) * | 2010-09-30 | 2012-07-11 | Nxp股份有限公司 | Memory accelerator buffer replacement method and system |
CN105373338A (en) * | 2014-08-20 | 2016-03-02 | 深圳市中兴微电子技术有限公司 | Control method and controller for FLASH |
WO2017211240A1 (en) * | 2016-06-07 | 2017-12-14 | 华为技术有限公司 | Processor chip and method for prefetching instruction cache |
CN107479860A (en) * | 2016-06-07 | 2017-12-15 | 华为技术有限公司 | A kind of forecasting method of processor chips and instruction buffer |
CN108874685A (en) * | 2018-06-21 | 2018-11-23 | 郑州云海信息技术有限公司 | The data processing method and solid state hard disk of solid state hard disk |
CN110888600A (en) * | 2019-11-13 | 2020-03-17 | 西安交通大学 | Buffer area management method for NAND flash memory |
Non-Patent Citations (3)
Title |
---|
蒋进松等: ""基于预取和缓存原理的片上Flash 加速控制器设计"", 《计算机工程与科学》, vol. 38, no. 12, pages 2381 - 2391 * |
钟锐;方文楷;: "嵌入式系统的高速缓存管理", 电脑知识与技术, no. 12 * |
陆政;范长军;江云飞;: "基于OpenSSD的闪存转换算法优化", 计算机系统应用, no. 05 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114281570A (en) * | 2021-12-23 | 2022-04-05 | 合肥市芯海电子科技有限公司 | Embedded control circuit, control method, device and chip |
WO2023116093A1 (en) * | 2021-12-23 | 2023-06-29 | 合肥市芯海电子科技有限公司 | Embedded control circuit, control method and apparatus, and chip |
CN114281570B (en) * | 2021-12-23 | 2024-05-03 | 合肥市芯海电子科技有限公司 | Embedded control circuit, control method, device and chip |
CN116896606A (en) * | 2022-12-31 | 2023-10-17 | 苏州精源创智能科技有限公司 | Method for compressing and reading pictures in embedded application scene |
CN116896606B (en) * | 2022-12-31 | 2024-02-06 | 苏州精源创智能科技有限公司 | Method for compressing and reading pictures in embedded application scene |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9396117B2 (en) | Instruction cache power reduction | |
US6978350B2 (en) | Methods and apparatus for improving throughput of cache-based embedded processors | |
JP3739491B2 (en) | Harmonized software control of Harvard architecture cache memory using prefetch instructions | |
US6564313B1 (en) | System and method for efficient instruction prefetching based on loop periods | |
EP1110151A1 (en) | Buffering system bus for external-memory accesses | |
KR102594288B1 (en) | Processing pipeline having first and second processing modes with different performance or energy consumption characteristics | |
CN109461113B (en) | Data structure-oriented graphics processor data prefetching method and device | |
US6282706B1 (en) | Cache optimization for programming loops | |
CN112148366A (en) | FLASH acceleration method for reducing power consumption and improving performance of chip | |
US11921634B2 (en) | Leveraging processing-in-memory (PIM) resources to expedite non-PIM instructions executed on a host | |
US20090177842A1 (en) | Data processing system and method for prefetching data and/or instructions | |
CN111639042A (en) | Method and device for processing consistency of prefetched buffer data | |
CN116149554A (en) | RISC-V and extended instruction based data storage processing system and method thereof | |
US11449428B2 (en) | Enhanced read-ahead capability for storage devices | |
US9645825B2 (en) | Instruction cache with access locking | |
JP2008015668A (en) | Task management device | |
JP5116275B2 (en) | Arithmetic processing apparatus, information processing apparatus, and control method for arithmetic processing apparatus | |
JP2007193433A (en) | Information processor | |
CN116700794A (en) | Method and system for acquiring instruction to be executed | |
CN111475203B (en) | Instruction reading method for processor and corresponding processor | |
CN116627335A (en) | Low-power eFlash reading acceleration system | |
JP2008052518A (en) | Cpu system | |
JPH0651982A (en) | Arithmetic processing unit | |
CN105843360A (en) | Apparatus and method for reducing power consumption of instruction cache | |
US20120151150A1 (en) | Cache Line Fetching and Fetch Ahead Control Using Post Modification Information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20201229 |
|
WD01 | Invention patent application deemed withdrawn after publication |