CN109344095B - Flash memory thermal data identification method - Google Patents

Flash memory thermal data identification method Download PDF

Info

Publication number
CN109344095B
CN109344095B CN201811085164.3A CN201811085164A CN109344095B CN 109344095 B CN109344095 B CN 109344095B CN 201811085164 A CN201811085164 A CN 201811085164A CN 109344095 B CN109344095 B CN 109344095B
Authority
CN
China
Prior art keywords
data
logical address
flash memory
hot
pages
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811085164.3A
Other languages
Chinese (zh)
Other versions
CN109344095A (en
Inventor
李虎
罗胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Demingli Electronics Co Ltd
Original Assignee
Shenzhen Demingli Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Demingli Electronics Co Ltd filed Critical Shenzhen Demingli Electronics Co Ltd
Priority to CN201811085164.3A priority Critical patent/CN109344095B/en
Publication of CN109344095A publication Critical patent/CN109344095A/en
Application granted granted Critical
Publication of CN109344095B publication Critical patent/CN109344095B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1045Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
    • G06F12/1054Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache the data cache being concurrently physically addressed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1045Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
    • G06F12/1063Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache the data cache being concurrently virtually addressed

Abstract

The embodiment of the application provides a flash memory thermal data identification method, which comprises the following steps: s100: searching a physical data block of which the ratio of the number of effective data pages to the total number of pages of the physical data block is greater than a preset value in a flash memory; s110: extracting the logical addresses of the invalid data pages in all the physical data blocks found in the step S100; s120: when the logical address of the extracted invalid data page is not determined as the hot data logical address in S110, the logical address of the extracted invalid data page is set as the hot data logical address. The flash memory hot data identification method can identify hot data from cold data blocks, performs data optimization on the basis of combining writing time threshold classification, and further improves recovery efficiency.

Description

Flash memory thermal data identification method
Technical Field
The embodiment of the application relates to the technical field of stored data processing, and particularly provides a flash memory thermal data identification method.
Background
Compared with the traditional hard disk, the flash memory has high read-write speed and lower power consumption, and the flash memory is applied more and more along with the improvement of the manufacturing process and the reduction of the cost.
Since the flash memory cannot write data again after writing data, the written data must be sorted by using additional flash memory blocks, and the process is called flash garbage collection.
The cold data is the data which is rarely updated after being written into the flash memory, and after the data is sorted, the data is not easily updated, and the garbage recycling benefit is the highest. On the contrary, the hot data is written frequently after being written into the flash memory, and is recycled again in a short time after being recycled, thereby affecting the recycling efficiency.
In the prior art, a method for identifying cold and hot data is to count the write times of each logical address, wherein the hot data is written more times, and the cold data is written less times. However, there are differences in the file systems of each system (for example, FAT system, FAT32 system, NTFS system, EXT4 system, etc.), and the methods for determining cold data according to the number of writes in the prior art have relativity, which may result in the determination of hot data as cold data.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a method for identifying flash memory hot data, which can perform data optimization based on writing time threshold classification, identify hot data from cold data blocks, and improve recovery efficiency.
In order to solve the above technical problem, a flash memory thermal data identification method is provided, which includes the following steps:
s100: searching a physical data block of which the ratio of the number of effective data pages to the total number of pages of the physical data block is greater than a preset value in a flash memory;
s110: extracting the logical addresses of the invalid data pages in all the physical data blocks found in the step S100;
s120: when the logical address of the extracted invalid data page is not determined as the hot data logical address in S110, the logical address of the extracted invalid data page is set as the hot data logical address.
Wherein the preset value is four fifths, three quarters or one half.
S120 includes a step of adding the logical address of the extracted invalid data page to a hot data logical address list.
The flash memory thermal data identification method provided by the application has the following technical effects: the method has the advantages that cold and hot data are re-screened through the judgment of the ratio of the number of effective data pages to the total number of pages in the physical data block at the later stage, the phenomenon that the cold and hot data are alternated due to incorrect writing times of the cold and hot data can be completely overcome, the classification method is further optimized, particularly, along with the operation of a system, the flash memory hot data identification method can accurately identify the cold and hot data by combining with the writing times threshold, the recovery efficiency is further improved on the basis of combining with the writing times threshold classification, and the pressure of garbage recovery is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1 is a schematic diagram of a physical structure of a flash memory according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a physical data block of a flash memory according to an embodiment of the present disclosure;
FIG. 3 is a flow chart illustrating a method for identifying hot data (identifying hot data from cold data) in a flash memory according to an embodiment of the present invention;
fig. 4 is a diagram showing a step of adding the logical address of the extracted invalid data page into a hot data logical address list in the embodiment of the present application.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for facilitating the explanation of the present invention, and have no specific meaning in itself. Thus, "module", "component" or "unit" may be used mixedly. A cold data block is a data block in which data is rarely updated, wherein usually most of the data is valid data and only a few of the data is invalid data. Unlike a cold data block, a hot data block is a data block in which data is frequently updated, wherein usually most of the data is invalid data and only a few of the data is valid data.
The inventor of the present application has intensively studied and found that a method for judging cold and hot data by counting the number of times of writing logical addresses involves a problem of a judgment standard, and hot data is written more than the number of times. The file systems of each system are different, and the method can only be used for simple distinction. By nature, the purpose of distinguishing cold and hot data is to ensure that a large amount of data which does not need to be updated are stored together, and data which is updated frequently is stored together. In particular, hot data, which has a low number of logical address writes, is written to a cold data block interspersed with a large amount of cold data. For such thermal data, it cannot be judged simply by the number of writes, and further optimization of the classification method is required.
As shown in fig. 1 and 2, the flash memory has physical data blocks BLK (english Block)0, BLK 1, BLK 2, BLK N, … BLK M. The data in page N +2 of the physical data block BLK N is updated at other physical addresses after writing the cold data block because the writing times are less than the threshold for determining the hot data. The physical data block BLK N is a cold data block, most of the data is valid data, only few invalid data, and only one data is invalid data in fig. 2, i.e., data in page N + 2. As is well known, if the same logical address is written multiple times (the number of times is greater than the threshold of the hot data determination), the logical address is determined as the hot data address, only one page of data is valid, and other pages are invalid because the data update is invalid, and when the physical address is recovered, all valid data pages of the moved physical data block can be recovered. When the invalid data of page N +2 appears in the physical data block BLK N, 255 valid data need to be moved to recover one flash memory physical data block. In other words, one more valid data page is generated by moving 255 valid data pages, and the recovery efficiency is very low.
Aiming at the problem that the recovery efficiency is low due to the fact that hot data with low logical address writing times are written into cold data blocks among a large amount of cold data in an intermingled mode, the following flash memory hot data identification method is provided, as shown in fig. 3, and the method comprises the following steps:
s100: searching a physical data block of which the ratio of the number of Valid Page counts (hereinafter, abbreviated as VPC) to the total number of pages of the physical data block is greater than a preset value in a flash memory;
s110: extracting the logical addresses of the invalid data pages in all the physical data blocks found in the step S100;
s120: when the logical address of the extracted invalid data page is not determined as the hot data logical address in S110, the logical address of the extracted invalid data page is set as the hot data logical address.
The preset value in S100 is, for example, four fifths, three quarters or one half.
Steps S110 and S120 may be performed on each physical data block one by one, in order from large to small according to the VPC of the physical data block to the total number of pages, until all invalid data pages of the found physical data blocks are traversed.
As shown in fig. 1, the flash memory has physical data blocks BLK (english Block)0, BLK 1, BLK 2, BLK N, … BLK M. The physical data block BLK N is a cold data block, most of the data is valid data (also referred to as a valid page), and only one data is invalid data (also referred to as an invalid page), i.e., page N + 2.
By the flash memory hot data identification method and the classification method of the writing-in frequency threshold, the existing classified physical data blocks are further optimized, the physical data blocks are searched and screened, and misjudged hot data addresses are found out and are optimized and compensated.
In one specific application, first, all the physical data blocks VPC in the flash memory are sorted, since the VPC of cold data blocks is usually larger. Then, a cold data block in which the ratio of the VPC to the total number of pages is greater than a predetermined value, that is, a physical data block in which hot data is updated due to erroneous determination is selected. Then, the logical address of the invalid data page in all the physical data blocks found in the previous step is read, and the logical address is generally stored in the redundant space of the physical page. And then, judging whether the logical address is judged to be a hot data logical address or not, and if not, adding the hot data logical address into a hot data logical address table. And repeating the previous step until the logical addresses of the invalid data pages of the cold data blocks with the VPC to total page number ratio larger than the preset value are all set as the hot data logical addresses.
In the step of determining whether the logical address is already determined as a hot data logical address, and if not, adding the hot data logical address into the hot data logical address list, as shown in fig. 4, the hot data logical address list resides in the RAM of the controller, a hot data logical address record of LPA X is stored in the schematic diagram by way of example, once a new hot data logical address (LPX +1) is scanned and found, the hot data logical address list is added later, and then the logical address data is written into a hot data block, so that the classification is more accurate, and the recovery efficiency can be further improved.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (2)

1. A flash memory thermal data identification method is characterized by comprising the following steps:
s100: searching a physical data block of which the ratio of the number of effective data pages to the total number of pages of the physical data block is greater than a preset value in a flash memory;
s110: extracting the logical addresses of the invalid data pages in all the physical data blocks found in the step S100;
s120: when the logical address of the invalid data page extracted in S110 is not determined as the hot data logical address, setting the logical address of the extracted invalid data page as the hot data logical address;
wherein the preset value is four fifths, or three quarters, or one half.
2. The method of claim 1, wherein S120 comprises the step of adding the logical address of the extracted invalid data page to a hot data logical address list.
CN201811085164.3A 2018-09-18 2018-09-18 Flash memory thermal data identification method Active CN109344095B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811085164.3A CN109344095B (en) 2018-09-18 2018-09-18 Flash memory thermal data identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811085164.3A CN109344095B (en) 2018-09-18 2018-09-18 Flash memory thermal data identification method

Publications (2)

Publication Number Publication Date
CN109344095A CN109344095A (en) 2019-02-15
CN109344095B true CN109344095B (en) 2021-05-04

Family

ID=65305467

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811085164.3A Active CN109344095B (en) 2018-09-18 2018-09-18 Flash memory thermal data identification method

Country Status (1)

Country Link
CN (1) CN109344095B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514106A (en) * 2012-06-20 2014-01-15 北京神州泰岳软件股份有限公司 Method for caching data
CN105677242A (en) * 2015-12-31 2016-06-15 杭州华为数字技术有限公司 Hot and cold data separation method and device
US9384138B2 (en) * 2014-05-09 2016-07-05 Avago Technologies General Ip (Singapore) Pte. Ltd. Temporal tracking of cache data
CN106598878A (en) * 2016-12-27 2017-04-26 湖南国科微电子股份有限公司 Method for separating cold data and hot data of solid state disk

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514106A (en) * 2012-06-20 2014-01-15 北京神州泰岳软件股份有限公司 Method for caching data
US9384138B2 (en) * 2014-05-09 2016-07-05 Avago Technologies General Ip (Singapore) Pte. Ltd. Temporal tracking of cache data
CN105677242A (en) * 2015-12-31 2016-06-15 杭州华为数字技术有限公司 Hot and cold data separation method and device
CN106598878A (en) * 2016-12-27 2017-04-26 湖南国科微电子股份有限公司 Method for separating cold data and hot data of solid state disk

Also Published As

Publication number Publication date
CN109344095A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
CN100478946C (en) Method and apparatus for file system snapshot persistence
US9348799B2 (en) Forming a master page for an electronic document
CN104021161A (en) Cluster storage method and device
US10776345B2 (en) Efficiently updating a secondary index associated with a log-structured merge-tree database
CN109902090B (en) Method and device for acquiring field name
CN106486167B (en) Improve the method and system that flash memory is removed
US10628487B2 (en) Method for hash collision detection based on the sorting unit of the bucket
CN107092566B (en) Data storage device and data maintenance method thereof
CN105589894B (en) Document index establishing method and device and document retrieval method and device
CN110888837B (en) Object storage small file merging method and device
CN107515931B (en) Repeated data detection method based on clustering
CN104750791A (en) Image retrieval method and device
CN104899114A (en) Continuous time data protection method on solid state drive
CN102959548A (en) Data storage method, search method and device
CN111142794A (en) Method, device and equipment for classified storage of data and storage medium
US20110107056A1 (en) Method for determining data correlation and a data processing method for a memory
CN109344095B (en) Flash memory thermal data identification method
CN112905496A (en) Garbage recycling method and device, readable storage medium and electronic equipment
CN104182479A (en) Method and device for processing information
CN103714121A (en) Index record management method and device
CN112597070B (en) Object recovery method and device
CN114327252A (en) Data reduction in block-based storage systems using content-based block alignment
CN114661859A (en) Sensitive keyword retrieval method, device, medium and product based on system index
CN109388579A (en) A kind of flash memory cold data recognition methods
CN112579763A (en) Document pushing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 701, 7th floor, Smart Valley Innovation Park, 1010 Bulong Road, Minzhi Street, Longhua District, Shenzhen City, Guangdong Province

Applicant after: Shenzhen deminli Technology Co., Ltd

Address before: 518000, 701, building 7, wisdom Valley Innovation Park, people's street, Longhua District, Shenzhen, Guangdong

Applicant before: SHENZHEN DEMINGLI ELECTRONICS Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder

Address after: 2501, 2401, block a, building 1, Shenzhen new generation industrial park, 136 Zhongkang Road, Meidu community, Meilin street, Futian District, Shenzhen, Guangdong 518000

Patentee after: Shenzhen deminli Technology Co.,Ltd.

Address before: 518000 room 701, 7 / F, wisdom Valley Innovation Park, 1010 Bulong Road, Minzhi street, Longhua District, Shenzhen City, Guangdong Province

Patentee before: Shenzhen deminli Technology Co.,Ltd.

CP02 Change in the address of a patent holder