CN110795396A - Cold and hot data distinguishing method and system and storage medium thereof - Google Patents
Cold and hot data distinguishing method and system and storage medium thereof Download PDFInfo
- Publication number
- CN110795396A CN110795396A CN201911019883.XA CN201911019883A CN110795396A CN 110795396 A CN110795396 A CN 110795396A CN 201911019883 A CN201911019883 A CN 201911019883A CN 110795396 A CN110795396 A CN 110795396A
- Authority
- CN
- China
- Prior art keywords
- data
- cold
- hot
- extension
- mark
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
- G06F16/148—File search processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a cold and hot data distinguishing method and a system thereof, belonging to the technical field of data storage of a memory. The method comprises the following steps: creating a cold data area, a warm data area and a hot data area for storing data; receiving data; judging the data length of the data; when the data length is judged to be smaller than the first limit value, operating according to the thermal data, and storing the data into a thermal data area; when the data length is judged to be larger than the first limit value and smaller than the second limit value, operating according to the temperature data, and storing the data into a temperature data area; and when the data length is judged to be larger than the second limit value, operating according to the cold data, and storing the data into the cold data area. The method for accurately positioning the cold data written in the multiple data streams at the medium frequency has a better effect by adopting the address continuity method.
Description
Technical Field
The invention relates to the technical field of data storage of a memory, in particular to a cold and hot data distinguishing method and a system thereof.
Background
The cold data is status data before a long time, i.e. user portrait data, and commonly includes bank certificates, tax certificates, medical files, movie and television data, etc. Cold data does not require real-time access to offline data for backup for disaster recovery or must be retained for a period of time to comply with legal regulations.
The temperature data is non-instantaneous status and behavior data. It is simply understood that mixing the hot and cold data together results in the temperature data. For example, a user is particularly interested in a certain type of topic in the near future (hot data), which is in sharp contrast to the past behavior (cold data), and this indicates that the user is in the growth period of a new user (warm data), and the operator can consider using a corresponding strategy to pull the liveness and promote the conversion.
Hot data refers to the instantaneous location state, transaction and browsing behavior. Such as an instant geographic location, a mobile phone application that is active at a particular time, etc., can characterize "what is doing at what location". In addition, some real-time recorded information, such as some operations just performed when a user opens certain software or website, can be accumulated through a third-party platform, and developers can also accumulate according to the user using behaviors.
When cold and hot data are recycled at the time of garbage collection, if the cold and hot data are mixed together, when the cold data are recycled, the hot data can be also recycled, the writing amplification can be caused, and the service life of storage can be reduced.
Disclosure of Invention
The invention aims to provide a cold and hot data distinguishing method which has the advantage of being convenient for distinguishing cold and hot data.
The above object of the present invention is achieved by the following technical solutions:
a cold-hot data differentiation method, the method comprising: judging at a file system layer; receiving data, wherein the data comprises a spreading name; acquiring the extension name of the data; judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library; if the data exists in the cold data extension database, adding a cold mark in the data; if the data exists in the hot data extension database, adding a hot mark in the data; if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data; if the file system layer can not obtain the extension, judging at an algorithm layer; judging the data length of the data; when the data length is judged to be smaller than the first limit value, adding a hot mark in the data; when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data; and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
Further, the method comprises: a table is maintained that records the address range of the hot data.
Further, the method comprises: the information of the data is transferred through the reserved fields in the protocol.
The second purpose of the present invention is to provide a cold and hot data distinguishing system, which has the advantages of convenient data partitioning and reduced write amplification during data recovery.
The second purpose of the invention is realized by the following technical scheme:
a cold-hot data differentiation system, the system comprising: the file system layer is used for receiving data, and the data comprises a spreading name; acquiring the extension name of the data; judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library; if the data exists in the cold data extension database, adding a cold mark in the data; if the data exists in the hot data extension database, adding a hot mark in the data; if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data; the algorithm layer is used for judging the data length of the data; when the data length is judged to be smaller than the first limit value, adding a hot mark in the data; when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data; and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
Further, the file system layer is further configured to maintain a table for recording the address range of the hot data.
The invention also aims to provide a computer readable storage medium which has the advantages of conveniently partitioning data and reducing write amplification during data recovery.
The third purpose of the invention is realized by the following technical scheme:
a computer-readable storage medium, in which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned cold-hot data discrimination method.
In conclusion, the invention has the following beneficial effects:
the data area is divided into cold data, warm data and hot data by judging the extension and the data length of the data, so that the cold data and the hot data can be stored separately when the storage operation is carried out according to the cold and the hot of the data.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The technical solutions of the embodiments of the present invention will be described below with reference to the accompanying drawings.
Aiming at the prior art, a method of adopting address frequency to distinguish hot data from cold data exists, and the method of adopting address frequency is more effective to hot data; the method for accurately positioning the cold data written in the multiple data streams at the medium frequency has a better effect by adopting the address continuity method.
For a specific scene (such as a vehicle event data recorder), the data stream continuously written in the multiple data streams can be better discriminated by using the written data continuity, the mixing of cold data and hot data is avoided, the continuity of logical addresses is increased, and the fragments in a storage system are reduced. In systems where trim is not available or cannot be guaranteed, the problem of write amplification can be reduced.
As shown in fig. 1, a cold-hot data distinguishing method provided by the present application includes:
judging at a file system layer;
receiving data, wherein the data comprises a spreading name;
acquiring the extension name of the data;
judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library;
if the data exists in the cold data extension database, adding a cold mark in the data;
if the data exists in the hot data extension database, adding a hot mark in the data;
if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data;
if the file system layer can not obtain the extension, judging at an algorithm layer;
judging the data length of the data;
when the data length is judged to be smaller than the first limit value, adding a hot mark in the data;
when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data;
and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
In an embodiment of the invention, the plurality of extensions stored in the cold data extension repository are a plurality of multimedia data extension types. And a plurality of extension names in the hot data extension name library are extension name types of various system states. The data length calculation method comprises the following steps: and maintaining a table with the data length of K in the algorithm layer, and judging the data to be hot data when the written data length is less than K. And maintaining a table with the data length of N, and judging the data to be cold data when the length of the written data is greater than N. N and K are integers greater than 0. And if the data is not judged to be cold data or hot data, the data is temperature data.
According to the cold and hot data distinguishing method, the data are distinguished into cold data, warm data and hot data through judging the extension name and the data length of the data, so that the cold and hot data can be stored separately when storage operation is carried out according to the cold and hot of the data. The separate storage of cold and hot data can reduce the write amplification of the data, and the storage life can be prolonged.
The invention also provides a cold and hot data distinguishing system, which comprises:
the file system layer is used for receiving data, and the data comprises a spreading name; acquiring the extension name of the data; judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library; if the data exists in the cold data extension database, adding a cold mark in the data; if the data exists in the hot data extension database, adding a hot mark in the data; if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data;
the file system layer is also used for maintaining a table for recording the address range of the hot data.
The algorithm layer is used for judging the data length of the data; when the data length is judged to be smaller than the first limit value, adding a hot mark in the data; when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data; and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
For specific definition of the cold and hot data distinguishing system, reference may be made to the above definition of the cold and hot data distinguishing method, which is not described herein again. The file system layer and the algorithm layer in the cold-hot data distinguishing system can be wholly or partially realized by software, hardware and a combination thereof. The file system layer and the algorithm layer can be embedded in a hardware form or independent from a processor in the computer equipment, and can also be stored in a memory in the computer equipment in a software form, so that the processor can call and execute the operation corresponding to the file system layer and the algorithm layer.
The present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
judging at a file system layer;
receiving data, wherein the data comprises a spreading name;
acquiring the extension name of the data;
judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library;
if the data exists in the cold data extension database, adding a cold mark in the data;
if the data exists in the hot data extension database, adding a hot mark in the data;
if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data;
if the file system layer can not obtain the extension, judging at an algorithm layer;
judging the data length of the data;
when the data length is judged to be smaller than the first limit value, adding a hot mark in the data;
when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data;
and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.
Claims (6)
1. A cold-hot data differentiation method, characterized in that the method comprises:
judging at a file system layer;
receiving data, wherein the data comprises a spreading name;
acquiring the extension name of the data;
judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library;
if the data exists in the cold data extension database, adding a cold mark in the data;
if the data exists in the hot data extension database, adding a hot mark in the data;
if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data;
if the file system layer can not obtain the extension, judging at an algorithm layer;
judging the data length of the data;
when the data length is judged to be smaller than the first limit value, adding a hot mark in the data;
when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data;
and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
2. The cold-hot data distinguishing method according to claim 1, wherein the method comprises the following steps: a table is maintained that records the address range of the hot data.
3. The cold-hot data distinguishing method according to claim 1, wherein the method comprises the following steps: the information of the data is transferred through the reserved fields in the protocol.
4. A cold-hot data differentiation system, said system comprising:
the file system layer is used for receiving data, and the data comprises a spreading name; acquiring the extension name of the data; judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library; if the data exists in the cold data extension database, adding a cold mark in the data; if the data exists in the hot data extension database, adding a hot mark in the data; if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data;
the algorithm layer is used for judging the data length of the data; when the data length is judged to be smaller than the first limit value, adding a hot mark in the data; when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data; and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
5. The method as claimed in claim 4, wherein the file system layer is further configured to maintain a table for recording address ranges of hot data.
6. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the hot and cold data distinguishing method according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911019883.XA CN110795396A (en) | 2019-10-24 | 2019-10-24 | Cold and hot data distinguishing method and system and storage medium thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911019883.XA CN110795396A (en) | 2019-10-24 | 2019-10-24 | Cold and hot data distinguishing method and system and storage medium thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110795396A true CN110795396A (en) | 2020-02-14 |
Family
ID=69441349
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911019883.XA Pending CN110795396A (en) | 2019-10-24 | 2019-10-24 | Cold and hot data distinguishing method and system and storage medium thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110795396A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112286935A (en) * | 2020-10-30 | 2021-01-29 | 上海淇玥信息技术有限公司 | Scheduling method and device based on scheduling platform and electronic equipment |
CN114185487A (en) * | 2021-11-25 | 2022-03-15 | 深圳市德明利技术股份有限公司 | Cold and hot data identification method and device and computer equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106406753A (en) * | 2016-08-30 | 2017-02-15 | 深圳芯邦科技股份有限公司 | Data storage method and data storage device |
CN109783019A (en) * | 2018-12-28 | 2019-05-21 | 上海威固信息技术股份有限公司 | A kind of data intelligence memory management method and device |
CN110069218A (en) * | 2019-04-22 | 2019-07-30 | 珠海全志科技股份有限公司 | Cold and hot data separation method, device, computer equipment and storage medium |
-
2019
- 2019-10-24 CN CN201911019883.XA patent/CN110795396A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106406753A (en) * | 2016-08-30 | 2017-02-15 | 深圳芯邦科技股份有限公司 | Data storage method and data storage device |
CN109783019A (en) * | 2018-12-28 | 2019-05-21 | 上海威固信息技术股份有限公司 | A kind of data intelligence memory management method and device |
CN110069218A (en) * | 2019-04-22 | 2019-07-30 | 珠海全志科技股份有限公司 | Cold and hot data separation method, device, computer equipment and storage medium |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112286935A (en) * | 2020-10-30 | 2021-01-29 | 上海淇玥信息技术有限公司 | Scheduling method and device based on scheduling platform and electronic equipment |
CN114185487A (en) * | 2021-11-25 | 2022-03-15 | 深圳市德明利技术股份有限公司 | Cold and hot data identification method and device and computer equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10949551B2 (en) | Policy aware unified file system | |
CN111309650B (en) | Cache control method, device, storage medium and equipment | |
US20140136575A1 (en) | Log-structured garbage collection | |
CN108459913B (en) | Data parallel processing method and device and server | |
US7774313B1 (en) | Policy enforcement in continuous data protection backup systems | |
CN110795396A (en) | Cold and hot data distinguishing method and system and storage medium thereof | |
CN111880734A (en) | Data processing method, system, electronic equipment and storage medium | |
CN112632375B (en) | Session information processing method, server and storage medium | |
CN109656487B (en) | Data processing method, device, equipment and storage medium | |
CN108984754B (en) | Client information updating method and device, computer equipment and storage medium | |
CN112307263A (en) | File storage method, device, equipment and medium | |
CN113157600A (en) | Space allocation method of shingled hard disk, file storage system and server | |
CN110806840A (en) | Flash memory card data storage method based on multiple data streams, flash memory card and equipment | |
CN110287129B (en) | L2P table updating and writing management method and device based on solid state disk | |
US11829377B2 (en) | Efficient storage method for time series data | |
CN112817962B (en) | Data storage method and device based on object storage and computer equipment | |
CN115470155A (en) | L2P table caching method and device supporting solid state disk multi-scene multiplexing | |
CN106406771A (en) | Log recording method and log recorder | |
CN115858419A (en) | Metadata management method, device, equipment, server and readable storage medium | |
CN113485965A (en) | Method, device and server for dynamically cleaning log files based on file sizes | |
CN110471623B (en) | Hard disk file writing method, device, computer equipment and storage medium | |
CN114637946A (en) | Resource data processing method and device and electronic equipment | |
CN114372188A (en) | Authority control method, device, equipment and storage medium | |
CN107918654B (en) | File decompression method and device and electronic equipment | |
CN105630694A (en) | Method and device for controlling memory release |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200214 |
|
RJ01 | Rejection of invention patent application after publication |