CN110795396A - Cold and hot data distinguishing method and system and storage medium thereof - Google Patents

Cold and hot data distinguishing method and system and storage medium thereof Download PDF

Info

Publication number
CN110795396A
CN110795396A CN201911019883.XA CN201911019883A CN110795396A CN 110795396 A CN110795396 A CN 110795396A CN 201911019883 A CN201911019883 A CN 201911019883A CN 110795396 A CN110795396 A CN 110795396A
Authority
CN
China
Prior art keywords
data
cold
hot
extension
mark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911019883.XA
Other languages
Chinese (zh)
Inventor
许铭霖
吴大畏
李晓强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHENZHEN SILICONGO SEMICONDUCTOR CO Ltd
Original Assignee
SHENZHEN SILICONGO SEMICONDUCTOR CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHENZHEN SILICONGO SEMICONDUCTOR CO Ltd filed Critical SHENZHEN SILICONGO SEMICONDUCTOR CO Ltd
Priority to CN201911019883.XA priority Critical patent/CN110795396A/en
Publication of CN110795396A publication Critical patent/CN110795396A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a cold and hot data distinguishing method and a system thereof, belonging to the technical field of data storage of a memory. The method comprises the following steps: creating a cold data area, a warm data area and a hot data area for storing data; receiving data; judging the data length of the data; when the data length is judged to be smaller than the first limit value, operating according to the thermal data, and storing the data into a thermal data area; when the data length is judged to be larger than the first limit value and smaller than the second limit value, operating according to the temperature data, and storing the data into a temperature data area; and when the data length is judged to be larger than the second limit value, operating according to the cold data, and storing the data into the cold data area. The method for accurately positioning the cold data written in the multiple data streams at the medium frequency has a better effect by adopting the address continuity method.

Description

Cold and hot data distinguishing method and system and storage medium thereof
Technical Field
The invention relates to the technical field of data storage of a memory, in particular to a cold and hot data distinguishing method and a system thereof.
Background
The cold data is status data before a long time, i.e. user portrait data, and commonly includes bank certificates, tax certificates, medical files, movie and television data, etc. Cold data does not require real-time access to offline data for backup for disaster recovery or must be retained for a period of time to comply with legal regulations.
The temperature data is non-instantaneous status and behavior data. It is simply understood that mixing the hot and cold data together results in the temperature data. For example, a user is particularly interested in a certain type of topic in the near future (hot data), which is in sharp contrast to the past behavior (cold data), and this indicates that the user is in the growth period of a new user (warm data), and the operator can consider using a corresponding strategy to pull the liveness and promote the conversion.
Hot data refers to the instantaneous location state, transaction and browsing behavior. Such as an instant geographic location, a mobile phone application that is active at a particular time, etc., can characterize "what is doing at what location". In addition, some real-time recorded information, such as some operations just performed when a user opens certain software or website, can be accumulated through a third-party platform, and developers can also accumulate according to the user using behaviors.
When cold and hot data are recycled at the time of garbage collection, if the cold and hot data are mixed together, when the cold data are recycled, the hot data can be also recycled, the writing amplification can be caused, and the service life of storage can be reduced.
Disclosure of Invention
The invention aims to provide a cold and hot data distinguishing method which has the advantage of being convenient for distinguishing cold and hot data.
The above object of the present invention is achieved by the following technical solutions:
a cold-hot data differentiation method, the method comprising: judging at a file system layer; receiving data, wherein the data comprises a spreading name; acquiring the extension name of the data; judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library; if the data exists in the cold data extension database, adding a cold mark in the data; if the data exists in the hot data extension database, adding a hot mark in the data; if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data; if the file system layer can not obtain the extension, judging at an algorithm layer; judging the data length of the data; when the data length is judged to be smaller than the first limit value, adding a hot mark in the data; when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data; and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
Further, the method comprises: a table is maintained that records the address range of the hot data.
Further, the method comprises: the information of the data is transferred through the reserved fields in the protocol.
The second purpose of the present invention is to provide a cold and hot data distinguishing system, which has the advantages of convenient data partitioning and reduced write amplification during data recovery.
The second purpose of the invention is realized by the following technical scheme:
a cold-hot data differentiation system, the system comprising: the file system layer is used for receiving data, and the data comprises a spreading name; acquiring the extension name of the data; judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library; if the data exists in the cold data extension database, adding a cold mark in the data; if the data exists in the hot data extension database, adding a hot mark in the data; if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data; the algorithm layer is used for judging the data length of the data; when the data length is judged to be smaller than the first limit value, adding a hot mark in the data; when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data; and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
Further, the file system layer is further configured to maintain a table for recording the address range of the hot data.
The invention also aims to provide a computer readable storage medium which has the advantages of conveniently partitioning data and reducing write amplification during data recovery.
The third purpose of the invention is realized by the following technical scheme:
a computer-readable storage medium, in which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned cold-hot data discrimination method.
In conclusion, the invention has the following beneficial effects:
the data area is divided into cold data, warm data and hot data by judging the extension and the data length of the data, so that the cold data and the hot data can be stored separately when the storage operation is carried out according to the cold and the hot of the data.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The technical solutions of the embodiments of the present invention will be described below with reference to the accompanying drawings.
Aiming at the prior art, a method of adopting address frequency to distinguish hot data from cold data exists, and the method of adopting address frequency is more effective to hot data; the method for accurately positioning the cold data written in the multiple data streams at the medium frequency has a better effect by adopting the address continuity method.
For a specific scene (such as a vehicle event data recorder), the data stream continuously written in the multiple data streams can be better discriminated by using the written data continuity, the mixing of cold data and hot data is avoided, the continuity of logical addresses is increased, and the fragments in a storage system are reduced. In systems where trim is not available or cannot be guaranteed, the problem of write amplification can be reduced.
As shown in fig. 1, a cold-hot data distinguishing method provided by the present application includes:
judging at a file system layer;
receiving data, wherein the data comprises a spreading name;
acquiring the extension name of the data;
judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library;
if the data exists in the cold data extension database, adding a cold mark in the data;
if the data exists in the hot data extension database, adding a hot mark in the data;
if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data;
if the file system layer can not obtain the extension, judging at an algorithm layer;
judging the data length of the data;
when the data length is judged to be smaller than the first limit value, adding a hot mark in the data;
when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data;
and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
In an embodiment of the invention, the plurality of extensions stored in the cold data extension repository are a plurality of multimedia data extension types. And a plurality of extension names in the hot data extension name library are extension name types of various system states. The data length calculation method comprises the following steps: and maintaining a table with the data length of K in the algorithm layer, and judging the data to be hot data when the written data length is less than K. And maintaining a table with the data length of N, and judging the data to be cold data when the length of the written data is greater than N. N and K are integers greater than 0. And if the data is not judged to be cold data or hot data, the data is temperature data.
According to the cold and hot data distinguishing method, the data are distinguished into cold data, warm data and hot data through judging the extension name and the data length of the data, so that the cold and hot data can be stored separately when storage operation is carried out according to the cold and hot of the data. The separate storage of cold and hot data can reduce the write amplification of the data, and the storage life can be prolonged.
The invention also provides a cold and hot data distinguishing system, which comprises:
the file system layer is used for receiving data, and the data comprises a spreading name; acquiring the extension name of the data; judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library; if the data exists in the cold data extension database, adding a cold mark in the data; if the data exists in the hot data extension database, adding a hot mark in the data; if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data;
the file system layer is also used for maintaining a table for recording the address range of the hot data.
The algorithm layer is used for judging the data length of the data; when the data length is judged to be smaller than the first limit value, adding a hot mark in the data; when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data; and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
For specific definition of the cold and hot data distinguishing system, reference may be made to the above definition of the cold and hot data distinguishing method, which is not described herein again. The file system layer and the algorithm layer in the cold-hot data distinguishing system can be wholly or partially realized by software, hardware and a combination thereof. The file system layer and the algorithm layer can be embedded in a hardware form or independent from a processor in the computer equipment, and can also be stored in a memory in the computer equipment in a software form, so that the processor can call and execute the operation corresponding to the file system layer and the algorithm layer.
The present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
judging at a file system layer;
receiving data, wherein the data comprises a spreading name;
acquiring the extension name of the data;
judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library;
if the data exists in the cold data extension database, adding a cold mark in the data;
if the data exists in the hot data extension database, adding a hot mark in the data;
if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data;
if the file system layer can not obtain the extension, judging at an algorithm layer;
judging the data length of the data;
when the data length is judged to be smaller than the first limit value, adding a hot mark in the data;
when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data;
and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (6)

1. A cold-hot data differentiation method, characterized in that the method comprises:
judging at a file system layer;
receiving data, wherein the data comprises a spreading name;
acquiring the extension name of the data;
judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library;
if the data exists in the cold data extension database, adding a cold mark in the data;
if the data exists in the hot data extension database, adding a hot mark in the data;
if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data;
if the file system layer can not obtain the extension, judging at an algorithm layer;
judging the data length of the data;
when the data length is judged to be smaller than the first limit value, adding a hot mark in the data;
when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data;
and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
2. The cold-hot data distinguishing method according to claim 1, wherein the method comprises the following steps: a table is maintained that records the address range of the hot data.
3. The cold-hot data distinguishing method according to claim 1, wherein the method comprises the following steps: the information of the data is transferred through the reserved fields in the protocol.
4. A cold-hot data differentiation system, said system comprising:
the file system layer is used for receiving data, and the data comprises a spreading name; acquiring the extension name of the data; judging whether the extension name exists in a preset cold data extension name library or a preset hot data extension name library; if the data exists in the cold data extension database, adding a cold mark in the data; if the data exists in the hot data extension database, adding a hot mark in the data; if the data does not exist in the cold data extension database or the hot data extension database, increasing a temperature mark in the data;
the algorithm layer is used for judging the data length of the data; when the data length is judged to be smaller than the first limit value, adding a hot mark in the data; when the data length is judged to be larger than the first limit value and smaller than the second limit value, increasing a temperature mark in the data; and when the data length is judged to be larger than the second limit value, adding a cold mark in the data.
5. The method as claimed in claim 4, wherein the file system layer is further configured to maintain a table for recording address ranges of hot data.
6. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the hot and cold data distinguishing method according to any one of claims 1 to 3.
CN201911019883.XA 2019-10-24 2019-10-24 Cold and hot data distinguishing method and system and storage medium thereof Pending CN110795396A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911019883.XA CN110795396A (en) 2019-10-24 2019-10-24 Cold and hot data distinguishing method and system and storage medium thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911019883.XA CN110795396A (en) 2019-10-24 2019-10-24 Cold and hot data distinguishing method and system and storage medium thereof

Publications (1)

Publication Number Publication Date
CN110795396A true CN110795396A (en) 2020-02-14

Family

ID=69441349

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911019883.XA Pending CN110795396A (en) 2019-10-24 2019-10-24 Cold and hot data distinguishing method and system and storage medium thereof

Country Status (1)

Country Link
CN (1) CN110795396A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112286935A (en) * 2020-10-30 2021-01-29 上海淇玥信息技术有限公司 Scheduling method and device based on scheduling platform and electronic equipment
CN114185487A (en) * 2021-11-25 2022-03-15 深圳市德明利技术股份有限公司 Cold and hot data identification method and device and computer equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106406753A (en) * 2016-08-30 2017-02-15 深圳芯邦科技股份有限公司 Data storage method and data storage device
CN109783019A (en) * 2018-12-28 2019-05-21 上海威固信息技术股份有限公司 A kind of data intelligence memory management method and device
CN110069218A (en) * 2019-04-22 2019-07-30 珠海全志科技股份有限公司 Cold and hot data separation method, device, computer equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106406753A (en) * 2016-08-30 2017-02-15 深圳芯邦科技股份有限公司 Data storage method and data storage device
CN109783019A (en) * 2018-12-28 2019-05-21 上海威固信息技术股份有限公司 A kind of data intelligence memory management method and device
CN110069218A (en) * 2019-04-22 2019-07-30 珠海全志科技股份有限公司 Cold and hot data separation method, device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112286935A (en) * 2020-10-30 2021-01-29 上海淇玥信息技术有限公司 Scheduling method and device based on scheduling platform and electronic equipment
CN114185487A (en) * 2021-11-25 2022-03-15 深圳市德明利技术股份有限公司 Cold and hot data identification method and device and computer equipment

Similar Documents

Publication Publication Date Title
US10949551B2 (en) Policy aware unified file system
CN111309650B (en) Cache control method, device, storage medium and equipment
US20140136575A1 (en) Log-structured garbage collection
CN108459913B (en) Data parallel processing method and device and server
US7774313B1 (en) Policy enforcement in continuous data protection backup systems
CN110795396A (en) Cold and hot data distinguishing method and system and storage medium thereof
CN111880734A (en) Data processing method, system, electronic equipment and storage medium
CN112632375B (en) Session information processing method, server and storage medium
CN109656487B (en) Data processing method, device, equipment and storage medium
CN108984754B (en) Client information updating method and device, computer equipment and storage medium
CN112307263A (en) File storage method, device, equipment and medium
CN113157600A (en) Space allocation method of shingled hard disk, file storage system and server
CN110806840A (en) Flash memory card data storage method based on multiple data streams, flash memory card and equipment
CN110287129B (en) L2P table updating and writing management method and device based on solid state disk
US11829377B2 (en) Efficient storage method for time series data
CN112817962B (en) Data storage method and device based on object storage and computer equipment
CN115470155A (en) L2P table caching method and device supporting solid state disk multi-scene multiplexing
CN106406771A (en) Log recording method and log recorder
CN115858419A (en) Metadata management method, device, equipment, server and readable storage medium
CN113485965A (en) Method, device and server for dynamically cleaning log files based on file sizes
CN110471623B (en) Hard disk file writing method, device, computer equipment and storage medium
CN114637946A (en) Resource data processing method and device and electronic equipment
CN114372188A (en) Authority control method, device, equipment and storage medium
CN107918654B (en) File decompression method and device and electronic equipment
CN105630694A (en) Method and device for controlling memory release

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200214

RJ01 Rejection of invention patent application after publication