CN116204138A - Efficient storage system and method based on hierarchical storage - Google Patents

Efficient storage system and method based on hierarchical storage Download PDF

Info

Publication number
CN116204138A
CN116204138A CN202310494361.5A CN202310494361A CN116204138A CN 116204138 A CN116204138 A CN 116204138A CN 202310494361 A CN202310494361 A CN 202310494361A CN 116204138 A CN116204138 A CN 116204138A
Authority
CN
China
Prior art keywords
data
temperature
module
storage
cold
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310494361.5A
Other languages
Chinese (zh)
Other versions
CN116204138B (en
Inventor
代幻成
杨晓华
杨尧
张蔚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Sanlitong Technology Co ltd
Original Assignee
Chengdu Sanlitong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Sanlitong Technology Co ltd filed Critical Chengdu Sanlitong Technology Co ltd
Priority to CN202310494361.5A priority Critical patent/CN116204138B/en
Publication of CN116204138A publication Critical patent/CN116204138A/en
Application granted granted Critical
Publication of CN116204138B publication Critical patent/CN116204138B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the technical field of data storage, and provides a high-efficiency storage system and a method based on layered storage, wherein the system comprises a cold and hot data classification module; a data migration module; a terminal storage module; the cloud storage module and/or the edge storage module; the cold and hot data classification module determines the temperature of the data and divides the data into hot data and cold data according to the temperature of the data; the data migration module migrates hot data in the data to the cloud storage module and/or the edge storage module, and migrates cold data in the data to the terminal storage module. According to the method, the data are divided into the hot data and the cold data, and then the data are stored in the optimal position in the cloud edge database, so that the cloud edge time sequence database is helped to avoid storage cost caused by storing all the data by three parties, and the data transmission cost among all parties in collaborative query processing is greatly reduced.

Description

Efficient storage system and method based on hierarchical storage
Technical Field
The invention relates to the technical field of data storage, in particular to a high-efficiency storage system and method based on hierarchical storage.
Background
Data storage requirements continue to grow exponentially, and the need for efficient storage systems becomes increasingly important. This trend is not only due to the rise of big data, cloud computing, internet of things (IoT) and artificial intelligence, but also because the way enterprises store and analyze data is changing. As the need for enterprises to collect and store large amounts of data becomes more prevalent, the required storage capacity, data access speed, and cost effectiveness become more important. For example, in artificial intelligence algorithms, a large amount of training data will occupy a large amount of memory space, and fast reading and accessing of such data is critical to the effectiveness of the algorithm. Furthermore, the increasing demand for real-time data processing and analysis has led to a need for high performance storage systems that can quickly access and process data, including both central data centers and edge-side devices.
With the development of cloud edge computing, there are many smaller, distributed data centers deployed on edge devices, with data stored on local nodes, which requires efficient storage management and hierarchical data storage policies. For example, time series databases are often used to process sensor-generated data in IoT, and the storage capacity of such data is very large. Therefore, implementing time series data storage based on edge devices must consider a strategy of hierarchical data storage, which not only requires different hierarchies according to the frequency and importance of data, but also requires fast data retrieval and filtering, and efficient data storage management.
The existing hierarchical storage strategy is not suitable for edge devices, because the prior art only can depict the relative temperature between data and cannot reflect the cold and hot degree of the data, and therefore, the situation of misscheduling can occur in the frequent scheduling process.
Disclosure of Invention
In order to solve the above-mentioned prior art problems, the present invention provides a high-efficiency storage system based on hierarchical storage, including:
a cold and hot data classification module;
a data migration module;
a terminal storage module;
the cloud storage module and/or the edge storage module;
the cold and hot data classification module determines the temperature of the data and divides the data into hot data and cold data according to the temperature of the data;
the data migration module migrates hot data in the data to the cloud storage module and/or the edge storage module, and migrates cold data in the data to the terminal storage module.
Optionally, the cold and hot data classification module includes:
a temperature determination unit;
the temperature determining unit acquires a temperature attenuation item and a temperature rise item of data, establishes a temperature model according to the temperature attenuation item and the temperature rise item, and determines the temperature of the data by using the temperature model.
Optionally, the expression of the temperature attenuation term is specifically:
Figure SMS_1
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_2
for data in +.>
Figure SMS_3
Attenuation temperature at time, < >>
Figure SMS_4
For data in +.>
Figure SMS_5
Temperature at time->
Figure SMS_6
Is a custom cooling rate.
Optionally, the expression of the temperature rising term is specifically:
Figure SMS_7
Figure SMS_8
Figure SMS_9
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_12
for the elevated temperature of the data, +.>
Figure SMS_15
For the introduced heat source temperature, +.>
Figure SMS_18
For data volume density>
Figure SMS_11
Is an intermediate variable +.>
Figure SMS_14
Is the heat radiation coefficient>
Figure SMS_17
For the access interval +.>
Figure SMS_19
For accessing the back->
Figure SMS_10
Accessible data->
Figure SMS_13
Is a self-defined specific heat capacity->
Figure SMS_16
Is the data length.
Optionally, the expression of the temperature model is specifically:
Figure SMS_20
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_21
for data in +.>
Figure SMS_22
Temperature at time->
Figure SMS_23
For the decay temperature of the data, +.>
Figure SMS_24
For the elevated temperature of the data, +.>
Figure SMS_25
As a discrete function, the access time is taken as a node, the value of the access time can be 1 or 0, the data access time value is 1, the data non-access time value is 0, and the access is carried out in a time period>
Figure SMS_26
Secondary, its value is->
Figure SMS_27
Optionally, the system further comprises:
a workload prediction module;
the work load prediction module determines the predicted temperature of the data in a target time period according to the temperature of the data;
optionally, the workload prediction module includes:
a training data extraction unit;
a labeling data extraction unit;
a temperature prediction unit;
the training data acquisition unit extracts data of a sample time period stored in time sequence to obtain training data;
the labeling data acquisition unit extracts the temperature of the data in the sample time period, and labels the data by utilizing the temperature to obtain labeling data;
the temperature prediction unit establishes a temperature prediction model, and trains the temperature prediction model by using training data and labeling data to obtain the predicted temperature of the data in the target time period.
Optionally, the cold-hot data classification module classifies the data into hot data and cold data according to a predicted temperature of the data.
Optionally, the system further comprises: a data abstract creation module; the data summary creation module includes:
a supercooling data determining unit;
a data digest creation unit;
the supercooling data determining unit determines supercooling data in the cold data according to the temperature of the cold data;
and the data abstract creating module creates a data abstract for the supercooled data when the supercooled data is migrated to the terminal storage module.
In order to solve the above-mentioned problems in the prior art, the present invention further provides a hierarchical storage-based efficient storage method, for use in the hierarchical storage-based efficient storage system as described above, the method comprising:
s1: the cold and hot data classification module determines the temperature of the data and divides the data into hot data and cold data according to the temperature of the data;
s2: and the data migration module migrates hot data in the data to the cloud storage module and/or the edge storage module, and migrates cold data in the data to the terminal storage module.
The invention has the beneficial effects of providing a high-efficiency storage system and a method based on layered storage, wherein the system comprises a cold and hot data classification module; a data migration module; a terminal storage module; the cloud storage module and/or the edge storage module; the cold and hot data classification module determines the temperature of the data and divides the data into hot data and cold data according to the temperature of the data; the data migration module migrates hot data in the data to the cloud storage module and/or the edge storage module, and migrates cold data in the data to the terminal storage module. According to the method, the data are divided into the hot data and the cold data, and then the data are stored in the optimal position in the cloud edge database, so that the cloud edge time sequence database is helped to avoid storage cost caused by storing all the data by three parties, and the data transmission cost among all parties in collaborative query processing is greatly reduced.
Drawings
Fig. 1 is a schematic structural diagram of a hierarchical storage-based efficient storage system according to the present invention.
Fig. 2 is a schematic flow chart of a high-efficiency storage method based on hierarchical storage.
Reference numerals:
10-a cold and hot data classification module; 20-a data migration module; 30-a terminal storage module; 40-cloud storage module; 50-edge memory module.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a hierarchical storage-based efficient storage system according to the present embodiment.
In this embodiment, an efficient storage system based on hierarchical storage includes: a cold and hot data classification module 10; a data migration module 20; a terminal storage module 30; cloud storage module 40 and/or edge storage module 50.
It should be noted that, the cold and hot data classification module 10 determines the temperature of the data, and classifies the data into hot data and cold data according to the temperature of the data; the data migration module 20 migrates hot data in the data to the cloud storage module 40 and/or the edge storage module 50, and migrates cold data in the data to the terminal storage module 30.
The cold and hot data classification module 10 is used for establishing a temperature model, determining the temperature of data and dividing the data into cold data and hot data. The data migration module 20 is configured to implement migration of data between different storage devices. The cloud storage module 40, the edge storage module 50 and the terminal storage module 30 are different types of storage devices, and in a cloud edge computing system, the cloud storage module 40 and the edge storage module 50 are used for realizing computation, so that the cloud storage module 40 and the edge storage module 50 generally store frequently called hot data for computation, and the terminal storage module 30 generally serves as a functional device, so that the terminal storage module 30 generally stores rarely called cold data for non-computation.
Therefore, in this embodiment, by dividing the data into the cold data and the hot data, storing the cold data in the terminal storage module 30 and storing the hot data in the cloud storage module 40 and/or the edge storage module 50, the size of the transmission data can be reduced while the storage space of the terminal device is reduced to the maximum extent, and the quick data call of the cloud edge calculation can be realized.
In a preferred embodiment, the cold and hot data classification module 10 includes: and a temperature determining unit.
In the prior art, cold and hot data classification is generally performed based on the relative temperature between data, that is, by counting the access amount or frequency in a unit time. For example: when the data a access frequency is 10000 times and the data B access frequency is 1000 times, the data a is considered to be hot data.
However, such a cold and hot data judgment method based on the relative temperature does not depict the influence of the access time interval on the cold data judgment, and may have an error influence on the division of the cold and hot data. For example: although the access frequency of the data B in unit time is 1000 times, the access interval is very short, the data B is accessed 1000 times in the first unit time and 1000 times in the second unit time, and the middle is not provided with an interval; although the access frequency of the data a is 10000 times per unit time, the access interval is longer, 10000 times are accessed in the first unit time, 10000 times are accessed in the third unit time at intervals of one unit time. If the data a is higher only from the viewpoint of the access frequency, but the data B is accessed every unit time from the viewpoint of the overall time, therefore, classification of only cold and hot data considering the access frequency may cause inaccurate classification.
Thus, the present embodiment provides a cold and hot data judging method based on absolute temperature, that is, a temperature determining unit acquires a temperature attenuation term and a temperature rise term of data, establishes a temperature model according to the temperature attenuation term and the temperature rise term, packages the data temperature with original data, and determines the temperature of the data by using the temperature model.
The temperature determining unit introduces timeliness and frequency indexes of data access, namely data writing time and/or frequency of data access by acquiring a corresponding time sequence when the data is stored, and the characteristics are represented by using a time stamp generated by the time sequence. Specifically, the present embodiment obtains a temperature decay term using newton's law of cooling.
The temperature determining unit characterizes the relation between the access interval and the data temperature by acquiring a corresponding time sequence when the data is stored. Specifically, the present embodiment obtains a temperature rise term using the law of heat radiation.
In a preferred embodiment, the expression of the temperature decay term is specifically:
Figure SMS_28
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_29
for data in +.>
Figure SMS_30
Attenuation temperature at time, < >>
Figure SMS_31
For data in +.>
Figure SMS_32
Temperature at time->
Figure SMS_33
Is a custom cooling rate.
The newton law of cooling is as follows:
Figure SMS_34
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_35
for reference temperature->
Figure SMS_36
Is the current temperature of the data, +.>
Figure SMS_37
Is the ratio coefficient of the temperature change rate and the temperature difference. The differential solution can be obtained:
Figure SMS_38
at this time, modeling of the temperature decay is required:
Figure SMS_39
a newton's law of cooling model, i.e., a temperature decay term, for the cold and hot data was obtained by the modeling described above.
In a preferred embodiment, the expression of the temperature rising term is specifically:
Figure SMS_40
Figure SMS_41
Figure SMS_42
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_44
for the elevated temperature of the data, +.>
Figure SMS_47
For the introduced heat source temperature, +.>
Figure SMS_51
For data volume density>
Figure SMS_45
Is an intermediate variable +.>
Figure SMS_48
Is the heat radiation coefficient>
Figure SMS_50
For the access interval +.>
Figure SMS_52
For accessing the back->
Figure SMS_43
Accessible data->
Figure SMS_46
Is a self-defined specific heat capacity->
Figure SMS_49
Is the data length.
It should be noted that this embodiment describes the relationship between the access interval and the data temperature using the heat radiation law, and the single-area heat radiation equation is as follows:
Figure SMS_53
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_54
for accessing the back->
Figure SMS_55
Accessible data->
Figure SMS_56
For the introduced heat source temperature, +.>
Figure SMS_57
For the access interval +.>
Figure SMS_58
For emissivity, then the process of heat radiation transfer is:
Figure SMS_59
the actual elevated temperature of the data is:
Figure SMS_60
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_61
for data volume, +.>
Figure SMS_62
For the number of accesses +.>
Figure SMS_63
For the current time +.>
Figure SMS_64
For heat radiation transmissibility, +.>
Figure SMS_65
Is a self-defined specific heat capacity->
Figure SMS_66
For data length +.>
Figure SMS_67
Is the data volume density. Thereby, it is obtained that:
Figure SMS_68
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_69
,/>
Figure SMS_70
,/>
Figure SMS_71
the actual elevated temperature is shown with distance as a variable.
A heat radiation law model, namely a temperature rise term, for the cold and hot data is obtained through the modeling.
In a preferred embodiment, the calculation of the data temperature includes both a decay and an increase in the data temperature. Thereby, an expression of the temperature model is obtained, specifically:
Figure SMS_72
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_73
for data in +.>
Figure SMS_74
Temperature at time->
Figure SMS_75
For the decay temperature of the data, +.>
Figure SMS_76
For the elevated temperature of the data, +.>
Figure SMS_77
As a discrete function, the access time is taken as a node, the value of the access time can be 1 or 0, the data access time value is 1, the data non-access time value is 0, and the access is carried out in a time period>
Figure SMS_78
Secondary, its value is->
Figure SMS_79
In the present embodiment, the data is calculated by
Figure SMS_80
The temperature in time is combined with a threshold value to divide cold and hot data, newton's law of cooling and law of heat radiation are introduced, a temperature attenuation term and a temperature rise term are obtained, the heat degree of the data is reduced according to the physical law along with the time, the relation between the data access interval and the temperature is considered, the absolute temperature of the data is calculated to judge, the influence of the defects caused by the traditional relative temperature judgment is differentiated, the condition of low-frequency low-interval access is taken into consideration of the data temperature judgment, and the accuracy of cold and hot data classification is improved.
In a preferred embodiment, the system further comprises: a workload prediction module.
The workload prediction module determines a predicted temperature of the data in the target time period according to the temperature of the data.
In practical applications, most time series data is accessed in batches by time period. The efficiency of the hierarchical storage scheduling of the time series database can be improved if the probability of data being accessed in the future time can be predicted and the rise and fall of the temperature of the data in the future time can be captured.
In this embodiment, the corresponding training data and labeling data are extracted by using the time stamps generated by the time sequence when the data is stored, so as to realize the prediction of the temperature of the data at the future time.
Still further, the workload prediction module includes: the device comprises a training data extraction unit, a labeling data extraction unit and a temperature prediction unit.
The training data acquisition unit extracts data of a sample time period stored in time series to obtain training data; the labeling data acquisition unit extracts the temperature of the data in the sample time period, and labels the data by utilizing the temperature to obtain labeling data; the temperature prediction unit establishes a temperature prediction model, and trains the temperature prediction model by using training data and labeling data to obtain the predicted temperature of the data in the target time period.
In the present embodiment, the data of the sample period stored in time series and the temperature corresponding to the data are extracted by the training data extraction unit and the labeling data extraction unit, respectively.
Specifically, the expressions of training data extraction and annotation data extraction:
Figure SMS_81
Figure SMS_82
as labeling data +.>
Figure SMS_83
As training data.
After that, the prediction model is trained, and the prediction result is the temperature of the data. In a preferred embodiment, the predictive model uses the LSTM model.
Therefore, the cold and hot data classification module 10 may divide the data into hot data and cold data according to the predicted temperature of the data, predict the temperature change of the future time series data by using machine learning or deep learning technology to assist the cold and hot division of the data, i.e. data scheduling, and migrate the hot data in the future time to the cloud storage device and/or the edge storage device in advance, thereby improving the efficiency and accuracy of the system data call.
After the cold-hot data classification module 10 divides the data into hot data and cold data, the data migration module 20 stores the hot data in the edge device and the transport device in a layered manner in advance, so as to reduce the size of the transmission data. The cold data is stored on the terminal device. At the same time, the workload prediction module may help the system predict the data temperature for the next time period in advance. If the temperature is too high, the data will be pre-heated to the edge and cloud equipment in advance. If the temperature decreases, which proves to be very low in the future, the data is still stored on the terminal device.
In a preferred embodiment, the system further comprises: a data abstract creation module; the data summary creation module includes: and a supercooling data determining unit and a data digest creating unit.
The supercooling data determining unit determines supercooling data in the cold data according to the temperature of the cold data; the data digest creation module creates a data digest for the supercooled data as the supercooled data is migrated to the terminal storage module 30.
In this embodiment, the supercooling data is determined according to the temperature of the supercooling data, and the supercooling data can be judged by setting a supercooling threshold value, and then, a data abstract is created for the supercooling data, so that the storage space of the terminal device can be reduced to the maximum extent.
The invention provides a high-efficiency storage system based on hierarchical storage, which utilizes a scheduling algorithm based on workload prediction to conduct hierarchical storage management on a cloud edge time sequence database. The cloud edge time sequence database can be helped to avoid storage cost caused by storing all data by three parties, and data transmission cost among parties in collaborative query processing is greatly reduced.
The embodiment also provides a high-efficiency storage method based on hierarchical storage, and referring to fig. 2, fig. 2 is a flow chart of the high-efficiency storage method based on hierarchical storage in this embodiment.
In this embodiment, a hierarchical storage-based efficient storage method is used in a hierarchical storage-based efficient storage system according to the foregoing embodiment, where the method includes:
s1: the cold and hot data classification module determines the temperature of the data and divides the data into hot data and cold data according to the temperature of the data;
s2: and the data migration module migrates hot data in the data to the cloud storage module and/or the edge storage module, and migrates cold data in the data to the terminal storage module.
The specific implementation manner of the server access management method is basically the same as that of each embodiment of the server access management system, and is not repeated here.
In describing embodiments of the present invention, it should be understood that the terms "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "center", "top", "bottom", "inner", "outer", "inside", "outside", etc. indicate orientations or positional relationships based on the drawings are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Wherein "inside" refers to an interior or enclosed area or space. "peripheral" refers to the area surrounding a particular component or region.
In the description of embodiments of the present invention, the terms "first," "second," "third," "fourth" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", "a third" and a fourth "may explicitly or implicitly include one or more such feature. In the description of the present invention, unless otherwise indicated, the meaning of "a plurality" is two or more.
In describing embodiments of the present invention, it should be noted that the terms "mounted," "connected," and "assembled" are to be construed broadly, as they may be fixedly connected, detachably connected, or integrally connected, unless otherwise specifically indicated and defined; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
In the description of embodiments of the invention, a particular feature, structure, material, or characteristic may be combined in any suitable manner in one or more embodiments or examples.
In describing embodiments of the present invention, it will be understood that the terms "-" and "-" are intended to be inclusive of the two numerical ranges, and that the ranges include the endpoints. For example, "A-B" means a range greater than or equal to A and less than or equal to B. "A-B" means a range of greater than or equal to A and less than or equal to B.
In the description of embodiments of the present invention, the term "and/or" is merely an association relationship describing an association object, meaning that three relationships may exist, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. An efficient storage system based on tiered storage, comprising:
a cold and hot data classification module;
a data migration module;
a terminal storage module;
the cloud storage module and/or the edge storage module;
the cold and hot data classification module determines the temperature of the data and divides the data into hot data and cold data according to the temperature of the data;
the data migration module migrates hot data in the data to the cloud storage module and/or the edge storage module, and migrates cold data in the data to the terminal storage module.
2. The tiered storage-based efficient storage system of claim 1 wherein the cold and hot data classification module comprises:
a temperature determination unit;
the temperature determining unit acquires a temperature attenuation item and a temperature rise item of data, establishes a temperature model according to the temperature attenuation item and the temperature rise item, and determines the temperature of the data by using the temperature model.
3. The efficient storage system based on hierarchical storage according to claim 2, wherein the expression of the temperature decay term is specifically:
Figure QLYQS_1
wherein->
Figure QLYQS_2
For data in +.>
Figure QLYQS_3
Attenuation temperature at time, < >>
Figure QLYQS_4
For data in +.>
Figure QLYQS_5
Temperature at time->
Figure QLYQS_6
Is a custom cooling rate.
4. The efficient storage system based on hierarchical storage according to claim 3, wherein the expression of the temperature rise term is specifically:
Figure QLYQS_8
;/>
Figure QLYQS_14
;/>
Figure QLYQS_17
wherein->
Figure QLYQS_9
For the elevated temperature of the data, +.>
Figure QLYQS_12
For the introduced heat source temperature, +.>
Figure QLYQS_16
For data volume density>
Figure QLYQS_19
Is an intermediate variable +.>
Figure QLYQS_7
In order to obtain the heat radiation coefficient,
Figure QLYQS_11
for the access interval +.>
Figure QLYQS_15
For accessing the back->
Figure QLYQS_18
Accessible data->
Figure QLYQS_10
Is a self-defined specific heat capacity->
Figure QLYQS_13
Is the data length.
5. The efficient storage system based on hierarchical storage according to claim 4, wherein the expression of the temperature model is specifically:
Figure QLYQS_22
wherein->
Figure QLYQS_24
For data in +.>
Figure QLYQS_26
Temperature at time->
Figure QLYQS_21
For the decay temperature of the data, +.>
Figure QLYQS_23
For the elevated temperature of the data, +.>
Figure QLYQS_25
As a discrete function, the access time is taken as a node, the value of the access time can be 1 or 0, the data access time value is 1, the data non-access time value is 0, and the access is carried out in a time period>
Figure QLYQS_27
Secondary, its value is->
Figure QLYQS_20
6. The tiered storage based efficient storage system of claim 1 further comprising:
a workload prediction module;
the work load prediction module determines the predicted temperature of the data in the target time period according to the temperature of the data.
7. The tiered storage-based efficient storage system of claim 6 wherein the workload prediction module comprises:
a training data extraction unit;
a labeling data extraction unit;
a temperature prediction unit;
the training data acquisition unit extracts data of a sample time period stored in time sequence to obtain training data;
the labeling data acquisition unit extracts the temperature of the data in the sample time period, and labels the data by utilizing the temperature to obtain labeling data;
the temperature prediction unit establishes a temperature prediction model, and trains the temperature prediction model by using training data and labeling data to obtain the predicted temperature of the data in the target time period.
8. The tiered storage-based efficient storage system of claim 7 wherein the cold and hot data classification module classifies data into hot data and cold data based on a predicted temperature of the data.
9. The tiered storage based efficient storage system of claim 1 further comprising: a data abstract creation module; the data summary creation module includes:
a supercooling data determining unit;
a data digest creation unit;
the supercooling data determining unit determines supercooling data in the cold data according to the temperature of the cold data;
and the data abstract creating module creates a data abstract for the supercooled data when the supercooled data is migrated to the terminal storage module.
10. A hierarchical storage based efficient storage method for a hierarchical storage based efficient storage system as claimed in any one of claims 1-9, the method comprising:
s1: the cold and hot data classification module determines the temperature of the data and divides the data into hot data and cold data according to the temperature of the data;
s2: and the data migration module migrates hot data in the data to the cloud storage module and/or the edge storage module, and migrates cold data in the data to the terminal storage module.
CN202310494361.5A 2023-05-05 2023-05-05 Efficient storage system and method based on hierarchical storage Active CN116204138B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310494361.5A CN116204138B (en) 2023-05-05 2023-05-05 Efficient storage system and method based on hierarchical storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310494361.5A CN116204138B (en) 2023-05-05 2023-05-05 Efficient storage system and method based on hierarchical storage

Publications (2)

Publication Number Publication Date
CN116204138A true CN116204138A (en) 2023-06-02
CN116204138B CN116204138B (en) 2023-07-07

Family

ID=86513353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310494361.5A Active CN116204138B (en) 2023-05-05 2023-05-05 Efficient storage system and method based on hierarchical storage

Country Status (1)

Country Link
CN (1) CN116204138B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076523A (en) * 2023-10-13 2023-11-17 北京云成金融信息服务有限公司 Local data time sequence storage method
CN117273131A (en) * 2023-11-22 2023-12-22 四川三合力通科技发展集团有限公司 Cross-node data relationship discovery system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124295A (en) * 2019-12-11 2020-05-08 成都信息工程大学 Agricultural data storage processing system and method based on ternary influence factor
CN112948398A (en) * 2021-04-29 2021-06-11 电子科技大学 Hierarchical storage system and method for cold and hot data
WO2022089218A1 (en) * 2020-10-28 2022-05-05 苏州奇流信息科技有限公司 Machine learning model training method and apparatus, and prediction system
CN114817425A (en) * 2022-06-28 2022-07-29 成都交大大数据科技有限公司 Method, device and equipment for classifying cold and hot data and readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124295A (en) * 2019-12-11 2020-05-08 成都信息工程大学 Agricultural data storage processing system and method based on ternary influence factor
WO2022089218A1 (en) * 2020-10-28 2022-05-05 苏州奇流信息科技有限公司 Machine learning model training method and apparatus, and prediction system
CN112948398A (en) * 2021-04-29 2021-06-11 电子科技大学 Hierarchical storage system and method for cold and hot data
CN114817425A (en) * 2022-06-28 2022-07-29 成都交大大数据科技有限公司 Method, device and equipment for classifying cold and hot data and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杜振伟: "负载感知的基于对象存储的分布式混合存储系统", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 2019, pages 137 - 32 *
杜鹏;: "基于分布式构建HLS协议流媒体文件三层文件存储管理系统", 信息与电脑(理论版), no. 07, pages 64 - 66 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076523A (en) * 2023-10-13 2023-11-17 北京云成金融信息服务有限公司 Local data time sequence storage method
CN117076523B (en) * 2023-10-13 2024-02-09 华能资本服务有限公司 Local data time sequence storage method
CN117273131A (en) * 2023-11-22 2023-12-22 四川三合力通科技发展集团有限公司 Cross-node data relationship discovery system and method
CN117273131B (en) * 2023-11-22 2024-02-13 四川三合力通科技发展集团有限公司 Cross-node data relationship discovery system and method

Also Published As

Publication number Publication date
CN116204138B (en) 2023-07-07

Similar Documents

Publication Publication Date Title
CN116204138B (en) Efficient storage system and method based on hierarchical storage
CN107045531A (en) A kind of system and method for optimization HDFS small documents access
CN105354251B (en) Electric power cloud data management indexing means based on Hadoop in electric system
CN111586091B (en) Edge computing gateway system for realizing computing power assembly
CN104573130A (en) Entity resolution method based on group calculation and entity resolution device based on group calculation
CN106709068A (en) Hotspot data identification method and device
CN102169491B (en) Dynamic detection method for multi-data concentrated and repeated records
CN107291719A (en) A kind of data retrieval method and device, a kind of date storage method and device
CN106605222A (en) Guided data exploration
CN115996249A (en) Data transmission method and device based on grading
CN116450620B (en) Database design method and system for multi-source multi-domain space-time reference data
CN116432998A (en) Big data internal feedback method and system for water quality anomaly monitoring
CN107038193B (en) Text information processing method and device
CN107741968A (en) A kind of method of document retrieval, system, device and computer-readable recording medium
CN115658682A (en) Data storage method, data storage device, computer storage medium and computer program product
CN104794237A (en) Web page information processing method and device
WO2021232442A1 (en) Density clustering method and apparatus on basis of dynamic grid hash index
CN115115107A (en) Photovoltaic power prediction method and device and computer equipment
Ishikawa et al. A data model for integrating data management and data mining in social big data
CN106557469B (en) Method and device for processing data in data warehouse
Hershberger et al. Adaptive sampling for geometric problems over data streams
CN110188301A (en) Information aggregation method and device for website
CN110147443A (en) Topic classification evaluation method and device
CN110597849A (en) Data query method and device
CN117094396B (en) Knowledge extraction method, knowledge extraction device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant