CN105653591B - A kind of industrial real-time data classification storage and moving method - Google Patents

A kind of industrial real-time data classification storage and moving method Download PDF

Info

Publication number
CN105653591B
CN105653591B CN201510969294.3A CN201510969294A CN105653591B CN 105653591 B CN105653591 B CN 105653591B CN 201510969294 A CN201510969294 A CN 201510969294A CN 105653591 B CN105653591 B CN 105653591B
Authority
CN
China
Prior art keywords
data
migration
value
data object
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510969294.3A
Other languages
Chinese (zh)
Other versions
CN105653591A (en
Inventor
徐星
陈鹏
叶莹
王天林
宋丽娜
庄严
周玄昊
俞翔
韩冰
王挺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZHEJIANG SUPCON RESEARCH CO LTD
Original Assignee
ZHEJIANG SUPCON RESEARCH Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZHEJIANG SUPCON RESEARCH Co Ltd filed Critical ZHEJIANG SUPCON RESEARCH Co Ltd
Priority to CN201510969294.3A priority Critical patent/CN105653591B/en
Publication of CN105653591A publication Critical patent/CN105653591A/en
Application granted granted Critical
Publication of CN105653591B publication Critical patent/CN105653591B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of industrial real-time data classification storage and moving method and its systems, this method comprises: regular monitoring hierarchical stor, after the memory capacity utilization rate of upper tier storage devices reaches preset first threshold, trigger data migration is calculated;When migration, value evaluation is carried out to each data object in storage equipment first, the value of each data object is obtained and data object is ranked up according to value;According to sequence setting migration strategy and data object to be migrated is selected, composition migration queue is simultaneously migrated.This method and system are by the difference of storage priority facility and the sorting out value setting migration strategy of current data object, wherein, value calculation considers the time factor for influencing data object transport efficiency, access number of users, other data object situations associated with it, the access contrast of different storage device and data object size itself, so that the method increase the accuracys that data value determines.

Description

Industrial real-time data hierarchical storage and migration method
Technical Field
The invention relates to the technical field of data access and storage processing, in particular to a hierarchical storage and migration method for industrial real-time data.
Background
With the scale expansion of industrial systems and the continuous development of automated information technologies, the application of massive data of industrial automation systems leads to the sudden increase of concurrent access volume of distributed file systems, and the increase of file read-write pressure inevitably needs to consider system bottlenecks caused by file I/O. Meanwhile, many applications in process control have high requirements on the real-time performance of data. Considering that different storage devices have different performances and costs, and data access has time and space locality, hierarchical storage is needed, so that data which is frequently accessed is prone to be stored in a high-performance device, and data which is not frequently read and written in the last access time is placed in a low-performance device. In addition, considering that the data has a periodic change rule, the heat of data access is changed, a considerable proportion of data in the mass storage system is still, and high-performance storage equipment is limited, so that data migration is performed based on a hierarchical storage technology.
With the rapid development of solid state disks such as SSDs and the like and the popularization and application in various fields, the combination of solid state disks for multilevel storage has become a key point of current and future storage research. Compared with the traditional hard disk, the solid state disk has more obvious advantages and disadvantages, can better optimize the performance and energy consumption of a system, and can be used as a flash disk medium of a multi-level storage medium. However, due to their high price, a compromise in performance, cost and energy consumption is made by taking a combination of factors into account.
The earliest conventional tiered storage applications were primarily used in archive backup environments where access was not particularly frequent. However, considering that the performance differences of the devices are different, if the devices with large performance difference and small performance difference adopt the same trigger condition, it is not favorable for the scalability of the system. The method comprises the steps of realizing unified management of files in the multi-stage storage device, respectively setting a metadata module, a metadata server module and a target data server module, wherein the metadata server module is provided with a system management and file migration decision module, manually acquiring migration candidate files, dividing the files into an upgrade queue and a downgrade queue, and sending a migration instruction to perform migration by a migration scheduling controller. And respectively setting a data service module and a migration execution module for the source data server and the target data server. The main defect of the technology is that there is no system and specific method for judging the file migration trigger point, and the file migration is carried out by artificially proposing the file migration proportion, so that all factors influencing the data value judgment cannot be comprehensively and accurately evaluated.
The mass data hierarchical storage technology is mainly used for placing data on different devices according to different performance values of the storage devices and carrying out data migration at proper time. However, these hierarchical storage methods do not fully mine various metrics for the hierarchical and migration policies (data value determination methods), and since the overall performance of the entire hierarchical storage system is directly determined by the data placement and data migration policies, more complete migration policies and data hierarchical storage placement methods are urgently needed to be proposed.
Disclosure of Invention
The invention aims to provide a method for hierarchical storage and migration of industrial real-time data, which aims to solve the problem that the existing technology for hierarchical storage and migration of mass data does not fully judge the data value to influence the data storage and migration performance.
In order to achieve the above object, the present invention provides a method for hierarchical storage and migration of industrial real-time data, which comprises two parts, namely, hierarchical storage of data and hierarchical migration of data, wherein the hierarchical storage of data comprises the following steps:
i: evaluating the value of the data;
II: placing or migrating the data into an appropriate hierarchy according to its value;
the data hierarchical migration comprises the following steps:
s1: the hierarchical storage system is monitored regularly, when the utilization rate of the storage capacity of the high-priority storage equipment reaches a preset first threshold value, data migration calculation is triggered, and the step S2 is executed;
s2: evaluating the value of each data object in the storage equipment to obtain the value of each data object, and sequencing the corresponding data objects according to the value;
s3: selecting data objects with lower value ordering stored in high-priority storage equipment according to a second threshold value with the preset proportion to form a migration queue, and migrating the data objects in the migration queue to low-priority storage equipment;
s4: comparing the remaining data object addresses with lower value rank stored in the high-priority storage device after the execution of step S3 with the data object address currently held in the cache according to a third threshold value, if any one of the data object addresses is already stored in the cache, migrating the data object to the low-priority storage device, otherwise, storing the memory address of the data object in the cache, and so on, and setting the number of data object addresses in all the compared caches to be NbIf N is presentb≤NhIf the migration operation is stopped, N isb>NhSorting the data object addresses stored in the cache from big to small in sequence according to the values corresponding to the data object addresses, and sequentially removing the data object addresses with the largest value until the number of the residual data object addresses is NhThe migration operation is stopped, wherein NhPresetting an upper limit for the address number of the data object in the cache;
s5: and finding the data object with the maximum value in the current cache, and forming a migration queue by all the data objects with the value larger than the maximum value in the low-priority storage device according to the sequence of the values from high to low and moving the data objects to the high-priority storage device.
Preferably, in S2, a sliding window method is adopted to calculate a weighted average of the values calculated at each time in the sliding window, specifically:
setting a given windowThe width is N, and the values of the data objects calculated for the current N times in the window are respectively V1、V2、…VNThen, the value calculation formula of the current data object is as follows:
preferably, in S2, when the value of the data object is evaluated, the value is calculated according to the following formula:
V=w1T+w2C+w3M+w4CT+w5/S
wherein T is a time factor, C is a number of access users factor, M is a value factor of a data object related to the data object, CT is a contrast factor of different storage devices, S is a size factor of the data object, w1、w2、w3、w4And w5Respectively, the weights of the corresponding factors.
Preferably, the method for obtaining the time factor T includes:
acquiring all access accepting moments after the creation of the data object is started: t is t1、t2...tnN is a positive integer;
calculating the time length T of the interval between each visit1、T2...Tn-1And then:
Ti=ti+1-ti i=1,2,...,n-1
and calculating T:
wherein, αiN-1 is a predetermined set of i-1, 2And satisfyAnd α1≤α2≤...≤αn-1
Preferably, for any one data object, its associated data object is defined as follows:
setting a time length threshold value as TthAny t0Time of day data object obj1Accessed, then at t0+TthWithin a time interval, the data object obj2Also accessed, the data object obj is considered1And obj2Are associated.
Preferably, the data object obj1The value factor Q of the related data object is obtained as follows:
find and data object obj1Associated set of data objects Φ (obj)1);
Find phi (obj)1) The value of all data objects in;
for and data object obj1The values of all the associated data objects are summed as follows:
Vobjis a value record for the data object obj.
Preferably, the data object is divided into m segments from the time of creation to the current time, and the contrast factors CT of the different storage devices are calculated according to the following formula:
wherein FWiAnd FRiIndicating the read and write frequency of the data object during the ith period of time, βiRepresents a weighted weight of the ith period of time, and β1<β2<...<βm,δrFor read contrast, delta, between two different memory deviceswIs the write contrast between two different storage devices.
Preferably, the read contrast is set to δ for the different memory devices A and BrWrite contrast of deltawThen, there are:
wherein R isA、RBThe speed of continuous reading data on two devices with different performances of A and B is respectively represented, WA、WBSpeed at which data is written for the corresponding duration.
Preferably, the first threshold is 80%, the second threshold is 10%, and the third threshold is 10%.
The invention also provides a hierarchical storage and migration system for industrial real-time data, which comprises:
the hierarchical storage system comprises a plurality of storage devices with different priorities for storing data objects, and also comprises a cache, wherein the cache is used for storing the addresses of the data objects with lower value in the storage devices with high priority, the data objects dynamically change along with the migration process, and when the data are migrated from high to low, if any selected data object address is stored in the cache, the data object is migrated to the storage device with low priority;
the value judgment manager is used for acquiring the data objects in the hierarchical storage system in real time and calculating the value of the data objects;
the data placement plan manager is used for acquiring the value from the value judgment manager, selecting the data objects to be migrated according to the value result to form a migration queue, and forming a data placement plan and a migration strategy;
the migration engine controller is used for acquiring the data placement plan and the migration strategy, sending a migration command to an application server agent and registering the application server agent;
the application server agent is used for registering to the migration engine controller during initialization and receiving the migration command to forward to the corresponding data migration/back-migration module;
and the data migration/migration module is respectively arranged between every two storage devices with different priorities and used for performing data migration or data migration according to the migration command and feeding back the migration result to the migration engine controller.
Preferably, the migration engine controller includes:
the data monitoring module is used for monitoring and recording the updating condition and the value change of the data object, feeding back the updating condition to the value judgment manager, and monitoring the I/O access condition of the system;
and the data management module is used for regularly inquiring the data placement plan manager so as to update data information, send the migration command and receive the migration result.
The industrial real-time data hierarchical storage and migration system and method provided by the invention realize the following technical effects:
(1) the invention adopts an industrial automation system data hierarchical storage architecture, and provides a complete and reasonable complete set of a system of a physical structure and a logical structure for migration.
(2) The invention adopts a data value judgment method aiming at the storage requirement of mass industrial real-time data, the method adopts a value index function, introduces a group of weight parameters to carry out quantitative analysis on the influence factors including time, the number of users asked for, the degree of association with other data, the value of associated data, the I/O access contrast of different storage devices, the size of a data object and the like, and adopts a sliding window method to carry out dynamic and sufficient judgment on the data value, thereby improving the accuracy of data value judgment.
(3) The invention adopts a data dynamic migration strategy, namely, a cache region is added in a traditional data migration mechanism to serve as an undetermined region before migration of a part of data objects with lower value on high-performance equipment, if the value of the data objects is still lower in the second value judgment, a migration event is triggered, and meanwhile, the maximum value of the data objects in the cache region is also used as an upward migration threshold value of the data objects of the low-performance equipment. By adopting the mechanism, the value of the data object on the high-performance equipment can be dynamically evaluated, and the repeated migration of the data object between the high-performance equipment and the low-performance equipment can be effectively inhibited.
Drawings
FIG. 1 is a schematic diagram of an industrial real-time data hierarchical storage and migration system architecture according to the present invention;
fig. 2 is a flowchart of a method for hierarchical storage and migration of industrial real-time data according to the present invention.
Detailed Description
To better illustrate the present invention, a preferred embodiment is described in detail with reference to the accompanying drawings, in which:
the industrial real-time data hierarchical storage and migration system provided by the invention is applied to the current general industrial data storage system, the storage system is used for storing industrial mass real-time data, and the data can be transmitted through an SAN network or an IP network to store the data.
Specifically, as shown in fig. 1, the system for hierarchical storage and migration of industrial real-time data provided in this embodiment includes: the hierarchical storage system 10 (which includes a plurality of storage devices with different priorities, as shown in fig. 1, the hierarchical storage system in this embodiment includes a primary device 11, a secondary device 12, and a tertiary device 13, where the primary device 11 belongs to a higher priority device than the secondary device 12, and the secondary device 12 belongs to a higher priority device than the tertiary device 13), a value determination manager 20, a data placement plan manager 30, a migration engine controller 40, an application server agent 50, and a data migration/migration module 60. In this embodiment, the data migration/migration module 60 includes a first migration/migration module 61 located between the primary device 11 and the secondary device 12, and a second migration/migration module 62 located between the secondary device 12 and the tertiary device 13.
The hierarchical storage system 10 further includes a cache. The cache in this embodiment is a high performance cache, such as a cache. The cache is used for storing addresses of data objects with low value in the storage device with high priority, the data objects dynamically change along with the migration process, and the data objects belong to a pending category. When data migration from high to low occurs, if any one selected data object address is stored in the cache, the data object is migrated to the low-priority storage device.
In this embodiment, the high priority storage device refers to a high performance storage device, for example: a solid state disk; while a low priority storage device refers to a low performance storage device, and is referred to as a relatively high priority storage device, for example: sas hard disks or sata hard disks. Of course, a person skilled in the art can freely select the types of the high-priority storage device and the low-priority storage device according to the data storage requirement, as long as the difference between the data reading and writing performance of the storage devices with different priorities is satisfied, thereby affecting the storage efficiency of different data objects. The method and system of the present invention can thus be applied to any situation where optimization is required based on such differences in storage devices and data objects.
When the industrial real-time data hierarchical storage and migration system works, the value judgment manager 20 acquires data objects in the hierarchical storage system in real time, calculates the values of the data objects, and sends value information to the data placement plan manager 30; after obtaining the value information and analyzing and balancing the quality of the value evaluation method according to the value result, the data placement plan manager 30 selects the data objects to be migrated to form a migration queue, and forms a data placement plan and a migration strategy to provide to the migration engine controller 40; the migration engine controller 40 sends a migration command to the application server agent according to the content thereof after acquiring the data placement plan and the migration policy, and also registers the application server agent 50 at the time of initialization; application server agent 50 receives migration commands from migration engine controller 40 to forward to the corresponding data migration/migration module; the data migration/migration module 60 performs data migration or migration according to the migration command corresponding thereto, and feeds back the migration result to the migration engine controller.
The migration engine controller 40 specifically includes:
the data monitoring module is used for monitoring and recording the updating condition and the value change of the data object, feeding back the updating condition to the value judgment manager, and monitoring the I/O access condition of the system;
and the data management module is used for regularly inquiring the data placement plan manager so as to update data information, send the migration command and receive the migration result.
The invention provides an industrial real-time data hierarchical storage and migration method, which comprises two parts of data hierarchical storage and data hierarchical migration, wherein the data hierarchical storage comprises the following steps:
i: evaluating the value of the data;
II: place or migrate the data into the appropriate hierarchy depending on its value.
Specifically, those skilled in the art perform hierarchical storage of data according to the data value obtained in step I and a current general hierarchical storage architecture, for example, place the data value into appropriate hierarchies, such as the primary device 11, the secondary device 12, and the tertiary device 13, according to the matching degree between the data value and storage devices with different priorities. The storage mode can give consideration to both the performance and the economy of the storage system, and can fully consider the value attribute of the data object.
As shown in fig. 2, the data migration method includes the following steps, and the data migration occurs between two storage devices with different performances. Without loss of generality, the following takes as an example a migration process between the primary device 11 and the secondary device 12, where the primary device 11 is a high-priority storage device:
s1: the hierarchical storage system is monitored regularly, when the utilization rate of the storage capacity of the advanced storage equipment reaches a preset first threshold value, data migration calculation is triggered, and the step S2 is executed;
s2: evaluating the value of each data object in the storage equipment to obtain the value of each data object, and sequencing the corresponding data objects according to the value;
s3: selecting data objects with lower value ordering stored in the high-priority storage device 11 according to a second threshold value with a preset ratio to form a migration queue, and migrating the data objects in the migration queue to the low-priority storage device;
s4: comparing the addresses of the remaining data objects with lower value rank stored in the high-priority storage device after the step S3 is executed with the address of the data object currently held in the cache according to a third threshold, if any address of the data object is already stored in the cache, migrating the data object to the low-priority storage device, otherwise, migrating the data objectThe memory address is stored in the cache, and so on, the number of the data object addresses in the cache after all comparison is finished is set as NbIf N is presentb≤NhIf the migration operation is stopped, N isb>NhSorting the data object addresses stored in the cache from big to small in sequence according to the values corresponding to the data object addresses, and sequentially removing the data object addresses with the largest value until the number of the residual data object addresses is NhThe migration operation is stopped, wherein NhPresetting an upper limit for the address number of the data object in the cache;
s5: and finding the data object with the maximum value in the current cache, and forming a migration queue by all the data objects with the value larger than the maximum value in the low-performance equipment according to the sequence of the values from high to low and moving the data objects to the high-performance equipment.
That is, each time the system migrates, after the data objects in the storage device with high priority are sorted from high value to low value, the data objects occupying the current capacity of the storage device with the percentage of the data objects with the preset second threshold value from the side with the lowest value in the sorting are directly migrated to the storage device with low priority. And considering the remaining data objects again, wherein the percentage of the side with the lowest value in the remaining sorting accounting for the current data volume of the storage device is the third threshold. The considered standard is to see whether the addresses of the data objects are the existing data addresses in the current cache, that is, whether the data objects have been placed in the cache in the previous data migration, if so, it indicates that the data objects have low value, and the data objects corresponding to the addresses of the data objects are required to be adjusted from the high-priority storage device to the low-priority storage device when the data objects are within the third threshold range at least during the second migration. And for the data object which is in the third threshold range and has no corresponding address in the cache, storing the address of the data object into the cache, and taking the data object to be migrated subsequently as reference. And the data object with the maximum value corresponding to the cached address is used as a reference for migrating the data object stored in the low-priority storage device.
Further, in S2, the weighted average of the values calculated at each time in the sliding window is calculated by using a sliding window method, specifically:
setting the width of a given window as N, and respectively setting the values of the data objects calculated for the current N times in the window as V1、V2、…VNThen, the value calculation formula of the current data object is as follows:
and the Vc is the value of the final data object and is transmitted to the data placement plan manager and a subsequent data migration/migration module for reference and use during data dynamic migration.
In S2, when the value of the data object is evaluated, the value is calculated according to the following formula:
V=w1T+w2C+w3M+w4CT+w5/S
wherein T is a time factor, C is a number of access users factor, M is a value factor of a data object related to the data object, CT is a contrast factor of different storage devices, S is a size factor of the data object, w1、w2、w3、w4And w5Respectively, the weights of the corresponding factors. Specifically, the method comprises the following steps:
1) the method for acquiring the time factor T comprises the following steps:
first, all the access accepting times after the creation of the data object is acquired: t is t1、t2...tnN is a positive integer;
then, the time length T of the interval between the power positions is calculated1、T2...Tn-1And then:
Ti=ti+1-ti i=1,2,...,n-1
finally, calculating to obtain a time factor T:
wherein, αiN-1 is a set of predetermined weight values, and satisfies the following conditionsAnd α1≤α2≤...≤αn-1. Since the access time characteristic of the recorded data may change from creation to the current time, the time lengths of the last times are counted into the average value with larger weight, so that the obtained result T is more consistent with the current actual situation.
2) The analysis of the relevance factor between the data objects requires finding all the data objects relevant to the data object. For any data object, the related data object is defined as follows:
setting a time length threshold value as TthAny t0Time of day data object obj1Accessed, then at t0+TthWithin a time interval, the data object obj2Also accessed, the data object obj is considered1And obj2Are associated.
Then the data object obj1The value factor Q of the related data object is obtained as follows:
first, find any data object obj1Associated set of data objects Φ (obj)1);
Second, find Φ (obj)1) The value of all data objects in;
finally, the data object obj is paired with1The values of all the associated data objects are summed as follows:
Vobjis a record of the value of the data object obj in the migration engine controller.
3) The value of the number of access users C is then directly available from the migration engine controller.
4) The I/O access contrast calculation method of different storage devices is as follows, because the read-write speed is different even if the same device is used, the read-write contrast needs to be considered separately:
let the read contrast be δ for different memory devices A and BrWrite contrast of deltawThen, there are:
wherein R isA、RBThe speed of continuous reading data on two devices with different performances of A and B is respectively represented, WA、WBSpeed at which data is written for the corresponding duration.
The I/O access contrast of different storage devices is also related to the I/O access frequency, and the recent access frequency closer to the current time is weighted more heavily into the I/O access contrast calculation. Therefore, in the present embodiment, the data object is divided into m segments from the time interval established to the current time interval, and the contrast factors CT of different storage devices are calculated according to the following formula:
wherein FWiAnd FRiIndicating the read and write frequency of the data object during the ith period of time, βiRepresents a weighted weight of the ith period of time, and β1<β2<...<βm,δrFor read contrast, delta, between two different memory deviceswIs the write contrast between two different storage devices.
5) The value of the data object size factor S may be obtained directly from the migration engine controller.
Preferably, the first threshold value in this embodiment is 80%, the second threshold value is 10%, and the third threshold value is 10%. Of course, in other hierarchical storage systems, the magnitudes of the first threshold, the second threshold, and the third threshold may be set to other suitable values as needed.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be able to make modifications or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (9)

1. The method for hierarchical storage and migration of industrial real-time data is characterized by comprising two parts, namely hierarchical storage of the data and hierarchical migration of the data, wherein the hierarchical storage of the data comprises the following steps:
i: evaluating the value of the data;
II: placing or migrating the data into an appropriate hierarchy according to its value;
the data hierarchical migration comprises the following steps:
s1: the hierarchical storage system is monitored regularly, when the utilization rate of the storage capacity of the high-priority storage equipment reaches a preset first threshold value, data migration calculation is triggered, and the step S2 is executed;
s2: evaluating the value of each data object in the storage equipment to obtain the value of each data object, and sequencing the corresponding data objects according to the value;
by adopting a sliding window method, calculating a weighted average value of values obtained by calculation at each moment in the sliding window, specifically:
setting the width of a given window as N, and respectively setting the values of the data objects calculated for the current N times in the window as V1、V2、…VNThen, the value calculation formula of the current data object is as follows:
wherein λiIs a weighted weight;
or,
the value is calculated according to the following formula: v ═ w1T+w2C+w3M+w4CT+w5/S
Wherein T is a time factor, C is a number of access users factor, M is a value factor of a data object related to the data object, CT is a contrast factor of different storage devices, S is a size factor of the data object, w1、w2、w3、w4And w5Respectively the weight of each corresponding factor;
s3: selecting data objects with lower value ordering stored in high-priority storage equipment according to a second threshold value with the ratio of occupation being preset to form a migration queue, and migrating the data objects in the migration queue to low-priority storage equipment, wherein the lower data objects refer to the data objects which occupy the current capacity of the storage equipment at the side where the lowest value is located in the ordering and have the percentage of the current capacity of the storage equipment being the preset second threshold value;
s4: sorting the remaining data objects stored in the high priority storage device after the execution of step S3 by a third threshold value having a predetermined percentage of total valueComparing the address with the address of the data object currently held in the cache, wherein the lower data object refers to the data object which occupies the current capacity of the storage device by the percentage of the lowest value side in the residual sorting as a preset third threshold, if the address of any one data object is stored in the cache, the data object is migrated to the low-priority storage device, otherwise, the memory address of the data object is stored in the cache, and so on, and the number of the data object addresses in the cache after all the comparisons is set to be NbIf N is presentb≤NhIf the migration operation is stopped, N isb>NhSorting the data object addresses stored in the cache from big to small in sequence according to the values corresponding to the data object addresses, and sequentially removing the data object addresses with the largest value until the number of the residual data object addresses is NhThe migration operation is stopped, wherein NhPresetting an upper limit for the address number of the data object in the cache;
s5: and finding the data object with the maximum value in the current cache, and forming a migration queue by all the data objects with the value larger than the maximum value in the low-priority storage device according to the sequence of the values from high to low and moving the data objects to the high-priority storage device.
2. The method for hierarchical storage and migration of industrial real-time data according to claim 1, wherein the time factor T is obtained by:
acquiring all access accepting moments after the creation of the data object is started: t is t1、t2…tnN is a positive integer;
calculating the time length T of the interval between each visit1、T2…Tn-1And then:
Ti=ti+1-ti i=1,2,...,n-1
and calculating T:
wherein, αiN-1 is a set of predetermined weight values, and satisfies the following conditionsAnd α1≤α2≤...≤αn-1
3. The method for hierarchical storage and migration of industrial real-time data according to claim 1, wherein for any data object, the related data object is defined as follows:
setting a time length threshold value as TthAny t0Time of day data object obj1Accessed, then at t0+TthWithin a time interval, the data object obj2Also accessed, the data object obj is considered1And obj2Are associated.
4. The industrial real-time data hierarchical storage and migration method according to claim 1 or 3, characterized in that the data objects obj1The value factor M of the related data object is obtained as follows:
find and data object obj1Associated set of data objects Φ (obj)1);
Find phi (obj)1) The value of all data objects in;
for and data object obj1The values of all the associated data objects are summed as follows:
Vobjis a value record for the data object obj.
5. The method for hierarchical storage and migration of industrial real-time data according to claim 1, wherein the time interval from establishment to the current time of the data object is divided into m segments, and then the contrast factors CT of different storage devices are calculated according to the following formula:
wherein FWiAnd FRiIndicating the read and write frequency of the data object during the ith period of time, βiRepresents a weighted weight of the ith period of time, and β12<…<βm,δrFor read contrast, delta, between two different memory deviceswIs the write contrast between two different storage devices.
6. The method for hierarchical storage and migration of industrial real-time data according to claim 1 or 5, wherein the read contrast is set to δ for different storage devices A and BrWrite contrast of deltawThen, there are:
wherein R isA、RBThe speed of continuous reading data on two devices with different performances of A and B is respectively represented, WA、WBSpeed at which data is written for the corresponding duration.
7. The method for hierarchical storage and migration of industrial real-time data according to claim 1, wherein the first threshold value is 80%, the second threshold value is 10%, and the third threshold value is 10%.
8. An industrial real-time data hierarchical storage and migration system, comprising:
the hierarchical storage system comprises a plurality of storage devices with different priorities for storing data objects, and a cache, wherein the cache is used for storing addresses of data objects with lower values in the storage devices with high priorities, the lower data objects refer to data objects which occupy the current capacity of the storage devices with the percentage of the lowest value in the sequence as a preset threshold, the lower data objects dynamically change along with the migration process, and when data migration from high to low occurs, if any selected data object address is stored in the cache, the data object is migrated to the storage device with low priority;
the value judgment manager is used for acquiring the data objects in the hierarchical storage system in real time and calculating the values of the data objects, and specifically comprises the following steps:
by adopting a sliding window method, calculating a weighted average value of values obtained by calculation at each moment in the sliding window, specifically:
setting the width of a given window as N, and respectively setting the values of the data objects calculated for the current N times in the window as V1、V2、…VNThen, the value calculation formula of the current data object is as follows:
wherein λiIs a weighted weight;
or,
the value is calculated according to the following formula:
V=w1T+w2C+w3M+w4CT+w5/S
wherein T is a time factor, C is a number of access users factor, M is a value factor of a data object related to the data object, CT is a contrast factor of different storage devices, S is a size factor of the data object, w1、w2、w3、w4And w5Respectively the weight of each corresponding factor;
the data placement plan manager is used for acquiring the value from the value judgment manager, selecting the data objects to be migrated according to the value result to form a migration queue, and forming a data placement plan and a migration strategy;
the migration engine controller is used for acquiring the data placement plan and the migration strategy, sending a migration command to an application server agent and registering the application server agent;
the application server agent is used for registering to the migration engine controller during initialization and receiving the migration command to forward to the corresponding data migration/back-migration module;
and the data migration/migration module is respectively arranged between every two storage devices with different priorities and used for performing data migration or data migration according to the migration command and feeding back the migration result to the migration engine controller.
9. The industrial real-time data staging storage and migration system according to claim 8, wherein the migration engine controller includes:
the data monitoring module is used for monitoring and recording the updating condition and the value change of the data object, feeding back the updating condition to the value judgment manager, and monitoring the I/O access condition of the system;
and the data management module is used for regularly inquiring the data placement plan manager so as to update data information, send the migration command and receive the migration result.
CN201510969294.3A 2015-12-22 2015-12-22 A kind of industrial real-time data classification storage and moving method Active CN105653591B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510969294.3A CN105653591B (en) 2015-12-22 2015-12-22 A kind of industrial real-time data classification storage and moving method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510969294.3A CN105653591B (en) 2015-12-22 2015-12-22 A kind of industrial real-time data classification storage and moving method

Publications (2)

Publication Number Publication Date
CN105653591A CN105653591A (en) 2016-06-08
CN105653591B true CN105653591B (en) 2019-02-05

Family

ID=56477595

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510969294.3A Active CN105653591B (en) 2015-12-22 2015-12-22 A kind of industrial real-time data classification storage and moving method

Country Status (1)

Country Link
CN (1) CN105653591B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918448A (en) * 2019-03-06 2019-06-21 电子科技大学 A kind of cloud storage data classification method based on user behavior

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3125056B1 (en) * 2015-07-30 2021-04-07 Siemens Aktiengesellschaft System and method for control and/or analysis of an industrial process
CN106293509A (en) * 2016-07-12 2017-01-04 乐视控股(北京)有限公司 Date storage method and system
US20180246659A1 (en) * 2017-02-28 2018-08-30 Hewlett Packard Enterprise Development Lp Data blocks migration
CN106990924A (en) * 2017-04-07 2017-07-28 广东浪潮大数据研究有限公司 A kind of data migration method and system
CN108052278A (en) * 2017-10-09 2018-05-18 清华大学 The storage controlling method and storage system of electron microscopic data
CN108491165A (en) * 2018-03-27 2018-09-04 中国农业银行股份有限公司 A kind of data migration method and system for being classified storage
CN108920344B (en) * 2018-05-11 2021-11-12 南京南瑞继保电气有限公司 Storage method and device and computer readable storage medium
CN108874311B (en) * 2018-05-29 2022-02-08 北京盛和大地数据科技有限公司 Data migration method and device in converged storage system
CN109582223B (en) 2018-10-31 2023-07-18 华为技术有限公司 Memory data migration method and device
CN110941513B (en) * 2019-11-22 2022-03-22 浪潮电子信息产业股份有限公司 Data reconstruction method and related device
CN111367469B (en) * 2020-02-16 2022-07-08 苏州浪潮智能科技有限公司 Method and system for migrating layered storage data
CN112261097B (en) * 2020-10-15 2023-11-24 科大讯飞股份有限公司 Object positioning method for distributed storage system and electronic equipment
CN112860188A (en) * 2021-02-09 2021-05-28 山东英信计算机技术有限公司 Data migration method, system, device and medium
CN112685454A (en) * 2021-03-10 2021-04-20 江苏金恒信息科技股份有限公司 Industrial data hierarchical storage system and method and industrial data hierarchical query method
CN113342781B (en) * 2021-06-29 2023-07-11 深圳前海微众银行股份有限公司 Data migration method, device, equipment and storage medium
CN113741819B (en) * 2021-09-15 2024-09-27 第四范式(北京)技术有限公司 Method and device for hierarchical storage of data
CN114611572B (en) * 2022-01-28 2024-05-14 北京工商大学 Data hierarchical storage algorithm based on improved RBF neural network
CN115062061A (en) * 2022-06-27 2022-09-16 艾象科技(深圳)股份有限公司 Commodity ERP management system and ERP management method
CN117193656A (en) * 2023-02-27 2023-12-08 自然资源部信息中心 Data hierarchical storage migration flow method
CN118330019B (en) * 2024-06-14 2024-08-16 北京蒂川国际能源服务有限公司 Pipeline magnetic flux leakage internal detection method and system based on mini monomer electron magnetic flux leakage device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101101563A (en) * 2007-07-23 2008-01-09 清华大学 Migration management based on massive data classified memory system
CN104598489A (en) * 2013-10-31 2015-05-06 大连易维立方技术有限公司 Method for updating book information based on crawler strategy of specialized search engine

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7177883B2 (en) * 2004-07-15 2007-02-13 Hitachi, Ltd. Method and apparatus for hierarchical storage management based on data value and user interest
WO2006131978A1 (en) * 2005-06-10 2006-12-14 Fujitsu Limited Hsm control program, device, and method
WO2013097119A1 (en) * 2011-12-28 2013-07-04 华为技术有限公司 Method and device for realizing multilevel storage in file system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101101563A (en) * 2007-07-23 2008-01-09 清华大学 Migration management based on massive data classified memory system
CN104598489A (en) * 2013-10-31 2015-05-06 大连易维立方技术有限公司 Method for updating book information based on crawler strategy of specialized search engine

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
海量信息分级存储数据迁移策略研究;吕帅 等;《计算机工程与科学》;20091015;第31卷(第A1期);163-167
面向海量存储系统的分层存储技术研究;李俊杰;《中国优秀硕士学位论文全文数据库 信息科技辑》;20131215(第 S2 期);I137-100

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918448A (en) * 2019-03-06 2019-06-21 电子科技大学 A kind of cloud storage data classification method based on user behavior

Also Published As

Publication number Publication date
CN105653591A (en) 2016-06-08

Similar Documents

Publication Publication Date Title
CN105653591B (en) A kind of industrial real-time data classification storage and moving method
US8429346B1 (en) Automated data relocation among storage tiers based on storage load
US8732148B2 (en) Assigning data for storage based on a frequency with which the data is accessed
CN104115134B (en) For managing the method and system to be conducted interviews to complex data storage device
US10140034B2 (en) Solid-state drive assignment based on solid-state drive write endurance
US11055224B2 (en) Data processing apparatus and prefetch method
US9274941B1 (en) Facilitating data migration between tiers
US20140115261A1 (en) Apparatus, system and method for managing a level-two cache of a storage appliance
US9612758B1 (en) Performing a pre-warm-up procedure via intelligently forecasting as to when a host computer will access certain host data
Cao et al. Sliding {Look-Back} Window Assisted Data Chunk Rewriting for Improving Deduplication Restore Performance
WO2013152678A1 (en) Method and device for metadata query
WO2018113317A1 (en) Data migration method, apparatus, and system
CN109491616B (en) Data storage method and device
CN111427844A (en) Data migration system and method for file hierarchical storage
CN104503703B (en) The treating method and apparatus of caching
WO2023207562A1 (en) Data processing method and apparatus, and device
US20170060472A1 (en) Transparent hybrid data storage
EP3859536B1 (en) Method and device for buffering data blocks, computer device, and computer-readable storage medium
CN106355031A (en) Data value degree calculation method based on analytic hierarchy process
CN108829343B (en) Cache optimization method based on artificial intelligence
CN111367469A (en) Layered storage data migration method and system
US20150212744A1 (en) Method and system of eviction stage population of a flash memory cache of a multilayer cache system
CN110858210A (en) Data query method and device
CN115858510A (en) Method for evaluating data temperature and performing dynamic storage management and storage medium
KR101686346B1 (en) Cold data eviction method using node congestion probability for hdfs based on hybrid ssd

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20190904

Address after: 316000 No. 3 307-6, No. 10, West Xingzhou Avenue, Yancang Street, Dinghai District, Zhoushan City, Zhejiang Province

Patentee after: ZHEJIANG ZHOUSHAN TO CONTROL INTELLIGENT EQUIPMENT TECHNOLOGY Co.,Ltd.

Address before: 310053 Hangzhou Province, Binjiang District Province, No. six and No. 309 Road, the center of science and Technology Park (high tech Zone) ()

Patentee before: ZHEJIANG SUPCON RESEARCH Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240531

Address after: No. 30053, zhongkong Science Park, Hangzhou, Zhejiang Province

Patentee after: ZHEJIANG SUPCON RESEARCH Co.,Ltd.

Country or region after: China

Address before: 316000 307-6, third floor, No. 10, west section of Xingzhou Avenue, Yancang street, Dinghai District, Zhoushan City, Zhejiang Province

Patentee before: ZHEJIANG ZHOUSHAN TO CONTROL INTELLIGENT EQUIPMENT TECHNOLOGY Co.,Ltd.

Country or region before: China

TR01 Transfer of patent right