CN113760176A - Data storage method and device - Google Patents

Data storage method and device Download PDF

Info

Publication number
CN113760176A
CN113760176A CN202011402506.7A CN202011402506A CN113760176A CN 113760176 A CN113760176 A CN 113760176A CN 202011402506 A CN202011402506 A CN 202011402506A CN 113760176 A CN113760176 A CN 113760176A
Authority
CN
China
Prior art keywords
data
cold
hot
preset
migrated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011402506.7A
Other languages
Chinese (zh)
Inventor
罗金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202011402506.7A priority Critical patent/CN113760176A/en
Publication of CN113760176A publication Critical patent/CN113760176A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms

Abstract

The invention discloses a data storage method and device, and relates to the technical field of computers. One embodiment of the method comprises: monitoring used data in a preset statistical period, and acquiring use detail information of the data in the statistical period; when the use detail information of any cold data in the used data accords with the cold data migration condition, determining the cold data as cold data to be migrated; when the use detail information of any hot data in the used data meets the hot data migration condition and the storage amount of the current second storage space is larger than the storage amount threshold value, determining the hot data as hot data to be migrated; and migrating the cold data to be migrated to the second storage space, and migrating the hot data to be migrated to the first storage space. The embodiment can dynamically adjust the cold and hot labels of the data according to the use condition of the cold data and the hot data and the current storage amount of the hot data storage space so as to execute migration.

Description

Data storage method and device
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data storage method and apparatus.
Background
In the current data storage system, a cold and hot data isolation storage scheme is generally adopted, when data is stored, the data is divided into cold data and hot data according to a preset rule (for example, data with an update time exceeding one year is determined as cold data), and the cold data and the hot data are respectively stored in corresponding storage spaces, so that the centralized storage of the cold data and the improvement of the hot data access efficiency and the calculation efficiency are realized. According to the storage scheme, cold and hot data can be rigidly distinguished only according to a preset rule, and when the use frequency of the cold data is suddenly increased at a certain time point, the use frequency of the hot data is suddenly reduced or the storage capacity of the hot data is suddenly increased, the unexpected conditions of low cold data query efficiency, insufficient hot data storage space and the like can occur, so that the business requirements cannot be efficiently supported and adapted.
Disclosure of Invention
In view of this, embodiments of the present invention provide a data storage method and apparatus, which can dynamically adjust a cold tag and a hot tag of data according to usage conditions of cold data and hot data and a current storage amount of a hot data storage space to perform migration, so as to solve problems in the prior art, such as low cold data query efficiency and insufficient hot data storage space, caused by an inability to flow between the cold data and the hot data.
To achieve the above object, according to one aspect of the present invention, a data storage method is provided.
The data storage method of the embodiment of the invention comprises the following steps: monitoring used data in a preset statistical period to acquire use detail information of the data in the statistical period; wherein the data comprises cold data stored in a first storage space and hot data stored in a second storage space, and the use detail information of each data comprises the use times and use time of the data in the counting period; when the use detail information of any cold data in the used data meets a preset cold data migration condition, determining the cold data as cold data to be migrated; when the use detail information of any hot data in the used data meets a preset hot data migration condition and the storage amount of the current second storage space is larger than a preset storage amount threshold value, determining the hot data as hot data to be migrated; and migrating the cold data to be migrated to the second storage space, and migrating the hot data to be migrated to the first storage space.
Optionally, the monitoring data used in the preset statistical period includes: monitoring the query request and the return result thereof and/or the update request and the return result thereof in the statistical period, and determining the unique identifier of the queried and/or updated data; the unique identification of the data is formed according to the corresponding library name, the table name and the primary key of the data.
Optionally, the usage detail information of any cold data in the used data conforms to a preset cold data migration condition, and the usage detail information includes: the number of uses of the cold data is greater than a preset number of uses threshold.
Optionally, the usage detail information of any hot data in the used data conforms to a preset hot data migration condition, and the usage detail information includes: the most recent usage time of the thermal data is longer than a preset time from the current time.
Optionally, the usage detail information of any data further includes a migration weight of the data, the migration weight being the sum of the first weight and the second weight; the first weight is the ratio of the usage period of any data to the average usage period of the used data, the second weight is the ratio of the number of times of using any data to the average number of times of using the used data, and the usage period of any data is the time length of the data between the earliest usage time and the latest usage time of the statistical period.
Optionally, the usage detail information of any cold data in the used data conforms to a preset cold data migration condition, and the usage detail information includes: the migration weight of the cold data is greater than a preset first weight threshold; the use detail information of any hot data in the used data conforms to a preset hot data migration condition, and the method comprises the following steps: the migration weight of the hot data is smaller than a preset second weight threshold.
To achieve the above object, according to another aspect of the present invention, there is provided a data storage device.
The data storage device of the embodiment of the invention can comprise: the bypass monitoring unit is used for monitoring the used data in a preset statistical period; the cold and hot marking unit is used for acquiring the use detail information of the data in the statistical period; wherein the data comprises cold data stored in a first storage space and hot data stored in a second storage space, and the use detail information of each data comprises the use times and use time of the data in the counting period; an arbitration unit to: when the use detail information of any cold data in the used data meets a preset cold data migration condition, determining the cold data as cold data to be migrated; when the use detail information of any hot data in the used data meets a preset hot data migration condition and the storage amount of the current second storage space is larger than a preset storage amount threshold value, determining the hot data as hot data to be migrated; and the data migration unit is used for migrating the cold data to be migrated to the second storage space and migrating the hot data to be migrated to the first storage space.
Optionally, the bypass monitoring unit is further configured to: monitoring the query request and the return result thereof and/or the update request and the return result thereof in the statistical period, and determining the unique identifier of the queried and/or updated data; the unique identification of the data is formed according to the library name, the table name and the primary key corresponding to the data; the arbitration unit is further configured to: when the using times of any cold data in the used data are larger than a preset using time threshold value, determining the cold data as cold data to be migrated; when the time length of the latest using time of any one piece of thermal data in the used data from the current time length is greater than the preset time length and the storage amount of the current second storage space is greater than the preset storage amount threshold value, determining the thermal data as thermal data to be migrated; or when the migration weight of any cold data in the used data is greater than a preset first weight threshold, determining the cold data as cold data to be migrated; when the migration weight of any hot data in the used data is smaller than a preset second weight threshold and the storage amount of the current second storage space is larger than the storage amount threshold, determining the hot data as hot data to be migrated; wherein the usage detail information of any data further comprises a migration weight of the data, and the migration weight is the sum of the first weight and the second weight; the first weight is the ratio of the usage period of any data to the average usage period of the used data, the second weight is the ratio of the number of times of using any data to the average number of times of using the used data, and the usage period of any data is the time length of the data between the earliest usage time and the latest usage time of the statistical period.
To achieve the above object, according to still another aspect of the present invention, there is provided an electronic apparatus.
An electronic device of the present invention includes: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the data storage method provided by the invention.
To achieve the above object, according to still another aspect of the present invention, there is provided a computer-readable storage medium.
A computer-readable storage medium of the present invention has stored thereon a computer program which, when executed by a processor, implements the data storage method provided by the present invention.
According to the technical scheme of the invention, the embodiment of the invention has the following advantages or beneficial effects: monitoring used data in a statistical period to obtain use detail information of the data in the statistical period, and determining cold data to be migrated when the use detail information of the cold data meets a preset cold data migration condition; when the usage detail information of the hot data meets a preset hot data migration condition and the storage capacity of the current second storage space is larger than the storage capacity threshold value, determining the hot data as hot data to be migrated, finally migrating the cold data to be migrated to the second storage space for storing the hot data, and migrating the hot data to be migrated to the first storage space for storing the cold data. Through the arrangement, cold and hot data migration can be dynamically completed according to the use detail information of the data in the statistical period, the hot data storage space is guaranteed, and the cold data query efficiency is improved. In addition, the embodiment of the invention adopts two strategies to determine the cold and hot data to be migrated: firstly, determining cold data with the use times larger than a use time threshold as cold data to be migrated, and determining hot data with the latest use time longer than the current time length by a preset time length as hot data to be migrated under the condition that the storage capacity of the current second storage space is larger than a storage capacity threshold; and secondly, the cold and hot data to be migrated are determined by using migration weights calculated based on the use period and the use times of the data, so that the cold and hot data are comprehensively and accurately distinguished.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main steps of a data storage method according to an embodiment of the present invention;
FIG. 2 is a system architecture diagram of a data storage method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating the steps performed by the bypass monitoring unit according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the steps performed by the hot and cold marking unit in an embodiment of the present invention;
FIG. 5 is a diagram illustrating the steps performed by the arbitration unit according to an embodiment of the present invention;
FIG. 6 is a flow chart illustrating data migration according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of the components of a data storage device in an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device for implementing the data storage method in the embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments of the present invention and the technical features of the embodiments may be combined with each other without conflict.
Fig. 1 is a schematic diagram of main steps of a data storage method according to an embodiment of the present invention.
As shown in fig. 1, the data storage method according to the embodiment of the present invention may be specifically executed according to the following steps:
step S101: and monitoring the used data in a preset statistical period, and acquiring the use detail information of the data in the statistical period.
In the embodiment of the present invention, the used data may be defined according to actual needs, for example, the queried data may be defined as the used data, and the queried and updated data may also be defined as the used data. In practical applications, this step may be performed by a pre-programmed bypass monitoring unit, and the bypass monitoring unit may monitor the query request and the return result thereof, and/or the update request and the return result thereof in the statistical period, so as to determine the unique identifier of the queried and/or updated data. Preferably, the unique identifier of the data can be formed according to the library name, the table name and the primary key corresponding to the data.
In particular, the data used may include cold data stored in the first storage space and hot data stored in the second storage space. The above cold and hot data can be distinguished according to the time rule or routing rule of the prior art, for example, data whose update time exceeds 1 year is automatically classified as cold data, and order data requiring after-sales service is classified as hot data. The first storage space may be HBASE suitable for centralized storage of data, and the second storage space may be ES, MYSQL, REDIS, and the like which support the query requirement well (the HBASE, ES, MYSQL, REDIS are databases).
Fig. 2 is a schematic diagram of a system architecture of a data storage method according to an embodiment of the present invention, and fig. 3 is a schematic diagram of an execution step of a bypass monitoring unit according to an embodiment of the present invention, as shown in fig. 2 and 3, the bypass monitoring unit is configured to capture request data for a back end and a return result of the back end on the premise that an application layer is not sensitive, so as to provide the cold and hot marking unit with identification of a cold and hot degree of data. And the cold and hot marking unit is used for recording the use detail information of the data in the statistical period according to the rule configured by the user. The arbitration unit is used for confirming the data cold and hot degree according to a threshold value configured by a user, and further determining cold data to be migrated and hot data to be migrated. And the data migration unit is used for finishing data migration according to the data to be migrated confirmed by the arbitration link.
In an embodiment, the bypass monitoring unit only monitors a query request for a back end by default, and after a user performs configuration, the monitoring and updating request can also be supported, after the query request and/or the updating request are obtained, the bypass monitoring unit records entry of the request and identifies a corresponding return result, determines a unique identifier of data for which the request is directed according to the entry and the return result, and forms a data stream based on the unique identifiers of a plurality of pieces of data, wherein the data stream can be used for related processing of subsequent steps.
In practical application, after the data pipelining provided by the bypass monitoring unit is obtained, the use detail information of each piece of data in the statistical period is recorded by using the pre-programmed cold and hot marking unit. In general, the usage detail information of each data may include the number of times the data is used and the usage time (i.e., the time corresponding to each time the data is used, i.e., the request time corresponding to the query request or the update request) of the statistical period.
Preferably, the hot and cold marking unit counts to obtain usage detail information of each data based on the unique identifier of the data, and in some embodiments, the usage detail information of any data may further include a migration weight of the data, and the migration weight is used for indicating the activity degree of the data from a perspective. Specifically, the migration weight is the sum of a first weight and a second weight; the first weight is the ratio of the usage period of any one of the data to the average usage period of the used data, the second weight is the ratio of the number of times of using any one of the data to the average number of times of using the used data, and the usage period of any one of the data is the time length of the data between the earliest usage time and the latest usage time of the statistical period. It is understood that the migration weight can accurately judge the activity degree of the data in the statistical period from two aspects of the usage time span (time period) and the usage frequency (usage times) of the data.
Fig. 4 is a schematic diagram of an execution step of the hot and cold marking unit in the embodiment of the present invention, and as shown in fig. 4, the hot and cold marking unit performs statistics of usage detail information using a configured statistical period or a default statistical period, and a unique identifier of data is used as an aggregation dimension during the statistics.
Step S102: when the use detail information of any cold data in the used data meets a preset cold data migration condition, determining the cold data as cold data to be migrated; and when the use detail information of any hot data in the used data meets a preset hot data migration condition and the storage amount of the current second storage space is greater than a preset storage amount threshold value, determining the hot data as the hot data to be migrated.
In an embodiment of the present invention, this step may be performed using a pre-programmed arbitration unit. Specifically, after the cold and hot marking unit acquires the use detail information of the used data in the statistical period, the arbitration unit determines the cold data to be migrated and the hot data to be migrated according to the use detail information and the preset cold data migration condition and hot data migration condition. In practical applications, the arbitration unit may perform the discrimination between the cold data to be migrated and the hot data to be migrated according to the following two logics.
In the first logic, the arbitration unit determines cold data with the use times larger than a preset use time threshold as cold data to be migrated; and determining the hot data with the time length from the current use time to the current time length being longer than the preset time length as the hot data to be migrated under the condition that the storage amount of the current second storage space is larger than a preset storage amount threshold (for example, 70% of the maximum storage amount). It is understood that, in this determination logic, the usage frequency of the cold data is directly used as its activity level, the last unused time length of the hot data (the time length from the current usage time length) is used as its activity level, and the determination is made in combination with the current storage amount condition of the second storage space.
In the second logic, the arbitration unit directly takes the migration weight of each data as its activity degree, that is, determines cold data with a migration weight greater than a preset first weight threshold as cold data to be migrated, and determines hot data with a migration weight less than a preset second weight threshold as hot data to be migrated if the storage amount of the current second storage space is greater than the storage amount threshold.
Through the two judgment logics, the cold and hot data can be accurately distinguished from multiple angles. Fig. 5 is a schematic diagram illustrating the steps performed by the arbitration unit according to the embodiment of the present invention, and the above determination process of the arbitration unit can be seen in fig. 5.
Step S103: and migrating the cold data to be migrated to the second storage space, and migrating the hot data to be migrated to the first storage space.
In the embodiment of the present invention, the step may be performed using a pre-programmed data migration unit. Specifically, the data migration unit may write cold data to be migrated and hot data to be migrated into a preset message queue according to a sequence of the activity degrees from large to small, migrate the cold data to be migrated to the second storage space by using a preset migration message common processing component, and migrate the hot data to be migrated to the first storage space. Fig. 6 is a schematic workflow diagram of a migration message common processing component in an embodiment of the present invention, and as shown in fig. 6, after receiving a message in a message queue, the migration message common processing component writes corresponding data into a target library and deletes corresponding data from a source library, thereby implementing data migration.
In the technical scheme of the embodiment of the invention, the used data in the statistical period is monitored to obtain the use detail information of the data in the statistical period, and when the use detail information of the cold data meets the preset cold data migration condition, the cold data is determined as the cold data to be migrated; when the usage detail information of the hot data meets a preset hot data migration condition and the storage capacity of the current second storage space is larger than the storage capacity threshold value, determining the hot data as hot data to be migrated, finally migrating the cold data to be migrated to the second storage space for storing the hot data, and migrating the hot data to be migrated to the first storage space for storing the cold data. Through the arrangement, cold and hot data migration can be dynamically completed according to the use detail information of the data in the statistical period, the hot data storage space is guaranteed, and the cold data query efficiency is improved. In addition, the embodiment of the invention adopts two strategies to determine the cold and hot data to be migrated: firstly, determining cold data with the use times larger than a use time threshold as cold data to be migrated, and determining hot data with the latest use time longer than the current time length by a preset time length as hot data to be migrated under the condition that the storage capacity of the current second storage space is larger than a storage capacity threshold; and secondly, the cold and hot data to be migrated are determined by using migration weights calculated based on the use period and the use times of the data, so that the cold and hot data are comprehensively and accurately distinguished.
It should be noted that, for the convenience of description, the foregoing method embodiments are described as a series of acts, but those skilled in the art will appreciate that the present invention is not limited by the order of acts described, and that some steps may in fact be performed in other orders or concurrently. Moreover, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required to implement the invention.
To facilitate a better implementation of the above-described aspects of embodiments of the present invention, the following also provides relevant means for implementing the above-described aspects.
Referring to fig. 7, a data storage device 700 according to an embodiment of the present invention may include: a bypass monitoring unit 701, a hot and cold marking unit 702, an arbitration unit 703, and a data migration unit 704.
The bypass monitoring unit 701 may be configured to monitor data used in a preset statistical period; the hot and cold marking unit 702 can be used for acquiring the use detail information of the data in the statistical period; wherein the data comprises cold data stored in a first storage space and hot data stored in a second storage space, and the use detail information of each data comprises the use times and use time of the data in the counting period; the arbitration unit 703 may be configured to: when the use detail information of any cold data in the used data meets a preset cold data migration condition, determining the cold data as cold data to be migrated; when the use detail information of any hot data in the used data meets a preset hot data migration condition and the storage amount of the current second storage space is larger than a preset storage amount threshold value, determining the hot data as hot data to be migrated; the data migration unit 704 may be configured to migrate cold data to be migrated to the second storage space and migrate hot data to be migrated to the first storage space.
In an embodiment of the present invention, the bypass monitoring unit 701 may be further configured to: monitoring the query request and the return result thereof and/or the update request and the return result thereof in the statistical period, and determining the unique identifier of the queried and/or updated data; the unique identification of the data is formed according to the corresponding library name, the table name and the primary key of the data.
The arbitration unit 703 may be further configured to: when the using times of any cold data in the used data are larger than a preset using time threshold value, determining the cold data as cold data to be migrated; when the time length of the latest using time of any one piece of thermal data in the used data from the current time length is greater than the preset time length and the storage amount of the current second storage space is greater than the preset storage amount threshold value, determining the thermal data as thermal data to be migrated; or when the migration weight of any cold data in the used data is greater than a preset first weight threshold, determining the cold data as cold data to be migrated; and when the migration weight of any hot data in the used data is smaller than a preset second weight threshold and the storage amount of the current second storage space is larger than the storage amount threshold, determining the hot data as the hot data to be migrated.
Wherein the usage detail information of any data further comprises a migration weight of the data, and the migration weight is the sum of the first weight and the second weight; the first weight is the ratio of the usage period of any data to the average usage period of the used data, the second weight is the ratio of the number of times of using any data to the average number of times of using the used data, and the usage period of any data is the time length of the data between the earliest usage time and the latest usage time of the statistical period.
In the technical scheme of the embodiment of the invention, the used data in the statistical period is monitored to obtain the use detail information of the data in the statistical period, and when the use detail information of the cold data meets the preset cold data migration condition, the cold data is determined as the cold data to be migrated; when the usage detail information of the hot data meets a preset hot data migration condition and the storage capacity of the current second storage space is larger than the storage capacity threshold value, determining the hot data as hot data to be migrated, finally migrating the cold data to be migrated to the second storage space for storing the hot data, and migrating the hot data to be migrated to the first storage space for storing the cold data. Through the arrangement, cold and hot data migration can be dynamically completed according to the use detail information of the data in the statistical period, the hot data storage space is guaranteed, and the cold data query efficiency is improved. In addition, the embodiment of the invention adopts two strategies to determine the cold and hot data to be migrated: firstly, determining cold data with the use times larger than a use time threshold as cold data to be migrated, and determining hot data with the latest use time longer than the current time length by a preset time length as hot data to be migrated under the condition that the storage capacity of the current second storage space is larger than a storage capacity threshold; and secondly, the cold and hot data to be migrated are determined by using migration weights calculated based on the use period and the use times of the data, so that the cold and hot data are comprehensively and accurately distinguished.
The invention also provides the electronic equipment. The electronic device of the embodiment of the invention comprises: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the data storage method provided by the invention.
Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use in implementing an electronic device of an embodiment of the present invention. The electronic device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data necessary for the operation of the computer system 800 are also stored. The CPU801, ROM 802, and RAM803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted into the storage section 808 as necessary.
In particular, the processes described in the main step diagrams above may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the main step diagram. In the above-described embodiment, the computer program can be downloaded and installed from a network via the communication section 809 and/or installed from the removable medium 811. The computer program, when executed by the central processing unit 801, performs the above-described functions defined in the system of the present invention.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present invention may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a bypass monitoring unit, a hot and cold marking unit, an arbitration unit, and a data migration unit. Where the names of the units do not in some cases constitute a limitation on the units themselves, for example, the bypass monitoring unit may also be described as a "unit providing data used in a statistical period to the hot and cold marking unit".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to perform steps comprising: monitoring used data in a preset statistical period to acquire use detail information of the data in the statistical period; wherein the data comprises cold data stored in a first storage space and hot data stored in a second storage space, and the use detail information of each data comprises the use times and use time of the data in the counting period; when the use detail information of any cold data in the used data meets a preset cold data migration condition, determining the cold data as cold data to be migrated; when the use detail information of any hot data in the used data meets a preset hot data migration condition and the storage amount of the current second storage space is larger than a preset storage amount threshold value, determining the hot data as hot data to be migrated; and migrating the cold data to be migrated to the second storage space, and migrating the hot data to be migrated to the first storage space.
In the technical scheme of the embodiment of the invention, the used data in the statistical period is monitored to obtain the use detail information of the data in the statistical period, and when the use detail information of the cold data meets the preset cold data migration condition, the cold data is determined as the cold data to be migrated; when the usage detail information of the hot data meets a preset hot data migration condition and the storage capacity of the current second storage space is larger than the storage capacity threshold value, determining the hot data as hot data to be migrated, finally migrating the cold data to be migrated to the second storage space for storing the hot data, and migrating the hot data to be migrated to the first storage space for storing the cold data. Through the arrangement, cold and hot data migration can be dynamically completed according to the use detail information of the data in the statistical period, the hot data storage space is guaranteed, and the cold data query efficiency is improved. In addition, the embodiment of the invention adopts two strategies to determine the cold and hot data to be migrated: firstly, determining cold data with the use times larger than a use time threshold as cold data to be migrated, and determining hot data with the latest use time longer than the current time length by a preset time length as hot data to be migrated under the condition that the storage capacity of the current second storage space is larger than a storage capacity threshold; and secondly, the cold and hot data to be migrated are determined by using migration weights calculated based on the use period and the use times of the data, so that the cold and hot data are comprehensively and accurately distinguished.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method of storing data, comprising:
monitoring used data in a preset statistical period to acquire use detail information of the data in the statistical period; wherein the data comprises cold data stored in a first storage space and hot data stored in a second storage space, and the use detail information of each data comprises the use times and use time of the data in the counting period;
when the use detail information of any cold data in the used data meets a preset cold data migration condition, determining the cold data as cold data to be migrated; when the use detail information of any hot data in the used data meets a preset hot data migration condition and the storage amount of the current second storage space is larger than a preset storage amount threshold value, determining the hot data as hot data to be migrated;
and migrating the cold data to be migrated to the second storage space, and migrating the hot data to be migrated to the first storage space.
2. The method according to claim 1, wherein the monitoring of the data used in the preset statistical period comprises:
monitoring the query request and the return result thereof and/or the update request and the return result thereof in the statistical period, and determining the unique identifier of the queried and/or updated data; the unique identification of the data is formed according to the corresponding library name, the table name and the primary key of the data.
3. The method according to claim 1, wherein the usage detail information of any cold data in the used data conforms to a preset cold data migration condition, and the method comprises the following steps:
the number of uses of the cold data is greater than a preset number of uses threshold.
4. The method according to claim 3, wherein the usage detail information of any hot data in the used data conforms to a preset hot data migration condition, and the method comprises the following steps:
the most recent usage time of the thermal data is longer than a preset time from the current time.
5. The method according to claim 1, wherein the usage detail information of any data further includes a migration weight of the data, the migration weight being a sum of the first weight and the second weight; wherein the content of the first and second substances,
the first weight is the ratio of the usage period of any data to the average usage period of the used data, the second weight is the ratio of the number of times of using any data to the average number of times of using the used data, and the usage period of any data is the time length of the data between the earliest usage time and the latest usage time of the statistical period.
6. The method of claim 5,
the use detail information of any cold data in the used data conforms to a preset cold data migration condition, and the use detail information comprises the following steps: the migration weight of the cold data is greater than a preset first weight threshold;
the use detail information of any hot data in the used data conforms to a preset hot data migration condition, and the method comprises the following steps: the migration weight of the hot data is smaller than a preset second weight threshold.
7. A data storage device, comprising:
the bypass monitoring unit is used for monitoring the used data in a preset statistical period;
the cold and hot marking unit is used for acquiring the use detail information of the data in the statistical period; wherein the data comprises cold data stored in a first storage space and hot data stored in a second storage space, and the use detail information of each data comprises the use times and use time of the data in the counting period;
an arbitration unit to: when the use detail information of any cold data in the used data meets a preset cold data migration condition, determining the cold data as cold data to be migrated; when the use detail information of any hot data in the used data meets a preset hot data migration condition and the storage amount of the current second storage space is larger than a preset storage amount threshold value, determining the hot data as hot data to be migrated;
and the data migration unit is used for migrating the cold data to be migrated to the second storage space and migrating the hot data to be migrated to the first storage space.
8. The apparatus of claim 7, wherein the bypass monitoring unit is further configured to: monitoring the query request and the return result thereof and/or the update request and the return result thereof in the statistical period, and determining the unique identifier of the queried and/or updated data; the unique identification of the data is formed according to the library name, the table name and the primary key corresponding to the data;
the arbitration unit is further configured to: when the using times of any cold data in the used data are larger than a preset using time threshold value, determining the cold data as cold data to be migrated; when the time length of the latest using time of any one piece of thermal data in the used data from the current time length is greater than the preset time length and the storage amount of the current second storage space is greater than the preset storage amount threshold value, determining the thermal data as thermal data to be migrated; alternatively, the first and second electrodes may be,
when the migration weight of any cold data in the used data is larger than a preset first weight threshold, determining the cold data as cold data to be migrated; when the migration weight of any hot data in the used data is smaller than a preset second weight threshold and the storage amount of the current second storage space is larger than the storage amount threshold, determining the hot data as hot data to be migrated; wherein the content of the first and second substances,
the usage detail information of any data further includes a migration weight of the data, the migration weight being the sum of the first weight and the second weight; the first weight is the ratio of the usage period of any data to the average usage period of the used data, the second weight is the ratio of the number of times of using any data to the average number of times of using the used data, and the usage period of any data is the time length of the data between the earliest usage time and the latest usage time of the statistical period.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-6.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-6.
CN202011402506.7A 2020-12-02 2020-12-02 Data storage method and device Pending CN113760176A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011402506.7A CN113760176A (en) 2020-12-02 2020-12-02 Data storage method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011402506.7A CN113760176A (en) 2020-12-02 2020-12-02 Data storage method and device

Publications (1)

Publication Number Publication Date
CN113760176A true CN113760176A (en) 2021-12-07

Family

ID=78786152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011402506.7A Pending CN113760176A (en) 2020-12-02 2020-12-02 Data storage method and device

Country Status (1)

Country Link
CN (1) CN113760176A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114924696A (en) * 2022-07-18 2022-08-19 上海有孚数迅科技有限公司 Method, apparatus, medium, and program product for storage management

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140181400A1 (en) * 2011-12-31 2014-06-26 Huawei Technologies Co., Ltd. Method for A Storage Device Processing Data and Storage Device
CN108363553A (en) * 2018-01-31 2018-08-03 北京兰云科技有限公司 A kind of data processing method, apparatus and system
CN110531938A (en) * 2019-09-02 2019-12-03 广东紫晶信息存储技术股份有限公司 A kind of cold and hot data migration method and system based on various dimensions
CN110673785A (en) * 2019-08-23 2020-01-10 上海科技发展有限公司 Data storage method and system
CN110989937A (en) * 2019-12-06 2020-04-10 浪潮电子信息产业股份有限公司 Data storage method, device and equipment and computer readable storage medium
WO2020087927A1 (en) * 2018-10-31 2020-05-07 华为技术有限公司 Method and device for memory data migration
CN111427844A (en) * 2020-04-15 2020-07-17 成都信息工程大学 Data migration system and method for file hierarchical storage
CN111562889A (en) * 2020-05-14 2020-08-21 杭州海康威视系统技术有限公司 Data processing method, device, system and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140181400A1 (en) * 2011-12-31 2014-06-26 Huawei Technologies Co., Ltd. Method for A Storage Device Processing Data and Storage Device
CN108363553A (en) * 2018-01-31 2018-08-03 北京兰云科技有限公司 A kind of data processing method, apparatus and system
WO2020087927A1 (en) * 2018-10-31 2020-05-07 华为技术有限公司 Method and device for memory data migration
CN110673785A (en) * 2019-08-23 2020-01-10 上海科技发展有限公司 Data storage method and system
CN110531938A (en) * 2019-09-02 2019-12-03 广东紫晶信息存储技术股份有限公司 A kind of cold and hot data migration method and system based on various dimensions
CN110989937A (en) * 2019-12-06 2020-04-10 浪潮电子信息产业股份有限公司 Data storage method, device and equipment and computer readable storage medium
CN111427844A (en) * 2020-04-15 2020-07-17 成都信息工程大学 Data migration system and method for file hierarchical storage
CN111562889A (en) * 2020-05-14 2020-08-21 杭州海康威视系统技术有限公司 Data processing method, device, system and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114924696A (en) * 2022-07-18 2022-08-19 上海有孚数迅科技有限公司 Method, apparatus, medium, and program product for storage management

Similar Documents

Publication Publication Date Title
CN109144696B (en) Task scheduling method and device, electronic equipment and storage medium
US8595722B2 (en) Preprovisioning virtual machines based on request frequency and current network configuration
CN108549583B (en) Big data processing method and device, server and readable storage medium
CN111950988B (en) Distributed workflow scheduling method and device, storage medium and electronic equipment
CN109144697B (en) Task scheduling method and device, electronic equipment and storage medium
CN109240802B (en) Request processing method and device
CN114706820B (en) Scheduling method, system, electronic device and medium for asynchronous I/O request
CN108932241B (en) Log data statistical method, device and node
CN115291806A (en) Processing method, processing device, electronic equipment and storage medium
CN113760176A (en) Data storage method and device
US9577869B2 (en) Collaborative method and system to balance workload distribution
US9703614B2 (en) Managing a free list of resources to decrease control complexity and reduce power consumption
CN113760982A (en) Data processing method and device
CN109067649B (en) Node processing method and device, storage medium and electronic equipment
CN108810130B (en) Method and device for planning distribution request
US8954974B1 (en) Adaptive lock list searching of waiting threads
CN107894942B (en) Method and device for monitoring data table access amount
CN115729687A (en) Task scheduling method and device, computer equipment and storage medium
CN115438007A (en) File merging method and device, electronic equipment and medium
CN115563310A (en) Method, device, equipment and medium for determining key service node
CN115065366A (en) Compression method, device and equipment of time sequence data and storage medium
CN115016890A (en) Virtual machine resource allocation method and device, electronic equipment and storage medium
CN114564149A (en) Data storage method, device, equipment and storage medium
CN114090201A (en) Resource scheduling method, device, equipment and storage medium
CN112395081A (en) Resource online automatic recovery method, system, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination