CN114860663A - Data storage method, device, equipment and computer readable storage medium - Google Patents

Data storage method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN114860663A
CN114860663A CN202210589221.1A CN202210589221A CN114860663A CN 114860663 A CN114860663 A CN 114860663A CN 202210589221 A CN202210589221 A CN 202210589221A CN 114860663 A CN114860663 A CN 114860663A
Authority
CN
China
Prior art keywords
access
attribute
storage
target storage
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210589221.1A
Other languages
Chinese (zh)
Inventor
邹杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chuangyou Digital Technology Guangdong Co Ltd
Original Assignee
Chuangyou Digital Technology Guangdong Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chuangyou Digital Technology Guangdong Co Ltd filed Critical Chuangyou Digital Technology Guangdong Co Ltd
Priority to CN202210589221.1A priority Critical patent/CN114860663A/en
Publication of CN114860663A publication Critical patent/CN114860663A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data storage method, a data storage device, data storage equipment and a computer readable storage medium, which can be applied to a file storage system with Hadoop as a core, can overcome the defects of a storage mode provided by the conventional Hadoop, meet the high availability of data in the file storage system, avoid the large consumption of data storage space and ensure the read-write speed of the data. The storage method comprises the following steps: determining access attributes of a target storage file in a file storage system, wherein the access attributes comprise: cold and hot properties; generating a target storage strategy which is matched with the access attribute for the target storage file; querying a historical storage strategy of the target storage file in the file storage system; and executing storage operation on the target storage file according to the comparison result of the target storage strategy and the historical storage strategy.

Description

Data storage method, device, equipment and computer readable storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data storage method, an apparatus, a device, and a computer-readable storage medium.
Background
With the development of technology, in the field of big data, many big data components with Hadoop as the core are generated, such as Hive, Spark, Presto, Impala, and so on.
The Hadoop is a distributed system infrastructure, and through Hadoop, a user can develop a distributed program without knowing details of a distributed bottom layer so as to fully utilize a cluster to perform high-speed operation and storage.
In Hadoop, in order to realize high availability of cluster storage, an HDFS file storage system of Hadoop provides two storage modes, namely a Block replication (namely file Block multiple copies) and an Erasure coding (namely Erasure coding technology), but the Block replication storage mode also causes 200% of extra overhead of a storage space while improving the data availability, and causes extra bandwidth consumption during data writing; although the erase coding storage method has a larger increase in the storage space usage and data writing speed than the Block replication storage method, the erase coding storage method does not increase the data reading speed. Therefore, in a file storage system with a Hadoop core, it is a difficult choice to meet the requirements of high availability of data, avoid large consumption of data storage space, and ensure the read-write speed of data.
Disclosure of Invention
The application provides a data storage method, a data storage device, data storage equipment and a computer readable storage medium, which can be applied to a file storage system with Hadoop as a core, can overcome the defects of a storage mode provided by the conventional Hadoop, meet the high availability of data in the file storage system, avoid the large consumption of data storage space and ensure the read-write speed of the data.
In view of the above, a first aspect of the present application provides a data storage method, including:
determining access attributes of a target storage file in a file storage system, wherein the access attributes comprise: cold and hot properties;
generating a target storage strategy which is matched with the access attribute for the target storage file;
querying a historical storage strategy of the target storage file in the file storage system;
and executing storage operation on the target storage file according to the comparison result of the target storage strategy and the historical storage strategy.
Optionally, the determining the access attribute of the target storage file in the file storage system specifically includes:
acquiring a target storage file and a log file in a file storage system;
filtering out access data corresponding to the target storage file from the log file;
and determining the access attribute of the target storage file according to the access data.
Optionally, the cold attributes comprise: pre-cooling property; the thermal properties include: a potential hotspot attribute;
the determining, according to the access data, an access attribute of the target storage file specifically includes:
calculating access growth data according to the access data;
acquiring reference access data and reference access growth data corresponding to the precooling attribute and the potential hotspot attribute respectively;
comparing the access data with reference access data corresponding to the precooling attribute and the potential hotspot attribute respectively to obtain a first comparison result;
comparing the access growth data with reference access growth data corresponding to the precooling attribute and the potential hotspot attribute respectively to obtain a second comparison result;
and determining that the access attribute of the target storage file is a cold attribute or a hot attribute by combining the first comparison result and the second comparison result.
Optionally, the determining, by combining the first comparison result and the second comparison result, that the access attribute of the target storage file is a cold attribute or a hot attribute specifically includes:
when the first comparison result is that the access data is larger than 0 and smaller than reference access data corresponding to a precooling attribute, and the second comparison result is that the access growth data is smaller than reference access growth data corresponding to a precooling attribute, determining the precooling attribute as the access attribute of the target storage file;
and when the first comparison result is that the access data is larger than 0 and smaller than the reference access data corresponding to the potential hotspot attribute, and the second comparison result is that the access growth data is larger than the reference access growth data corresponding to the potential hotspot attribute, determining the potential hotspot attribute as the access attribute of the target storage file.
Optionally, the accessing data comprises: number of visitors and number of visits;
the calculating access growth data according to the access data specifically includes:
calculating a visitors number growth rate in the visitors growth data based on the visitors number;
and calculating the access time increase rate in the access increase data based on the access times.
Optionally, the cold attributes comprise: freezing property; the thermal properties include: a current hotspot attribute, a thawing attribute and a local hotspot attribute;
the determining, according to the access data, an access attribute of the target storage file specifically includes:
when the access data of the last N days is 0, determining a freezing attribute as the access attribute of the target storage file, wherein N is a natural number;
when the access data is larger than the reference access data corresponding to the current hotspot attribute, determining the current hotspot attribute as the access attribute of the target storage file;
when the access data of the last N days is not 0 and the access data of the previous N days is 0, determining a unfreezing attribute as the access attribute of the target storage file;
and when the number of the access people in the access data is more than 0 and less than the reference number of the access people corresponding to the local hot spot attribute, and the number of the access times in the access data is more than the reference number of the access times corresponding to the local hot spot attribute, determining the local hot spot attribute as the access attribute of the target storage file.
Optionally, the generating a target storage policy adapted to the access attribute for the target storage file specifically includes:
when the access attribute is a cold attribute, determining an object storage policy as a target storage policy of the target storage file;
and when the access attribute is the thermal attribute, determining a copy storage policy as a target storage policy of the target storage file.
Optionally, the copy storage policy includes: high copy and normal copy;
when the access attribute is a thermal attribute, determining a copy storage policy as a target storage policy of the target storage file, specifically including:
when the access attribute is a current hotspot attribute or a potential hotspot attribute in the thermal attributes, determining a high copy as a target storage strategy of the target storage file;
and when the access attribute is a unfreezing attribute or a local hot spot attribute in the thermal attributes, determining the normal copy as a target storage strategy of the target storage file.
Optionally, the executing, according to the comparison result between the target storage policy and the historical storage policy, a storage operation on the target storage file specifically includes:
comparing the target storage strategy with the historical storage strategy to obtain a comparison result;
when the comparison result is that the target storage strategy is inconsistent with the historical storage strategy, updating the historical storage strategy into the target storage strategy, and executing storage operation on the target storage file according to the target storage strategy;
and when the comparison result shows that the target storage strategy is consistent with the historical storage strategy, continuing to execute storage operation on the target storage file according to the historical storage strategy.
A second aspect of the present application provides a data storage device comprising:
a determining unit, configured to determine an access attribute of a target storage file in a file storage system, where the access attribute includes: cold and hot properties;
the generating unit is used for generating a target storage strategy which is adaptive to the access attribute for the target storage file;
the query unit is used for querying a historical storage strategy of the target storage file in the file storage system;
and the storage unit is used for executing storage operation on the target storage file according to the comparison result of the target storage strategy and the historical storage strategy.
A third aspect of the present application provides a data storage device comprising a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to execute any of the data storage methods of the first aspect according to instructions in the program code.
A fourth aspect of the present application provides a computer-readable storage medium for storing program code for performing the data storage method of any one of the first aspects.
According to the technical scheme, the method has the following advantages:
the applicant researches the storage mode of the existing HDFS file storage system, and finds that the basic reasons of storage resource waste and high storage cost are that a one-view-same-kernel storage strategy is adopted for all storage files, namely, 3 storage files are copied. This increases data availability, but also causes an overhead of 200% of the storage space, at the same time as the storage costs are higher.
In the data storage method provided by the application, in order to avoid a large amount of consumption of data storage space in the file storage system, the access attribute of the target storage file in the file storage system is preferentially determined, and the access attribute comprises: cold and hot properties; generating a target storage strategy which is matched with the access attribute of the target storage file for the target storage file, wherein the target storage strategy is generated after the access attribute of the target storage file is determined to be a cold attribute or a hot attribute, and when the access attribute changes, the target storage strategy also changes; after a currently adapted target storage strategy is formulated for a target storage file, a historical storage strategy of the target storage file in a file storage system is inquired, the target storage strategy is compared with the historical storage strategy, and a storage operation is executed on the target storage file according to a comparison result of the target storage strategy and the historical storage strategy, wherein the storage operation is executed according to a comparison result of the current target storage strategy and the historical storage strategy of the target storage file, and the comparison result changes along with the difference of the target storage strategy, so that the storage operation executed aiming at the target storage file also changes along with the difference, namely the storage strategy and the storage operation which are correspondingly matched in the file storage system are different aiming at the same target storage file under the condition that the access attribute of the same target storage file is different, thereby being beneficial to ensuring the high availability of the data of the target storage file at any time of system service, and can reduce the extra bandwidth consumption when accessing the target storage file while maintaining the data read-write speed.
Drawings
In order to more clearly illustrate the technical method in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without inventive labor.
Fig. 1 is a schematic flowchart of a first embodiment of a data storage method in an embodiment of the present application;
fig. 2 is a schematic flowchart of a second embodiment of a data storage method in an embodiment of the present application;
FIG. 3 is a schematic structural diagram of an embodiment of a data storage device according to an embodiment of the present application;
fig. 4 is a block diagram of an implementation of a data storage device according to an embodiment of the present application.
Detailed Description
The application designs a data storage method, a data storage device, data storage equipment and a computer readable storage medium, which can be applied to a file storage system with Hadoop as a core, can overcome the defects of a storage mode provided by the conventional Hadoop, meet the high availability of data in the file storage system, avoid the large consumption of data storage space and ensure the read-write speed of the data.
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
For easy understanding, please refer to fig. 1, in which fig. 1 is a schematic flowchart of a first embodiment of a data storage method according to an embodiment of the present application.
As shown in fig. 1, a data storage method in this embodiment specifically includes:
step 101, determining access attributes of a target storage file in a file storage system, wherein the access attributes comprise: cold properties and hot properties.
In this embodiment, the storage operation of the target storage file is performed based on the access attribute of the file. Thus, when the access attribute of the target storage file is changed, the storage operation of the target storage file in the file storage system may also be changed. Therefore, in the present embodiment, the access attribute of the target storage file in the file storage system is preferentially determined.
It is understood that the access attribute is determined according to the access behavior of the user to the target storage file, so the access attribute may represent the access degree of the user to the target storage file, and a higher access degree indicates a more frequent access behavior of the user to the target storage file, and vice versa. In this embodiment, the access attribute is defined as a cold attribute and a hot attribute, so as to distinguish different access heat degrees. When the target storage file is of a thermal attribute, the user is indicated to frequently access the target storage file, and the target storage file is frequently read and written; when the target storage file is in a cold attribute, the user has less access to the target storage file, and the target storage file is less read and written.
It should be understood that the target storage file may be any storage file in the file storage system, which is not limited and described in this embodiment.
And 102, generating a target storage strategy matched with the access attribute for the target storage file.
Different access attributes may correspond to different target storage policies, and therefore, after determining the access attribute corresponding to the target storage file, a target storage policy adapted to the access attribute may be generated for the target storage file. For example: when the access attribute of the target storage file is a cold attribute, generating a target storage strategy which is matched with the cold attribute for the file; and when the access attribute of the target storage file is the thermal attribute, generating a target storage strategy which is matched with the thermal attribute for the file. Further, when the access attribute changes, the target storage policy may also change. For example: when the access attribute is changed from the historical cold attribute to the current hot attribute, the target storage policy is also changed from the target storage policy adapted to the cold attribute to the target storage policy adapted to the hot attribute.
And 103, querying a history storage strategy of the target storage file in the file storage system.
The storage operation of the target storage file in the file storage system is performed based on the comparison result between the currently adapted target storage strategy and the history adapted history storage strategy, so that the history storage strategy of the target storage file in the file storage system is inquired after the currently adapted target storage strategy is formulated for the target storage file.
It will be appreciated that in an alternative embodiment, the query to the history storage policy may be: the file storage system records a history storage policy in the target storage file, and performs query of the history storage policy by using file information (such as a file name or a file number) of the target storage file as a query index during query.
And 104, executing storage operation on the target storage file according to the comparison result of the target storage strategy and the historical storage strategy.
The storage operation of the target storage file is an operation executed according to a comparison result of the current target storage policy and the historical storage policy. The comparison result may vary with different target storage policies, so that the storage operation performed on the target storage file may also vary with the target storage policy, and the target storage policy may also vary with different access attributes, that is, under the condition that the same target storage file has different access attributes, the target storage policy and the storage operation that are correspondingly matched in the file storage system may also differ.
It can be understood that the comparison between the target storage policy and the historical storage policy can be implemented in various ways, which is not limited and described in this embodiment.
In the data storage method in this embodiment, to avoid a large amount of consumption of data storage space in the file storage system, the access attribute of the target storage file in the file storage system is preferentially determined, where the access attribute includes: cold and hot properties; generating a target storage strategy which is matched with the access attribute of the target storage file for the target storage file, wherein the target storage strategy is generated after the access attribute of the target storage file is determined to be a cold attribute or a hot attribute, and when the access attribute changes, the target storage strategy also changes; after a currently adapted target storage strategy is formulated for a target storage file, a historical storage strategy of the target storage file in a file storage system is inquired, the target storage strategy is compared with the historical storage strategy, and a storage operation is executed on the target storage file according to a comparison result of the target storage strategy and the historical storage strategy, wherein the storage operation is executed according to a comparison result of the current target storage strategy and the historical storage strategy of the target storage file, and the comparison result changes along with the difference of the target storage strategy, so that the storage operation executed aiming at the target storage file also changes along with the difference, namely the storage strategy and the storage operation which are correspondingly matched in the file storage system are different aiming at the same target storage file under the condition that the access attribute of the same target storage file is different, thereby being beneficial to ensuring the high availability of the data of the target storage file at any time of system service, and the extra bandwidth consumption when accessing the target storage file can be reduced while maintaining the data reading and writing speed.
The foregoing is a first embodiment of a data storage method provided in the embodiments of the present application, and the following is a second embodiment of the data storage method provided in the embodiments of the present application.
Referring to fig. 2, fig. 2 is a flowchart illustrating a second embodiment of a data storage method according to an embodiment of the present application.
As shown in fig. 2, a data storage method in this embodiment specifically includes:
step 201, acquiring a target storage file and a log file in a file storage system.
In this embodiment, the access data of the target storage file can be obtained from the log file, so that the target storage file and the log file in the file storage system need to be obtained first. In a file storage system, a log file is a file type independent of a storage file, and log data recorded therein includes some access data, bug information, download data, storage data, record data of a system for a certain processing operation that has been completed, and the like. It is understood that the log file can be updated by day, and the content can be stored by day.
The log file can be obtained in real time, namely the log file is obtained from the file storage system when the configuration of the storage operation of the target storage file is required; the log file may also be obtained in advance, that is, the log file is obtained from the file storage system in advance, and may be used directly when the configuration of the storage operation is performed on the target storage file. When more accurate storage operation is needed, the former acquisition mode can be selected, so that the acquired log file is the latest and most suitable for the next time, and further the storage operation is also suitable for the next time. The latter may be selected when a shorter calculation time and a faster calculation efficiency are required, and in order to further specify the acquisition of the log file, a timing acquisition may be set, for example, 0 point, 6 points, etc. per day.
Step 202, filtering out access data corresponding to the target storage file from the log file.
In this embodiment, the access attribute is determined according to an access behavior of the user to the target storage file, and the access behavior of the user to the target storage file is embodied in the form of access data and recorded in the log file, so that the access data corresponding to the target storage file needs to be filtered from the log file.
The data recorded in the log file is diversified, the access data is only one of the data, the access data recorded in the log file is the access data of all storage files in the file storage system, and the access data corresponding to the target storage file needs to be screened from all the access data.
In one embodiment, the filtering of the access data corresponding to the target storage file from the log file may be: all log data corresponding to the target storage file are filtered from the log file, and then access data are filtered from all log data, and the corresponding implementation steps can include:
step 2021, using the file information of the target storage file as an index, and filtering out all log data corresponding to the target storage file from the log file.
Step 2022, using the access request as an index, querying access data corresponding to the target storage file from all log data.
In one embodiment, the filtering of the access data corresponding to the target storage file from the log file may be: filtering all access data from the log file, and then filtering the access data corresponding to the target storage file from all the access data, wherein the corresponding implementation steps may include:
step 2021, using the access request as an index, queries all the access data from the log data.
Step 2022, using the file information of the target storage file as an index, filtering out the log data corresponding to the target storage file from all the access data.
It is understood that the access request may be an access request name or an access request command, etc. As long as the discrimination of the access request/access behavior can be realized, this is not limited and described in this embodiment. The file information of the target storage file may be a file name or a file number, which is not limited in this embodiment.
Specifically, accessing data may include: the number of visitors and the number of visitors, wherein the number of visitors represents the number of visitors of the target storage file; the access times represent the access times of the target storage file regardless of the difference of the access users. When 1 user accesses the target storage file for 3 times, the number of access persons in the access data is 1, and the number of access times is 3; when 3 users access the target storage file 1 time respectively, the number of access persons in the access data is 3, and the number of access times is 3.
For convenience of understanding, in this embodiment, implementation of the steps is described with reference to specific examples, for example, the number of visitors corresponding to the target storage file a is 30, and the number of visitors is 100.
Step 203, according to the access data, determining the access attribute of the target storage file.
It is understood that a plurality of attributes are preset in the file storage system, for example: a current hot spot attribute, a potential hot spot attribute, a thaw attribute, a local hot spot attribute, a pre-cool attribute, and a freeze attribute. In this embodiment, according to the access data, the access attribute corresponding to the target storage file may be determined from the current hotspot attribute, the potential hotspot attribute, the unfreezing attribute, the local hotspot attribute, the precooling attribute, and the freezing attribute.
The current hotspot attribute means that a plurality of users frequently access the target storage file currently; the potential hotspot attributes refer to: although a small number of users access the target storage file at present, the access is greatly increased compared with historical access; the thawing attribute means: during the previous period, no user accesses the target storage file, and during the latest period, the user accesses the target storage file; the local hot spot attribute means that only a few users frequently access the target storage file; the pre-cooling property is as follows: almost no user accesses a target storage file in history and at present; the freezing property means that: no user has accessed the target storage file for the last period of time. As can be seen from the above description, the access behavior characteristics of the current hotspot attribute, the potential hotspot attribute, the unfreezing attribute, the local hotspot attribute, the precooling attribute and the freezing attribute are shown in table 1 below:
TABLE 1
Figure BDA0003666844440000101
Figure BDA0003666844440000111
It can be understood that the access behavior characteristics in table 1 are specifically expressed by the reference access data and the reference access growth data corresponding to each attribute, and specific numerical values of the reference access data and the reference access growth data corresponding to each attribute may be set as needed, which is not specifically limited in this embodiment.
In a specific embodiment, when determining whether the access attribute of the target storage file is a cold attribute or a hot attribute, the determination may be performed by combining the access data of the target storage file and the access growth data corresponding to the access data. In this embodiment, the cold attributes may include: pre-cooling property; the thermal properties may include: a potential hotspot attribute; determining the access attribute of the target storage file according to the access data, which specifically comprises the following steps:
step 2031, calculating access growth data according to the access data.
Specifically, in one optional implementation, accessing the data comprises: in terms of number of visitors and number of visitors, accessing growth data may include: the number of visitors increases and the number of visitors increases. At this time, according to the access data, calculating access growth data, specifically including:
calculating the number of visitors growth rate in the visit growth data based on the number of visitors;
based on the number of accesses, the access number increase rate in the access increase data is calculated.
It can be understood that the number of visitors increase rate indicates the increase rate of the number of visitors, the number of visitors and the number of visitors acquired in step 202 are access data currently adapted to the target storage file, the number of visitors and the number of visitors historically adapted to the target storage file need to be acquired, and then the number of visitors and the current number of visitors are compared to obtain the number of visitors increase rate. Similarly, the access number increase rate can be obtained.
Step 2032, obtaining reference access data and reference access growth data corresponding to the pre-cooling attribute and the potential hotspot attribute respectively.
In this embodiment, the access attribute of the target storage file is determined by comparing the access data, the access growth data and the pre-cooling attribute of the target storage file with the reference access data and the reference access growth data corresponding to the potential hotspot attribute, so that the reference access data and the reference access growth data corresponding to the pre-cooling attribute and the potential hotspot attribute are obtained after the access data and the access growth data of the target storage file are obtained.
It is to be understood that, when the access data is the number of visitors and the number of visits, the reference access data may be the number of reference visitors and the number of reference visits. When the visit growth data is a visit number growth rate and a visit number growth rate, the reference visit growth data may be a reference visit number growth rate and a reference visit number growth rate.
Step 2033, comparing the access data with reference access data corresponding to the pre-cooling attribute and the potential hotspot attribute respectively to obtain a first comparison result.
Specifically, during comparison, the access data is compared with the reference access data, and the access data is respectively compared with the reference access data corresponding to the precooling attribute and the potential hotspot attribute. And when the access data is the number of visitors and the number of visitors, and the reference access data is the reference number of visitors and the reference number of visitors, comparing the number of visitors with the reference number of visitors, and comparing the number of visitors with the reference number of visitors, namely: respectively comparing the number of visitors with reference visitors corresponding to the precooling attribute and the potential hotspot attribute; and comparing the access times with reference access times corresponding to the precooling attribute and the potential hotspot attribute respectively.
Step 2034, comparing the access growth data with reference access growth data corresponding to the pre-cooling attribute and the potential hotspot attribute respectively to obtain a second comparison result.
Specifically, during comparison, the access growth data is compared with the reference access growth data, and the access growth data is compared with the reference access growth data corresponding to the precooling attribute and the potential hotspot attribute respectively. And when the visit increase data is the visit number increase rate and the visit number increase rate, and the reference visit increase data is the reference visit number increase rate and the reference visit number increase rate, comparing the visit number increase rate with the reference visit number increase rate, and comparing the visit number increase rate with the reference visit number increase rate, namely: comparing the number of visitors growth rate with reference number of visitors growth rate corresponding to the pre-cooling attribute and the potential hotspot attribute respectively; and comparing the access times growth rate with reference access times growth rates corresponding to the precooling attribute and the potential hotspot attribute respectively.
And step 2035, combining the first comparison result and the second comparison result to determine that the access attribute of the target storage file is a cold attribute or a hot attribute.
Specifically, after a first comparison result corresponding to the access data and a second comparison result corresponding to the access growth data are obtained, the access attribute of the target storage file may be determined to be a cold attribute or a hot attribute by combining the first comparison result and the second comparison result.
In an optional implementation manner, determining, by combining the first comparison result and the second comparison result, that the access attribute of the target storage file is a cold attribute or a hot attribute specifically includes:
when the first comparison result is that the access data is larger than 0 and smaller than the reference access data corresponding to the precooling attribute, and the second comparison result is that the access growth data is smaller than the reference access growth data corresponding to the precooling attribute, determining the precooling attribute as the access attribute of the target storage file;
and when the first comparison result is that the access data is larger than 0 and smaller than the reference access data corresponding to the potential hotspot attribute, and the second comparison result is that the access growth data is larger than the reference access growth data corresponding to the potential hotspot attribute, determining the potential hotspot attribute as the access attribute of the target storage file.
In a specific embodiment, when determining whether the access attribute of the target storage file is specifically a cold attribute or a hot attribute, the determination may be performed by using the access data of the target storage file alone. In this embodiment, the cold attributes may include: freezing property; the thermal properties may include: a current hotspot attribute, a thawing attribute and a local hotspot attribute;
determining the access attribute of the target storage file according to the access data may specifically include:
when the access data of the last N days is 0, determining the freezing attribute as the access attribute of the target storage file, wherein N is a natural number;
when the access data is larger than the reference access data corresponding to the current hotspot attribute, determining the current hotspot attribute as the access attribute of the target storage file;
when the access data of the last N days is not 0 and the access data of the last N days is 0, determining the unfreezing attribute as the access attribute of the target storage file;
and when the number of the access persons in the access data is more than 0 and less than the reference number of the access persons corresponding to the local hot spot attribute, and the number of the access times in the access data is more than the reference number of the access times corresponding to the local hot spot attribute, determining the local hot spot attribute as the access attribute of the target storage file.
It is understood that the last N days and the first N days respectively refer to a time period consisting of N days. The last N days is a time period which is N days ahead based on the current, and the last N days is a previous N-day time period of the last N days. For example, when the current day is 15 and N is 3, the last N days are: a time period consisting of 15 days, 14 days, and 13 days; the first N days are: a time period consisting of 12 days, 11 days and 10 days.
It is to be understood that, when the above access data are compared, the relationship of less than, greater than, or less than, etc. means that the number of visitors and the number of visits satisfy the above relationship. For example, accessing data greater than 0 means: the number of visitors and the number of visitions are both greater than 0, and the visiting data smaller than the reference visiting data corresponding to the precooling attribute means that: the number of visitors is less than the reference number of visitors corresponding to the pre-cooling attribute, and the number of visitors is less than the reference number of visitors corresponding to the pre-cooling attribute.
Similarly, when the access increase data are compared, the relationship of less than, more than or less than equal refers to that the access number increase rate and the access number increase rate both meet the relationship. For example: the access growth data being smaller than the reference access growth data corresponding to the pre-cooling attribute means: the number of visitors is increased less than the number of reference visitors corresponding to the pre-cooling attribute, and the number of visitors is increased less than the number of reference visitors corresponding to the pre-cooling attribute.
Meanwhile, the above description refers to the case where the number of visitors in the visit data is 0 or not 0, and refers to the case where the number of visitors in the visit data and the number of visitors in the visit data are both 0 or not 0.
For example, when the number of access persons of the target storage file a is 30 and the number of access times is 100, the number of access persons 30 is greater than the reference number of access persons 25 corresponding to the current hotspot attribute, and the number of access times 100 is greater than the reference number of access times 80 corresponding to the current hotspot attribute, at this time, the current hotspot attribute may be used as the access attribute of the target storage file.
And step 204, when the access attribute is the cold attribute, determining the object storage policy as the target storage policy of the target storage file.
In this embodiment, different access attributes may correspond to different storage policies, and when the access attribute of the target storage file is a cold attribute, it indicates that the target storage file is rarely accessed by a user in a recent period of time. For the target Storage file with the cold property, because the access requirement of the user is not high, in order to save the Storage space and reduce the Storage cost, the current target Storage policy of the target Storage file may be set as the Object Storage policy, that is, the target Storage file may be stored by using Object Storage Service (OSS), and the Storage cost of the Storage method is low.
And step 205, when the access attribute is the thermal attribute, determining the copy storage policy as the target storage policy of the target storage file.
When the access attribute of the target storage file is the thermal attribute, it indicates that the target storage file will be accessed by the user more frequently in the recent period of time. For a target storage file with a thermal attribute, because the access requirement of a user is high, in order to improve the access efficiency of the target storage file, meet the requirement of the user on the file read-write speed, and ensure high availability of data of the target storage file, the current target storage policy of the target storage file may be set as a copy storage policy, for example, the target storage file may be stored in a Block replication storage manner.
The copy storage policy in this embodiment includes: high copy and normal copy; the number of copies corresponding to the high copies is greater than that of the copies corresponding to the normal copies, and the greater the number of copies, the higher the reading performance and efficiency of the data, but the higher the storage cost.
In one embodiment, when the access attribute is a thermal attribute, determining the copy storage policy as a target storage policy of the target storage file includes:
when the access attribute is the current hotspot attribute or the potential hotspot attribute in the thermal attributes, determining the high copy as a target storage strategy of the target storage file;
and when the access attribute is a unfreezing attribute or a local hot spot attribute in the hot attributes, determining the normal copy as a target storage strategy of the target storage file.
It can be understood that, as can be seen from the foregoing definition of the access attribute, the current hotspot attribute and the potential hotspot attribute are higher than the access heat of the unfreezing attribute and the local hotspot attribute, so that when the access attribute is the current hotspot attribute or the potential hotspot attribute, it indicates that the target storage file is frequently accessed, and in order to ensure the reading performance and efficiency of the target storage file at this time, a high copy with higher reading performance and efficiency is determined as the target storage policy of the target storage file. When the access attribute is a unfreezing attribute or a local hot spot attribute, the access of the target storage file is not particularly frequent, and the normal copy which has relatively low storage cost, relatively balanced storage performance and efficiency and can ensure the access requirement of a user in the copy storage strategy is determined as the target storage strategy of the target storage file according to the comprehensive consideration of factors such as storage cost, file read-write performance, read-write efficiency and the like.
In summary, the target storage policies corresponding to different access attributes are shown in table 2 below:
TABLE 2
Access Properties Target storage strategy
Current hotspot attributes High copy
Potential hotspot attributes High copy
Thawing Properties Normal copy
Local hotspot attributes Normal copy
Pre-cooling property Object store
Freezing property Object store
In one specific example: when the access attribute of the target storage file A is the current hotspot attribute, the corresponding target storage policy is as follows: high copy.
And step 206, querying a historical storage strategy of the target storage file in the file storage system.
It is to be understood that the description of step 206 is the same as the description of step 103 in the first embodiment, and reference may be specifically made to the above description, which is not repeated in this embodiment.
And step 207, comparing the target storage strategy with the historical storage strategy to obtain a comparison result.
After the historical storage strategy of the target storage file and the target storage strategy which is matched with the target storage file at present are obtained, consistency comparison is carried out on the historical storage strategy and the target storage strategy, and a comparison result of whether the historical storage strategy and the target storage strategy are consistent is obtained.
For example: the target storage policy of the target storage file a is as follows: and if the high copy is the normal copy, the historical storage strategy is obviously inconsistent with the normal copy, and the comparison result is that the target storage strategy is inconsistent with the historical storage strategy.
And step 208, when the comparison result is that the target storage strategy is inconsistent with the historical storage strategy, updating the historical storage strategy into the target storage strategy, and executing storage operation on the target storage file according to the target storage strategy.
When the comparison result shows that the target storage policy is inconsistent with the historical storage policy, the historical storage policy needs to be changed into the currently adapted target storage policy, and the storage operation is performed on the target storage file according to the target storage policy, so as to ensure that the storage policy of the target storage file in the file storage system is suitable for the access requirement of the current user on the target storage file, and ensure high availability of the data of the target storage file in the file storage system.
It is understood that the storage operation may be to add a copy to the storage file, to subtract a copy from the storage file, or to change a storage policy for the storage file. For example: and when the target storage strategy of the target storage file is a high copy and the historical storage strategy is a normal copy, the storage operation executed on the current target storage file is to increase the copy. And when the target storage strategy of the target storage file is a normal copy and the historical storage strategy is a high copy, the storage operation executed on the current target storage file is the copy reduction. When the target storage strategy of the target storage file is a normal copy or a high copy and the historical storage strategy is object storage, the storage operation executed on the current target storage file is to change the object storage strategy to the normal copy or the high copy. When the target storage strategy of the target storage file is the object storage strategy and the historical storage strategy is a high copy or a normal copy, the storage operation executed on the current target storage file is to change the high copy or the normal copy into the object storage strategy.
For example, when the target storage policy of the target storage file a is: and when the high copy is adopted as the historical storage strategy, the corresponding storage operation is to add the copy.
And step 209, when the comparison result is that the target storage strategy is consistent with the historical storage strategy, continuing to execute storage operation on the target storage file according to the historical storage strategy.
When the comparison result shows that the target storage strategy is consistent with the historical storage strategy, the historical storage strategy of the target storage file is also suitable for the current user requirement, so that extra computing resources do not need to be transferred to update the historical storage strategy, and the target storage file can be continuously executed with storage operation according to the historical storage strategy.
In the embodiment, whether the access attribute of the target storage file is a cold attribute or a hot attribute is distinguished according to the access condition of the user to the target storage file, and the target storage file with the cold attribute is stored by using a low-cost object storage strategy, so that the storage cost is greatly reduced; for the target storage file with the thermal property, a copy storage strategy such as a Block replication storage mode is used for storage, the number of stored copies is increased for the thermal property file with high access heat, the data access performance and the access efficiency of the target storage file are improved, the data query time is shortened, the storage space is optimized more obviously, the high availability of data can be ensured, and the use experience of a user is improved.
In the data storage method in this embodiment, to avoid a large amount of consumption of data storage space in the file storage system, the access attribute of the target storage file in the file storage system is preferentially determined, where the access attribute includes: cold and hot properties; generating a target storage strategy which is matched with the access attribute of the target storage file for the target storage file, wherein the target storage strategy is generated after the access attribute of the target storage file is determined to be a cold attribute or a hot attribute, and when the access attribute changes, the target storage strategy also changes; after a currently adapted target storage strategy is formulated for a target storage file, a historical storage strategy of the target storage file in a file storage system is inquired, the target storage strategy is compared with the historical storage strategy, and a storage operation is executed on the target storage file according to a comparison result of the target storage strategy and the historical storage strategy, wherein the storage operation is executed according to a comparison result of the current target storage strategy and the historical storage strategy of the target storage file, and the comparison result changes along with the difference of the target storage strategy, so that the storage operation executed aiming at the target storage file also changes along with the difference, namely the storage strategy and the storage operation which are correspondingly matched in the file storage system are different aiming at the same target storage file under the condition that the access attribute of the same target storage file is different, thereby being beneficial to ensuring the high availability of the data of the target storage file at any time of system service, and the extra bandwidth consumption when accessing the target storage file can be reduced while maintaining the data reading and writing speed.
The second embodiment of the data storage method provided in the embodiments of the present application is as follows.
Referring to fig. 3, fig. 3 is a schematic structural diagram of an embodiment of a data storage device according to an embodiment of the present disclosure.
The data storage device in the embodiment includes:
a determining unit, configured to determine an access attribute of a target storage file in a file storage system, where the access attribute includes: cold and hot properties;
the generating unit is used for generating a target storage strategy which is matched with the access attribute for the target storage file;
the query unit is used for querying a historical storage strategy of the target storage file in the file storage system;
and the storage unit is used for executing storage operation on the target storage file according to the comparison result of the target storage strategy and the historical storage strategy.
Optionally, the determining unit specifically includes:
the first determining subunit is used for acquiring a target storage file and a log file in the file storage system;
the filtering subunit is used for filtering out the access data corresponding to the target storage file from the log file;
and the second determining subunit is used for determining the access attribute of the target storage file according to the access data.
Optionally, the cold attributes include: pre-cooling property; the thermal properties include: a potential hotspot attribute; the second determining subunit specifically includes:
the calculation subunit is used for calculating the access growth data according to the access data;
the acquisition subunit is used for acquiring reference access data and reference access growth data corresponding to the precooling attribute and the potential hotspot attribute respectively;
the first comparison subunit is configured to compare the access data with reference access data corresponding to the pre-cooling attribute and the potential hotspot attribute, respectively, to obtain a first comparison result;
the second comparison subunit is configured to compare the access growth data with reference access growth data corresponding to the precooling attribute and the potential hotspot attribute respectively to obtain a second comparison result;
and the first determining subunit is used for determining the access attribute of the target storage file to be a cold attribute or a hot attribute by combining the first comparison result and the second comparison result.
Optionally, determining, by combining the first comparison result and the second comparison result, that the access attribute of the target storage file is a cold attribute or a hot attribute, specifically including:
when the first comparison result is that the access data is larger than 0 and smaller than the reference access data corresponding to the precooling attribute, and the second comparison result is that the access growth data is smaller than the reference access growth data corresponding to the precooling attribute, determining the precooling attribute as the access attribute of the target storage file;
and when the first comparison result is that the access data is larger than 0 and smaller than the reference access data corresponding to the potential hotspot attribute, and the second comparison result is that the access growth data is larger than the reference access growth data corresponding to the potential hotspot attribute, determining the potential hotspot attribute as the access attribute of the target storage file.
Optionally, accessing the data comprises: number of visitors and number of visits;
calculating access growth data according to the access data, specifically comprising:
calculating the number of visitors growth rate in the visit growth data based on the number of visitors;
based on the number of accesses, the access number increase rate in the access increase data is calculated.
Optionally, the cold attributes include: freezing property; the thermal properties include: a current hotspot attribute, a unfreezing attribute and a local hotspot attribute; the second determining subunit specifically includes: the method specifically comprises the following steps:
the second determining subunit is used for determining the freezing attribute as the access attribute of the target storage file when the access data of the last N days is 0, wherein N is a natural number;
the third determining subunit is configured to determine, when the access data is greater than the reference access data corresponding to the current hotspot attribute, the current hotspot attribute as an access attribute of the target storage file;
the fourth determining subunit is used for determining the unfreezing attribute as the access attribute of the target storage file when the access data of the last N days is not 0 and the access data of the last N days is 0;
and the fifth determining subunit is used for determining the local hotspot attribute as the access attribute of the target storage file when the number of access persons in the access data is greater than 0 and less than the reference number of access persons corresponding to the local hotspot attribute, and the number of access times in the access data is greater than the reference number of access times corresponding to the local hotspot attribute.
Specifically, the generating unit specifically includes:
the first generation subunit is used for determining the object storage policy as a target storage policy of the target storage file when the access attribute is a cold attribute;
and the second generation subunit is used for determining the copy storage policy as the target storage policy of the target storage file when the access attribute is the thermal attribute.
Optionally, the copy storage policy includes: high copy and normal copy;
the second generating subunit specifically includes:
the first generation subunit is used for determining the high copy as a target storage strategy of the target storage file when the access attribute is a current hotspot attribute or a potential hotspot attribute in the thermal attributes;
and the second generation subunit is used for determining the normal copy as a target storage strategy of the target storage file when the access attribute is a unfreezing attribute or a local hot spot attribute in the thermal attributes.
Optionally, the storage unit specifically includes:
the comparison subunit is used for comparing the target storage strategy with the historical storage strategy to obtain a comparison result;
the first storage subunit is used for updating the historical storage strategy into a target storage strategy and executing storage operation on the target storage file according to the target storage strategy when the comparison result shows that the target storage strategy is inconsistent with the historical storage strategy;
and the second storage subunit is used for continuing to execute storage operation on the target storage file according to the historical storage strategy when the comparison result shows that the target storage strategy is consistent with the historical storage strategy.
It is understood that fig. 4 is a block diagram of a specific implementation of a data storage device in this embodiment.
The data storage device includes: the system comprises a management server, a controller, a log acquisition module, a log storage module, a log analysis module, an analysis result module, a strategy generation module and a strategy execution module. The workflow of the system is described as follows:
1. the user connects the HDFS management server (namely NameNode) by using the client, and initiates an access request to the target storage file.
2. The NameNode responds to the user request and simultaneously saves the access behavior of the user in a log file.
3. The log acquisition module extracts the log file in real time, writes the log file into the log storage module, and reports the write result to the controller after the log file is successfully written.
4. And after receiving the log writing result, the controller controls the log analysis module to execute a log analysis task. The log analysis module acquires the log file from the log storage module, filters and counts the number of visitors and the number of visitors of the user to the visit data of the target storage file, writes the statistical result into the analysis result module, and reports the written result to the controller.
5. And after the controller receives the analysis result and writes the analysis result, controlling the strategy generation module to execute the strategy generation task. The strategy generation module acquires the analysis result of the last N days from the analysis result module, analyzes the analysis result from four dimensions of the number of visitors, the number of visitors growth rate and the number of visitors growth rate, determines the access attribute and the target storage strategy of the target storage file, and reports the generated target storage strategy to the controller.
6. And after receiving the target storage strategy, the controller controls the strategy execution module to execute the strategy at a preset time point. And the strategy execution module acquires the target storage strategy and the historical storage strategy from the strategy generation module and compares the target storage strategy and the historical storage strategy to obtain a comparison result.
7. And executing storage operation through the NameNode, and reporting a change result to the controller after the change is completed.
In the data storage apparatus in this embodiment, to avoid a large amount of consumption of data storage space in the file storage system, the access attribute of the target storage file in the file storage system is preferentially determined, where the access attribute includes: cold and hot properties; generating a target storage strategy which is matched with the access attribute of the target storage file for the target storage file, wherein the target storage strategy is generated after the access attribute of the target storage file is determined to be a cold attribute or a hot attribute, and when the access attribute changes, the target storage strategy also changes; after a currently adapted target storage strategy is formulated for a target storage file, a historical storage strategy of the target storage file in a file storage system is inquired, the target storage strategy is compared with the historical storage strategy, and a storage operation is executed on the target storage file according to a comparison result of the target storage strategy and the historical storage strategy, wherein the storage operation is executed according to a comparison result of the current target storage strategy and the historical storage strategy of the target storage file, and the comparison result changes along with the difference of the target storage strategy, so that the storage operation executed aiming at the target storage file also changes along with the difference, namely the storage strategy and the storage operation which are correspondingly matched in the file storage system are different aiming at the same target storage file under the condition that the access attribute of the same target storage file is different, thereby being beneficial to ensuring the high availability of the data of the target storage file at any time of system service, and the extra bandwidth consumption when accessing the target storage file can be reduced while maintaining the data reading and writing speed.
An embodiment of the present application further provides an embodiment of a data storage device, where the storage device in this embodiment includes a processor and a memory: the memory is used for storing the program codes and transmitting the program codes to the processor; the processor is used for executing the data storage method in the above embodiment according to the instructions in the program code.
Embodiments of the present application further provide an embodiment of a computer-readable storage medium, where the computer-readable storage medium is used for storing a program code, and the program code is used for executing the data storage method in the foregoing embodiments.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (12)

1. A method of storing data, comprising:
determining access attributes of a target storage file in a file storage system, wherein the access attributes comprise: cold and hot properties;
generating a target storage strategy which is matched with the access attribute for the target storage file;
querying a historical storage strategy of the target storage file in the file storage system;
and executing storage operation on the target storage file according to the comparison result of the target storage strategy and the historical storage strategy.
2. The data storage method of claim 1, wherein the determining the access attribute of the target storage file in the file storage system specifically comprises:
acquiring a target storage file and a log file in a file storage system;
filtering out access data corresponding to the target storage file from the log file;
and determining the access attribute of the target storage file according to the access data.
3. The data storage method of claim 2, wherein the cold attributes comprise: pre-cooling property; the thermal properties include: a potential hotspot attribute;
the determining, according to the access data, an access attribute of the target storage file specifically includes:
calculating access growth data according to the access data;
acquiring reference access data and reference access growth data corresponding to the precooling attribute and the potential hotspot attribute respectively;
comparing the access data with reference access data corresponding to the precooling attribute and the potential hotspot attribute respectively to obtain a first comparison result;
comparing the access growth data with reference access growth data corresponding to the precooling attribute and the potential hotspot attribute respectively to obtain a second comparison result;
and determining that the access attribute of the target storage file is a cold attribute or a hot attribute by combining the first comparison result and the second comparison result.
4. The data storage method according to claim 3, wherein the determining, by combining the first comparison result and the second comparison result, that the access attribute of the target storage file is a cold attribute or a hot attribute specifically includes:
when the first comparison result is that the access data is larger than 0 and smaller than reference access data corresponding to a precooling attribute, and the second comparison result is that the access growth data is smaller than reference access growth data corresponding to a precooling attribute, determining the precooling attribute as the access attribute of the target storage file;
and when the first comparison result is that the access data is larger than 0 and smaller than the reference access data corresponding to the potential hotspot attribute, and the second comparison result is that the access growth data is larger than the reference access growth data corresponding to the potential hotspot attribute, determining the potential hotspot attribute as the access attribute of the target storage file.
5. The data storage method of claim 3, wherein said accessing data comprises: number of visitors and number of visits;
the calculating access growth data according to the access data specifically includes:
calculating a visitors number growth rate in the visitors growth data based on the visitors number;
and calculating the access time increase rate in the access increase data based on the access times.
6. The data storage method of claim 2, wherein the cold attributes comprise: freezing property; the thermal properties include: a current hotspot attribute, a thawing attribute and a local hotspot attribute;
the determining, according to the access data, an access attribute of the target storage file specifically includes:
when the access data of the last N days is 0, determining a freezing attribute as the access attribute of the target storage file, wherein N is a natural number;
when the access data is larger than the reference access data corresponding to the current hotspot attribute, determining the current hotspot attribute as the access attribute of the target storage file;
when the access data of the last N days is not 0 and the access data of the previous N days is 0, determining a unfreezing attribute as the access attribute of the target storage file;
and when the number of the access people in the access data is more than 0 and less than the reference number of the access people corresponding to the local hot spot attribute, and the number of the access times in the access data is more than the reference number of the access times corresponding to the local hot spot attribute, determining the local hot spot attribute as the access attribute of the target storage file.
7. The data storage method according to any one of claims 1 to 6, wherein the generating a target storage policy adapted to the access attribute for the target storage file specifically includes:
when the access attribute is a cold attribute, determining an object storage policy as a target storage policy of the target storage file;
and when the access attribute is the thermal attribute, determining a copy storage policy as a target storage policy of the target storage file.
8. The data storage method of claim 7, wherein the replica storage policy comprises: high copy and normal copy;
when the access attribute is a thermal attribute, determining a copy storage policy as a target storage policy of the target storage file, specifically including:
when the access attribute is a current hotspot attribute or a potential hotspot attribute in the thermal attributes, determining a high copy as a target storage strategy of the target storage file;
and when the access attribute is a unfreezing attribute or a local hot spot attribute in the thermal attributes, determining the normal copy as a target storage strategy of the target storage file.
9. The data storage method according to any one of claims 1 to 6, wherein the performing a storage operation on the target storage file according to the comparison result between the target storage policy and the historical storage policy specifically includes:
comparing the target storage strategy with the historical storage strategy to obtain a comparison result;
when the comparison result is that the target storage strategy is inconsistent with the historical storage strategy, updating the historical storage strategy into the target storage strategy, and executing storage operation on the target storage file according to the target storage strategy;
and when the comparison result shows that the target storage strategy is consistent with the historical storage strategy, continuing to execute storage operation on the target storage file according to the historical storage strategy.
10. A data storage device, comprising:
a determining unit, configured to determine an access attribute of a target storage file in a file storage system, where the access attribute includes: cold and hot properties;
the generating unit is used for generating a target storage strategy which is adaptive to the access attribute for the target storage file;
the query unit is used for querying a historical storage strategy of the target storage file in the file storage system;
and the storage unit is used for executing storage operation on the target storage file according to the comparison result of the target storage strategy and the historical storage strategy.
11. A data storage device, wherein the storage device comprises a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the data storage method of any one of claims 1 to 9 according to instructions in the program code.
12. A computer-readable storage medium for storing program code for performing the data storage method of any one of claims 1 to 9.
CN202210589221.1A 2022-05-27 2022-05-27 Data storage method, device, equipment and computer readable storage medium Pending CN114860663A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210589221.1A CN114860663A (en) 2022-05-27 2022-05-27 Data storage method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210589221.1A CN114860663A (en) 2022-05-27 2022-05-27 Data storage method, device, equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114860663A true CN114860663A (en) 2022-08-05

Family

ID=82641903

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210589221.1A Pending CN114860663A (en) 2022-05-27 2022-05-27 Data storage method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114860663A (en)

Similar Documents

Publication Publication Date Title
US9355112B1 (en) Optimizing compression based on data activity
CN105205014B (en) A kind of date storage method and device
CN102667772B (en) File level hierarchical storage management system, method, and apparatus
US8521986B2 (en) Allocating storage memory based on future file size or use estimates
US20090307329A1 (en) Adaptive file placement in a distributed file system
CN107888687B (en) Proxy client storage acceleration method and system based on distributed storage system
JP2019204473A (en) Method for writing plurality of small files of 2 mb or smaller to hdfs having data merge module and hbase cash module on the basis of hadoop
CN111159176A (en) Method and system for storing and reading mass stream data
CN103812934B (en) Remote sensing data publishing method based on cloud storage system
US7895247B2 (en) Tracking space usage in a database
Herodotou AutoCache: Employing machine learning to automate caching in distributed file systems
US10789234B2 (en) Method and apparatus for storing data
CN107506466A (en) A kind of small documents storage method and system
CN108304142A (en) A kind of data managing method and device
CN103841168B (en) Data trnascription update method and meta data server
CN116339643B (en) Formatting method, formatting device, formatting equipment and formatting medium for disk array
US20110093688A1 (en) Configuration management apparatus, configuration management program, and configuration management method
CN111913913A (en) Access request processing method and device
CN109189696B (en) SSD (solid State disk) caching system and caching method
CN114625695A (en) Data processing method and device
CN114860663A (en) Data storage method, device, equipment and computer readable storage medium
CN105610921A (en) Erasure code filing method based on data cache in cluster
CN115858510A (en) Method for evaluating data temperature and performing dynamic storage management and storage medium
CN116820323A (en) Data storage method, device, electronic equipment and computer readable storage medium
CN113835613B (en) File reading method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination