CN116185968A - File storage method, system, electronic equipment, storage medium and product - Google Patents

File storage method, system, electronic equipment, storage medium and product Download PDF

Info

Publication number
CN116185968A
CN116185968A CN202310134395.3A CN202310134395A CN116185968A CN 116185968 A CN116185968 A CN 116185968A CN 202310134395 A CN202310134395 A CN 202310134395A CN 116185968 A CN116185968 A CN 116185968A
Authority
CN
China
Prior art keywords
file
storage
files
requested
storage device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310134395.3A
Other languages
Chinese (zh)
Inventor
王丹枫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202310134395.3A priority Critical patent/CN116185968A/en
Publication of CN116185968A publication Critical patent/CN116185968A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/119Details of migration of file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a file storage method, a system, electronic equipment, a storage medium and a product, and relates to the technical field of data storage, wherein the method comprises the following steps: acquiring file request conditions in a current time period and file recommendation information in a next time period; determining a plurality of files to be requested in the next time period and respective file request probabilities according to file request conditions and file recommendation information; determining respective file storage strategies of the plurality of files to be requested according to respective file request probabilities of the plurality of files to be requested; before the next time period comes, storing the files to be requested according to respective file storage strategies of the files to be requested. According to the method, the file is stored in the more suitable storage equipment, so that the file can be read from the storage equipment with high access speed as much as possible, the file access is smoother, the file access experience of a user is ensured, and the service capability of the storage equipment is improved.

Description

File storage method, system, electronic equipment, storage medium and product
Technical Field
The present invention relates to the field of data storage technologies, and in particular, to a file storage method, a system, an electronic device, a storage medium, and a product.
Background
With the development of the internet, file resources required to be stored are also becoming more and more abundant. Particularly, as streaming media (such as short video) occupies higher and higher weight in the internet, the volume of a single streaming media file (such as a high-definition video file) is also larger and larger. In order to ensure that the access of the internet users to the increasingly-grown resource files is smooth enough, the throughput capacity of each storage device (such as a mechanical hard disk, a solid state hard disk and a memory) needs to be continuously improved, and the response speed and the service capacity of the edge cache device closest to the users are also continuously improved. In addition, the service capacity of each storage device is improved, and meanwhile, the cost of storage resources needs to be guaranteed to be controlled to a certain extent.
Therefore, it is necessary to develop a file storage method, system, electronic device, storage medium and product, so that the overall service capacity of each storage device can be improved and smooth file access of users can be ensured on the premise of keeping the existing storage architecture and capacity.
Disclosure of Invention
In view of the foregoing, embodiments of the present invention provide a file storage method, system, electronic device, storage medium, and article to overcome or at least partially solve the foregoing problems.
In a first aspect of an embodiment of the present invention, a method for storing a file is provided, including:
acquiring file request conditions in a current time period and file recommendation information in a next time period;
determining a plurality of files to be requested in the next time period and the file request probability of each of the plurality of files to be requested according to the file request condition and the file recommendation information;
determining respective file storage strategies of the plurality of files to be requested according to respective file request probabilities of the plurality of files to be requested;
and before the next time period comes, storing the files to be requested according to the file storage strategies of the files to be requested.
Optionally, determining the file storage policy of each of the plurality of files to be requested according to the file request probability of each of the plurality of files to be requested, including:
obtaining storage device information, wherein the storage device information comprises category information of available storage devices and storage condition information of each storage device, and the storage condition information at least comprises: the storage capacity of the storage device, the stored file information, the size of the remaining storage space and the file reading speed;
Determining storage equipment matched with each of the plurality of files to be requested according to the file request probability and the storage equipment information of each of the plurality of files to be requested; according to the order of the file reading speeds of the storage devices from high to low, the file request probability of the file to be requested, which is matched with the storage device with higher file reading speed, in any two storage devices is higher than the file request probability of the file to be requested, which is matched with the storage device with lower file reading speed.
Optionally, determining a plurality of files to be requested in the next time period and respective file request probabilities of the plurality of files to be requested according to the file request condition and the file recommendation information includes:
determining a plurality of files to be requested in the next time period according to the file request condition and the file recommendation information; the file request condition indicates the number of times that the file is requested in the current time period, and the file recommendation information indicates the file information pushed in the next time period;
determining semantic relations between a plurality of files to be requested in the next time period and a plurality of files requested in the current time period;
And determining the file request probability of each of the plurality of files to be requested according to the semantic relation.
Optionally, storing the plurality of files to be requested according to a file storage policy of each of the plurality of files to be requested, including:
and before the next time period comes, storing the files to be requested according to the file storage strategies of the files to be requested.
Optionally, storing the plurality of files to be requested according to respective file storage policies of the plurality of files to be requested when the plurality of files to be requested are local files stored in any one of the plurality of storage devices, including:
according to the file storage strategy of each of the plurality of files to be requested, controlling the plurality of storage devices to perform local file migration according to the following steps:
controlling the plurality of storage devices, and transferring the file to be requested, of which the stored file request probability is between a first threshold value and a second threshold value, to a first storage device, wherein the first storage device is the storage device with the slowest file reading speed in the plurality of storage devices, and the first threshold value is higher than the second threshold value;
Controlling the storage devices to delete files to be requested, wherein the file request probability of the files to be requested is lower than the second threshold value;
and according to the respective residual storage space sizes of the storage devices except the first storage device, migrating at least part of files to be requested in the first storage device to the other storage devices.
Optionally, each of the other storage devices includes at least a second storage device and a third storage device, and a file reading speed of the third storage device is higher than a file reading speed of the second storage device; according to the respective residual storage space sizes of the storage devices except the first storage device, migrating at least part of files to be requested in the first storage device to each other storage device, including:
controlling the first storage device to migrate a plurality of files to be requested with highest request probability to the third storage device from among the stored files to be requested according to the size of the remaining storage space of the third storage device;
and after the file migration of the third storage device is completed, controlling the first storage device to migrate the plurality of files to be requested with the highest request probability to the second storage device in the stored remaining request files according to the size of the remaining storage space of the second storage device.
Optionally, when a non-local file that is not stored in any storage device of the plurality of storage devices exists in the plurality of request files, determining, according to the respective file request probabilities of the plurality of files to be requested, a respective file storage policy of the plurality of files to be requested includes:
determining a file storage strategy of the non-local file according to the size relation between the file request probability of the non-local file and the first threshold value and/or the second threshold value;
further comprises:
and downloading and storing the non-local file according to the file storage strategy of the non-local file.
Optionally, acquiring the file request condition in the current time period includes:
acquiring file request condition data of a plurality of terminals in the current time period, and obtaining the file request condition in the current time period after statistics.
The second aspect of the present embodiment provides a file storage system, where the file storage system includes a request data platform, an intelligent file distribution module, and a storage end;
the data request platform is used for acquiring file request conditions in the current time period and file recommendation information in the next time period, which are sent by the storage terminal, and acquiring file recommendation information in the next time period and sent by the third-party file recommendation system; the file request condition and the file recommendation information are sent to the file intelligent distribution module;
The file intelligent distribution module is used for determining a plurality of files to be requested in the next time period and the file request probability of each of the files to be requested according to the file request condition and the file recommendation information; determining respective file storage strategies of the plurality of files to be requested according to respective file request probabilities of the plurality of files to be requested; sending the file storage strategy to the storage end;
the storage end is configured to store the plurality of files to be requested according to respective file storage policies of the plurality of files to be requested before the next time period arrives.
A third aspect of the present embodiment provides a file storage device, including:
the acquisition module is used for acquiring file request conditions in the current time period and file recommendation information in the next time period;
the probability determining module is used for determining a plurality of files to be requested in the next time period and the respective file request probabilities of the plurality of files to be requested according to the file request condition and the file recommendation information;
the storage strategy determining module is used for determining the file storage strategy of each of the plurality of files to be requested according to the file request probability of each of the plurality of files to be requested;
And the storage module is used for storing the plurality of files to be requested according to the respective file storage strategies of the plurality of files to be requested before the next time period arrives.
Optionally, the storage policy determining module includes:
a first storage policy determining submodule, configured to obtain storage device information, where the storage device information includes category information of available storage devices, and storage condition information of each storage device, and the storage condition information includes at least: the storage capacity of the storage device, the stored file information, the size of the remaining storage space and the file reading speed;
the second storage strategy determining submodule is used for determining storage equipment matched with each of the plurality of files to be requested according to the file request probability of each of the plurality of files to be requested and the storage equipment information; according to the order of the file reading speeds of the storage devices from high to low, the file request probability of the file to be requested, which is matched with the storage device with higher file reading speed, in any two storage devices is higher than the file request probability of the file to be requested, which is matched with the storage device with lower file reading speed.
Optionally, the probability determining module includes:
the first probability determination submodule is used for determining a plurality of files to be requested in the next time period according to the file request condition and the file recommendation information; the file request condition indicates the number of times that the file is requested in the current time period, and the file recommendation information indicates the file information pushed in the next time period;
a second probability determination submodule, configured to determine semantic relationships between a plurality of files to be requested in the next time period and a plurality of files requested in the current time period;
and the third probability determination submodule is used for determining the file request probability of each of the plurality of files to be requested according to the semantic relation.
Optionally, the storage module includes:
and the first storage sub-module is used for storing the plurality of files to be requested according to the respective file storage strategies of the plurality of files to be requested before the next time period arrives.
Optionally, in the case that the plurality of files to be requested are local files stored in any one of the plurality of storage devices, the storage module further includes:
And the migration submodule is used for controlling the plurality of storage devices to carry out local file migration according to the file storage strategy of each of the plurality of files to be requested, and comprises the following steps:
the first migration unit is used for controlling the plurality of storage devices, migrating the files to be requested, of which the stored file request probability is between a first threshold value and a second threshold value, into the first storage device, wherein the first storage device is the storage device with the slowest file reading speed in the plurality of storage devices, and the first threshold value is higher than the second threshold value;
the second migration unit is used for controlling the plurality of storage devices and deleting files to be requested, wherein the file request probability of the files to be requested is lower than the second threshold value;
and the third migration unit is used for migrating at least part of files to be requested in the first storage device to each other storage device according to the respective residual storage space sizes of the storage devices except the first storage device.
Optionally, the other storage devices at least include a second storage device and a third storage device, where a file reading speed of the third storage device is higher than a file reading speed of the second storage device; the third migration unit includes:
The first migration subunit is used for controlling the first storage device to migrate a plurality of files to be requested with highest request probability to the third storage device in each stored file to be requested according to the size of the remaining storage space of the third storage device;
and the second migration subunit is used for controlling the first storage device to migrate the plurality of files to be requested with highest request probability to the second storage device in the stored residual request files according to the size of the residual storage space of the second storage device.
Optionally, in a case that there is a non-local file that is not stored in any one of the plurality of storage devices in the plurality of request files, the storage policy determining module includes:
the third storage strategy sub-module is used for determining the file storage strategy of the non-local file according to the size relation between the file request probability of the non-local file and the first threshold value and/or the second threshold value;
further comprises:
and the downloading sub-module is used for downloading and storing the non-local file according to the file storage strategy of the non-local file.
Optionally, the acquiring module includes:
And the statistics sub-module is used for acquiring file request condition data of a plurality of terminals in the current time period and obtaining the file request condition in the current time period after statistics.
The fourth aspect of the embodiment of the invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to realize the steps in the file storage method according to the first aspect of the embodiment of the invention.
The fifth aspect of the embodiment of the present invention further provides a computer readable storage medium, on which a computer program/instruction is stored, which when executed by a processor, implements the steps in the file storage method according to the first aspect of the embodiment of the present invention.
A sixth aspect of the embodiments of the present invention also provides a computer program product which, when run on an electronic device, causes a processor to carry out the steps of the file storage method according to the first aspect of the embodiments of the present invention.
The embodiment of the invention provides a file storage method, a system, electronic equipment, a storage medium and a product, wherein the method comprises the following steps: acquiring file request conditions in a current time period and file recommendation information in a next time period; determining a plurality of files to be requested in the next time period and respective file request probabilities of the plurality of files to be requested according to file request conditions and file recommendation information; determining respective file storage strategies of a plurality of files to be requested according to the file request probability; before the next time period comes, a plurality of files to be requested are stored according to the file storage strategy. According to the method and the device, different storage strategies are set by calculating the probability that the file in the next time period is requested, so that the file is stored in the more suitable storage equipment through the storage strategies, and therefore, the file is read from the storage equipment with high access speed as much as possible in the next time period, so that the file access is smoother, the file access experience of a user is ensured, and the service capability of the storage equipment is improved. In addition, according to the embodiment of the invention, the file is stored in the proper storage device according to the file request probability, so that the file access speed is improved, the original storage architecture and capacity are not required to be changed, and the method can be widely applied to large-scale clusters, cloud machine rooms or edge storage devices and is not limited by the original architecture.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart illustrating steps of a method for storing files according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a file storage policy according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a file distribution refreshing process according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a file storage system according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a file storage device according to an embodiment of the present invention;
fig. 6 is a schematic diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings in the embodiments of the present invention. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
An embodiment of the present invention provides a file storage method, referring to fig. 1, and fig. 1 is a flowchart of steps of the file storage method provided in the embodiment of the present invention, as shown in fig. 1, where the method includes:
step S101, acquiring file request conditions in a current time period and file recommendation information in a next time period;
specifically, the file request condition indicates the number of times that the file is requested in the current time period, and specifically may further include the number of times that the file is respectively requested in various storage devices in the current time period, where the various storage devices at least include a memory, a solid state disk and a mechanical hard disk.
According to the related content of the basic principle of the computer, the file reading speed of the memory is more than 20 times of that of the solid state disk, and more than 1000 times of that of the mechanical disk. Therefore, when storing the file in the memory, the reading speed is the fastest and the file is accessed the most smoothly. However, in practical application, due to cost reasons, the capacity of the memory is far smaller than that of the solid state disk, and the solid state disk is far smaller than that of the mechanical disk. The capacity of the memory is limited and all files cannot be stored in the memory simply for access speed. And, the file throughput capability, or service capability, of each storage device depends on the local hit rate. When the file is accessed, firstly, whether the file exists in the memory is inquired, if the file does not exist in the memory, whether the file exists in the solid state disk is searched, if the file does not exist in the solid state disk, whether the file exists in the mechanical hard disk is searched continuously, and therefore step-by-step downward searching is conducted.
The hit rate of a storage device indicates the ratio of the number of times a target file is found in the storage device to the number of times a search is performed in the storage device, and if a file is found each time it is determined that the target file is not in the storage device, the hit rate of the storage device is correspondingly reduced. In addition, the time of the current time period can be set according to specific application conditions. For example, it may be set to 1 minute or 5 minutes, and set to a fixed period, periodically to acquire data.
The file recommendation information is file recommendation information sent by a third party platform or system, and mainly comprises file information pushed to the equipment or the user in the next period, wherein the file information comprises information such as file name, type, file size, file identification and the like. In particular, the third party system may be a video pushing system, and the file recommendation information may represent, in a list, video files that the video pushing system will push to the device in a next time period. This next time period may be set according to the actual application, and may be set to 1 minute or 5 minutes, for example, and may be the same as the current time period described above. In one embodiment, the method may be set to a fixed period, so that the current time period is the current period, and the next time period is the next period, so that the file request condition and the file recommendation information are periodically acquired.
Step S102, determining a plurality of files to be requested in the next time period and respective file request probabilities of the plurality of files to be requested according to the file request condition and the file recommendation information;
specifically, according to the file request situation, files which are frequently read in a certain historical time period can be known, and according to file recommendation information, files which may be needed in a next time period can be known. After the file request condition and the file recommendation information are acquired, a plurality of files to be requested in the next time period and the file request probability of each file to be requested can be determined by analyzing and calculating the data. The file request probability indicates a probability that each file is requested in the next time period, that is, a probability that the user will read the file in the next time period. If the probability of requesting the file is high, it means that the file is likely to be read in the next time period.
In this embodiment, one or more algorithms of machine learning, word vector and deep learning may be used to analyze and calculate file request conditions and file recommendation information as input data, so as to obtain an output result, that is, a file request probability in the next time period. Specifically, taking a probabilistic predictive neural network model as an example, training the request probabilistic predictive neural network model by acquiring a labeled sample data set to obtain a trained neural network model. And then, taking the file request condition and the file recommendation information as input data, and inputting the trained neural network model, so that a file request probability output result of each file can be obtained. Illustratively, the output result may be expressed as: { File 1:0.55, file 2:0.7,. File N:0.33}. In this embodiment, a training sample is taken as a file request condition in a certain time period and file recommendation information for the time period, and the sample is marked by using the file request times in the next time period adjacent to the time period, so that a training sample data set is obtained, and a neural network model is trained. It should be noted that the duration of the time period of each training sample data in the training data set needs to be consistent, and the acquired training sample data needs to be device data in the same geographic area, so as to ensure the stability of the training result and the reliability of the neural network model.
Step S103, determining respective file storage strategies of the plurality of files to be requested according to respective file request probabilities of the plurality of files to be requested;
after obtaining the respective file request probabilities of the plurality of files to be requested, the file request probabilities of the plurality of files to be requested can be respectively matched with the appropriate storage equipment according to the request probabilities of the files, so that the file storage strategy is determined. The file storage strategy represents storage devices matched with the file and storage means such as migration, downloading and deleting of the file.
Illustratively, storage policy 1 is "move file a from storage device a to storage device B for storage", storage policy 2 is "download file B directly to storage device a", and storage policy 3 is "delete file C from storage device B".
Referring to fig. 2, fig. 2 shows a schematic diagram of a file storage policy, and as shown in fig. 2, a storage device is specifically divided into a memory, a solid state disk and a mechanical hard disk, files to be requested with high request probability are stored in the memory and the solid state disk, files to be requested with low request probability are stored in the mechanical hard disk, and when a file is accessed, a target file is preferentially searched in the memory, so that the file reading speed and the hit rate of the storage device are ensured.
Based on different file reading speeds of different storage devices, after files are stored in different storage devices, the file accessing speed is different, so that the embodiment generates a storage strategy according to the file request probability, so that the files can be allocated to proper storage devices according to the request probability, for example, files with high request probability are matched with storage devices with high file reading speed, files with low request probability are matched with storage devices with low file reading speed, and most of files with high access probability are stored in the storage devices with high file reading speed, thereby achieving the purpose of improving the file reading speed.
Step S104, before the next time period comes, storing the files to be requested according to the respective file storage policies of the files to be requested.
After receiving the corresponding file storage policy, each storage device can store the file according to the policy. Specifically, the operations of file migration, downloading, deleting and the like can be performed.
In this embodiment, the storage policy of each file is determined by predicting the file request probability of the next time period, so that each storage device can store according to the corresponding storage policy before the next time period comes, thereby completing the refreshing of the file distribution in the whole storage system before the next time period, and achieving the effect of file caching. The next time period is set according to the practical application, and can be 1 minute, 5 minutes and the like. When the time period is set according to a fixed-length period, the calculation of the file request probability of the (n+1) -th period in the (n+1) -th period can be expressed, before the (n+1) -th period arrives, the storage device is enabled to complete file storage according to the corresponding file storage strategy, and then the calculation of the file request probability of the (n+2) -th period in the (n+1) -th period is enabled to complete file storage according to the corresponding file storage strategy before the (n+2) -th period arrives. Therefore, the steps are executed once in each time period, and the refreshing of the file distribution in the storage equipment is completed, so that the continuous updating of the file distribution can be realized, and the distribution condition of the files in each storage equipment can be flexibly updated along with the actual condition.
Referring to fig. 3, fig. 3 shows a schematic flow chart of file distribution refreshing, and as shown in fig. 3, each storage device (such as a device a, a device B and a device C) is included in a heterogeneous storage and cache device set. Firstly, the device set reports the file request and hit data to the big data collection system, the big data collection system integrates the collected information, after statistics is completed, the counted information is sent to the intelligent file distribution system, and therefore step S101 (obtaining the file request condition in the current time period) in the above embodiment is completed. As shown in fig. 3, the video recommendation system is used as a third party system to send the recommended video set of the next period to the file intelligent distribution system, thereby completing step S101 (obtaining the file recommendation information in the next period) in the above embodiment. Then, the file intelligent distribution system performs file distribution prediction according to the received information, and generates a file list including file distribution conditions of the next period, that is, file storage policies (corresponding to steps S102 and S103 in the above-described embodiments). Finally, the intelligent file distribution system issues the obtained file list (file distribution condition of the next period) to the storage and cache device set, specifically, to the device a, the device B and the device C thereof, so that each device performs operations such as downloading, migration and deleting of the file according to the file distribution condition, and completes file distribution (corresponding to step S104 in the above embodiment). The file distribution refreshing process shown in fig. 3 (i.e., the file storage method provided in this embodiment) realizes refreshing of the file distribution of the storage device set or system, and according to the collected information, the respective file request probabilities in the next time period are obtained through analysis and calculation, so that a file list (a file storage policy) can be generated according to the request probabilities of the respective files, and the files are respectively stored in a suitable storage device, so that file access is smoother, and file access experience of a user is ensured.
In addition, for different storage devices (such as heterogeneous storage and cache devices shown in fig. 3), network connection channels of the storage devices and the intelligent file distribution system can be respectively established. Specifically, the network connection is established in an HTTP short connection mode or a tcp long connection mode. When step S101 is performed, file request conditions of each storage device in the current time period may be acquired through a network connection channel established in advance. Correspondingly, when step S104 (corresponding to the intelligent file distribution system in fig. 3 sending the obtained file list to the storage and cache device set) in the foregoing embodiment is executed, the file storage policy may be sent to the corresponding storage device through the network connection channel, so that the storage device may complete file storage according to the file storage policy. Therefore, the required information is transmitted by utilizing the pre-established network connection channel, the data transmission speed is increased, and the file storage efficiency is further improved.
Currently, for a general server operating system (such as linux), local storage refreshing and storage replacement are often performed according to a traditional algorithm such as a least recently used algorithm (Least Recently Used, LRU) or a least frequently used algorithm (Least Frequently Used, LFU) during network file transmission. The algorithm is used for caching and storing the files by moving the files with higher access frequency in the late stage from a storage device (such as a mechanical hard disk) with lower access speed to a storage device (a solid state hard disk or a memory) with higher access speed. By way of example, by acquiring historical access data, it is determined that video a was viewed 10 times, with higher access times than other files, and then it is determined that video a needs to be stored in memory with faster file reading speed; and compared with other files, the video B is cached in the solid state disk with lower file reading speed if the video B is watched for 1 time and the access times are smaller. However, the file storage method is only to match the storage device by means of the historical access frequency of the files, so that the accessed files tend to be stored in the memory, and the fact that which files are more likely to be read by a user is not considered, so that the really needed files are stored in the solid state disk and the mechanical hard disk with full reading speed, the file access speed of the user is lower, and the experience is affected.
According to the file request condition of the current time period and the file recommendation information of the next time period, the file request probability of each of the multiple files to be requested in the next time period is obtained through analysis and calculation, so that different storage strategies can be set according to the request probability of each file, and the multiple files to be requested are respectively stored in the proper storage equipment. For example, the file with high request probability can be stored in the storage device with high file reading speed, and the file with low request probability can be stored in the storage device with low file reading speed, so that the file can be read from the storage device with high access speed as much as possible, the file access is smoother, and the file access experience of a user is ensured. In addition, according to the file request probability of the file in the next time period, the embodiment of the invention stores the file in the corresponding storage device respectively to improve the file access speed without changing the original storage architecture and capacity, so that the method can be widely applied to large-scale clusters, cloud machine rooms or edge storage devices and is not limited by the original storage architecture.
In one embodiment, step S103, determining, according to the file request probabilities of the files to be requested, the file storage policies of the files to be requested, includes:
Step S103-1, obtaining storage device information, wherein the storage device information comprises the type information of available storage devices and the storage condition information of each storage device, and the storage condition information at least comprises: the storage capacity of the storage device, the stored file information, the size of the remaining storage space and the file reading speed;
in the practical application process, when a file storage policy is generated, storage device information needs to be acquired, wherein the storage device information comprises the type information of available storage devices, namely the type of the storage devices owned by the terminal, such as a memory, a solid state disk, a mechanical hard disk and the like, and the storage condition information of each storage device, and the storage request information specifically comprises the storage capacity, the stored file information, the size of the residual storage space, the file reading speed and the like of the storage device.
Step S103-2, determining storage devices matched with the files to be requested according to the file request probability and the storage device information of the files to be requested; according to the order of the file reading speeds of the storage devices from high to low, the file request probability of the file to be requested, which is matched with the storage device with higher file reading speed, in any two storage devices is higher than the file request probability of the file to be requested, which is matched with the storage device with lower file reading speed.
In this embodiment, after determining the file request probability of each file in the next time period, each file to be requested may be matched to a corresponding storage device according to the file request probability of each file to be requested and the storage device information, so as to determine a storage policy, notify the corresponding storage device of the predicted access file content in the next time period in advance, so that the storage device may store the file according to the storage policy, update the file distribution in the storage device, and promote the hit rate of the corresponding storage device.
According to the order of the file reading speeds of the storage devices from high to low, the file request probability of the file to be requested, which is matched with the storage device with higher file reading speed, is higher than the file request probability of the file to be requested, which is matched with the storage device with lower file reading speed, in any two storage devices. Specifically, according to the order of the file reading speed of the storage device from high to low, for example, the file reading speed of the memory is higher than that of the solid state disk, and the file reading speed of the solid state disk is higher than that of the mechanical disk. When the file storage strategy is generated, the file request probability of the file to be requested stored in the memory is higher than the file request probability of the file to be requested in the solid state disk, and the file request probability of the file to be requested in the solid state disk is higher than the file request probability of the file to be requested in the mechanical disk.
In one embodiment, the file with the highest probability of file request may be stored in the memory, the file with the medium probability of file request may be stored in the solid state disk, and the file with the lowest probability of file request may be stored in the mechanical hard disk. Therefore, when the files are read from the storage device, the target files are searched from the memory, the files with highest request probability are stored in the memory, so that the hit rate of the memory can be ensured, the service capacity and the file throughput capacity of the memory are improved, the hit rates of the solid state disk and the mechanical hard disk can be ensured, and the overall service capacity is improved while the file reading speed is improved. The file throughput capability, or service capability, of an individual storage device depends on the local hit rate of that storage device.
In one embodiment, step S102 determines a plurality of files to be requested in the next time period and respective file request probabilities of the plurality of files to be requested according to the file request condition and the file recommendation information, including:
step S102-1, determining a plurality of files to be requested in the next time period according to the file request condition and the file recommendation information; the file request condition indicates the number of times that the file is requested in the current time period, and the file recommendation information indicates the file information pushed in the next time period.
Step S102-2, determining semantic relations between a plurality of files to be requested in the next time period and a plurality of files requested in the current time period.
Step S102-3, determining the file request probability of each of the plurality of files to be requested according to the semantic relation.
In this embodiment, one or more algorithms selected from machine learning, word vector, and deep learning may be used to perform analysis and calculation using the file request condition and the file recommendation information as input data. Specifically, a plurality of files to be requested which are possibly requested in the next time period can be determined first, then semantic relations among the files are acquired, and the semantic relations among the plurality of files to be requested and the plurality of files requested in the current time period can be included, so that respective file request probabilities of the plurality of files to be requested are obtained through analysis and calculation according to the semantic relations. The semantic relationship between files represents the logical association relationship between files, which is predetermined according to the content of the files. In particular, for streaming media files, there is an obvious semantic relationship between the files, for example, after some video accesses, another video is recommended to the user, or for episode video, the next episode content is pushed. The plurality of files requested in the current time period comprise a file A video 10 th set, semantic relations of an upper and lower set relation between a file B video 11 th set to be requested and the file A are determined, and the file request probability of the file B to be requested is considered to be higher through analysis of the semantic relations. The embodiment can calculate the file request probability of each file to be requested by using the semantic relation between the obtained files, the file request condition and the file recommendation information as input data and using one or more algorithms of machine learning, word vector and deep learning. In the present embodiment, the algorithm employed is not limited. According to the method, the device and the system, the semantic relation between the files to be requested and the files requested in the current time period is obtained, the correlation between the files is analyzed, and the files needed in the next time period of the system are calculated in a combined intelligent prediction mode, so that the determined file request probability is more accurate.
In one embodiment, in the case that the plurality of files to be requested are all local files stored in any one of the plurality of storage devices, step S104 stores the plurality of files to be requested according to respective file storage policies of the plurality of files to be requested, including:
according to the file storage strategy of each of the plurality of files to be requested, controlling the plurality of storage devices to perform local file migration according to the following steps:
step S104-1, controlling the plurality of storage devices, and transferring the file to be requested, of which the stored file request probability is between a first threshold value and a second threshold value, to a first storage device, wherein the first storage device is the storage device with the slowest file reading speed in the plurality of storage devices, and the first threshold value is higher than the second threshold value;
step S104-2, controlling the plurality of storage devices, and deleting the files to be requested, of which the file request probability is lower than the second threshold value;
and step S104-3, migrating at least part of files to be requested in the first storage device to each other storage device according to the respective residual storage space sizes of the other storage devices except the first storage device.
In this embodiment, when the file to be requested is already stored in one of the storage devices, that is, when the file to be requested is a local file already stored in the system, after determining the storage policy, migration or retention in the original storage device or deletion is required according to the storage policy. Specifically, files with file request probability between the first threshold and the second threshold may be migrated to the first storage device, that is, files with file request probability too low to a certain extent may be migrated from other storage devices to the storage device with the slowest file reading speed. On one hand, the method can ensure that the request probability of the files in other storage devices is higher, improve the hit rate of the storage devices, and on the other hand, the method can save the storage space of the storage devices by moving the files with low request probability out of the other storage devices. The first threshold and the second threshold can be artificially set thresholds, and are not limited in the secondary aspect. For example, files with file request probabilities in the range of 10% -20% may be moved from memory and solid state disk to mechanical disk. In this embodiment, the files to be requested, which are lower than the second threshold, in each storage device are deleted, so that the files with too low file request probability are directly deleted from the storage device, and the storage space of the storage device can be further saved.
And according to the respective residual storage space sizes of other storage devices except the first storage device, migrating at least part of files to be requested in the first storage device to each other storage device. And according to the size of the residual storage space of the storage device, migrating part of files to be requested which are already stored in the first storage device to other storage devices, thereby achieving the purpose of migrating files with high file request probability from the storage device with low reading speed to the storage device with high reading speed.
In one embodiment, the other storage devices include at least a second storage device and a third storage device, the third storage device having a file read speed that is higher than the file read speed of the second storage device; step S104-3, according to the respective remaining storage space sizes of the storage devices except the first storage device, migration of at least part of the files to be requested in the first storage device to the other storage devices includes:
step S104-3a, controlling the first storage device to migrate a plurality of files to be requested with highest request probability to the third storage device in each stored file to be requested according to the size of the remaining storage space of the third storage device;
And step S104-3b, after the file migration of the third storage device is completed, controlling the first storage device to migrate the plurality of files to be requested with the highest request probability into the second storage device from the stored residual request files according to the residual storage space of the second storage device.
In this embodiment, since the file reading speeds and the space capacities of the storage devices are different, when the storage is performed, on one hand, the capacity of the storage device needs to be considered, all the files cannot be directly stored in the storage device with the highest file reading speed, and on the other hand, the file request probability of each file needs to be considered, and the file with the highest request probability needs to be stored in the storage device with the highest file reading speed as much as possible, so that smooth access of a user is ensured. In this embodiment, a third storage device with a faster file reading speed is considered first, and a plurality of files with highest file request probabilities in the first storage device are migrated to the third storage device. By way of example, when 10 files can also be stored in the third storage device, migrating the 10 files with the highest probability of being requested in the first storage device to the third storage device. And then, determining the size of the residual storage space of the second storage device with relatively higher file reading speed, and transferring the file with high file request probability to the second storage device from the residual stored files to be requested of the first storage device. By way of example, the second storage device is further capable of storing 10 files, and then the 10 files with the highest probability of file request in the first storage device are migrated to the second storage device. In practical applications, the third storage device, the second storage device, and the first storage device may correspond to the memory, the solid state disk, and the mechanical hard disk, respectively. Therefore, files with high file request probability in the mechanical hard disk can be migrated to the memory and the solid state disk, so that respective hit rates of the memory and the solid state disk are ensured, and the overall service capacity is improved.
In one embodiment, a higher third request probability threshold value can be set, and files with the file request probability higher than the third request probability threshold value in the first storage device are migrated to the memory and the solid state disk, so that the high-probability files are ensured to be stored in the storage device with high reading speed in a concentrated mode, and the user is ensured to access the files smoothly enough.
In one embodiment, in a case where there is a non-local file that is not stored in any one of the plurality of storage devices in the plurality of request files, step S103, determining, according to the respective file request probabilities of the plurality of files to be requested, a respective file storage policy of the plurality of files to be requested includes:
step S103-3, determining a file storage strategy of the non-local file according to the size relation between the file request probability of the non-local file and the first threshold value and/or the second threshold value;
further comprises:
step S103-4, downloading and storing the non-local file according to the file storage strategy of the non-local file.
In this embodiment, if there is a non-local file in the file to be requested, that is, a file that is not stored in any storage device, a different file storage policy is generated according to the file request probability of the non-local file. Specifically, if the file request probability of the non-local file is lower than the second threshold value, the file is not stored; under the request that the file request probability of the non-local file is lower than a first threshold value but not lower than a second threshold value, matching the file to first storage equipment, and downloading and storing the file by the first storage equipment with the slowest file reading speed; and when the file request probability of the non-local file is higher than a third request probability threshold value, matching the file to a storage device with the highest file reading speed, and downloading and storing by the storage device. Therefore, the method and the device realize that the file request probability of the non-local file is matched with the proper storage device according to the size of the file request probability of the non-local file and the size relation between the file request probability and the first threshold value and/or the second threshold value, and if the file request probability of the file is higher, the file request probability is matched with the storage device with high file reading speed, so that the required file is stored in the storage device with high file reading speed as much as possible, the file reading speed is ensured, and the hit rate of the storage device is improved.
In one embodiment, step S101, obtaining a file request case in the current time period includes:
step S101-1, file request condition data of a plurality of terminals in the current time period are obtained, and the file request condition in the current time period is obtained after statistics.
The file request condition indicates the number of times that the file is requested in the current time period, and specifically may further include the number of times that the file is respectively requested in various storage devices in the current time period, where the various storage devices at least include a memory, a solid state disk and a mechanical hard disk. In this embodiment, the obtained file request situation may be a file request situation of one terminal or one user, or may be a sum of file request situations of a plurality of terminals. Specifically, by acquiring file request case data of a plurality of terminals, the file request case data including the number of times a file in the terminal is requested each in the current period of time, and the number of times the file is requested in a storage device, as an example: in the current time period, file a is requested 5 times in memory. After the file request condition data of a plurality of terminals are obtained, the file request condition data are statistically processed to obtain the file request condition, wherein the file request condition data comprise the total number of times each file is requested and the number of times the file is requested in different storage devices. Exemplary: { File A: { memory hit: 1 solid state disk hit 5 times, mechanical hard disk hit 10 times, not in the local machine 10 times }. The term "not in local" means that the file is a non-local file, and is not stored in any storage device, and when requested, the file needs to be downloaded. According to the file storage method of the embodiment, file request condition data of a plurality of terminals can be obtained, so that the condition that the file is requested in the current time period can be more accurately analyzed and obtained by obtaining multiparty data. Therefore, the file request probability can be calculated and determined by acquiring the data of the plurality of terminals and utilizing the data, so that the obtained file request probability is more accurate, and the storage strategy is determined for the target terminal according to the file request probability, so that each storage device of the target terminal can store the file according to the corresponding storage strategy.
In one embodiment, data of different device types is processed separately. Different device types represent that the storage capacity is different for a large-scale cluster server, a cloud computer room or a separate server, and file storage cannot be performed according to the same standard. The file request condition of the cluster server is obtained, and the file request probability of the file to be requested is determined according to the file request condition and the file recommendation information, so that a storage strategy is determined according to the file request probability and the storage condition of the cluster server, and each storage device in the cluster server can store the file according to the storage strategy. Since the storage capacities and file reading capacities of the cluster server and the independent server are different, they are different and analyzed, and there are some differences in the determination criteria of the storage policy, and there are differences in the setting of the first threshold, the second threshold, and the third request probability threshold. For example, the storage capacity of the cluster server may be higher than that of the individual server, so when the third request probability threshold is set, the third request probability threshold of the cluster server may be slightly lower than that of the individual server, so that more files to be requested stored in the storage device with the fastest file reading speed can be passed.
In addition to processing data of different device types separately, data of different geographical areas also needs to be processed separately. The acquired file request condition data of the Beijing area equipment and the file request condition data of the Nanjing area equipment are processed separately, file request probabilities are determined respectively, and corresponding storage strategies are generated. This is because, for example, if an epidemic occurs in Beijing, the number of requests for files containing the tag "epidemic" in the Beijing area increases accordingly. Therefore, the file request probabilities determined according to the file request conditions of different geographical areas will be correspondingly different, resulting in different final file storage policies. According to the method and the device, different equipment types and different geographical areas are distinguished, the determined file request probability is more accurate, a more reasonable file storage strategy can be obtained, the file storage effect is improved, and the file access speed is improved.
The present embodiment also provides a file storage system, referring to fig. 4, fig. 4 shows a schematic structural diagram of a file storage system, as shown in fig. 4, where the system includes: a request data platform, a file intelligent distribution module and a storage end;
The data request platform is used for acquiring file request conditions in the current time period and file recommendation information in the next time period, which are sent by the storage terminal, and acquiring file recommendation information in the next time period and sent by the third-party file recommendation system; the file request condition and the file recommendation information are sent to the file intelligent distribution module;
the file intelligent distribution module is used for determining a plurality of files to be requested in the next time period and the file request probability of each of the files to be requested according to the file request condition and the file recommendation information; determining respective file storage strategies of the plurality of files to be requested according to respective file request probabilities of the plurality of files to be requested; sending the file storage strategy to the storage end;
the storage end is configured to store the plurality of files to be requested according to respective file storage policies of the plurality of files to be requested before the next time period arrives.
In this embodiment, the data request platform may obtain file request conditions from different storage ends, such as a large-scale cluster server, a cloud server, or a separate server. For different storage ends, network connection channels of the storage ends and the file intelligent distribution module can be respectively established, and specifically, the network connection is established in an HTTP short connection mode or a tcp long connection mode. After the storage strategy is generated, the intelligent file distribution module sends the storage strategy to the storage end through the network connection channel, so that the storage end can store the file according to the storage strategy, refresh the distribution of the stored file in advance, and the overall service capability is improved.
The embodiment also provides a file storage device, referring to fig. 5, fig. 5 shows a schematic structural diagram of the file storage device, as shown in fig. 5, where the device includes:
the acquisition module is used for acquiring file request conditions in the current time period and file recommendation information in the next time period;
the probability determining module is used for determining a plurality of files to be requested in the next time period and the respective file request probabilities of the plurality of files to be requested according to the file request condition and the file recommendation information;
the storage strategy determining module is used for determining the file storage strategy of each of the plurality of files to be requested according to the file request probability of each of the plurality of files to be requested;
and the storage module is used for storing the plurality of files to be requested according to the respective file storage strategies of the plurality of files to be requested before the next time period arrives.
In one embodiment, the storage policy determination module includes:
a first storage policy determining submodule, configured to obtain storage device information, where the storage device information includes category information of available storage devices, and storage condition information of each storage device, and the storage condition information includes at least: the storage capacity of the storage device, the stored file information, the size of the remaining storage space and the file reading speed;
The second storage strategy determining submodule is used for determining storage equipment matched with each of the plurality of files to be requested according to the file request probability of each of the plurality of files to be requested and the storage equipment information; according to the order of the file reading speeds of the storage devices from high to low, the file request probability of the file to be requested, which is matched with the storage device with higher file reading speed, in any two storage devices is higher than the file request probability of the file to be requested, which is matched with the storage device with lower file reading speed.
In one embodiment, the probability determination module includes:
the first probability determination submodule is used for determining a plurality of files to be requested in the next time period according to the file request condition and the file recommendation information; the file request condition indicates the number of times that the file is requested in the current time period, and the file recommendation information indicates the file information pushed in the next time period;
a second probability determination submodule, configured to determine semantic relationships between a plurality of files to be requested in the next time period and a plurality of files requested in the current time period;
And the third probability determination submodule is used for determining the file request probability of each of the plurality of files to be requested according to the semantic relation.
In one embodiment, the memory module includes:
and the first storage sub-module is used for storing the plurality of files to be requested according to the respective file storage strategies of the plurality of files to be requested before the next time period arrives.
Optionally, in the case that the plurality of files to be requested are local files stored in any one of the plurality of storage devices, the storage module further includes:
and the migration submodule is used for controlling the plurality of storage devices to carry out local file migration according to the file storage strategy of each of the plurality of files to be requested, and comprises the following steps:
the first migration unit is used for controlling the plurality of storage devices, migrating the files to be requested, of which the stored file request probability is between a first threshold value and a second threshold value, into the first storage device, wherein the first storage device is the storage device with the slowest file reading speed in the plurality of storage devices, and the first threshold value is higher than the second threshold value;
The second migration unit is used for controlling the plurality of storage devices and deleting files to be requested, wherein the file request probability of the files to be requested is lower than the second threshold value;
and the third migration unit is used for migrating at least part of files to be requested in the first storage device to each other storage device according to the respective residual storage space sizes of the storage devices except the first storage device.
In one embodiment, the other storage devices include at least a second storage device and a third storage device, the third storage device having a file read speed that is higher than the file read speed of the second storage device; the third migration unit includes:
the first migration subunit is used for controlling the first storage device to migrate a plurality of files to be requested with highest request probability to the third storage device in each stored file to be requested according to the size of the remaining storage space of the third storage device;
and the second migration subunit is used for controlling the first storage device to migrate the plurality of files to be requested with highest request probability to the second storage device in the stored residual request files according to the size of the residual storage space of the second storage device.
In one embodiment, in a case where there is a non-native file in the plurality of request files that is not stored in any of the plurality of storage devices, the storage policy determination module includes:
the third storage strategy sub-module is used for determining the file storage strategy of the non-local file according to the size relation between the file request probability of the non-local file and the first threshold value and/or the second threshold value;
further comprises:
and the downloading sub-module is used for downloading and storing the non-local file according to the file storage strategy of the non-local file.
In one embodiment, the acquisition module includes:
and the statistics sub-module is used for acquiring file request condition data of a plurality of terminals in the current time period and obtaining the file request condition in the current time period after statistics.
The embodiment of the invention also provides an electronic device, and referring to fig. 6, fig. 6 is a schematic diagram of the electronic device according to the embodiment of the invention. As shown in fig. 6, the electronic device 100 includes: the memory 110 and the processor 120 are connected through a bus communication, and the memory 110 and the processor 120 store a computer program which can run on the processor 120, thereby implementing the steps in the file storage method disclosed by the embodiment of the invention.
The embodiments of the present invention also provide a computer readable storage medium having stored thereon a computer program/instruction which, when executed by a processor, performs steps in a file storage method as disclosed in the embodiments of the present invention.
Embodiments of the present invention also provide a computer program product which, when run on an electronic device, causes a processor to perform the steps in a file storage method as disclosed in the embodiments of the present invention.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, electronic devices, and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
The above detailed description of the method, system, electronic device, storage medium and product for storing files provided by the present invention applies specific examples to illustrate the principles and embodiments of the present invention, and the above examples are only used to help understand the method and core idea of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (11)

1. A method of storing a file, the method comprising:
acquiring file request conditions in a current time period and file recommendation information in a next time period;
determining a plurality of files to be requested in the next time period and the file request probability of each of the plurality of files to be requested according to the file request condition and the file recommendation information;
determining respective file storage strategies of the plurality of files to be requested according to respective file request probabilities of the plurality of files to be requested;
and before the next time period comes, storing the files to be requested according to the file storage strategies of the files to be requested.
2. The file storage method according to claim 1, wherein determining the file storage policy of each of the plurality of files to be requested according to the file request probability of each of the plurality of files to be requested, comprises:
obtaining storage device information, wherein the storage device information comprises category information of available storage devices and storage condition information of each storage device, and the storage condition information at least comprises: the storage capacity of the storage device, the stored file information, the size of the remaining storage space and the file reading speed;
Determining storage equipment matched with each of the plurality of files to be requested according to the file request probability and the storage equipment information of each of the plurality of files to be requested; according to the order of the file reading speeds of the storage devices from high to low, the file request probability of the file to be requested, which is matched with the storage device with higher file reading speed, in any two storage devices is higher than the file request probability of the file to be requested, which is matched with the storage device with lower file reading speed.
3. The file storage method according to claim 1, wherein determining a plurality of files to be requested in the next period of time and respective file request probabilities of the plurality of files to be requested according to the file request condition and the file recommendation information includes:
determining a plurality of files to be requested in the next time period according to the file request condition and the file recommendation information; the file request condition indicates the number of times that the file is requested in the current time period, and the file recommendation information indicates the file information pushed in the next time period;
determining semantic relations between a plurality of files to be requested in the next time period and a plurality of files requested in the current time period;
And determining the file request probability of each of the plurality of files to be requested according to the semantic relation.
4. The file storage method according to claim 2, wherein, in the case where the plurality of files to be requested are local files stored in any one of the plurality of storage devices, storing the plurality of files to be requested according to respective file storage policies of the plurality of files to be requested includes:
according to the file storage strategy of each of the plurality of files to be requested, controlling the plurality of storage devices to perform local file migration according to the following steps:
controlling the plurality of storage devices, and transferring the file to be requested, of which the stored file request probability is between a first threshold value and a second threshold value, to a first storage device, wherein the first storage device is the storage device with the slowest file reading speed in the plurality of storage devices, and the first threshold value is higher than the second threshold value;
controlling the storage devices to delete files to be requested, wherein the file request probability of the files to be requested is lower than the second threshold value;
and according to the respective residual storage space sizes of other storage devices except the first storage device, migrating at least part of files to be requested in the first storage device to each other storage device.
5. The file storage method according to claim 4, wherein the other storage device includes at least a second storage device and a third storage device, a file reading speed of the third storage device being higher than a file reading speed of the second storage device; according to the respective residual storage space sizes of the storage devices except the first storage device, migrating at least part of files to be requested in the first storage device to each other storage device, including:
controlling the first storage device to migrate a plurality of files to be requested with highest file request probability to the third storage device in each stored file to be requested according to the size of the remaining storage space of the third storage device;
and after the file migration of the third storage device is completed, controlling the first storage device to migrate a plurality of files to be requested with highest file request probability to the second storage device according to the size of the remaining storage space of the second storage device.
6. The file storage method according to claim 4, wherein, in the case where there is a non-local file that is not stored in any one of the plurality of storage devices among the plurality of request files, determining the file storage policy of each of the plurality of files to be requested according to the file request probability of each of the plurality of files to be requested, includes:
Determining a file storage strategy of the non-local file according to the size relation between the file request probability of the non-local file and the first threshold value and/or the second threshold value;
further comprises:
and downloading and storing the non-local file according to the file storage strategy of the non-local file.
7. The file storage method according to any one of claims 1 to 6, wherein acquiring the file request condition in the current time period includes:
acquiring file request condition data of a plurality of terminals in the current time period, and obtaining the file request condition in the current time period after statistics.
8. The file storage system is characterized by comprising a request data platform, an intelligent file distribution module and a storage end;
the data request platform is used for acquiring file request conditions in the current time period and file recommendation information in the next time period, which are sent by the storage terminal, and acquiring file recommendation information in the next time period and sent by the third-party file recommendation system; the file request condition and the file recommendation information are sent to the file intelligent distribution module;
the file intelligent distribution module is used for determining a plurality of files to be requested in the next time period and the file request probability of each of the files to be requested according to the file request condition and the file recommendation information; determining respective file storage strategies of the plurality of files to be requested according to respective file request probabilities of the plurality of files to be requested; sending the file storage strategy to the storage end;
The storage end is used for storing the plurality of files to be requested according to respective file storage strategies of the plurality of files to be requested.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to implement the steps in the file storage method of any of claims 1 to 7.
10. A computer readable storage medium having stored thereon a computer program/instruction which when executed by a processor performs the steps in the file storage method of any of claims 1 to 7.
11. A computer program product, characterized in that the computer program product, when run on an electronic device, causes a processor to carry out the steps in the file storage method according to any of claims 1 to 7.
CN202310134395.3A 2023-02-09 2023-02-09 File storage method, system, electronic equipment, storage medium and product Pending CN116185968A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310134395.3A CN116185968A (en) 2023-02-09 2023-02-09 File storage method, system, electronic equipment, storage medium and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310134395.3A CN116185968A (en) 2023-02-09 2023-02-09 File storage method, system, electronic equipment, storage medium and product

Publications (1)

Publication Number Publication Date
CN116185968A true CN116185968A (en) 2023-05-30

Family

ID=86447296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310134395.3A Pending CN116185968A (en) 2023-02-09 2023-02-09 File storage method, system, electronic equipment, storage medium and product

Country Status (1)

Country Link
CN (1) CN116185968A (en)

Similar Documents

Publication Publication Date Title
Goian et al. Popularity-based video caching techniques for cache-enabled networks: A survey
US20220210242A1 (en) Predictive caching
US8799973B2 (en) Methods and apparatus for selecting and pushing customized electronic media content
CN108629029B (en) Data processing method and device applied to data warehouse
CN109982104B (en) Motion-aware video prefetching and cache replacement decision method in motion edge calculation
CN111049903B (en) Edge network load distribution algorithm based on application perception prediction
US20210151056A1 (en) Network data aligning
US20040132467A1 (en) Retrieving media items to a mobile device
CN111526246A (en) Caching method, electronic device and computer-readable storage medium
US8954556B2 (en) Utility-based model for caching programs in a content delivery network
CN111935025B (en) Control method, device, equipment and medium for TCP transmission performance
KR20220014325A (en) Communication server device, method and communication system for recommending one or more points of interest for transportation related services to a user.
Malektaji et al. Deep reinforcement learning-based content migration for edge content delivery networks with vehicular nodes
CN109246240A (en) A kind of mobile network's content pre-cache method merging CCN
CN111770152A (en) Edge data management method, medium, edge server and system
CN116185968A (en) File storage method, system, electronic equipment, storage medium and product
CN112749296A (en) Video recommendation method and device, server and storage medium
WO2022062777A1 (en) Data management method, data management apparatus, and storage medium
CN114207570A (en) Techniques for identifying segments of an information space by active adaptation to an environmental context
Wang Data-Driven Online Network Optimization Through Reinforcement Learning
CN110933119B (en) Method and equipment for updating cache content
US20230004565A1 (en) A cache updating system and a method thereof
WO2021159304A1 (en) Method and apparatus for controlling content caching in edge computing system
Kumar et al. Improve Client performance in Client Server Mobile Computing System using Cache Replacement Technique
Akbar et al. A Time Pattern-Based Intelligent Cache Optimization Policy on Korea Advanced Research Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination