CN111400318B

CN111400318B - Method and device for generating scheduling policy of data storage

Info

Publication number: CN111400318B
Application number: CN202010157249.9A
Authority: CN
Inventors: 李博睿; 韩晴; 张光磊; 于中汉; 李高杨; 李炬盼; 李政东; 夏金超
Original assignee: Beijing E Hualu Information Technology Co Ltd
Current assignee: Beijing E Hualu Information Technology Co Ltd
Priority date: 2020-03-09
Filing date: 2020-03-09
Publication date: 2023-09-15
Anticipated expiration: 2040-03-09
Also published as: CN111400318A

Abstract

The application provides a method and a device for generating a scheduling policy of data storage, wherein the method comprises the following steps: acquiring data to be operated, and extracting structural information of the data to be operated; acquiring a storage distribution abstract and a calling record of the current stored data; and generating a scheduling strategy of data storage according to the structured information of the data to be operated, the storage distribution abstract and the calling record. By implementing the method, the system and the device, the problem that the data scheduling strategy cannot be automatically optimized in the related technology, so that the system call data is too long in time is solved, personalized storage can be performed aiming at the use habit of the user, further the storage scheduling time is reduced, when the user actually operates, the network is generated by training the strategy according to the actual operation history of the user, the time consumption of the user for continuously scheduling the file is optimized, and the strategy generation is learned through the operation history of the user, so that the method and the device can be well adapted to the habit of the user.

Description

Method and device for generating scheduling policy of data storage

Technical Field

The present application relates to the field of data storage technologies, and in particular, to a method, an apparatus, and a system for generating a data storage policy.

Background

With the development of technology, more and more data needs to be stored, but the capacity is limited, and a data cold and hot layered storage strategy is generated, wherein the layered storage strategy is used for coordinating data distribution between expensive high-speed storage and low-speed storage, and aims to improve the overall system performance and reduce the overall storage cost. In the existing data storage technology, the following categories are mainly adopted, and after the concept of file hierarchical storage is proposed; an external threshold interval of the data access quantity can be manually set, and data outside the threshold is put into an automatic layering discrimination system; or migrate high access data from a low bandwidth machine to a high bandwidth machine. An external threshold interval of the data access amount can be manually set, and data outside the threshold is locally put into an adjacent storage layer. The data heat may be determined based on the storage medium coverage frequency.

The existing cold and hot layered storage strategies can be divided into two types, wherein one type is to use information life cycle management to judge the data value, which is similar to the FIFO thought; one type is to formulate an artificial policy based on the time distribution characteristics of the data being accessed. The above strategies are all intelligent based on manual design, the strategies cannot be optimized by themselves, and although the strategies can cover most user use demands, the problem that hot storage and no data need to be loaded from cold storage easily occur when users switch work contents. For example, when a game user plays a next game after playing the next game under the current strategy, the next game still needs to be loaded from a hard disk instead of being loaded in a memory in advance, so that the calling time of stored data is longer, the time of waiting for data loading by the user is longer, and the user experience is influenced.

Disclosure of Invention

Therefore, the technical problem to be solved by the application is to overcome the defect of long calling time of stored data when a user switches working contents in the prior art, thereby providing a method and a device for generating a scheduling strategy of data storage.

According to a first aspect, an embodiment of the present application discloses a method for generating a scheduling policy for data storage, including: acquiring data to be operated, and extracting structural information of the data to be operated; acquiring a storage distribution abstract and a calling record of the current stored data; and generating a scheduling strategy of data storage according to the structured information of the data to be operated, the storage distribution abstract and the calling record.

With reference to the first aspect, in a first implementation manner of the first aspect, the structured information includes general structured information and independent structured information; the extracting the structured information of the data to be operated comprises the following steps: according to the data to be operated, matching the corresponding general structured information in a preset file attribute information base; and acquiring the independent structural information through a preset first algorithm according to the data to be operated.

With reference to the first aspect, in a second implementation manner of the first aspect, the method further includes: and updating the scheduling policy of the current data storage according to the generated scheduling policy of the data storage.

With reference to the first aspect, in a third implementation manner of the first aspect, the method further includes: determining a hierarchy of the storage medium of the data to be operated after layering according to the cold and hot according to the generated scheduling strategy of the data storage; and adjusting the distribution of the waiting operation data in the storage medium according to the hierarchy.

With reference to the third implementation manner of the first aspect, in a fourth implementation manner of the first aspect, the method includes: and updating the storage distribution abstract of the current stored data according to the adjusted distribution of the data to be operated in the storage medium.

With reference to the fourth implementation manner of the first aspect, in a fifth implementation manner of the first aspect, the updating the storage distribution summary of the currently stored data according to the adjusted distribution of the data to be operated in the storage medium specifically includes: determining a first descriptor of a plurality of stored data according to the storage distribution abstract of the current stored data; determining a second descriptor of the data to be operated according to the distribution of the adjusted data to be operated in the storage medium; according to first descriptors of a plurality of stored data, respectively determining historical call times of the plurality of stored data; respectively determining Hamming distances between the stored data and data to be operated according to the first descriptors and the second descriptors of the stored data; and when the Hamming distance is smaller than a preset distance and the historical call times are highest, determining stored data which are linked with the data to be operated, and updating the storage distribution abstract of the current stored data.

According to a second aspect, an embodiment of the present application discloses a device for generating a scheduling policy for data storage, including: the extraction module is used for acquiring data to be operated and extracting structural information of the data to be operated; the acquisition module is used for acquiring a storage distribution abstract and a call record of the current stored data; and the generation module is used for generating a scheduling strategy of data storage according to the structured information of the data to be operated, the storage distribution abstract and the call record.

According to a third aspect, an embodiment of the present application discloses a system for generating a scheduling policy for data storage, including: at least one control device for executing the steps of the method for generating a scheduling policy for data storage as described in the first aspect or any implementation manner of the first aspect, determining the scheduling policy for data storage according to stored data and data to be operated on.

According to a fourth aspect, an embodiment of the present application discloses a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the steps of the method for generating a scheduling policy for data storage as described in the first aspect or any implementation manner of the first aspect.

The technical scheme of the application has the following advantages:

1. the application provides a method and a device for generating a scheduling policy of data storage, wherein the method comprises the following steps: the scheduling strategy of the data storage, namely the cold and hot hierarchical storage strategy is generated by learning a first preset algorithm, and the strategy is generated by training a model. By entering a historical storage access sequence and current file storage profile over a user's target period of time, a policy is generated that moves certain files to certain locations. Specifically, data to be operated is obtained, and structural information of the data to be operated is extracted; acquiring a storage distribution abstract and a calling record of the current stored data; and generating a scheduling strategy of data storage according to the structured information of the data to be operated, the storage distribution abstract and the calling record. By implementing the method, the device and the system, the problem that the data scheduling strategy cannot be automatically optimized in the related technology, so that the time for a user to call the data is too long is solved, personalized storage can be realized aiming at the use habit of the user, further, the storage scheduling time is reduced, when the user actually operates, the network is generated according to the actual operation history of the user by training the strategy, the time consumption of the user for continuously scheduling the file is optimized, and the generation of the strategy is learned through the operation history of the user, so that the user preference can be better adapted.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a specific example of a method for generating a scheduling policy for data storage in embodiment 1 of the present application;

FIG. 2 is a flow chart of extracting structured information in a method for generating a scheduling policy of data storage in embodiment 1 of the present application;

FIG. 3 is a flow chart of updating a storage policy in a method for generating a scheduling policy of data storage according to embodiment 1 of the present application;

FIG. 4 is a flow chart of adjusting data distribution in a method for generating a scheduling policy of data storage in embodiment 1 of the present application;

FIG. 5 is a flowchart of updating a storage medium summary in a method for generating a scheduling policy for data storage according to embodiment 1 of the present application;

FIG. 6 is a schematic diagram of a storage file in a method for generating a scheduling policy for data storage according to embodiment 1 of the present application;

FIG. 7 is a flowchart of a specific example of updating a storage medium summary in a method for generating a scheduling policy for data storage according to embodiment 1 of the present application;

FIG. 8 is a flowchart of a specific example of a device for generating a scheduling policy for data storage in embodiment 2 of the present application;

FIG. 9 is a block diagram illustrating a control device in a system for generating a scheduling policy for data storage according to embodiment 3 of the present application;

fig. 10 is a block diagram of a controller in a system for generating a scheduling policy for data storage according to embodiment 3 of the present application.

Detailed Description

The following description of the embodiments of the present application will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the application are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In the description of the present application, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present application and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present application. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In addition, the technical features of the different embodiments of the present application described below may be combined with each other as long as they do not collide with each other.

Example 1

The embodiment of the application provides a method for generating a scheduling policy of data storage, which is applied to specific application scenarios of user in browsing web pages, switching use software and the like requiring data re-scheduling and storage, as shown in fig. 1, and comprises the following steps:

step S11: acquiring data to be operated, and extracting structural information of the data to be operated; in this embodiment, the data to be operated may be data to be stored or data to be called, and specifically, the data to be stored may be data from non-cold-hot hierarchical storage and needs to be stored in cold storage or hot storage; the data to be called may be, for example, data that the user needs to use when switching from the current software to the next software in a series of operations.

The process of obtaining the data to be operated may be to obtain the data from the storage medium after receiving the instruction of the user, or may be to directly receive the data sent by the user; when data is input, extracting structural information in the data to be operated, wherein the structural information comprises general structural information and independent structural information; the generic structured information may be creation time, modification time, size, format, etc. of a file storing the data; the structured information also includes independent structured information in different formats, for example, when the stored data file is audio information, the independent structured information may be frame rate information, tone information, loudness information, etc.

Illustratively, the extracted structured information may be included as a descriptor of the stored data file and may be included as part of the stored data file with the stored data file.

Step S12: acquiring a storage distribution abstract and a calling record of the current stored data; in this embodiment, the storage distribution summary of the currently stored data is actually the distribution of the data that has been stored in the current storage system; the call record may be the number of times that the current data to be operated is used by the user in the target time period, and reflects the heat of the current data to be operated to a certain extent.

Step S13: and generating a scheduling strategy of data storage according to the structured information of the data to be operated, the storage distribution abstract and the calling record. In this embodiment, according to the obtained multiple structured information about the amount of data points to be operated, the storage distribution summary of the currently stored data, and the historical call times in the target time period, a scheduling policy of data storage is obtained, specifically, the data to be operated and the summary of a storage system may be input, and the storage system may be a storage system including multiple levels of hierarchy, and the hierarchy of the data to be operated after being partitioned according to heat is output through a heat and heat storage scheduling policy algorithm.

Illustratively, the scheduling policy of the data storage is learned and generated by a cold and hot storage scheduling policy algorithm, specifically, we can put the problem in a reinforcement learning model to build a model, and generate the cold and hot layered storage policy through model training. Specifically, the input state of the policy generation model is a historical storage access sequence of a user in a target time period, and the distribution of currently stored data in a storage system, and the scheduling policy of the output data storage can be to move the data to be operated to a certain position and a certain level, and in fact, the hot storage can be disk storage; the cold storage may be optical disc storage.

The application provides a method for generating a scheduling strategy of data storage, which comprises the following specific steps: the scheduling strategy of the data storage, namely the cold and hot hierarchical storage strategy is generated by learning a first preset algorithm, and the strategy is generated by training a model. By entering a historical storage access sequence and current file storage profile over a user's target period of time, a policy is generated that moves certain files to certain locations. Specifically, data to be operated is obtained, and structural information of the data to be operated is extracted; acquiring a storage distribution abstract and a calling record of the current stored data; and generating a scheduling strategy of data storage according to the structured information of the data to be operated, the storage distribution abstract and the calling record. By implementing the method, the device and the system, the problem that the data scheduling strategy cannot be automatically optimized in the related technology, so that the time for a user to wait for data loading is too long is solved, personalized storage can be realized aiming at the use habit of the user, further, the storage scheduling time is reduced, when the user actually operates, the network is generated according to the actual operation history of the user by training the strategy, the time consumption for scheduling and storing the data by the system is optimized when the user continuously operates, the generation of the strategy is learned through the operation history of the user, and the user preference can be better adapted.

In a specific embodiment, in the step S11, the extracting the structured information of the data to be operated, where the structured information includes the general structured information and the independent structured information, as shown in fig. 2, may specifically include the following steps:

step S111: according to the data to be operated, matching corresponding general structured information in a preset file attribute information base; in this embodiment, the storage system automatically recognizes the extraction when extracting the general structured information, that is, the attribute information of the storage data file.

Step S112: and acquiring independent structural information through a preset first algorithm according to the data to be operated. In this embodiment, when the independent structured information is extracted, for example, when the stored data is audio information, the frame rate information, tone information, loudness information, and the like in the stored data file may be extracted by a preset first algorithm, that is, a streaming media algorithm; information can also be extracted through a preset model such as a neural network.

The application provides a method for generating a scheduling policy of data storage, which specifically comprises the following steps when structured information in stored data is extracted: in a preset file attribute information base, corresponding general structured information is matched, independent structured information is obtained through a preset first algorithm, and through implementation of the step, the structured information in a stored data file can be effectively and accurately extracted, the efficiency of data storage is improved, and the time consumption of continuous file scheduling of a user is optimized.

In one embodiment, as shown in fig. 3, the method further comprises:

step S14: and updating the scheduling policy of the current data storage according to the generated scheduling policy of the data storage.

By way of example, the scheduling policy of the data store may be generated by a data storage policy generation model, in particular the data storage policy generation model may be a reinforcement learning model, the input of the model may be historical data of the user over a target period of time, for example, storing a sequence of data files and retrieving a sequence of stored data files; and the distribution of stored data in the current storage system; the model can generate a scheduling strategy of data storage through a preset algorithm, and update the current data storage strategy, specifically, the data storage strategy can be randomly stored or stored in a first-in first-out queue when learning is performed at the beginning, and the data storage strategy generated in model training is continuously updated; in practice, the model learning and training may be based on a historical sequence of the user storing and reading the file in the target time period, specifically, the user is simulated to operate, that is, to operate the data, the calculated time is calculated, and the reward signal of the data storage strategy generation model is generated according to the time of calling the data, where the reward signal may instruct the data storage strategy generation model to update, for example, the reward signal may be a negative reward, when the time is longer and the number is larger, the reward is smaller, and the time consumed when the user reads the file may be reduced.

In a specific embodiment, as shown in fig. 4, the method further includes:

step S151: determining a hierarchy of storage media of data to be operated after layering according to cold and hot according to the generated scheduling strategy of data storage;

step S152: according to the hierarchy, the distribution of the data to be operated in the storage medium is adjusted. In this embodiment, the data to be scheduled may be stored in the cold storage from the hot storage, or may be stored in the hot storage from the cold storage, so that the data stored in the storage system may be more fit to the usage habit of the user.

Illustratively, as shown in FIG. 5, the method further comprises:

step S153: and updating the storage distribution abstract of the current stored data according to the adjusted distribution of the data to be operated in the storage medium. In this embodiment, the storage system dynamically maintains the multi-level storage system digest, that is, updates the storage distribution digest accordingly based on the adjusted distribution of the storage data files. Specifically, as shown in fig. 6, the storage distribution summary may be in the form of a relationship graph formed by file descriptors, the nodes may represent storage data files, the attributes of the storage data files may include temperature, file names and file types, and the edges in the relationship graph may be file relationships generated by the file descriptors and are undirected edges. The file descriptor is in fact the structured information of the data to be manipulated.

In a specific embodiment, the step S153 updates the storage distribution summary of the currently stored data according to the adjusted distribution of the data to be operated in the storage medium, and in the executing process, as shown in fig. 7, the method may specifically include the following steps:

step S1531: determining a first descriptor of the plurality of stored data according to the stored distribution abstract of the current stored data;

step S1532: determining a second descriptor of the data to be operated according to the adjusted distribution of the data to be operated in the storage medium;

step S1533: according to the first descriptors of the stored data, respectively determining historical call times of the stored data;

step S1534: respectively determining Hamming distances between the stored data and the data to be operated according to the first descriptors and the second descriptors of the stored data;

step S1535: when the Hamming distance is smaller than the preset distance and the historical call times are highest, determining stored data of the data to be operated in a linked mode, and updating a storage distribution abstract of the current stored data.

The application provides a method for generating a scheduling policy of data storage, which updates a storage distribution abstract of current stored data according to the distribution of adjusted data to be operated in a storage medium, and specifically comprises the following steps: after the data to be operated has completed storing or calling, calculating the relation between the descriptors of the stored data file and the descriptors of the stored data file, so as to determine the edge of the link input file node. Specifically, in order to determine the rule of establishing an edge, the hamming distance of the independent structured information in the file descriptor in the map can be used as a first heavy standard of whether to establish the edge, and the number of times of file history call can be used as a second heavy standard of whether to establish the edge. The edge is established with the point with more historical call times and near Hamming distance preferentially. The upper limit of the degree of the undirected edge connection can be defined according to a rule set by people. By implementing the method and the device, the position information of the stored data file in the storage system can be updated in time, so that a user can conveniently and quickly call the data when using the data next time, that is, after online updating is completed, the stored abstract can be updated in real time according to the user operation based on a new strategy to generate the file position migration action.

For example, during a target time period, when a user accesses the storage system, the storage system continuously records a sequence of user storage data and call data; the cold and hot storage scheduling strategy is generated by a reinforcement learning model and is updated on line regularly. When updating the strategy, generating a strategy for storing a certain file in a certain storage hierarchy, namely a cold and hot storage scheduling strategy according to the history storage operation sequence and the storage abstract; and the time for executing each sequence fragment can be calculated by the cold and hot storage scheduling strategy algorithm based on the storage condition of the current storage system according to the sequence fragments of the file accessed by the user, so that the consumed time for calling the data is obtained. The feedback function can also be inversely proportional to the consumed time, so that the cold and hot storage scheduling strategy algorithm is stimulated to perform optimization, a data storage strategy with less time consumption is generated, the optimization process can be that the optimization function uses multi-function combination, the ADAM algorithm can be used in the initial stage of training, and the SGD optimization algorithm can be used in the final stage of training.

Example 2

An embodiment of the present application provides a device for generating a scheduling policy for data storage, as shown in fig. 8, including:

the extraction module is used for acquiring data to be operated and extracting structural information of the data to be operated; for details, see the description of step S11 in the above method embodiment.

The acquisition module is used for acquiring a storage distribution abstract and a call record of the current stored data; for details, see the description of step S12 in the above method embodiment.

The generating module is configured to generate a scheduling policy for data storage according to the structured information of the data to be operated, the storage distribution abstract and the call record, and the detailed implementation content can be referred to the related description of step S13 in the above method embodiment.

The application provides a device for generating a scheduling policy of data storage, wherein the device comprises: the scheduling strategy of the data storage, namely the cold and hot hierarchical storage strategy is generated by learning a first preset algorithm, and the strategy is generated by training a model. By entering a historical storage access sequence and current file storage profile over a user's target period of time, a policy is generated that moves certain files to certain locations. Specifically, the data to be operated is obtained through an extraction module, and the structural information of the data to be operated is extracted; acquiring a storage distribution abstract and a calling record of the current stored data through an acquisition module; and generating a scheduling strategy of data storage according to the structured information, the storage distribution abstract and the calling record of the data to be operated through the generation module. By implementing the method, the system and the device, the problem that the data scheduling strategy cannot be automatically optimized in the related technology, so that the user waits for the data to be stored for too long time is solved, personalized storage can be realized aiming at the use habit of the user, the data scheduling time is further reduced, the network is generated according to the real operation history of the user while the user actually operates, the time consumption of the system for scheduling data is optimized when the user continuously operates, the strategy generation is learned through the operation history of the user through the self-learning and self-iteration process of the data scheduling strategy, the user preference can be better adapted, and the scheduling efficiency of the file in a specific scene is improved.

Example 3

An embodiment of the present application provides a system for generating a scheduling policy of a data store, where the system includes at least one control device 81, where the control device 81 is configured to execute the steps of the method for generating a scheduling policy of a data store according to any one of the foregoing embodiments.

As shown in fig. 9, the control device 81 includes:

first communication module 811: the method is used for transmitting data, receiving and transmitting historical call times information according to the data to be operated, the distribution abstract of the current stored data and the current data. The first communication module can be a Bluetooth module and a Wi-Fi module, and then communicates through a set wireless communication protocol.

First controller 812: is connected to the first communication module 811, as shown in fig. 10, and includes: at least one processor 91; and a memory 92 communicatively coupled to the at least one processor 91; the memory 92 stores instructions executable by the at least one processor 91, and when receiving the data information, the at least one processor 91 executes the method for generating the scheduling policy of data storage shown in fig. 1, in fig. 10, one processor is taken as an example, the processor 91 and the memory 92 are connected through the bus 90, and in this embodiment, the first communication module may be a wireless communication module, for example, a bluetooth module, a Wi-Fi module, or the like, or may be a wired communication module. The transmission between the first controller 812 and the first communication module 811 is wireless.

The memory 92 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the method for generating a scheduling policy for data storage in an embodiment of the present application. The processor 91 executes various functional applications of the server and data processing, i.e., a method of generating scheduling policies for data storage implementing the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 92.

Memory 92 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of a processing device operated by the server, or the like. In addition, the memory 92 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 92 may optionally include memory remotely located relative to processor 91, which may be connected to the network connection device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

One or more modules are stored in the memory 92 that, when executed by the one or more processors 91, perform the method described in any of the above embodiments.

Example 4

The embodiment of the application also provides a non-transitory computer readable medium storing computer instructions for causing a computer to execute the method for generating a scheduling policy for data storage as described in any one of the above embodiments, where the storage medium may be a magnetic Disk, an optical disc, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive), a Solid State Drive (SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.

It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. While still being apparent from variations or modifications that may be made by those skilled in the art are within the scope of the application.

Claims

1. A method for generating a scheduling policy for data storage, comprising:

acquiring data to be operated, and extracting structural information of the data to be operated;

acquiring a storage distribution abstract and a calling record of the current stored data;

generating a scheduling strategy of data storage according to the structured information of the data to be operated, the storage distribution abstract and the calling record;

determining a hierarchy of the storage medium of the data to be operated after layering according to the cold and hot according to the generated scheduling strategy of the data storage;

according to the hierarchy, adjusting the distribution of the waiting operation data in a storage medium;

updating a storage distribution abstract of the current stored data according to the distribution of the adjusted data to be operated in a storage medium;

the updating the storage distribution abstract of the current stored data according to the adjusted distribution of the data to be operated in the storage medium specifically comprises the following steps:

determining a first descriptor of a plurality of stored data according to the storage distribution abstract of the current stored data;

determining a second descriptor of the data to be operated according to the distribution of the adjusted data to be operated in the storage medium;

according to first descriptors of a plurality of stored data, respectively determining historical call times of the plurality of stored data;

respectively determining Hamming distances between the stored data and data to be operated according to the first descriptors and the second descriptors of the stored data;

and when the Hamming distance is smaller than a preset distance and the historical call times are highest, determining stored data which are linked with the data to be operated, and updating the storage distribution abstract of the current stored data.

2. The method of claim 1, wherein the structured information comprises generic structured information and independent structured information;

the extracting the structured information of the data to be operated comprises the following steps:

according to the data to be operated, matching the corresponding general structured information in a preset file attribute information base;

and acquiring the independent structural information through a preset first algorithm according to the data to be operated.

3. The method as recited in claim 1, further comprising:

and updating the scheduling policy of the current data storage according to the generated scheduling policy of the data storage.

4. A device for generating a scheduling policy for data storage, comprising:

the extraction module is used for acquiring data to be operated and extracting structural information of the data to be operated;

the acquisition module is used for acquiring a storage distribution abstract and a call record of the current stored data;

the generation module is used for generating a scheduling strategy of data storage according to the structured information of the data to be operated, the storage distribution abstract and the call record;

the storage distribution adjustment module is used for executing the following processes:

5. A system for generating a scheduling policy for data storage, comprising:

at least one control device for performing the steps of the method of generating a scheduling policy for a data store according to any of claims 1-3, determining the scheduling policy for the data store from stored data and data to be operated on.

6. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of generating a scheduling policy for a data store according to any of claims 1-3.