WO2017024802A1 - 多种存储介质并存的系统及进行文件操作的方法、装置及计算机存储介质 - Google Patents

多种存储介质并存的系统及进行文件操作的方法、装置及计算机存储介质 Download PDF

Info

Publication number
WO2017024802A1
WO2017024802A1 PCT/CN2016/078398 CN2016078398W WO2017024802A1 WO 2017024802 A1 WO2017024802 A1 WO 2017024802A1 CN 2016078398 W CN2016078398 W CN 2016078398W WO 2017024802 A1 WO2017024802 A1 WO 2017024802A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
storage
path
storage medium
module
Prior art date
Application number
PCT/CN2016/078398
Other languages
English (en)
French (fr)
Inventor
黄德光
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017024802A1 publication Critical patent/WO2017024802A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers

Definitions

  • the present invention relates to the field of storage, and in particular to a system in which a plurality of storage media coexist, a method and apparatus for performing file operations, and a computer storage medium.
  • Figure 1 (a) Several typical application scenarios of the storage cluster are shown in Figure 1 (a) ⁇ (c).
  • file storage when file storage is shared, all shared storage methods as shown in Figure 1(a) are employed.
  • file storage is not fully shared, there are two ways. One is that the files are all mutually exclusive storage, as shown in Figure 1(b), and the other is partially shared storage as shown in Figure 1(c). Partially exclusive storage, which is the way of mixed storage.
  • FIG. 2(a) and (b) there may be two or more storage media in the storage cluster.
  • the application scenarios when there are two storage media are shown in Figure 2(a) and (b).
  • Each of Figure 2(a) The two storage media (the first storage medium and the second storage medium) corresponding to the server are mutually exclusive storage; each server in FIG. 2(b) corresponds to a mutually exclusive storage medium (the first storage medium), two The server corresponds to a shared storage medium (second storage medium).
  • the application scenarios shown in FIG. 2(a) and (b) belong to the case where the above file storage is not all shared.
  • the technical problem to be solved by the embodiments of the present invention is to provide a file operation scheme suitable for a system in which a plurality of storage media coexist.
  • the method further includes:
  • selecting a storage medium for writing the file when a file needs to be written, selecting a storage medium for writing the file; when an absolute value of the difference between the predetermined performance parameters of the first storage medium and the second storage medium is greater than a preset threshold, Preferentially selecting the predetermined storage medium having a low performance parameter;
  • the storage path of the file is queried; a storage path is selected from the queried storage path as a path for reading the file.
  • the method further includes:
  • At least one storage path is selected as a path for deleting the file in each storage path of the file; when the predetermined performance parameter of the first storage medium and the second storage medium When the absolute value of the difference is greater than the preset threshold, the storage path corresponding to the storage medium having the predetermined high performance parameter is preferentially selected; when there are multiple storage media with the predetermined high performance parameter, the storage space is preferentially selected.
  • the file that satisfies the cold trigger condition is periodically deleted according to the path of the selected deleted file.
  • the method further includes:
  • the method further includes:
  • the storage path of the file is queried, and one of the storage paths is selected; the selected storage path is used as a path for modifying the file; and the other storage paths that are queried are deleted. file.
  • the method before the writing the file, the method further includes:
  • the reading before the file further includes:
  • the segment corresponding to the data to be read is determined according to the offset between the data to be read and the beginning of the original file, and the size of the segment, and the determined segment is used as a file to be read.
  • the file access statistics module is configured to filter out a file that meets a heating trigger condition according to access statistics of each file stored in the system;
  • the file migration module is configured to perform replication on a file that periodically meets the heating trigger condition. If the absolute value of the difference between the predetermined performance parameters of the first storage medium and the second storage medium is greater than a preset threshold, the heating is satisfied. When the file that triggers the condition is copied, it is preferentially copied to the storage medium with the predetermined high performance parameter.
  • the device further includes:
  • a storage management module configured to: when a file needs to be written, select a storage medium for writing the file; and an absolute value of the difference between the predetermined performance parameters of the first storage medium and the second storage medium When the threshold is greater than the preset threshold, the predetermined storage medium with a low performance parameter is preferentially selected;
  • a file access module configured to record a storage path of the file according to a root path of the selected storage medium, and use the recorded storage path as a path to write the file; when the file needs to be read, query the file a storage path; instructing the storage management module to select a storage path from the queried storage path as a path for reading the file.
  • the file access statistics module is further configured to: according to the access statistical information of each file stored in the system, filter out a file that meets a cold trigger condition;
  • the storage management module is further configured to: select, for each file that meets the cold triggering condition, at least one storage path as a path for deleting the file in each storage path of the file; and when the first storage medium and the second storage When the absolute value of the difference between the predetermined performance parameters of the medium is greater than a preset threshold, the storage path corresponding to the predetermined storage medium having a high performance parameter is preferentially selected; when the predetermined storage medium having a predetermined high performance parameter has multiple When the storage path corresponding to the storage medium that uses a large amount of storage space is preferentially selected;
  • the file migration module is further configured to periodically delete the file that meets the cold trigger condition according to the selected path of the deleted file.
  • the file accessing module is further configured to query all storage paths of the file when the file needs to be deleted;
  • the file migration module is further configured to delete files on each of the queried storage paths.
  • the file accessing module is further configured to query a storage path of the file when the file needs to be rewritten, and instruct the storage management module to select one of the stored storage paths; and select the selected storage path As a path for modifying the file; instructing the file migration module to delete files on other storage paths that are queried.
  • the device further includes:
  • a file service module configured to cut the original file into a plurality of segments before writing the file, one of the segments as one of the files; to save a mapping relationship between the segments and the original file, and a size of the segment; Before the file is fetched, according to the offset between the data to be read and the beginning of the original file, and the size of the segment, the segment corresponding to the data to be read is determined, and the determined segment is used as a read. file.
  • a system in which a plurality of storage media coexist including:
  • the plurality of storage media comprising at least a first storage medium and a second storage medium; the first storage medium comprising one or more; the second storage medium comprising one or more;
  • a processor configured to: filter a file that meets a heating trigger condition according to access statistics of each file stored in the system; periodically copy a file that meets a heating trigger condition, when the first storage medium, When the absolute value of the difference between the predetermined performance parameters of the second storage medium is greater than the preset threshold, the file that satisfies the heating trigger condition is preferentially copied to the storage medium with the predetermined high performance parameter.
  • the processor is further configured to: when a file needs to be written, select a storage medium for writing the file; when the predetermined performance parameter of the first storage medium and the second storage medium When the absolute value of the difference is greater than the preset threshold, the storage medium having the predetermined low performance parameter is preferentially selected; the storage path of the file is recorded according to the root path of the selected storage medium, and the recorded storage path is written as The path of the file; when the file needs to be read, the storage path of the file is queried; and a storage path is selected from the queried storage path as a path for reading the file.
  • the processor is further configured to: according to the access statistical information of each file stored in the system, filter out a file that meets a cold trigger condition; and for each file that satisfies the cold trigger condition, respectively in the file Selecting at least one storage path as a path for deleting a file in each storage path; when the absolute value of the difference between the predetermined performance parameters of the first storage medium and the second storage medium is greater than a preset threshold, preferentially selecting the a storage path corresponding to a storage medium having a predetermined high performance parameter; when there are a plurality of storage media having a predetermined high performance parameter, preferentially selecting a storage path corresponding to the storage medium using a plurality of storage spaces; periodically deleting according to the selected one The path to the file deletes the file that meets the cold trigger condition.
  • the processor is further configured to query all storage paths of the file when the file needs to be deleted; and delete the file on each of the queried storage paths.
  • the processor further queries a storage path of the file when the file needs to be rewritten, selects one of the stored storage paths, and selects the selected storage path as a path for modifying the file; The files on other storage paths that are queried.
  • the processor is further configured to cut the original file into multiple segments before writing the file, one of the segments as one of the files; and save a mapping relationship between the segments and the original file, and The size of the segment; the offset between the data to be read and the beginning of the original file before reading the file, and the size of the segment, determining the segment corresponding to the data to be read, and determining the segment As a file that needs to be read.
  • a computer storage medium is further provided, and the computer storage medium may store an execution instruction for executing the method in the foregoing embodiment.
  • the method and the device according to the embodiment of the present invention can perform unified management on clusters composed of different types of storage media, and provide hotspot content preferentially in predetermined performance parameters when there is a large difference in predetermined performance parameters of the storage medium.
  • High storage medium access characteristics the optional solution of the embodiment of the present invention can also implement load balancing of files, and can preferentially select a storage medium with a low performance parameter and an automatic deletion of a cold file by file writing. To achieve a balance of storage space.
  • the optional solution of the embodiment of the present invention can also set the operation granularity by the management of the segment divided by the original file; the embodiment of the present invention also solves the replacement problem of the storage file when rewriting.
  • FIG. 1(a) is a schematic diagram of an application scenario of a storage cluster when sharing storage
  • Figure 1 (b) is a schematic diagram of an application scenario of a storage cluster when the storage is mutually exclusive;
  • FIG. 1(c) is a schematic diagram of an application scenario of a storage cluster in a mixed storage
  • FIG. 2(a) is a schematic diagram of an application scenario when two types of storage media exist
  • FIG. 2(b) is a second schematic diagram of an application scenario when there are two types of storage media
  • Embodiment 3 is a schematic flow chart of a method for performing file operations in a system in which multiple storage media coexist in Embodiment 1;
  • FIG. 4 is a schematic diagram of an apparatus for performing file operations in a system in which a plurality of storage media coexist;
  • FIG. 5(a) is a schematic diagram showing the application of the second embodiment to a system in which a plurality of storage media coexist;
  • FIG. 5(a) is a schematic diagram showing the application of the second embodiment to a system in which a plurality of storage media coexist;
  • Figure 5 (b) is a schematic diagram of a file operation using the apparatus of the second embodiment
  • Figure 5 (c) is a second schematic diagram of the file operation using the apparatus of the second embodiment
  • FIG. 6 is a schematic flowchart of a write operation in an implementation manner of Embodiment 2;
  • FIG. 7 is a schematic flowchart of a rewrite operation in an implementation manner of Embodiment 2;
  • FIG. 10 is a schematic flowchart of a file migration operation in an implementation manner of Embodiment 2;
  • FIG. 11 is a schematic diagram of a file cold deletion operation in an implementation manner of Embodiment 2.
  • Figure 12 is a schematic diagram of a system in which a plurality of storage media of the third embodiment coexist.
  • Embodiment 1 A method for performing file operations in a system in which a plurality of storage media coexist, the plurality of storage media comprising at least one or more first storage media, and one or more second storage media; As shown in 3, the method includes:
  • the file refers to a set of related information defined by the creator, and can be logically divided into a structured file and an unstructured file.
  • a file consists of a set of similar records, such as a record of all candidates for a school, also known as a record file; an unstructured file is treated as a stream of characters, such as a binary file or Character files, also known as streaming files. It can be considered that the document is consistent with the meaning of a file that is generally understood.
  • the multiple storage media may be located on one or more storage devices in the system, and one storage device may have one or more storage media, and the same storage medium in one storage device may have one or more.
  • X is a positive integer
  • storage devices such as but not limited to a storage server
  • Y is a positive integer ⁇ X
  • Z is a positive integer ⁇ X
  • storage device has one or more second storage media (such as but not limited to SATA storage, etc.); each storage medium
  • the root path and device number are unique within the system.
  • the different storage media used are managed by a distributed file system or a local file system.
  • one root path may be used to correspond to one storage medium, and different storage media and clients accessing the storage medium may be in the same
  • the file system when using a distributed file system, the file system may be able to access disks that are not managed by the local storage controller, and are not in the local storage control when accessed.
  • the disk managed by the device can be embodied as a storage path.
  • the performance of different types of storage media may be close to each other (that is, the absolute value of the difference of the predetermined performance parameters is less than or equal to a preset threshold), or may have Significant difference (ie, the absolute value of the difference in the predetermined performance parameter is greater than a preset threshold).
  • the replication policy may be set by itself, for example, may be copied to any storage medium, or may be copied first. To the first or second storage medium.
  • the predetermined performance parameters include, but are not limited to, an I/O (Input/Output) speed of the storage medium.
  • I/O Input/Output
  • a storage medium having a predetermined high performance parameter will be simply referred to as a high performance storage medium or a high performance storage medium
  • a storage medium having a predetermined low performance parameter will be simply referred to as a low performance storage medium or a low performance storage medium.
  • the file storage is not fully shared.
  • the accessed file becomes a hotspot
  • the corresponding file needs to have multiple copies.
  • the performance of multiple storage media is significantly different (ie, If the absolute value of the difference of the specific performance parameter is greater than a preset threshold, the copy of the file that becomes the hot spot will be placed on the storage medium with high performance as much as possible.
  • This embodiment provides a method for performing file operations in a storage cluster in which multiple storage media coexist, and can flexibly achieve the effect of preferentially deploying hot content on a high performance storage medium.
  • the storage medium as the source device may be selected according to the predetermined load balancing policy, that is, which of the storage devices is selected.
  • the file that satisfies the heating trigger condition stored on one storage medium is copied as a source file. You can copy only one new file (ie, a copy) each time you copy, or you can copy multiple new files.
  • the method further includes:
  • selecting a storage medium for writing the file when a file needs to be written, selecting a storage medium for writing the file; when an absolute value of the difference between the predetermined performance parameters of the first storage medium and the second storage medium is greater than a preset threshold, Preferentially selecting the predetermined storage medium having a low performance parameter;
  • the storage path of the file is queried; a storage path is selected from the queried storage path as a path for reading the file.
  • the step of recording the storage path of the file according to the root path of the selected storage medium includes multiple implementation manners; for example, the root path of the storage medium and the relative path of the file on the storage medium may be separately recorded.
  • the query may be to separately query the root path and the relative path, and splicing into an absolute path and then feeding back to the application that needs to write/read the file.
  • the root path of the storage medium and the relative path of the file on the storage medium may be spliced into an absolute path, and the absolute path is recorded as a storage path, and correspondingly, the spliced absolute path is queried. It is also possible to record the relative path of the file by the application, or calculate the relative path of the file according to the predetermined rule and the file number/file name or other file identifier, and the record/query/feedback may only have the root path.
  • This option ensures that content is prioritized on low-performance storage media to balance storage costs.
  • the storage path corresponding to the storage medium with high performance can be preferentially selected.
  • one of them may be selected according to a predetermined load balancing policy.
  • the application can directly write to and read the file from the file system according to the stored storage path.
  • Interfaces can also be provided to let the application see a specific storage path, or just let the application see a virtual storage system.
  • the file access may also directly query the storage path of the file, and after the access fails, the application layer controls the next processing.
  • the method further includes:
  • At least one storage path is selected as a path for deleting the file in each storage path of the file; when the predetermined performance parameter of the first storage medium and the second storage medium When the absolute value of the difference is greater than the preset threshold, the storage path corresponding to the storage medium having the predetermined high performance parameter is preferentially selected; when there are multiple storage media with the predetermined high performance parameter, the storage space is preferentially selected.
  • the file that satisfies the cold trigger condition is periodically deleted according to the path of the selected deleted file.
  • the storage data of the file on the storage medium may be automatically deleted, and the file in the high performance storage medium is preferentially deleted.
  • the heating trigger condition and the cold triggering condition can be set by themselves.
  • the method further includes:
  • the method further includes:
  • the storage path of the file is queried, and one of the storage paths is selected; the selected storage path is used as a path for modifying the file; and the other storage paths that are queried are deleted. file.
  • the application goes to the file system to perform the overlay operation on the corresponding file according to the stored storage path.
  • the storage path corresponding to the storage medium with low performance may be preferentially selected. Further, if there are multiple storage media available, one of them may be selected according to a predetermined load balancing policy.
  • the reading before the file further includes:
  • the segment corresponding to the data to be read is determined according to the offset between the data to be read and the beginning of the original file, and the size of the segment, and the determined segment is used as a file to be read.
  • the internal storage may specify that the original file is to be cut into pieces and then stored, and the above file refers to the fragment obtained after the original file is divided.
  • the cutting may not be performed, in which case the file refers to the original file itself, and the entirety of the original file.
  • the steps in the first embodiment may be implemented by using multiple functional modules respectively, or may be implemented by sharing one functional module in all steps or partial steps.
  • a computer storage medium is further provided, and the computer storage medium may store an execution instruction for executing the method in the foregoing embodiment.
  • Embodiment 2 An apparatus for performing file operations in a system in which a plurality of storage media coexist, the plurality of storage media including at least one or more first storage media, and one or more second storage media; As shown in Figure 4, the device comprises:
  • the file access statistics module 102 is configured to filter, according to the access statistical information of each file stored in the system, a file that meets a heating trigger condition;
  • the file migration module 103 is configured to perform replication on a file that periodically meets the heating triggering condition. If the absolute value of the difference between the predetermined performance parameters of the first storage medium and the second storage medium is greater than a preset threshold, the file is configured to satisfy the heat. When the file of the trigger condition is copied, it is preferentially copied to the storage medium having the predetermined high performance parameter.
  • the device may further include a file access module 101 and a storage management module 104.
  • the file access module 101 may be, but is not limited to, the file access module 101, according to the screening result of the file access statistics module 102, instructing the storage management module 104 to select the copy.
  • the destination device and the source device; then, the selected destination device and the source device are notified to the file migration module 103, and the file migration module 103 performs the copy operation.
  • the file migration module 103 can also directly interact with the file access statistics module 102 to perform replication according to the screening result.
  • the device of the embodiment can perform hotspot load sharing when multiple storage media coexist. If two storage media with large performance differences are included, when the file is hot, the service can be preferentially served on a higher performance storage medium.
  • the storage management module 104 is configured to: when a file needs to be written, select a storage medium for writing the file; when the predetermined performance of the first storage medium and the second storage medium When the absolute value of the difference of the parameter is greater than the preset threshold, the storage medium with the predetermined low performance parameter is preferentially selected;
  • the file accessing module 101 is configured to record a storage path of the file according to a root path of the selected storage medium, and use the recorded storage path as a path to write the file; when the file needs to be read, query the File storage path a path indicating that the storage management module selects a storage path from the queried storage path as a path for reading the file.
  • the file access statistics module 102 is further configured to: filter, according to the access statistical information of each file stored in the system, a file that meets a cold trigger condition;
  • the storage management module 104 is further configured to: select, for each file that meets the cold triggering condition, at least one storage path as a path for deleting the file in each storage path of the file; when the first storage medium, the second When the absolute value of the difference between the predetermined performance parameters of the storage medium is greater than a preset threshold, the storage path corresponding to the storage medium having the predetermined high performance parameter is preferentially selected; when the predetermined storage performance parameter has a high storage medium First, the storage path corresponding to the storage medium that uses a large amount of storage space is preferentially selected;
  • the file migration module 103 is further configured to periodically delete the file that satisfies the cold trigger condition according to the path of the selected deleted file.
  • the file accessing module 101 is further configured to query all storage paths of the file when the file needs to be deleted;
  • the file migration module 103 is further configured to delete the files on the queried storage paths.
  • the file accessing module 101 is further configured to query a storage path of the file when the file needs to be rewritten, and instruct the storage management module 104 to select one of the stored storage paths;
  • the storage path is used as a path for modifying the file;
  • the file migration module 103 is instructed to delete files on other storage paths that are queried.
  • the fragment scheduling and service can be performed.
  • the device further includes:
  • a file service module configured to cut the original file into a plurality of segments, one of the segments as one of the files; to save a mapping relationship between the segments and the original file, and a size of the segment; The offset between the read data and the beginning of the original file, and the size of the segment, determine the segment corresponding to the data to be read, and use the determined segment as the file to be read.
  • the device includes: a file access module, a file access statistics module, a file migration module, and a storage management module.
  • the file access module is configured to record a path of the storage medium selected by the storage management module as a storage path of the file when the file is written; and query a storage path of the file when the file is read; Store the path to the application.
  • the file access module is configured to maintain a mapping relationship between the file and the storage path, and confirm a specific storage path of the file when the file is written and read.
  • the file access module is further configured to send a file migration or deletion instruction to the file migration module when the access information of the file satisfies the heating/cold triggering condition.
  • the file migration module performs corresponding operations according to the file migration or deletion instruction.
  • the storage management module is configured to maintain a correspondence between each path and a storage medium.
  • the spatial information of each storage device (which may include spatial information of each storage medium), the disk IO information (which may include IO information of each storage medium), and the status information (which may include status information of each storage medium) may be maintained. Maintain the correspondence between the application and the available storage devices/media. Maintain load balancing between storage devices/media.
  • the storage management module When the file is written, the storage management module preferentially selects a storage medium for writing the file.
  • the storage management module may also select one of the plurality of storage media of the selected category according to the load balancing policy when the file is written.
  • the application accesses the file access module, and the file access module interacts with the storage management module to confirm that the optimal storage path that can be used by the application is fed back to the application, and the application performs the storage path according to the returned storage path.
  • the writing of the file is not limited to
  • the application accesses the file access module, and the file access module interacts with the storage management module to select an optimal accessible path to return to the application.
  • the file access statistics module is configured to perform file access heat statistics according to the file access information, and notify the file access module of the file that reaches the heating/cold triggering condition according to the heating and cooling algorithm, and the file access module passes and stores the management.
  • the module interacts to obtain the file path to be copied or to be deleted, and then generates a migration/deletion instruction to notify the file migration module to perform file scheduling or deletion.
  • the file migration module receives the request, initiates a copy or delete operation in a specified manner (such as periodic execution or immediate execution), and after the operation is completed, notifies the file access module to update the corresponding file information, so that the content is re-initiated next time.
  • the file access module will re-select the path corresponding to the corresponding file.
  • a file service module that manages the division and mapping of original files to fragments, and reads and writes fragments after being divided into fragments.
  • the file service module When the original file is written, the file service module cuts the original file according to the original file size or other business attributes, internally naming the fragments, and recording the mapping relationship between the original files and the fragments.
  • the writing of the clip is completed by interacting with the file access module.
  • the file service module completes the mapping and reading of the offset to the fragment, and the read policy is still performed by the file access module.
  • the file stored and accessed is each fragment into which the original file is divided. Therefore, the objects that perform operations such as access information statistics, copying, writing, reading, cooling, and deleting are all the segments, so that the Fragmentation-level heat statistics and load balancing of fragmented access.
  • each storage medium group is a different storage medium, which is managed by a lower-level distributed file system or a local file system, specifically using one root path corresponding to one storage medium, different storage mediums and accesses.
  • the client of the storage medium is on the same storage bus (different storage media is managed by the same storage controller).
  • the file system may access the disk not managed by the local storage controller.
  • Native storage control The disk managed by the device can be embodied as a storage path.
  • FIG. 5(a) is a schematic diagram of a second embodiment of a storage system 1 to n, where the storage mediums 1 to n in one storage device are the same or Different kinds of storage media.
  • the application consists of the application layer of the business layer and the application agent of the bearer layer, and the hypervisor agent module of the bearer layer (also belonging to the device for file operation provided by the second embodiment) is mainly responsible for collecting information and files of the storage device. Execution of heating/coldification.
  • the device performing the file operation performs specific load balancing and access statistics functions.
  • the application When a business operation needs to be performed, the application first accesses the management program to select a suitable storage medium, such as a storage medium with the largest remaining space, or a storage medium with a small amount of disk IO, and then the application corresponds to the storage device where the selected storage medium is located.
  • a suitable storage medium such as a storage medium with the largest remaining space, or a storage medium with a small amount of disk IO.
  • the application agent communicates and the follow-up work is done by the application agent.
  • Fig. 5(b) is a schematic diagram showing a file operation using the apparatus of the second embodiment.
  • the device described in the second embodiment includes: a file access module 101, a file access statistics module 102, a file migration module 103, a storage management module 104, and a hypervisor agent module 105.
  • the application and application proxy can act as a single module.
  • Figure 5 (c) shows a second schematic diagram of a file operation using the apparatus of the second embodiment.
  • the device described in the second embodiment includes: a file access module 101, a file access statistics module 102, a file migration module 103, a storage management module 104, a hypervisor proxy module 105, and a file service module 106.
  • the file service module 106 is mainly for the isolation application to directly access the file system, and controls the fragmentation function of the completed file. Each file is written by the file service module according to the size of the fragment specified by the application, when the original file is written more than After the size of the clip, the clip name is automatically regenerated, each clip is used as a file, and the clip is written by the file access module 101.
  • the file service module maintains the mapping relationship between the original file and the fragment.
  • the file service module 106 determines whether the data to be read crosses the fragment. If the cross-segment is obtained by the file service module 106 according to the new fragment, the file path is obtained by the file service module. 106 reads the fragment and returns it to the application to complete the reading.
  • the file service module 106 actually includes two functions: file fragmentation management, file read and write agents. In fact, it manages the mapping relationship between large original files and small fragments, and generates new fragments and ends the old fragments according to the offset and the size of the fragments.
  • file reading and writing agent, part of the function is the function completed by the application in the above process, and the other part is to read the data to the application.
  • the fine-grained scheduling of the file and the control of the service can be completed. If the file is large, the fragment can be stored in pieces, and each fragment is used as a file, and the fragment is hot when the access is high, if the original file is copied. Smaller or original files do not need to consider the size of the fragment when fine-grained control is used, and the original file is used as a whole file. At the same time, since the fragment is processed inside the file service module, the existence of the fragment is not actually perceived by the application. At the same time, because the storage space is not directly accessed, when each application deploys a file service module, the storage space used by the application can be isolated to prevent abnormal access of stored data across applications.
  • the storage management module 104 is responsible for periodically collecting and managing the storage information of each storage device, including the storage space of each storage medium in each storage device and the disk read/write IO, whether it is available.
  • the information collected here is mainly for other modules such as file access modules.
  • Block 101 or file access statistics module 102 is used when selecting an available storage medium with low load or large space remaining.
  • the storage path is used with a single disk as the minimum granularity, so that the access statistics of a path can be obtained.
  • the application proxy communication process corresponding to the storage device is omitted, and the application and the application proxy are collectively described as an application.
  • the file access module 101 When the application requests the file access module 101, the file access module 101 requests the storage management module 104 to select an available storage medium with low load or large space remaining according to the mapping path of the stored file to the storage medium, and determine the specific use of the file. After the storage path, the application directly writes or reads the file according to the specific storage path. When the access is a read, the file access module 101 also notifies the file access statistics module 102 to perform file access information statistics.
  • the file access statistics module 102 updates the maintained access statistics of the file.
  • the statistics of all the files are periodically checked whether there is a file that satisfies the heating triggering condition and/or the cooling triggering condition. If the heating triggering condition is met, the file accessing module 101 is notified to perform the copying, and the file accessing module 101 queries the The file has been released from the storage medium, and the request storage management module 104 selects a published storage medium with a low load as the specific source storage path of the file, and an unpublished storage medium with a low available load or a large remaining storage space. As the destination path of the file. The file access module 101 processes the selection result and sends it to the file migration module 103.
  • the file access module 101 If the cold boot trigger condition is met, the file access module 101 is notified to perform file cold deletion, and the file access module 101 queries the filed published storage medium, and requests the storage management module 104 to select an available storage medium with a large storage space. The file access module 101 processes the selection result and sends it to the file migration module 103.
  • the file migration module 103 initiates a file copy according to the received source path and the destination path or completes the delete operation according to the received storage path. After completion, the file access module 101 is notified, and the file access module 101 sets the new storage path to be available or will be old. The storage path is deleted.
  • FIG. 6 is a specific flow chart of file writing in an implementation manner of Embodiment 2. Steps 401-410 are included.
  • the application determines the relative path or file number of the file to be written.
  • the application requests file writing from the file access module 101.
  • the file accessing module 101 records file information, including a relative path or a file number of the file, and records an application number corresponding to the file.
  • the file access module 101 initiates a load balancing request (carrying an application number) to the storage management module 104.
  • the storage management module 104 uses a preset load balancing policy to select a storage medium with a low load or a large space remaining in an accessible low-performance storage medium corresponding to an application number (specifically, which load balancing policy is determined by configuration) And returning the device number of the selected storage medium and the root path corresponding to the storage medium in the load balancing response.
  • the file accessing module 101 records file information, including: an application number corresponding to the file, a file relative path or a file number, information of the selected storage medium (including a root path and a device number), and marking the file access status as new.
  • the file access module 101 returns a file write path response to the application, where the root path and the relative path are carried.
  • the application After receiving the file write path response of the file access module 101, the application directly initiates a file write request to the file system according to the spliced absolute path in the file write path response.
  • the application sends a write result notification to the file access module 101, and reports the file size to the file access module 101.
  • the file access module 101 finds the corresponding file information of the original record, modifies the file information, and marks the file access status as writing completion.
  • FIG. 7 is a specific flowchart of file rewriting in an implementation manner of Embodiment 2. Steps 501-510 are included.
  • the application determines the relative path or file number of the file that needs to be rewritten.
  • the application requests file writing from the file access module 101.
  • the file accessing module 101 queries the recorded file information, filters out the storage medium corresponding to the application number, and saves the device number of the selected storage medium as the used device number list.
  • the file access module 101 initiates a load balancing request (carrying an application number, a list of used device numbers) to the storage management module 104.
  • the storage management module 104 adopts a preset load balancing policy, and selects, in each storage medium corresponding to the used device number list, a storage medium with a low load or a large space remaining in the low performance storage medium.
  • the specific load balancing policy is determined by the configuration, and the root path and the device number of the selected storage medium are returned in the load balancing response.
  • the file access module 101 records file information, including the application number corresponding to the file, the file relative path or file number, the selected storage medium information (including the device number and the root path), and the selected file (ie, rewritten)
  • the file access status of the file is marked as New, and the file access status in the file information corresponding to the other copy of the file is marked as to be deleted.
  • the file accessing module 101 returns a file write path response to the application, where the absolute path is formed by splicing the root path and the relative path of the selected storage medium.
  • the application After receiving the file write path response of the file access module 101, the application directly initiates a file overwrite write request to the file system according to the spliced absolute path in the file write path response.
  • the application sends a write result notification to the file access module 101, and reports the file size to the file access module 101.
  • the file accessing module 101 finds the corresponding file information of the original record, modifies the file information, and marks the file access status of the selected file (ie, the rewritten file) as writing completion.
  • the remaining records to be deleted are recorded by the cold flow The process is guaranteed to be deleted.
  • FIG. 8 is a specific flowchart of file deletion in an implementation manner of Embodiment 2. Steps 601-604 are included.
  • the application determines the relative path or file number of the file to be deleted.
  • the application requests file deletion from the file access module 101.
  • the file accessing module 101 queries the recorded file information, and filters out the file access status of all the files corresponding to the application number to be deleted.
  • the record of the status to be deleted is guaranteed to be deleted by the cold process.
  • the file accessing module 101 returns a file deletion response to the application. After the application receives the file deletion response of the file accessing module 101, the deletion operation is completed.
  • FIG. 9 is a flowchart showing a specific process of file reading in an implementation manner of Embodiment 2. Steps 701-709 are included.
  • the application determines the relative path or file number of the file that needs to be read.
  • the application requests file access from the file access module 101.
  • the file accessing module 101 queries the recorded file information to obtain a list of used device numbers.
  • the file access module 101 initiates a load balancing request (carrying an application number, a list of used device numbers) to the storage management module 104.
  • the storage management module 104 selects a storage medium corresponding to the used device number list by using a preset load balancing policy, and selects a storage medium with a low load or a large space remaining in the accessible storage medium (preferably high performance)
  • the storage medium, which load balancing policy is used by the configuration returns the root path and device number of the selected storage medium in the load balancing response.
  • the file accessing module 101 confirms which file information can match the device number of the selected storage medium, and splices the absolute path according to the matched file information as the optimal access path.
  • the file access module 101 sends a file access notification to the file access statistics module 102, where the optimal access path is carried.
  • the file access module 101 returns a file read path response to the application, where the optimal access path is carried.
  • the application After receiving the file read path response of the file access module 101, the application directly initiates a file read request to the file system according to the spliced absolute path in the file read path response.
  • FIG. 10 is a specific flowchart of file scheduling in an implementation manner of Embodiment 2. Steps 801-810 are included.
  • the file access statistics module 102 periodically determines the heat of the file access count corresponding to each application number, and performs a heating trigger check; if a file exceeds the heating click threshold in a window range is greater than a preset heating threshold, It is considered that the file satisfies the heating trigger condition and the conditions for copying to other storage media are reached.
  • the file access statistics module 102 sends a heating notification to the file access module 101, and notifies the file access module 101 to perform file migration processing.
  • the file accessing module 101 queries the recorded file information to obtain a list of used device numbers.
  • the file access module 101 initiates a load balancing request (carrying an application number, a list of used device numbers) to the storage management module 104.
  • the storage management module 104 uses a predetermined load balancing policy to select a storage medium with a low load or a large space remaining in the storage medium other than the storage medium corresponding to the used device number list. Selecting a high-performance storage medium, which load balancing policy is used by the configuration as the destination device, and selecting a storage medium with a low load or a large space remaining in the storage medium corresponding to the device number list ( The specific load balancing policy is determined by the configuration as the source device, and the root path and the device number of the selected storage medium are returned in the load balancing response.
  • the file accessing module 101 records an application number corresponding to the file, a file relative path or a file number, information of the selected storage medium (including a root path, that is, a destination path, and a device number), and marks the file access status as Migration, recording the source path of the source device where the file to be migrated is located.
  • the file accessing module 101 periodically initiates a file migration operation: sending a file migration status to a file migration function to be migrated to the file migration module 103, and marking the file access status as a migration.
  • the file migration module 103 After receiving the file migration task, the file migration module 103 reads the content from the source path, and then writes the content to the destination path.
  • the file migration module 103 sends a file migration result notification to the file access module 101.
  • the file accessing module 101 updates the recorded file information according to the received file migration result, and marks the file access status as writing completion or to be migrated.
  • FIG. 11 is a flowchart showing a specific process of file cold deletion in an implementation manner of Embodiment 2. Steps 901 to 911 are included.
  • the file access statistics module 102 periodically determines the heat of the file access count corresponding to an application number, and performs a cold trigger check; if a file is below a threshold threshold in a window range is less than a preset cold threshold, The file is considered to satisfy the cold trigger condition.
  • the file access statistic module 102 sends a cold notification to the file access module 101, and notifies the file access module 101 to perform file deletion processing.
  • the file accessing module 101 queries the recorded file information to obtain a list of used device numbers.
  • the file access module 101 initiates a load balancing request (carrying an application number, a list of used device numbers) to the storage management module 104.
  • the storage management module 104 selects, in a storage medium corresponding to the used device number list, a storage medium with a high load or a large space used for the accessible storage medium in the storage medium corresponding to the used device number list (preferredly selecting a high-performance storage medium)
  • the specific load balancing policy is determined by the configuration.
  • the root path and device number of the selected device are returned.
  • the file accessing module 101 records an application number corresponding to the file, a file relative path or a file number, information of the selected storage medium (including a root path and a device number), and marks the file access status as to be deleted.
  • the file accessing module 101 periodically detects the file access status as file information to be deleted.
  • the file accessing module 101 initiates a file deletion operation: sending a file deletion status to the file migration module 103, where the file access status is to be deleted, and carrying the absolute path of the root path and the relative path splicing in 906; Mark the file access status as deleted.
  • the file migration module 103 deletes the file from the absolute path.
  • the file migration module 103 sends a file deletion result notification to the file access module 101.
  • the file accessing module 101 deletes the corresponding file information or marks the file access status as to be deleted according to the received file deletion result.
  • Embodiment 3 A system in which a plurality of storage media coexist, including a plurality of storage media; the plurality of storage media includes at least a first storage medium and a second storage medium; the first storage medium includes one or more; The second storage medium includes one or more;
  • the system further includes:
  • a processor configured to: filter a file that meets a heating trigger condition according to access statistics of each file stored in the system; periodically copy a file that meets a heating trigger condition, when the first storage medium, When the absolute value of the difference between the predetermined performance parameters of the second storage medium is greater than the preset threshold, the file that satisfies the heating trigger condition is preferentially copied to the storage medium with the predetermined high performance parameter.
  • the processor is further configured to: when a file needs to be written, select a storage medium for writing the file; when the predetermined performance parameter of the first storage medium and the second storage medium When the absolute value of the difference is greater than the preset threshold, the storage medium having the predetermined low performance parameter is preferentially selected; the storage path of the file is recorded according to the root path of the selected storage medium, and the recorded storage path is written as The path of the file; when the file needs to be read, the storage path of the file is queried; and a storage path is selected from the queried storage path as a path for reading the file.
  • the processor is further configured to: according to the access statistical information of each file stored in the system, filter out a file that meets a cold trigger condition; and for each file that satisfies the cold trigger condition, respectively in the file Selecting at least one storage path as a path for deleting a file in each storage path; when the absolute value of the difference between the predetermined performance parameters of the first storage medium and the second storage medium is greater than a preset threshold, preferentially selecting the a storage path corresponding to a storage medium having a predetermined high performance parameter; when there are a plurality of storage media having a predetermined high performance parameter, preferentially selecting a storage path corresponding to the storage medium using a plurality of storage spaces; periodically deleting according to the selected one The path to the file deletes the file that meets the cold trigger condition.
  • the processor is further configured to query all storage paths of the file when the file needs to be deleted; and delete the file on each of the queried storage paths.
  • the processor further queries a storage path of the file when the file needs to be rewritten, selects one of the stored storage paths, and selects the selected storage path as a path for modifying the file; The files on other storage paths that are queried.
  • the processor is further configured to cut the original file into multiple segments before writing the file, one of the segments as one of the files; and save a mapping relationship between the segments and the original file, and The size of the segment; the offset between the data to be read and the beginning of the original file before reading the file, and the size of the segment, determining the segment corresponding to the data to be read, and determining the segment As a file that needs to be read.
  • Embodiment 1 For other implementation details, refer to Embodiment 1 and Embodiment 2.
  • modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
  • the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the foregoing embodiments of the present invention can be applied to the field of mobile communication technologies, and solve the problem of replacement of a storage file when rewriting, and can collectively manage clusters composed of different kinds of storage media to provide predetermined performance when the storage medium is used.
  • the hotspot content is preferentially accessed in a storage medium with a predetermined high performance parameter;
  • the optional solution of the present invention can also implement load balancing of files, and can preferentially select predetermined performance parameters by file writing.
  • Low storage medium, automatic deletion of cold files, etc. to achieve storage space balance; can set the operation granularity by managing the fragments of the original file.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

多种存储介质并存的系统及进行文件操作的方法、装置及计算机存储介质;所述多种存储介质至少包括一个或多个第一存储介质、以及一个或多个第二存储介质;所述方法包括:根据所述系统中存储的各文件的访问统计信息,筛选出满足热化触发条件的文件(S110);周期性对满足热化触发条件的文件进行复制(S120),如果所述第一存储介质、第二存储介质的预定的性能参数的差值绝对值大于预设阈值,则对满足热化触发条件的文件进行复制时优先复制到所述预定的性能参数高的存储介质中。提供了一种适用于多种存储介质并存的系统的文件操作方案。

Description

多种存储介质并存的系统及进行文件操作的方法、装置及计算机存储介质 技术领域
本发明涉及存储领域,具体涉及多种存储介质并存的系统及进行文件操作的方法、装置及计算机存储介质。
背景技术
随着时代的发展,人们对文件(比如但不限于视频等多媒体文件)服务的需求越来越高,传统单台服务器已无法满足实际应用的需求,服务器集群,分布式存储技术应运而生。
存储集群几种典型的应用场景如图1(a)~(c)所示。在这些场景中,当文件存储共享时,采用如图1(a)所示的全部共享存储方式。当文件存储不全部共享时,分为两种方式,一种是文件全部互斥存储的方式,如图1(b)所示,另一种是如图1(c)所示的部分共享存储,部分互斥存储,也就是混合存储的方式。
最新的硬件环境上,存储集群可能存在两种或两种以上的存储介质,存在两种存储介质时的应用场景如图2(a)和(b)所示,图2(a)中每个服务器对应的两种存储介质(第一存储介质和第二存储介质)都是互斥存储的;图2(b)中每个服务器对应一个互斥的存储介质(第一存储介质),两个服务器对应一个共享的存储介质(第二存储介质)。图2(a)和(b)所示的应用场景均属于上述文件存储不全部共享的情况。
现有技术中,缺乏针对多种存储介质并存的不同应用场景都能适用的文件操作方案。
发明内容
本发明实施例所要解决的技术问题是提供一种适用于多种存储介质并存的系统的文件操作方案。
为了解决上述问题,采用如下技术方案。
一种在多种存储介质并存的系统中进行文件操作的方法,所述多种存储介质至少包括一个或多个第一存储介质、以及一个或多个第二存储介质;所述方法包括:
根据所述系统中存储的各文件的访问统计信息,筛选出满足热化触发条件的文件;
周期性对满足热化触发条件的文件进行复制,如果所述第一存储介质、第二存储介质的预定的性能参数的差值绝对值大于预设阈值,则对满足热化触发条件的文件进行复制时优先复制到所述预定的性能参数高的存储介质中。
可选地,所述的方法还包括:
当有文件需要写入时,选择用于写入所述文件的存储介质;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数低的存储介质;
根据所选择的存储介质的根路径记录所述文件的存储路径,将所记录的存储路径作为写入所述文件的路径;
当需要读取文件时,查询所述文件的存储路径;从查询到的存储路径中选择一个存储路径作为读取所述文件的路径。
可选地,所述的方法还包括:
根据所述系统中存储的各文件的访问统计信息,筛选出满足冷化触发条件的文件;
对于各满足冷化触发条件的文件,分别在该文件的各存储路径中,选择至少一个存储路径作为删除文件的路径;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数高的存储介质对应的存储路径;当所述预定的性能参数高的存储介质有多个时,优先选择存储空间使用多的存储介质对应的存储路径;
周期性根据所选择的删除文件的路径删除满足冷化触发条件的文件。
可选地,所述的方法还包括:
当需要删除文件时,查询所述文件的全部存储路径;
删除所查询出的各存储路径上的文件。
可选地,所述的方法还包括:
当需要重新写入文件时,查询所述文件的存储路径,从查询到的存储路径中选择一个;将所选择的存储路径作为修改所述文件的路径;删除所查询到的其它存储路径上的文件。
可选地,所述写入文件前还包括:
将原始文件切割为多个片段,一个所述片段作为一个所述文件;保存各片段和所述原始文件之间的映射关系,以及片段的大小;
所述读取文件前还包括:
根据需要读取的数据与所述原始文件开头之间的偏移量、以及片段的大小,确定所述需要读取的数据所对应的片段,将所确定的片段作为需要读取的文件。
一种在多种存储介质并存的系统中进行文件操作的装置,所述多种存储介质至少包括一个或多个第一存储介质、以及一个或多个第二存储介质;所述装置包括:
文件访问统计模块,设置为根据所述系统中存储的各文件的访问统计信息,筛选出满足热化触发条件的文件;
文件迁移模块,设置为周期性满足热化触发条件的文件进行复制,如果所述第一存储介质、第二存储介质的预定的性能参数的差值绝对值大于预设阈值,则对满足热化触发条件的文件进行复制时优先复制到所述预定的性能参数高的存储介质中。
可选地,所述的装置还包括:
存储管理模块,设置为当有文件需要写入时,选择用于写入所述文件的存储介质;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数低的存储介质;
文件访问模块,设置为根据所选择的存储介质的根路径记录所述文件的存储路径,将所记录的存储路径作为写入所述文件的路径;当需要读取文件时,查询所述文件的存储路径;指示所述存储管理模块从查询到的存储路径中选择一个存储路径作为读取所述文件的路径。
可选地,所述文件访问统计模块还设置为根据所述系统中存储的各文件的访问统计信息,筛选出满足冷化触发条件的文件;
所述存储管理模块还设置为对于各满足冷化触发条件的文件,分别在该文件的各存储路径中,选择至少一个存储路径作为删除文件的路径;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数高的存储介质对应的存储路径;当所述预定的性能参数高的存储介质有多个时,优先选择存储空间使用多的存储介质对应的存储路径;
所述文件迁移模块还设置为周期性根据所选择的删除文件的路径删除满足冷化触发条件的文件。
可选地,所述文件访问模块还设置为当需要删除文件时,查询所述文件的全部存储路径;
所述文件迁移模块还设置为删除所查询出的各存储路径上的文件。
可选地,所述文件访问模块还设置为当需要重新写入文件时,查询所述文件的存储路径,指示所述存储管理模块从查询到的存储路径中选择一个;将所选择的存储路径作为修改所述文件的路径;指示所述文件迁移模块删除所查询到的其它存储路径上的文件。
可选地,所述的装置还包括:
文件服务模块,设置为在写入文件前将原始文件切割为多个片段,一个所述片段作为一个所述文件;保存各片段和所述原始文件之间的映射关系,以及片段的大小;读取文件前根据需要读取的数据与所述原始文件开头之间的偏移量、以及片段的大小,确定所述需要读取的数据所对应的片段,将所确定的片段作为需要读取的文件。
一种多种存储介质并存的系统,包括:
多种存储介质;所述多种存储介质至少包括第一存储介质和第二存储介质;所述第一存储介质包括一个或多个;所述第二存储介质包括一个或多个;
处理器,用于根据所述系统中存储的各文件的访问统计信息,筛选出满足热化触发条件的文件;周期性对满足热化触发条件的文件进行复制,当所述第一存储介质、第二存储介质的预定的性能参数的差值绝对值大于预设阈值时,则对满足热化触发条件的文件进行复制时优先复制到所述预定的性能参数高的存储介质中。
可选地,所述处理器还用于当有文件需要写入时,选择用于写入所述文件的存储介质;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数低的存储介质;根据所选择的存储介质的根路径记录所述文件的存储路径,将所记录的存储路径作为写入所述文件的路径;当需要读取文件时,查询所述文件的存储路径;从查询到的存储路径中选择一个存储路径作为读取所述文件的路径。
可选地,所述处理器还用于根据所述系统中存储的各文件的访问统计信息,筛选出满足冷化触发条件的文件;对于各满足冷化触发条件的文件,分别在该文件的各存储路径中,选择至少一个存储路径作为删除文件的路径;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数高的存储介质对应的存储路径;当所述预定的性能参数高的存储介质有多个时,优先选择存储空间使用多的存储介质对应的存储路径;周期性根据所选择的删除文件的路径删除满足冷化触发条件的文件。
可选地,所述处理器还用于当需要删除文件时,查询所述文件的全部存储路径;删除所查询出的各存储路径上的文件。
可选地,所述处理器还当需要重新写入文件时,查询所述文件的存储路径,从查询到的存储路径中选择一个;将所选择的存储路径作为修改所述文件的路径;删除所查询到的其它存储路径上的文件。
可选地,所述处理器还用于在写入文件前将原始文件切割为多个片段,一个所述片段作为一个所述文件;保存各片段和所述原始文件之间的映射关系,以及片段的大小;读取文件前根据需要读取的数据与所述原始文件开头之间的偏移量、以及片段的大小,确定所述需要读取的数据所对应的片段,将所确定的片段作为需要读取的文件。
在本发明实施例中,还提供了一种计算机存储介质,该计算机存储介质可以存储有执行指令,该执行指令用于执行上述实施例中的方法。
采用本发明实施例所述的方法和装置,可以对不同种类的存储介质组成的集群进行统一管理,提供当存储介质的预定的性能参数存在较大差异时,使热点内容优先在预定的性能参数高的存储介质访问的特性;本发明实施例的可选方案还能够实现文件的负载均衡,可以通过文件写入时优先选择预定的性能参数低的存储介质、自动的对冷化文件删除等方式来达到存储空间的均衡。本发明实施例的可选方案还能够通过对原始文件划分出的片段的管理设置操作粒度;本发明实施例还解决了存储文件在重新写入时的替换问题。
本发明实施例的其它特征和优点将在随后的说明书中阐述,并且部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。
附图说明
附图用来提供对本发明技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本发明的技术方案,并不构成对本发明技术方案的限制。
图1(a)是共享存储时存储集群的应用场景示意图;
图1(b)是互斥存储时存储集群的应用场景示意图;
图1(c)是混合存储时存储集群的应用场景示意图;
图2(a)是存在两种存储介质时的应用场景示意图之一;
图2(b)是存在两种存储介质时的应用场景示意图之二;
图3是实施例一的在多种存储介质并存的系统中进行文件操作的方法的流程示意图;
图4是实施例二的在多种存储介质并存的系统中进行文件操作的装置的示意图;
图5(a)是实施例二应用于多种存储介质并存的系统中的概要示意图;
图5(b)是采用了实施例二所述的装置进行文件操作的示意图之一;
图5(c)是采用了实施例二所述的装置进行文件操作的示意图之二;
图6是实施例二的一种实现方式中,写入操作的流程示意图;
图7是实施例二的一种实现方式中,重新写入操作的流程示意图;
图8是实施例二的的一种实现方式中,删除操作的流程示意图;
图9是实施例二的的一种实现方式中,读取操作的流程示意图;
图10是实施例二的的一种实现方式中,文件迁移操作的流程示意图;
图11是实施例二的的一种实现方式中,文件冷化删除操作的示意图。
图12是实施例三的多种存储介质并存的系统的示意图。
具体实施方式
下面将结合附图及实施例对本发明的技术方案进行更详细的说明。
需要说明的是,如果不冲突,本发明实施例以及实施例中的各个特征可以相互结合,均 在本发明的保护范围之内。另外,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
实施例一、一种在多种存储介质并存的系统中进行文件操作的方法,所述多种存储介质至少包括一个或多个第一存储介质、和一个或多个第二存储介质;如图3所示,所述方法包括:
S110、根据所述系统中存储的各文件的访问统计信息,筛选出满足热化触发条件的文件;
S120、周期性对满足热化触发条件的文件进行复制,当所述第一存储介质、第二存储介质的预定的性能参数的差值绝对值大于预设阈值时,则对满足热化触发条件的文件进行复制时优先复制到所述预定的性能参数高的存储介质中。
本实施例中,所述文件是指由创建者所定义的一组相关信息的集合,逻辑上可分为有结构文件和无结构文件两种。在有结构文件中,文件由一组相似记录组成,如报考某学校的所有考生的报考信息记录,又称记录式文件;而无结构文件则被看成是一个字符流,比如一个二进制文件或字符文件,又称流式文件。可以认为,所述文件与通常意义上理解的文件的含义一致。
本实施例中,所述多种存储介质可以位于所述系统中的一个或多个存储设备上,一个存储设备中可以有一种或多种存储介质,而且一个存储设备中同一种存储介质可以有一个或多个。比如假设所述系统中有X个(X为正整数)存储设备(比如但不限于存储服务器),其中Y个(Y为≤X的正整数)存储设备上有一个或多个所述第一存储介质(比如但不限于SSD存储等),Z个(Z为≤X的正整数)存储设备上有一个或多个第二存储介质(比如但不限于SATA存储等);每一个存储介质的根路径及设备编号在所述系统中都是唯一的。
本实施例中,使用的不同存储介质由分布式文件系统或本地文件系统进行管理,具体应用时可以使用一个根路径对应一个存储介质,不同的存储介质和访问该存储介质的客户端可以在同一个存储总线上(不同存储介质使用同一个存储控制器进行管理),在使用分布式文件系统时,可能文件系统能访问不在本机存储控制器管理的磁盘,访问时所述不在本机存储控制器管理的磁盘能体现为一个存储路径。
本实施例中,不同种类的存储介质(比如所述第一、第二存储介质)的性能可以接近(即所述预定的性能参数的差值绝对值小于或等于预设阈值),也可以具有明显差异(即所述预定的性能参数的差值绝对值大于预设阈值)。
比如,当对某一个第二存储介质中满足热化触发条件的文件进行复制时,如果所述第一存储介质的预定的性能参数高于第二存储介质且差值绝对值大于预设阈值,则复制时优先复制到所述第一存储介质中;如果所述第一存储介质的预定的性能参数低于第二存储介质且差值绝对值大于预设阈值,则复制时优先复制到所述第二存储介质中;如果两种存储介质预定的的性能参数的差值绝对值小于或等于预设阈值,则可以自行设置复制策略,比如可以复制到任一种存储介质中,也可以优先复制到第一或第二存储介质。
本实施例中,所述预定的性能参数包括但不限于所述存储介质的I/O(Input/Output,输入/输出)速度。为了方便说明,后文将预定的性能参数高的存储介质简称为高性能存储介质或性能高的存储介质,将预定的性能参数低的存储介质简称为低性能存储介质或性能低的存储介质。
在多种存储介质并存的系统中,文件存储不全部共享,当访问的文件成为热点后,相应文件需要存在多个副本;本实施例中,如果多种存储介质的性能有较大差异(即所述特定的性能参数的差值绝对值大于预设阈值),则成为热点的文件的副本将会尽量多地放在性能高的存储介质上。
本实施例提供了一种在多种存储介质并存的存储集群中进行文件操作的方法,能够灵活的达到优先在高性能存储介质上部署热点内容的效果。
进一步地,当满足热化触发条件的文件在多个存储设备中的存储介质上都存在时,可根据预定的负载均衡策略选择作为源设备的存储介质,即选择将哪一个存储设备中的哪一个存储介质上存储的所述满足热化触发条件的文件作为源文件进行复制。每次复制时可以只复制出一个新的文件(即副本),也可以复制出多个新的文件。
可选地,所述方法还包括:
当有文件需要写入时,选择用于写入所述文件的存储介质;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数低的存储介质;
根据所选择的存储介质的根路径记录所述文件的存储路径,将所记录的存储路径作为写入所述文件的路径;
当需要读取文件时,查询所述文件的存储路径;从查询到的存储路径中选择一个存储路径作为读取所述文件的路径。
本可选方案中,根据所选择的存储介质的根路径记录所述文件的存储路径的步骤包括多种实现方式;比如可以将存储介质的根路径和文件在存储介质上的相对路径分开记录,相应地,查询时可以是分别查询根路径和相对路径,拼接为绝对路径后反馈给需要写入/读取文件的应用程序。也可以将存储介质的根路径和文件在存储介质上的相对路径拼接为绝对路径,将该绝对路径记录为存储路径,相应地,查询到的是拼接后的绝对路径。还可以由应用程序记录文件的相对路径,或根据预定规则和文件编号/文件名或其它文件标识计算出文件的相对路径,此时记录/查询/反馈的可以只有根路径。
本可选方案可保证内容优先在性能低的存储介质上发布以兼顾存储成本。
进一步地,当性能低的存储介质有多个时,还可以优先选择存储空间使用较少或存储容量大的存储介质。
进一步地,读取文件时可以优先选择性能高的存储介质对应的存储路径。
进一步地,写入或读取时,如果可供选择的存储介质有多个时,可根据预定的负载均衡策略选择其一。
本可选方案中,应用程序可以根据反馈的存储路径直接去文件系统写入和读取所述文件。还可以提供接口,从而让应用程序看到具体的存储路径,也可以只让应用程序看到一个虚拟的存储系统。另外,还可以控制应用程序编号可访问的存储空间,实际上完成了应用程序的存储设备/介质的分配。
在其它可选方案中,文件访问也可以不查询文件的存储路径,直接访问,访问失败后,由应用层控制下一步处理。
考虑到存储设备的空间有限,因此还需要不断进行内容的老化。除了文件形成热点后需要复制,还需要在文件变冷时从存储介质上清除文件。可选地,所述方法还包括:
根据所述系统中存储的各文件的访问统计信息,筛选出满足冷化触发条件的文件;
对于各满足冷化触发条件的文件,分别在该文件的各存储路径中,选择至少一个存储路径作为删除文件的路径;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数高的存储介质对应的存储路径;当所述预定的性能参数高的存储介质有多个时,优先选择存储空间使用多的存储介质对应的存储路径;
周期性根据所选择的删除文件的路径删除满足冷化触发条件的文件。
本可选方案中,当文件不再是访问热点需要冷化时,可以自动删除该文件在存储介质上的存储数据,而且优先删除高性能存储介质中的该文件。
所述热化触发条件及冷化触发条件可自行设置。
可选地,所述方法还包括:
当需要删除文件时,查询所述文件的全部存储路径;
删除所查询出的各存储路径上的文件。
可选地,所述方法还包括:
当需要重新写入文件时,查询所述文件的存储路径,从查询到的存储路径中选择一个;将所选择的存储路径作为修改所述文件的路径;删除所查询到的其它存储路径上的文件。
本可选方案中,所述应用程序根据反馈的存储路径去文件系统对相应的文件进行覆盖操作。进一步地,选择时可优先选择性能低的存储介质对应的存储路径。进一步地,如果可供选择的存储介质有多个时,可根据预定的负载均衡策略选择其一。
可选地,写入文件前还包括:
将原始文件切割为多个片段,一个所述片段作为一个所述文件;保存各片段和所述原始 文件之间的映射关系,以及片段的大小;
所述读取文件前还包括:
根据需要读取的数据与所述原始文件开头之间的偏移量、以及片段的大小,确定所述需要读取的数据所对应的片段,将所确定的片段作为需要读取的文件。
本可选方案中,内部存储时可指定原始文件要切割为片段后存储,上述文件是指原始文件分割后得到的片段。其它可选方案中,也可以不进行切割,该情况下所述文件是指原始文件本身,及原始文件的整体。
本可选方案可以控制内部文件使用的粒度。
实施例一中的各步骤可以分别采用多个功能模块实现,也可以全部步骤或部分步骤共用一个功能模块实现。
在本发明实施例中,还提供了一种计算机存储介质,该计算机存储介质可以存储有执行指令,该执行指令用于执行上述实施例中的方法。
实施例二、一种在多种存储介质并存的系统中进行文件操作的装置,所述多种存储介质至少包括一个或多个第一存储介质、以及一个或多个第二存储介质;如图4所示,所述装置包括:
文件访问统计模块102,设置为根据所述系统中存储的各文件的访问统计信息,筛选出满足热化触发条件的文件;
文件迁移模块103,设置为周期性满足热化触发条件的文件进行复制,如果所述第一存储介质、第二存储介质的预定的性能参数的差值绝对值大于预设阈值,则对满足热化触发条件的文件进行复制时优先复制到所述预定的性能参数高的存储介质中。
本实施例中,所述装置还可以包括文件访问模块101及存储管理模块104;可以但不限于由文件访问模块101根据所述文件访问统计模块102的筛选结果,指示存储管理模块104选择复制的目的设备、源设备;然后将所选择的目的设备、源设备通知给文件迁移模块103,由文件迁移模块103执行复制的操作。当然,所述文件迁移模块103也可以直接和所述文件访问统计模块102交互,根据筛选结果进行复制。
通过本实施例的装置可完成多种存储介质并存时的热点负荷分担,如果包含两种性能差距较大的存储介质,当文件较热时,可优先在更高性能的存储介质上进行服务。
可选地,所述存储管理模块104设置为当有文件需要写入时,选择用于写入所述文件的存储介质;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数低的存储介质;
所述文件访问模块101设置为根据所选择的存储介质的根路径记录所述文件的存储路径,将所记录的存储路径作为写入所述文件的路径;当需要读取文件时,查询所述文件的存储路 径;指示所述存储管理模块从查询到的存储路径中选择一个存储路径作为读取所述文件的路径。
可选地,所述文件访问统计模块102还设置为根据所述系统中存储的各文件的访问统计信息,筛选出满足冷化触发条件的文件;
所述存储管理模块104还设置为对于各满足冷化触发条件的文件,分别在该文件的各存储路径中,选择至少一个存储路径作为删除文件的路径;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数高的存储介质对应的存储路径;当所述预定的性能参数高的存储介质有多个时,优先选择存储空间使用多的存储介质对应的存储路径;
所述文件迁移模块103还设置为周期性根据所选择的删除文件的路径删除满足冷化触发条件的文件。
可选地,所述文件访问模块101还设置为当需要删除文件时,查询所述文件的全部存储路径;
所述文件迁移模块103还设置为删除所查询出的各存储路径上的文件。
可选地,所述文件访问模块101还设置为当需要重新写入文件时,查询所述文件的存储路径,指示所述存储管理模块104从查询到的存储路径中选择一个;将所选择的存储路径作为修改所述文件的路径;指示所述文件迁移模块103删除所查询到的其它存储路径上的文件。
可选地,如果需要采用文件分片存储,在目前介绍系统上,只要应用完成片段的应用层划分,就可以进行片段的调度和服务。
如果需要在系统层面完成分片文件的划分,则所述装置还包括:
文件服务模块,设置为将原始文件切割为多个片段,一个所述片段作为一个所述文件;保存各片段和所述原始文件之间的映射关系,以及片段的大小;读取文件前根据需要读取的数据与所述原始文件开头之间的偏移量、以及片段的大小,确定所述需要读取的数据所对应的片段,将所确定的片段作为需要读取的文件。
其它实现细节可参见实施例一。
本实施例的一个具体例子中,所述装置包括:文件访问模块、文件访问统计模块、文件迁移模块及存储管理模块。
所述文件访问模块设置为当写入文件时,将所述存储管理模块选择的存储介质的路径记录为所述文件的存储路径;读取文件时查询所述文件的存储路径;反馈查询到的存储路径给应用程序。
所述文件访问模块设置为维护文件与存储路径的映射关系,文件写入和读取时确认文件具体的存储路径。
所述文件访问模块还设置为当文件的访问信息满足热化/冷化触发条件,生成文件迁移或删除指令发送给文件迁移模块。所述文件迁移模块根据所述文件迁移或删除指令进行相应的操作。
所述存储管理模块设置为维护各个路径和存储介质的对应关系。维护各存储设备的空间信息(可以包括其中各存储介质的空间信息),磁盘IO信息(可以包括其中各存储介质的IO信息),状态信息(可以包括其中各存储介质的状态信息)。维护应用程序和可使用存储设备/介质的对应关系。维护存储设备/介质间的负载均衡。
当文件写入时,由存储管理模块优先选择一种存储介质进行文件的写入。当写入文件时存储管理模块还可以根据负载均衡策略在所选择的种类的多个存储介质中选择一个。
当有内容需要写入时,应用程序访问文件访问模块,由文件访问模块和存储管理模块交互,确认本应用程序可使用的最优的存储路径反馈给应用程序,应用程序根据返回的存储路径进行文件的写入。
当有内容需要读取时,应用程序访问文件访问模块,由文件访问模块和存储管理模块交互,选择一个最优的可访问的路径返回给应用程序。
所述文件访问统计模块设置为根据文件访问信息进行文件访问热度统计,根据热化、冷化算法,将达到热化/冷化触发条件的文件通知给文件访问模块,文件访问模块通过和存储管理模块进行交互,获取待复制或待删除的文件路径,然后生成迁移/删除指令通知文件迁移模块进行文件的调度或删除。
所述文件迁移模块收到请求,通过指定的方式(比如周期性执行或立刻执行),发起复制或删除操作,操作完成后,通知文件访问模块更新相应的文件信息,这样,下次再发起内容读取时,文件访问模块会重新选择相应文件对应的路径。
文件服务模块,用来管理原始文件到片段的划分和映射,和划分为片段后,片段的读取和写入。
文件服务模块在原始文件写入时,按原始文件大小或其他业务属性进行原始文件的切割,内部进行片段的命名,记录原始文件到片段的映射关系。当片段写入时,通过和文件访问模块交互,完成片段的写入。当文件读取时,由文件服务模块完成偏移量到片段的映射和读取,读取的策略还是通过文件访问模块来进行。该情况下,存储、访问的文件是原始文件划分成的各片段,因此进行访问信息统计、复制、写入、读取、冷化处理、删除等操作的对象都是所述片段,这样可以完成分片级的热度统计和分片的访问的负载均衡。
下面结合图5(a)~(c)对技术方案的实施来做进一步的详细描述。简单起见,下面的描述中,假定各存储介质组分别为不同存储介质,由更底层的分布式文件系统或本地文件系统进行管理,具体使用一个根路径对应一个存储介质,不同的存储介质和访问该存储介质的客户端在同一个存储总线上(不同存储介质使用同一个存储控制器进行管理),在使用分布式文件系统时,可能文件系统能访问不在本机存储控制器管理的磁盘,所述不在本机存储控制 器管理的磁盘能体现为一个存储路径。
图5(a)所示为实施例二应用于多种存储介质的系统中的概要示意图,三个存储设备各包括存储介质1~n,一个存储设备中的存储介质1~n为同一种或不同种类的存储介质。应用程序由业务层的应用程序和承载层的应用程序代理组成,而承载层的管理程序代理模块(也属于实施例二提供的进行文件操作的装置)主要负责存储设备的信息的采集,和文件热化/冷化的执行。进行文件操作的装置完成具体的负载均衡和访问统计功能。需要执行业务操作时,应用程序先访问管理程序选择合适的存储介质,如剩余空间最大的存储介质,或磁盘IO量较小的存储介质,然后应用程序与所选择的存储介质所在的存储设备对应的应用程序代理通讯,由应用程序代理完成后续工作。
图5(b)所示为一个采用了实施例二所述的装置进行文件操作的示意图之一。实施例二所述的装置包括:文件访问模块101,文件访问统计模块102,文件迁移模块103,存储管理模块104,管理程序代理模块105。单台设备部署时,应用程序和应用程序代理可以作为一个模块。
图5(c)所示为一个采用了实施例二所述的装置进行文件操作的示意图之二。实施例二所述的装置包括:文件访问模块101,文件访问统计模块102,文件迁移模块103,存储管理模块104,管理程序代理模块105,文件服务模块106。
其中,文件服务模块106主要是隔离应用程序直接访问文件系统,并且控制完成文件的分片功能,每个文件在写入时由文件服务模块根据应用程序指定片段的大小,当原始文件写入超过片段的大小后,自动重新生成片段名,将每个片段各作为一个文件,并通过文件访问模块101进行片段的写入。文件服务模块维护原始文件和片段的映射关系。当文件中的数据需要读取时,由文件服务模块106判断需要读取的数据是否跨片段,如果跨片段则由文件服务模块106根据新的片段发起文件读获取实际路径后,由文件服务模块106读取片段,并返回给应用程序,以完成读取。
文件服务模块106实际上包括两部分功能:文件分片管理,文件读写代理。实际上就是管理大的原始文件到小的片段的映射关系,根据偏移量和片段的大小进行新片段的生成和老片段的结束。而文件读写代理,一部分功能就是上面流程中的应用程序完成的功能,另一部分功能为读取数据传递给应用程序。
通过以上文件服务模块106的部署可完成文件细粒度的调度和服务的控制,如果文件较大可以分片存储,将每个片段各作为一个文件,片段访问热度较高时进行复制,如果原始文件较小或原始文件不需要考虑细粒度控制时不指定片段的大小即可,将原始文件整个作为一个文件。同时由于片段在文件服务模块内部处理,对应用来说实际上不感知片段的存在。同时由于不直接访问存储空间,当每个应用程序部署使用一个文件服务模块,这里可以实现应用程序使用的存储空间的隔离,防止跨应用程序的存储数据的异常访问。
存储管理模块104负责周期采集管理各存储设备的存储信息,包括各存储设备中各存储介质的存储空间和磁盘读写IO,是否可用。这里采集的信息主要是供其他模块如文件访问模 块101或文件访问统计模块102选取一个负载低或空间剩余大的可用存储介质时使用。这里的存储路径的使用上,以单个磁盘为最小粒度,这样可以获取一个路径的访问统计信息。
在后面的描述过程中,省略了应用程序与存储设备对应的应用程序代理通讯过程,将应用程序和应用程序代理整体描述为应用程序。
当应用程序请求文件访问模块101时,文件访问模块101要根据其存储的文件到存储介质的映射路径,请求存储管理模块104选择一个负载低或空间剩余大的可用存储介质,确定文件使用的具体的存储路径后由应用程序直接根据具体的存储路径进行文件的写入或读取。当访问为读取时,文件访问模块101还要通知文件访问统计模块102进行文件访问信息统计。
文件访问统计模块102收到文件访问模块101通知后,对维护的该文件的访问统计信息进行更新。同时定时检查所有文件的统计信息是否有满足热化触发条件和/或冷化触发条件的文件需要处理,如果是满足热化触发条件,通知文件访问模块101进行复制,文件访问模块101,查询该文件已发布的存储介质,请求存储管理模块104选择一个已发布的负载低的可用存储介质作为该文件具体的源存储路径,和一个未发布过的可用的负载低或存储空间剩余大的存储介质作为该文件的目的路径。文件访问模块101将选择结果处理后发给文件迁移模块103。如果是满足冷化触发条件,通知文件访问模块101进行文件冷化删除,文件访问模块101查询该文件已发布的存储介质,请求存储管理模块104选择一个存储空间使用多的可用存储介质。文件访问模块101将选择结果处理后发给文件迁移模块103。
文件迁移模块103根据收到源路径和目的路径发起文件复制或根据收到的存储路径完成删除动作,完成后通知文件访问模块101,文件访问模块101将新的存储路径置为可用或将老的存储路径进行删除。
图6所示为实施例二的一种实现方式中,文件写入的具体流程。包括步骤401~410。
401、当应用程序需要写入文件时,应用程序确定需要写入的文件的相对路径或文件编号。
402、应用程序向文件访问模块101请求文件写入。
403、文件访问模块101记录文件信息,包括文件的相对路径或文件编号,并记录文件对应的应用程序编号。
404、文件访问模块101向存储管理模块104发起负载均衡请求(携带应用程序编号)。
405、存储管理模块104采用预设的负载均衡策略选择一个应用程序编号对应的、可访问的低性能存储介质所在的负载低或空间剩余大的存储介质(具体使用哪种负载均衡策略由配置决定),在负载均衡响应中返回所选择的存储介质的设备编号和存储介质对应的根路径。
406、文件访问模块101记录文件信息,包括:文件对应的应用程序编号,文件相对路径或文件编号,选中的存储介质的信息(包括根路径和设备编号),并将文件访问状态标记为新建。
407、文件访问模块101向应用程序返回文件写入路径响应,其中携带由根路径和相对路 径拼接而成的绝对路径。
408、应用程序收到文件访问模块101的文件写入路径响应后,根据文件写入路径响应中拼接的绝对路径直接向文件系统发起文件写请求。
409、完成写操作后,应用程序发送写入结果通知给文件访问模块101,将文件大小上报文件访问模块101。
410、文件访问模块101找到原来记录的相应的文件信息,修改该文件信息,将文件访问状态标记为写入完成。
图7所示为实施例二的一种实现方式中,文件重新写入的具体流程。包括步骤501~510。
501、当文件需要重新写入时,应用程序确定需要重新写入的文件的相对路径或文件编号。
502、应用程序向文件访问模块101请求文件写入。
503、文件访问模块101查询记录的文件信息,筛选出该应用程序编号对应的的存储介质,将筛选出的存储介质的设备编号保存为已使用的设备编号列表。
504、文件访问模块101向存储管理模块104发起负载均衡请求(携带应用程序编号,已使用的设备编号列表)。
505、存储管理模块104采用预设的负载均衡策略,在已使用的设备编号列表所对应的各存储介质中,选择可访问的、低性能存储介质所在的负载低或空间剩余大的存储介质(具体使用哪种负载均衡策略由配置决定),在负载均衡响应中返回所选择的存储介质的根路径和设备编号。
506、文件访问模块101记录文件信息,包括文件对应的应用程序编号,文件相对路径或文件编号,选中的存储介质的信息(包括设备编号和根路径),并将所选中文件(即重新写入的文件)的文件访问状态标记为新建,将该文件的其它副本对应的文件信息中的文件访问状态标记为待删除。
507、文件访问模块101向应用程序返回文件写入路径响应,其中携带由选中的存储介质的根路径和相对路径拼接而成的绝对路径。
508、应用程序收到文件访问模块101的文件写入路径响应后,根据文件写入路径响应中拼接的绝对路径直接向文件系统发起文件覆盖写请求。
509、完成写操作后,应用程序发送写入结果通知给文件访问模块101,将文件大小上报文件访问模块101。
510、文件访问模块101找到原来记录的相应的文件信息,修改该文件信息,将所选中文件(即重新写入的文件)的文件访问状态标记为写入完成。其余待删除状态的记录由冷化流 程保证删除。
图8所示为实施例二的一种实现方式中,文件删除的具体流程。包括步骤601~604。
601、当文件需要删除时,应用程序确定需要删除的文件的相对路径或文件编号。
602、应用程序向文件访问模块101请求文件删除。
603、文件访问模块101查询记录的文件信息,筛选出该应用程序编号对应的所有该文件的文件访问状态标记为待删除。待删除状态的记录由冷化流程保证删除。
604、文件访问模块101向应用程序返回文件删除响应;应用程序收到文件访问模块101的文件删除响应后,删除操作完成。
图9所示为实施例二的一种实现方式中,文件读取的具体流程。包括步骤701~709。
701、当应用程序需要读取文件时,应用程序确定需要读取的文件的相对路径或文件编号。
702、应用程序向文件访问模块101请求文件读取。
703、文件访问模块101查询记录的文件信息,获取已使用的设备编号列表。
704、文件访问模块101向存储管理模块104发起负载均衡请求(携带应用程序编号,已使用的设备编号列表)。
705、存储管理模块104采用预设的负载均衡策略选择已使用的设备编号列表所对应的各存储介质中,选择可访问的存储介质所在的负载低或空间剩余大的存储介质(优先选择高性能存储介质,具体使用哪种负载均衡策略由配置决定),在负载均衡响应中返回所选择的存储介质的根路径和设备编号。
706、文件访问模块101确认哪条文件信息可以匹配所选择的存储介质的设备编号,根据匹配的文件信息拼接出绝对路径作为最优访问路径。
707、文件访问模块101向文件访问统计模块102发送文件访问通知,其中携带所述最优访问路径。
708、文件访问模块101向应用程序返回文件读取路径响应,其中携带所述最优访问路径。
709、应用程序收到文件访问模块101的文件读取路径响应后,根据文件读取路径响应中拼接的绝对路径直接向文件系统发起文件读请求。
图10所示为实施例二的一种实现方式中,文件调度的具体流程。包括步骤801~810。
801、文件访问统计模块102周期性判断各应用程序编号对应的文件访问计数的热度,进行热化触发检查;如果一个文件在一个窗口范围超过热化点击阈值的次数大于预设的热化阈值,认为该文件满足热化触发条件,达到了复制到其他存储介质的条件。
802、文件访问统计模块102向文件访问模块101发送热化通知,通知文件访问模块101进行文件迁移处理。
803、文件访问模块101查询记录的文件信息,获取已使用的设备编号列表。
804、文件访问模块101向存储管理模块104发起负载均衡请求(携带应用程序编号,已使用的设备编号列表)。
805、存储管理模块104采用预定的负载均衡策略在已使用的设备编号列表所对应的存储介质以外的其余存储介质中,选择可访问的存储介质所在的负载低或空间剩余大的存储介质(优先选择高性能存储介质,具体使用哪种负载均衡策略由配置决定)作为目的设备,以及在设备编号列表所对应的存储介质中选择可访问的存储介质所在的负载低或空间剩余大的存储介质(具体使用哪种负载均衡策略由配置决定)作为源设备,在负载均衡响应中返回所选择的存储介质的根路径和设备编号。
806、文件访问模块101记录文件对应的应用程序编号,文件相对路径或文件编号,选中的存储介质的信息(包括根路径——即目的路径,和设备编号),并将文件访问状态标记为待迁移,记录待迁移文件所在的源设备的源路径。
807、文件访问模块101周期性发起文件迁移操作:将文件访问状态为待迁移的文件信息生成文件迁移任务发送给文件迁移模块103,并将文件访问状态标记为迁移中。
808、文件迁移模块103收到文件迁移任务后,从源路径读取内容,然后写入到目地路径中,
809、完成迁移任务后或迁移失败时,文件迁移模块103向文件访问模块101发送文件迁移结果通知。
810、文件访问模块101根据收到的文件迁移结果,更新记录的文件信息,将文件访问状态标记为写入完成或待迁移。
图11所示为实施例二的一种实现方式中,文件冷化删除的具体流程。包括步骤901~911。
901、文件访问统计模块102周期性判断某一个应用程序编号对应的文件访问计数的热度,进行冷化触发检查;如果一个文件在一个窗口范围低于点击阈值的次数小于预设的冷化阈值,认为该文件满足冷化触发条件。
902、文件访问统计模块102向文件访问模块101发送冷化通知,通知文件访问模块101进行文件删除处理。
903、文件访问模块101查询记录的文件信息,获取已使用的设备编号列表。
904、文件访问模块101向存储管理模块104发起负载均衡请求(携带应用程序编号,已使用的设备编号列表)。
905、存储管理模块104采用预设的负载均衡策略在已使用的设备编号列表对应的存储介质中,选择可访问的存储介质所在的负载高或空间使用大的存储介质(优先选择高性能存储介质,具体使用哪种负载均衡策略由配置决定),在负载均衡响应中返回所选择的设备的根路径和设备编号。
906、文件访问模块101记录文件对应的应用程序编号,文件相对路径或文件编号,选中的存储介质的信息(包括根路径和设备编号),并将文件访问状态标记为待删除。
907、文件访问模块101周期性检测文件访问状态为待删除的文件信息。
908、文件访问模块101发起文件删除操作:将文件访问状态为待删除的文件信息生成文件删除任务发送给文件迁移模块103,其中携带由906中的根路径和相对路径拼接处的绝对路径;并将文件访问状态标记为删除中。
909、文件迁移模块103收到文件删除任务后,从绝对路径删除文件。
910、完成删除任务后或删除失败时,文件迁移模块103向文件访问模块101发送文件删除结果通知。
911、文件访问模块101根据收到的文件删除结果,删除相应的文件信息或将文件访问状态标记为待删除。
实施例三、一种多种存储介质并存的系统,包括多种存储介质;所述多种存储介质至少包括第一存储介质和第二存储介质;所述第一存储介质包括一个或多个;所述第二存储介质包括一个或多个;
如图12所示,所述系统还包括:
处理器,用于根据所述系统中存储的各文件的访问统计信息,筛选出满足热化触发条件的文件;周期性对满足热化触发条件的文件进行复制,当所述第一存储介质、第二存储介质的预定的性能参数的差值绝对值大于预设阈值时,则对满足热化触发条件的文件进行复制时优先复制到所述预定的性能参数高的存储介质中。
可选地,所述处理器还用于当有文件需要写入时,选择用于写入所述文件的存储介质;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数低的存储介质;根据所选择的存储介质的根路径记录所述文件的存储路径,将所记录的存储路径作为写入所述文件的路径;当需要读取文件时,查询所述文件的存储路径;从查询到的存储路径中选择一个存储路径作为读取所述文件的路径。
可选地,所述处理器还用于根据所述系统中存储的各文件的访问统计信息,筛选出满足冷化触发条件的文件;对于各满足冷化触发条件的文件,分别在该文件的各存储路径中,选择至少一个存储路径作为删除文件的路径;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数高的存储介质对应的存储路径;当所述预定的性能参数高的存储介质有多个时,优先选择存储空间使用多的存储介质对应的存储路径;周期性根据所选择的删除文件的路径删除满足冷化触发条件的文件。
可选地,所述处理器还用于当需要删除文件时,查询所述文件的全部存储路径;删除所查询出的各存储路径上的文件。
可选地,所述处理器还当需要重新写入文件时,查询所述文件的存储路径,从查询到的存储路径中选择一个;将所选择的存储路径作为修改所述文件的路径;删除所查询到的其它存储路径上的文件。
可选地,所述处理器还用于在写入文件前将原始文件切割为多个片段,一个所述片段作为一个所述文件;保存各片段和所述原始文件之间的映射关系,以及片段的大小;读取文件前根据需要读取的数据与所述原始文件开头之间的偏移量、以及片段的大小,确定所述需要读取的数据所对应的片段,将所确定的片段作为需要读取的文件。
其它实现细节可参见实施例一及实施例二。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。
工业实用性
上述的本发明实施例,可以应用于移动通讯技术领域,解决了存储文件在重新写入时的替换问题,可以对不同种类的存储介质组成的集群进行统一管理,提供当存储介质的预定的性能参数存在较大差异时,使热点内容优先在预定的性能参数高的存储介质访问的特性;本发明的可选方案还能够实现文件的负载均衡,可以通过文件写入时优先选择预定的性能参数低的存储介质、自动的对冷化文件删除等方式来达到存储空间的均衡;能够通过对原始文件划分出的片段的管理设置操作粒度。

Claims (19)

  1. 一种在多种存储介质并存的系统中进行文件操作的方法,所述多种存储介质至少包括一个或多个第一存储介质、以及一个或多个第二存储介质;所述方法包括:
    根据所述系统中存储的各文件的访问统计信息,筛选出满足热化触发条件的文件;
    周期性对满足热化触发条件的文件进行复制,如果所述第一存储介质、第二存储介质的预定的性能参数的差值绝对值大于预设阈值,则对满足热化触发条件的文件进行复制时优先复制到所述预定的性能参数高的存储介质中。
  2. 如权利要求1所述的方法,其中,还包括:
    当有文件需要写入时,选择用于写入所述文件的存储介质;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数低的存储介质;
    根据所选择的存储介质的根路径记录所述文件的存储路径,将所记录的存储路径作为写入所述文件的路径;
    当需要读取文件时,查询所述文件的存储路径;从查询到的存储路径中选择一个存储路径作为读取所述文件的路径。
  3. 如权利要求2所述的方法,其中,还包括:
    根据所述系统中存储的各文件的访问统计信息,筛选出满足冷化触发条件的文件;
    对于各满足冷化触发条件的文件,分别在该文件的各存储路径中,选择至少一个存储路径作为删除文件的路径;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数高的存储介质对应的存储路径;当所述预定的性能参数高的存储介质有多个时,优先选择存储空间使用多的存储介质对应的存储路径;
    周期性根据所选择的删除文件的路径删除满足冷化触发条件的文件。
  4. 如权利要求2所述的方法,其中,还包括:
    当需要删除文件时,查询所述文件的全部存储路径;
    删除所查询出的各存储路径上的文件。
  5. 如权利要求2所述的方法,其中,还包括:
    当需要重新写入文件时,查询所述文件的存储路径,从查询到的存储路径中选择一个;将所选择的存储路径作为修改所述文件的路径;删除所查询到的其它存储路径上的文件。
  6. 如权利要求2所述的方法,其中,所述写入文件前还包括:
    将原始文件切割为多个片段,一个所述片段作为一个所述文件;保存各片段和所述原始文件之间的映射关系,以及片段的大小;
    所述读取文件前还包括:
    根据需要读取的数据与所述原始文件开头之间的偏移量、以及片段的大小,确定所述需要读取的数据所对应的片段,将所确定的片段作为需要读取的文件。
  7. 一种在多种存储介质并存的系统中进行文件操作的装置,所述多种存储介质至少包括一个或多个第一存储介质、以及一个或多个第二存储介质;所述装置包括:
    文件访问统计模块,设置为根据所述系统中存储的各文件的访问统计信息,筛选出满足热化触发条件的文件;
    文件迁移模块,设置为周期性满足热化触发条件的文件进行复制,如果所述第一存储介质、第二存储介质的预定的性能参数的差值绝对值大于预设阈值,则对满足热化触发条件的文件进行复制时优先复制到所述预定的性能参数高的存储介质中。
  8. 如权利要求7所述的装置,其中,还包括:
    存储管理模块,设置为当有文件需要写入时,选择用于写入所述文件的存储介质;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数低的存储介质;
    文件访问模块,设置为根据所选择的存储介质的根路径记录所述文件的存储路径,将所记录的存储路径作为写入所述文件的路径;当需要读取文件时,查询所述文件的存储路径;指示所述存储管理模块从查询到的存储路径中选择一个存储路径作为读取所述文件的路径。
  9. 如权利要求8所述的装置,其中:
    所述文件访问统计模块还设置为根据所述系统中存储的各文件的访问统计信息,筛选出满足冷化触发条件的文件;
    所述存储管理模块还设置为对于各满足冷化触发条件的文件,分别在该文件的各存储路径中,选择至少一个存储路径作为删除文件的路径;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数高的存储介质对应的存储路径;当所述预定的性能参数高的存储介质有多个时,优先选择存储空间使用多的存储介质对应的存储路径;
    所述文件迁移模块还设置为周期性根据所选择的删除文件的路径删除满足冷化触发条件的文件。
  10. 如权利要求8所述的装置,其中:
    所述文件访问模块还设置为当需要删除文件时,查询所述文件的全部存储路径;
    所述文件迁移模块还设置为删除所查询出的各存储路径上的文件。
  11. 如权利要求8所述的装置,其中:
    所述文件访问模块还设置为当需要重新写入文件时,查询所述文件的存储路径,指示所述存储管理模块从查询到的存储路径中选择一个;将所选择的存储路径作为修改所述文件的路径;指示所述文件迁移模块删除所查询到的其它存储路径上的文件。
  12. 如权利要求8所述的装置,其中,还包括:
    文件服务模块,设置为在写入文件前将原始文件切割为多个片段,一个所述片段作为一个所述文件;保存各片段和所述原始文件之间的映射关系,以及片段的大小;读取文件前根据需要读取的数据与所述原始文件开头之间的偏移量、以及片段的大小,确定所述需要读取的数据所对应的片段,将所确定的片段作为需要读取的文件。
  13. 一种多种存储介质并存的系统,包括:
    多种存储介质;所述多种存储介质至少包括第一存储介质和第二存储介质;所述第一存储介质包括一个或多个;所述第二存储介质包括一个或多个;
    处理器,用于根据所述系统中存储的各文件的访问统计信息,筛选出满足热化触发条件的文件;周期性对满足热化触发条件的文件进行复制,当所述第一存储介质、第二存储介质的预定的性能参数的差值绝对值大于预设阈值时,则对满足热化触发条件的文件进行复制时优先复制到所述预定的性能参数高的存储介质中。
  14. 如权利要求13所述的系统,其中:
    所述处理器还用于当有文件需要写入时,选择用于写入所述文件的存储介质;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数低的存储介质;根据所选择的存储介质的根路径记录所述文件的存储路径,将所记录的存储路径作为写入所述文件的路径;当需要读取文件时,查询所述文件的存储路径;从查询到的存储路径中选择一个存储路径作为读取所述文件的路径。
  15. 如权利要求14所述的系统,其中:
    所述处理器还用于根据所述系统中存储的各文件的访问统计信息,筛选出满足冷化触发条件的文件;对于各满足冷化触发条件的文件,分别在该文件的各存储路径中,选择至少一个存储路径作为删除文件的路径;当所述第一存储介质、第二存储介质的所述预定的性能参数的差值绝对值大于预设阈值时,优先选择所述预定的性能参数高的存储介质对应的存储路径;当所述预定的性能参数高的存储介质有多个时,优先选择存储空间使用多的存储介质对应的存储路径;周期性根据所选择的删除文件的路径删除满足冷化触发条件的文件。
  16. 如权利要求14所述的系统,其中:
    所述处理器还用于当需要删除文件时,查询所述文件的全部存储路径;删除所查询出的各存储路径上的文件。
  17. 如权利要求14所述的系统,其中:
    所述处理器还当需要重新写入文件时,查询所述文件的存储路径,从查询到的存储路径中选择一个;将所选择的存储路径作为修改所述文件的路径;删除所查询到的其它存储路径上的文件。
  18. 如权利要求14所述的系统,其中:
    所述处理器还用于在写入文件前将原始文件切割为多个片段,一个所述片段作为一个所述文件;保存各片段和所述原始文件之间的映射关系,以及片段的大小;读取文件前根据需要读取的数据与所述原始文件开头之间的偏移量、以及片段的大小,确定所述需要读取的数据所对应的片段,将所确定的片段作为需要读取的文件。
  19. 一种计算机存储介质,所述计算机存储介质存储有执行指令,所述执行指令用于执行权利要求1至6中任一项所述的方法。
PCT/CN2016/078398 2015-08-07 2016-04-01 多种存储介质并存的系统及进行文件操作的方法、装置及计算机存储介质 WO2017024802A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510483927.X 2015-08-07
CN201510483927.XA CN105183368A (zh) 2015-08-07 2015-08-07 多种存储介质并存的系统及进行文件操作的方法和装置

Publications (1)

Publication Number Publication Date
WO2017024802A1 true WO2017024802A1 (zh) 2017-02-16

Family

ID=54905477

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/078398 WO2017024802A1 (zh) 2015-08-07 2016-04-01 多种存储介质并存的系统及进行文件操作的方法、装置及计算机存储介质

Country Status (2)

Country Link
CN (1) CN105183368A (zh)
WO (1) WO2017024802A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115718571A (zh) * 2022-11-23 2023-02-28 深圳计算科学研究院 一种基于多维度特征的数据管理方法和装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183368A (zh) * 2015-08-07 2015-12-23 中兴通讯股份有限公司 多种存储介质并存的系统及进行文件操作的方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010033578A (ja) * 2008-07-30 2010-02-12 Samsung Electronics Co Ltd データ管理方法、記録媒体及びデータ保存システム
CN102364465A (zh) * 2011-09-30 2012-02-29 深圳市赫迪威信息技术有限公司 一种文件存储方法及存储集群
CN102508789A (zh) * 2011-10-14 2012-06-20 浪潮电子信息产业股份有限公司 一种系统分级存储的方法
CN103139302A (zh) * 2013-02-07 2013-06-05 浙江大学 考虑负载均衡的实时副本调度方法
CN105183368A (zh) * 2015-08-07 2015-12-23 中兴通讯股份有限公司 多种存储介质并存的系统及进行文件操作的方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010033578A (ja) * 2008-07-30 2010-02-12 Samsung Electronics Co Ltd データ管理方法、記録媒体及びデータ保存システム
CN102364465A (zh) * 2011-09-30 2012-02-29 深圳市赫迪威信息技术有限公司 一种文件存储方法及存储集群
CN102508789A (zh) * 2011-10-14 2012-06-20 浪潮电子信息产业股份有限公司 一种系统分级存储的方法
CN103139302A (zh) * 2013-02-07 2013-06-05 浙江大学 考虑负载均衡的实时副本调度方法
CN105183368A (zh) * 2015-08-07 2015-12-23 中兴通讯股份有限公司 多种存储介质并存的系统及进行文件操作的方法和装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115718571A (zh) * 2022-11-23 2023-02-28 深圳计算科学研究院 一种基于多维度特征的数据管理方法和装置
CN115718571B (zh) * 2022-11-23 2023-08-22 深圳计算科学研究院 一种基于多维度特征的数据管理方法和装置

Also Published As

Publication number Publication date
CN105183368A (zh) 2015-12-23

Similar Documents

Publication Publication Date Title
US11609884B2 (en) Intelligent file system with transparent storage tiering
US9613040B2 (en) File system snapshot data management in a multi-tier storage environment
US9535739B2 (en) Virtual machine storage
AU2014346369B2 (en) Managed service for acquisition, storage and consumption of large-scale data streams
US8392685B2 (en) Arrangements for managing metadata of an integrated logical unit including differing types of storage media
US8756199B2 (en) File level hierarchical storage management system, method, and apparatus
JP5021683B2 (ja) ファイル・レプリケーション自動決定メカニズム
US11314444B1 (en) Environment-sensitive distributed data management
US8930364B1 (en) Intelligent data integration
US9031906B2 (en) Method of managing data in asymmetric cluster file system
US11436194B1 (en) Storage system for file system objects
US11811839B2 (en) Managed distribution of data stream contents
CN113535323A (zh) 容器化应用清单和虚拟持久卷
US9135116B1 (en) Cloud enabled filesystems provided by an agent which interfaces with a file system on a data source device
WO2017024802A1 (zh) 多种存储介质并存的系统及进行文件操作的方法、装置及计算机存储介质
JP2008158661A (ja) アクセス制御方法、アクセス制御装置、及びアクセス制御プログラム
US20210342084A1 (en) Using a secondary storage system to implement a hierarchical storage management plan
US20100223442A1 (en) Computer system and data erasing method
CN109241011B (zh) 一种虚拟机文件处理方法及装置
KR101694299B1 (ko) 클라우드 스토리지의 저장장치를 관리하기 위한 방법 및 메타데이터 서버
JP2015069246A (ja) 仮想ファイルシステムを含むコンピュータシステム
WO2015145707A1 (ja) 追記型記憶装置への書き込みデータ決定方法
CN117032596A (zh) 数据访问方法及装置、存储介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16834428

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16834428

Country of ref document: EP

Kind code of ref document: A1