CN113407506A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN113407506A
CN113407506A CN202110762931.5A CN202110762931A CN113407506A CN 113407506 A CN113407506 A CN 113407506A CN 202110762931 A CN202110762931 A CN 202110762931A CN 113407506 A CN113407506 A CN 113407506A
Authority
CN
China
Prior art keywords
file
local cache
cloud storage
read
operation request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110762931.5A
Other languages
Chinese (zh)
Inventor
黄鹄
张一飞
林洁琬
李凯
吴德承
黄诗嵘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Cloud Technology Co Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202110762931.5A priority Critical patent/CN113407506A/en
Publication of CN113407506A publication Critical patent/CN113407506A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a data processing method, a data processing device, data processing equipment and a storage medium, and relates to the technical field of data processing. The method comprises the following steps: acquiring a file operation request; and performing file operation in a local cache according to the file operation request, and performing forward synchronization or reverse synchronization on the local cache and cloud storage, wherein the forward synchronization is to synchronize the data of the local cache to the cloud storage, and the reverse synchronization is to synchronize the data of the cloud storage to the local cache. The method improves the stability of the service system using the cloud storage.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a data processing method, apparatus, device, and readable storage medium.
Background
Along with the trend of the Information Technology (IT) era to the Data Technology (DT) era, the requirements of users on cloud storage capacity, performance, reliability and security are also continuously increased. Specifically, in the Storage field, a user may use a local common file server, a high-end Network Attached Storage (NAS) device, and a cloud Storage product at the same time. In the process of system evolution, users also need to fully protect the existing investment and fully utilize the original equipment.
In the related art, a service system of a user is respectively connected with a local storage and a cloud storage, and when the user encounters various network problems in the process of using the cloud storage, the stability of the service system connected with the user through a network is influenced.
As described above, how to improve the stability of a business system using cloud storage is an urgent problem to be solved.
The above information disclosed in this background section is only for enhancement of understanding of the background of the disclosure and therefore it may contain information that does not constitute prior art that is already known to a person of ordinary skill in the art.
Disclosure of Invention
The purpose of the present disclosure is to provide a data processing method, apparatus, device and readable storage medium, which improve the stability of a business system using cloud storage at least to a certain extent.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to an aspect of the present disclosure, there is provided a data processing method including: acquiring a file operation request; and performing file operation in a local cache according to the file operation request, and performing forward synchronization or reverse synchronization on the local cache and cloud storage, wherein the forward synchronization is to synchronize the data of the local cache to the cloud storage, and the reverse synchronization is to synchronize the data of the cloud storage to the local cache.
According to an embodiment of the present disclosure, the file operation request includes a file write request; the file operation is performed in a local cache according to the file operation request, and the forward synchronization or the reverse synchronization of the local cache and the cloud storage comprises: acquiring file data to be written according to the file writing request; writing the file data to be written into the local cache to generate an operation log; and writing the file data to be written into the cloud storage according to the operation log.
According to an embodiment of the present disclosure, the file operation request includes a file read request; the file operation is performed in a local cache according to the file operation request, and the forward synchronization or the reverse synchronization of the local cache and the cloud storage comprises: acquiring a file identifier to be read according to the file reading request; judging whether a file to be read corresponding to the file identification to be read is stored in the local cache or not according to the file identification to be read; and under the condition that the file to be read is not stored in the local cache, placing the file identification to be read in a reverse synchronization queue so as to synchronize the file to be read in the cloud storage to the local cache.
According to an embodiment of the present disclosure, the performing a file operation in a local cache according to the file operation request, and performing forward synchronization or reverse synchronization between the local cache and cloud storage further includes: and under the condition that the file to be read is stored in the local cache, reading the file to be read from the local cache according to the file identification to be read.
According to an embodiment of the present disclosure, the obtaining a file operation request includes: and acquiring the file operation request through a portable operating system interface standard file system interface.
According to an embodiment of the present disclosure, the obtaining a file operation request includes: intercepting system calling information corresponding to the file operation request in a kernel space of an operating system, wherein the system calling information is generated according to the file operation request sent by a service system; forwarding the system call information to a user space of the operating system; and analyzing the system calling information through a preset program of a user space of the operating system to acquire the file operation request.
According to an embodiment of the present disclosure, the method further comprises: acquiring a file identifier to be cleaned and the failure time of the file to be cleaned corresponding to the file identifier to be cleaned, wherein the file to be cleaned is stored in the local cache; judging whether the file to be cleaned is invalid currently according to the invalidation time; judging whether the file to be cleaned is synchronized to the cloud storage or not according to the file to be cleaned identifier; and deleting the file to be cleaned from the local cache under the condition that the file to be cleaned is invalid currently and is synchronized to the cloud storage.
According to still another aspect of the present disclosure, there is provided a data processing apparatus including: the operation request acquisition module is used for acquiring a file operation request; and the file operation module is used for performing file operation in a local cache according to the file operation request and performing forward synchronization or reverse synchronization on the local cache and cloud storage, wherein the forward synchronization is to synchronize the data of the local cache to the cloud storage, and the reverse synchronization is to synchronize the data of the cloud storage to the local cache.
According to an embodiment of the present disclosure, the file operation request includes a file write request; the file operation module comprises: the to-be-written file data acquisition module is used for acquiring the to-be-written file data according to the file writing request; the file writing module is used for writing the file data to be written into the local cache to generate an operation log; and the forward synchronization module is used for writing the file data to be written into the cloud storage according to the operation log.
According to an embodiment of the present disclosure, the file operation request includes a file read request; the file operation module comprises: the file identifier acquisition module is used for acquiring the file identifier to be read according to the file reading request; the cache judging module is used for judging whether the local cache stores the file to be read corresponding to the file to be read identifier or not according to the file to be read identifier; and the reverse synchronization module is used for placing the file identification to be read in a reverse synchronization queue under the condition that the file to be read is not stored in the local cache, so that the file to be read in the cloud storage is synchronized to the local cache.
According to an embodiment of the present disclosure, the file operation module further includes: and the file reading module is used for reading the file to be read from the local cache according to the file identification to be read under the condition that the file to be read is stored in the local cache.
According to an embodiment of the present disclosure, the operation request obtaining module is further configured to obtain the file operation request through a portable operating system interface standard file system interface.
According to an embodiment of the present disclosure, the operation request obtaining module includes: the system call information intercepting module is used for intercepting system call information corresponding to the file operation request in a kernel space of an operating system, and the system call information is generated according to the file operation request sent by a service system; the system calling information forwarding module is used for forwarding the system calling information to a user space of the operating system; and the system calling information analysis module is used for analyzing the system calling information through a preset program of a user space of the operating system to acquire the file operating request.
According to an embodiment of the present disclosure, the apparatus further comprises: the file cleaning system comprises a to-be-cleaned file information acquisition module, a local cache and a file cleaning module, wherein the to-be-cleaned file information acquisition module is used for acquiring a to-be-cleaned file identifier and the failure time of a to-be-cleaned file corresponding to the to-be-cleaned file identifier, and the to-be-cleaned file is stored in the local cache; the failure judging module is used for judging whether the file to be cleaned fails currently according to the failure time; the synchronization judging module is used for judging whether the file to be cleaned is synchronized to the cloud storage according to the file to be cleaned identifier; and the file deleting module is used for deleting the file to be cleaned from the local cache under the condition that the file to be cleaned is invalid currently and is synchronized to the cloud storage.
According to yet another aspect of the present disclosure, there is provided an apparatus comprising: a memory, a processor and executable instructions stored in the memory and executable in the processor, the processor implementing any of the methods described above when executing the executable instructions.
According to yet another aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, implement any of the methods described above.
According to the data processing method provided by the embodiment of the disclosure, file operation is performed in the local cache according to the acquired file operation request, the local cache and the cloud storage are synchronized to the forward synchronization of the cloud storage or the reverse synchronization of the cloud storage to the local cache, and by uniformly managing the local cache and the cloud storage, the data of the local cache at the service system side can be accessed by the user service system when the network environment is unstable, and the data synchronization of the local cache and the cloud storage is performed after the network is recovered, so that the stability of the service system using the cloud storage can be improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
Fig. 1 shows a schematic diagram of a system architecture in an embodiment of the disclosure.
Fig. 2 shows a flow chart of a data processing method in an embodiment of the present disclosure.
Fig. 3 is a schematic diagram illustrating a processing procedure of step S202 shown in fig. 2 in an embodiment.
Fig. 4 shows a schematic flow chart of a request process according to fig. 2 and 3.
Fig. 5 is a schematic diagram illustrating a processing procedure of step S204 shown in fig. 2 in an embodiment.
Fig. 6 shows a schematic diagram of data flow in a file writing process in an embodiment of the present disclosure.
Fig. 7 shows a schematic processing procedure of step S204 shown in fig. 2 in another embodiment.
Fig. 8 shows a schematic diagram of data flow in a file reading process in an embodiment of the present disclosure.
FIG. 9 is a flow chart illustrating a method of data scrubbing in accordance with an exemplary embodiment.
Fig. 10 shows a schematic diagram of a data cleaning process in an embodiment of the present disclosure.
Fig. 11 shows an architecture diagram of a file read-write processing system according to fig. 2 to 10.
Fig. 12 shows an architecture diagram of a caching file system according to fig. 2 to 11.
Fig. 13 shows a block diagram of a data processing apparatus in an embodiment of the present disclosure.
Fig. 14 shows a block diagram of another data processing apparatus in an embodiment of the present disclosure.
Fig. 15 shows a schematic structural diagram of an electronic device in an embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, apparatus, steps, etc. In other instances, well-known structures, methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.
Furthermore, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present disclosure, "a plurality" means at least two, e.g., two, three, etc., unless explicitly specifically limited otherwise. The symbol "/" generally indicates that the former and latter associated objects are in an "or" relationship.
In the present disclosure, unless otherwise expressly specified or limited, the terms "connected" and the like are to be construed broadly, e.g., as meaning electrically connected or in communication with each other; may be directly connected or indirectly connected through an intermediate. The specific meaning of the above terms in the present disclosure can be understood by those of ordinary skill in the art as appropriate.
In the process of using cloud storage, a user service system can be divided into hot data and cold data according to different data access frequencies, and the service system has higher requirement on the access performance of the hot data. As described above, cross-metropolitan access may suffer from network instability, insufficient bandwidth, and the like, which may affect the performance and stability of cloud storage.
Therefore, the present disclosure provides a data processing method, which may implement that a user service system accesses data locally cached at a service system side when a network environment is unstable by uniformly managing a local cache and a cloud storage, and perform data synchronization between the local cache and the cloud storage after a network is restored, thereby improving stability of the service system using the cloud storage.
Fig. 1 shows an exemplary system architecture 10 to which the data processing method or data processing apparatus of the present disclosure may be applied.
As shown in fig. 1, system architecture 10 may include a terminal device 102, a network 104, and a server 106. The terminal device 102 may be a variety of electronic devices having a display screen and supporting input, output, including but not limited to smart phones, tablets, laptop portable computers, desktop computers, wearable devices, virtual reality devices, smart homes, and the like. Network 104 is the medium used to provide communication links between terminal device 102 and server 106. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few. The server 106 may be a server or a cluster of servers, etc. that provide various services.
A user may use terminal device 102 to interact with server 106 via network 104 to receive or transmit data, etc. For example, the server 106 may include a processing server for cache scheduling, the network 104 may be a local area network, a user may operate a service system on the terminal device 102, and input a file operation request, and the terminal device 102 transmits the file operation request to the server 106 through the network 104 for processing. For another example, the server 106 may include a local storage server and a cloud storage server cluster for implementing a file system, the network 104 may be an internet, and the processing server for cache scheduling may perform data synchronization between the local cache and the cloud storage through the network 104.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
FIG. 2 is a flow chart illustrating a method of data processing according to an exemplary embodiment. The method shown in fig. 2 may be applied to, for example, a server side of the system, and may also be applied to a terminal device of the system.
Referring to fig. 2, a method 20 provided by an embodiment of the present disclosure may include the following steps.
In step S202, a file operation request is acquired. The file operation request may include a file write request and a file read request.
In some embodiments, the data processing method provided by the embodiments of the present disclosure may be implemented by a cache file system developed based on a user-mode file system. A user-mode file system (FUSE, also called as a user space file system) is a framework provided by a Linux system for file system development, and can define and implement a file system of the Linux system in a user space of an operating system. The file operation request of the service system is processed through the cache file system, and correspondingly forwarded to a Network File System (NFS) or a journal file system (X file system, XFS) for file operation, and a specific embodiment may refer to fig. 3 and 4.
In some embodiments, the file operation request is obtained through a Portable Operating System Interface (POSIX) standard file System Interface. The caching file system can provide a standard POSIX file system interface, so that any application supporting the POSIX standard interface business system can be subjected to zero modification and docking.
In step S204, a file operation is performed in the local cache according to the file operation request, and the local cache and the cloud storage are forward synchronized or backward synchronized, where the forward synchronization is to synchronize data of the local cache to the cloud storage, and the backward synchronization is to synchronize data of the cloud storage to the local cache.
In some embodiments, for example, when the file operation request is a file write request, data written by the client through the standard interface in the service system preferentially enters the local cache, an operation log is generated at the same time, an operation result is returned, and the back-end data synchronization module synchronizes the data to the cloud storage in the forward direction according to the operation log. Reference may be made to fig. 5, 6 and 11 for a specific embodiment.
In some embodiments, for example, when the file operation request is a file read request, the file read request sent by the service system may preferentially process reading in the local cache, and when the local cache does not have the required file, a non-existent file identifier (for example, a file name) is placed in a reverse synchronization queue (the data synchronization module may synchronize the file in the queue from the cloud storage to the local cache), and then the file is returned after the reverse synchronization is completed. Reference may be made to fig. 7, 8 and 11 for a specific embodiment.
According to the data processing method provided by the embodiment of the disclosure, file operation is performed in the local cache according to the acquired file operation request, the local cache and the cloud storage are subjected to forward synchronization for synchronizing the data of the local cache to the cloud storage or reverse synchronization for synchronizing the data of the cloud storage to the local cache, and by uniformly managing the local cache and the cloud storage, the data of the local cache on the service system side can be accessed by the user service system when the network environment is unstable, and the data synchronization of the local cache and the cloud storage is performed after the network is restored, so that the stability of the service system using the cloud storage can be improved.
Fig. 3 is a schematic diagram illustrating a processing procedure of step S202 shown in fig. 2 in an embodiment. As shown in fig. 3, in the embodiment of the present disclosure, the step S202 may further include the following steps.
Step S302, intercepting system call information corresponding to the file operation request in the kernel space of the operating system, wherein the system call information is generated according to the file operation request sent by the service system.
Step S304, the system call information is forwarded to the user space of the operating system.
Step S306, analyzing the system calling information through a preset program of a user space of the operating system, and acquiring a file operation request.
In some embodiments, when a business system requests to read and write a file through a cache file system, a system call is generated, information of the system call is intercepted in an operating system kernel and forwarded to a user space, and a program in the user space forwards the request to a specific file system after analyzing the call information so as to read and write the file.
In some embodiments, FIG. 4 illustrates a request processing flow diagram according to FIGS. 2 and 3. As shown in fig. 4, a client operates through an application 4002 of a business system, calls a file system interface (S402) to operate a file under an export directory, intercepts a file operation request under the directory in an operating system kernel 4004, and transmits (S404) request information to a request scheduling module of a user space file system 4006, the request scheduling module operates data from a cache, schedules (S406) a back-end actual file system 4008 according to a cache policy and a request type, determines whether a synchronization log and a synchronization type need to be recorded, then returns a response, and finally returns a file operation result.
According to the file operation request processing method provided by the embodiment of the disclosure, different file systems can be freely and flexibly combined through the cache file system, the high-speed reliable cache file system and the low-cost high-capacity cloud file system which is easy to expand are combined through the user space file system, a unified file system supporting a standard interface is provided, and a business application program does not need to be modified.
Fig. 5 is a schematic diagram illustrating a processing procedure of step S204 shown in fig. 2 in an embodiment. As shown in fig. 5, in the embodiment of the present disclosure, the step S204 may further include the following steps.
Step S502, obtaining the file data to be written according to the file writing request.
In some embodiments, the file write request may be in the form of, for example, a link to the file data to be written, or the file data itself to be written.
Step S504, writing the file data to be written into the local cache to generate an operation log.
And S506, writing the file data to be written into the cloud storage according to the operation log.
In some embodiments, for example, fig. 6 shows a schematic diagram of data flow in a file writing process, as shown in fig. 6, when a client business system 602 initiates a write request to a cache file system 604, the cache file system 604 first writes data to a local cache (cache disk 608) and updates a written operation record to an operation log 606; when the system triggers a synchronization operation, the forward synchronization module 610 performs the same file data writing operation in the cloud storage 612 according to the operation log.
According to the file writing method provided by the embodiment of the disclosure, hot data can be cached in a cache close to a service system according to a certain strategy, when a network environment is unstable, a user service system can still normally access data on the service system side from a local cache, and the cache performs data synchronization after network recovery, so as to ensure consistency of data of cloud storage and local storage. Because the hot data is stored in the service system side, the requirement of the user on the performance is met, the access speed can be improved, and the bandwidth cost is reduced.
Fig. 7 shows a schematic processing procedure of step S204 shown in fig. 2 in another embodiment. As shown in fig. 7, in the embodiment of the present disclosure, the step S204 may further include the following steps.
Step S702, obtaining the file identification to be read according to the file reading request.
Step S704, determining whether the file to be read corresponding to the file identifier to be read has been stored in the local cache according to the file identifier to be read.
Step S706, under the condition that the file to be read is not stored in the local cache, placing the file identifier to be read in the reverse synchronization queue, so as to synchronize the file to be read in the cloud storage to the local cache.
In step S708, the file to be read is read from the local cache according to the file identifier to be read when the file to be read is stored in the local cache.
In some embodiments, for example, fig. 8 shows a data flow diagram in a file reading process, as shown in fig. 8, when a client service system 802 initiates a read request to a cache file system 804, the cache file system 804 first searches for a file to be read from a cache disk 808, if required data exists, the file to be read is directly returned, if the required data does not exist, an identifier of the file to be read is placed in a request queue 806, and meanwhile, an inverse synchronization module 810 is triggered to consume the request queue 806, the inverse synchronization module 810 performs inverse synchronization one by one, synchronizes corresponding file data from a cloud storage 812 to the cache disk 808, and after synchronization is completed, the cache file system 804 returns the required data to the client service system 802.
According to the file reading method provided by the embodiment of the disclosure, the data which is hot is transferred to the cache for reading, so that the influence of network jitter and interruption in a short time can be smoothed when a user uses cloud storage.
FIG. 9 is a flow chart illustrating a method of data scrubbing in accordance with an exemplary embodiment. The method shown in fig. 9 may be used for the above-described caching file system.
Referring to fig. 9, a method 90 provided by embodiments of the present disclosure may include the following steps.
In step S902, the file identifier to be cleaned and the expiration time of the file to be cleaned corresponding to the file identifier to be cleaned are obtained, and the file to be cleaned is stored in the local cache.
In step S904, it is determined whether the file to be cleaned is currently invalid according to the expiration time.
In step S906, it is determined whether the file to be cleaned is synchronized to the cloud storage according to the file to be cleaned identifier.
In step S908, in the case where the file to be cleaned has currently failed and is synchronized to the cloud storage, the file to be cleaned is deleted from the local cache.
In some embodiments, fig. 10 shows a data cleaning flow chart, and taking fig. 10 as an example, it may be determined whether a file to be cleaned fails or not and then whether the file to be cleaned is synchronized to the cloud storage or not. As shown in fig. 10, the cache scheduling module specifies a cleaning directory, a file expiration time, and triggers a cache cleaning operation (S1002), the cache cleaning module in the cache file system traverses all files in the local storage specified directory (S1004), first checks whether the current time exceeds the expiration time of the current file (S1006), if not, returns to step S1004 to check a next file, if yes, checks whether the next file is synchronized to the cloud storage (S1008), otherwise, returns to step S1004 to check the next file, and if yes, adds the next file to the deletion queue and deletes the file (S1010).
In some embodiments, it may also be determined whether the file to be cleaned is synchronized to the cloud storage first and then determined whether the file to be cleaned fails, and the remaining steps are similar to those in fig. 10 and are not described herein again.
According to the file reading method provided by the embodiment of the disclosure, the cache data is managed in the file system, and the cooled data is transferred to the cloud storage system, so that the access speed can be increased, and the bandwidth cost can be reduced.
Fig. 11 shows an architecture diagram of a file read-write processing system according to fig. 2 to 10. As shown in fig. 11, the business system 1102 is connected to the cache 1104 through a POSIX standard interface, and the business system 1102 can access the cache 1104 locally or the cloud access cache 1104. The cache 1104 includes three parts: a cache file system 11044, a data synchronization module 11046 and a cache cleaning module 11048. The request scheduling module in the cache scheduling module 1104 may process the file operation request, and a specific embodiment may refer to fig. 3 and 4. The cache scheduling module 1104 may also control file data synchronization between the local storage 1106 and the cloud storage 1108, and a specific synchronization method may refer to fig. 5 to 8. The cache scheduling module 1104 may also control the cache disk 11049 to clear out-of-date data, where the cache disk 11049 is a virtual disk corresponding to a local cache space in the local storage 1106, and is used as a virtual cache space of the cache file system. The local storage and the cloud storage are managed uniformly through the cache file system, a uniform file system supporting a standard interface is provided, and the business system of a client does not need to be modified in the access process.
Fig. 12 shows an architecture diagram of a caching file system according to fig. 2 to 11. As shown in FIG. 12, a cached file system 1204 is developed based on a user-mode file system, and provides a standard POSIX file system interface 1203, and any application 1202 supporting the POSIX standard interface of the business system can be docked with zero modifications. When a service system reads and writes files through the cache file system 1204, a system call is generated, and system call information is firstly intercepted by the virtual file system 1206 in an operating system kernel, wherein the virtual file system 1206 defines a set of standard interfaces, and specific implementation such as XFS, EXT4(four Extended file system, fourth generation Extended file system), FUSE and the like is called according to a file system type corresponding to a mount path. After invoking the relevant interface of the virtual file system 1206, such as FUSE, the FUSE kernel module transfers the system call information to the user space, and then the cache file system 1204 running in the user space may perform operations such as request scheduling (S12042), log recording (S12044) and the like according to the system call information, and the cache file system 1204 also provides an interface 12046 for updating configuration, data synchronization, cache cleaning and the like.
An application scenario of the method provided by the embodiment of the present disclosure is described below.
Scene 1: a hospital generates a large number of image files each day, which were originally stored on a hard disk local to the hospital, but the large growth of image files quickly exceeded the capacity of the local disk. Hospitals want images to be saved in a cloud file system. Meanwhile, in order to ensure performance and reliability, the whole quantity of the images in the last half year is stored locally, the images before the half year can be read when needed, and the application does not need to switch the storage directory when the images are read. By adopting the method provided by the embodiment of the disclosure, the local hard disk storage and the cloud file system are managed uniformly through the cache file system, the hospital can make full use of the local original storage resources and the cloud resources, the hierarchical storage is automatically realized, and various systems of the hospital do not need to be modified.
Scene 2: a certain animation rendering enterprise stores a large number of material files in local Network Attached Storage (NAS), the enterprise has a plurality of branches nationwide, and employees of each branch need to write and process the material files to generate the material files. Since animation rendering requires a lot of effort. To save cost, a Graphics Processing Unit (GPU) of the cloud company may be used for rendering. If the file is read from the local NAS every time during rendering, a large amount of transmission time is needed, and the rendering speed can be increased by caching the used and not updated file in the cloud. However, since the rendering software is a commercial software, the user has no way to modify the software according to such needs. By adopting the method provided by the embodiment of the disclosure, a user space-based cache file system is provided for the user, the local NAS and the cloud storage space of the user are managed uniformly, and the requirements of the user can be met.
FIG. 13 is a block diagram illustrating a data processing apparatus according to an example embodiment. The apparatus shown in fig. 13 may be applied to, for example, a server side of the system, and may also be applied to a terminal device of the system.
Referring to fig. 13, the apparatus 130 provided by the embodiment of the present disclosure may include an operation request obtaining module 1302 and a file operation module 1304.
The operation request obtaining module 1302 may be configured to obtain a file operation request.
The file operation module 1304 may be configured to perform a file operation in the local cache according to the file operation request, and perform forward synchronization or reverse synchronization on the local cache and the cloud storage, where the forward synchronization is to synchronize data of the local cache to the cloud storage, and the reverse synchronization is to synchronize data of the cloud storage to the local cache.
FIG. 14 is a block diagram illustrating another data processing apparatus according to an example embodiment. The apparatus shown in fig. 14 may be applied to, for example, a server side of the system, or may be applied to a terminal device of the system.
Referring to fig. 14, the apparatus 140 provided in the embodiment of the present disclosure may include an operation request obtaining module 1402, a file operation module 1404, a to-be-cleaned file information obtaining module 1406, a failure determining module 1408, a synchronization determining module 1410, and a file deleting module 1412, where the operation request obtaining module 1402 may include a system call information capturing module 14022, a system call information forwarding module 14024, and a system call information parsing module 14026, and the file operation module 1404 may include a to-be-written file data obtaining module 14042, a file writing module 14044, a forward synchronization module 14046, a to-be-read file identification obtaining module 14048, a cache determining module 140410, a reverse synchronization module 140412, and a file reading module 140414.
The operation request obtaining module 1402 may be configured to obtain a file operation request. The file operation request includes a file write request and a file read request.
The operation request obtaining module 1402 is further configured to obtain a file operation request through a portable operating system interface standard file system interface.
The system call information capturing module 14022 may be configured to capture system call information corresponding to the file operation request in a kernel space of the operating system, where the system call information is generated according to the file operation request sent by the service system.
System call information forwarding module 14024 is operable to forward the system call information to the user space of the operating system.
The system call information parsing module 14026 may be configured to parse the system call information through a preset program in a user space of the operating system, and obtain a file operation request.
The file operation module 1404 may be configured to perform a file operation in the local cache according to the file operation request, and perform forward synchronization or reverse synchronization on the local cache and the cloud storage, where the forward synchronization is to synchronize data of the local cache to the cloud storage, and the reverse synchronization is to synchronize data of the cloud storage to the local cache.
The to-be-written file data obtaining module 14042 may be configured to obtain the to-be-written file data according to the file writing request.
The file writing module 14044 may be configured to write the file data to be written into the local cache, and generate an operation log.
The forward synchronization module 14046 may be configured to write the file data to be written to the cloud storage according to the operation log.
The to-be-read file identifier obtaining module 14048 may be configured to obtain a to-be-read file identifier according to the file reading request.
The cache determining module 140410 may be configured to determine whether the file to be read corresponding to the file identifier to be read has been stored in the local cache according to the file identifier to be read.
The reverse synchronization module 140412 may be configured to, in a case that the file to be read is not stored in the local cache, place the file to be read identifier in a reverse synchronization queue, so as to synchronize the file to be read in the cloud storage to the local cache.
The file reading module 140414 may be configured to, in a case that the file to be read has been stored in the local cache, read the file to be read from the local cache according to the file identifier to be read.
The to-be-cleaned file information obtaining module 1406 may be configured to obtain the to-be-cleaned file identifier and the expiration time of the to-be-cleaned file corresponding to the to-be-cleaned file identifier, where the to-be-cleaned file is already stored in the local cache.
The failure determination module 1408 may be configured to determine whether the file to be cleaned is currently failed according to the failure time.
The synchronization determining module 1410 may be configured to determine whether the file to be cleaned is synchronized to the cloud storage according to the file to be cleaned identifier.
File deletion module 1412 may be used to delete the file to be cleaned from the local cache if it is currently stale and synchronized to cloud storage.
The specific implementation of each module in the apparatus provided in the embodiment of the present disclosure may refer to the content in the foregoing method, and is not described herein again.
Fig. 15 shows a schematic structural diagram of an electronic device in an embodiment of the present disclosure. It should be noted that the apparatus shown in fig. 15 is only an example of a computer system, and should not bring any limitation to the function and the scope of the application of the embodiments of the present disclosure.
As shown in fig. 15, the device 1500 includes a Central Processing Unit (CPU)1501 which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)1502 or a program loaded from a storage section 1508 into a Random Access Memory (RAM) 1503. In the RAM 1503, various programs and data necessary for the operation of the device 1500 are also stored. The CPU1501, the ROM 1502, and the RAM 1503 are connected to each other by a bus 1504. An input/output (I/O) interface 1505 is also connected to bus 1504.
The following components are connected to the I/O interface 1505: an input portion 1506 including a keyboard, a mouse, and the like; an output portion 1507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1508 including a hard disk and the like; and a communication section 1509 including a network interface card such as a LAN card, a modem, or the like. The communication section 1509 performs communication processing via a network such as the internet. A drive 1510 is also connected to the I/O interface 1505 as needed. A removable medium 1511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1510 as necessary, so that a computer program read out therefrom is mounted into the storage section 1508 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 1509, and/or installed from the removable medium 1511. The above-described functions defined in the system of the present disclosure are executed when the computer program is executed by the Central Processing Unit (CPU) 1501.
It should be noted that the computer readable media shown in the present disclosure may be computer readable signal media or computer readable storage media or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present disclosure may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an operation request acquisition module and a file operation module. The names of these modules do not form a limitation to the modules themselves in some cases, for example, the operation request obtaining module may also be described as a "module for obtaining a file operation request through a portable operating system interface standard file system interface".
As another aspect, the present disclosure also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise:
acquiring a file operation request; and performing file operation in the local cache according to the file operation request, and performing forward synchronization or reverse synchronization on the local cache and the cloud storage, wherein the forward synchronization is to synchronize the data of the local cache to the cloud storage, and the reverse synchronization is to synchronize the data of the cloud storage to the local cache.
Exemplary embodiments of the present disclosure are specifically illustrated and described above. It is to be understood that the present disclosure is not limited to the precise arrangements, instrumentalities, or instrumentalities described herein; on the contrary, the disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. A data processing method, comprising:
acquiring a file operation request;
and performing file operation in a local cache according to the file operation request, and performing forward synchronization or reverse synchronization on the local cache and cloud storage, wherein the forward synchronization is to synchronize the data of the local cache to the cloud storage, and the reverse synchronization is to synchronize the data of the cloud storage to the local cache.
2. The method of claim 1, wherein the file operation request comprises a file write request;
the file operation is performed in a local cache according to the file operation request, and the forward synchronization or the reverse synchronization of the local cache and the cloud storage comprises:
acquiring file data to be written according to the file writing request;
writing the file data to be written into the local cache to generate an operation log;
and writing the file data to be written into the cloud storage according to the operation log.
3. The method of claim 1, wherein the file operation request comprises a file read request;
the file operation is performed in a local cache according to the file operation request, and the forward synchronization or the reverse synchronization of the local cache and the cloud storage comprises:
acquiring a file identifier to be read according to the file reading request;
judging whether a file to be read corresponding to the file identification to be read is stored in the local cache or not according to the file identification to be read;
and under the condition that the file to be read is not stored in the local cache, placing the file identification to be read in a reverse synchronization queue so as to synchronize the file to be read in the cloud storage to the local cache.
4. The method of claim 3, wherein the performing the file operation in the local cache according to the file operation request, and the forward or reverse synchronizing the local cache with the cloud storage further comprises:
and under the condition that the file to be read is stored in the local cache, reading the file to be read from the local cache according to the file identification to be read.
5. The method of claim 1, wherein the get file operation request comprises:
and acquiring the file operation request through a portable operating system interface standard file system interface.
6. The method of claim 1, wherein the get file operation request comprises:
intercepting system calling information corresponding to the file operation request in a kernel space of an operating system, wherein the system calling information is generated according to the file operation request sent by a service system;
forwarding the system call information to a user space of the operating system;
and analyzing the system calling information through a preset program of a user space of the operating system to acquire the file operation request.
7. The method of claim 1, further comprising:
acquiring a file identifier to be cleaned and the failure time of the file to be cleaned corresponding to the file identifier to be cleaned, wherein the file to be cleaned is stored in the local cache;
judging whether the file to be cleaned is invalid currently according to the invalidation time;
judging whether the file to be cleaned is synchronized to the cloud storage or not according to the file to be cleaned identifier;
and deleting the file to be cleaned from the local cache under the condition that the file to be cleaned is invalid currently and is synchronized to the cloud storage.
8. A data processing apparatus, comprising:
the operation request acquisition module is used for acquiring a file operation request;
and the file operation module is used for performing file operation in a local cache according to the file operation request and performing forward synchronization or reverse synchronization on the local cache and cloud storage, wherein the forward synchronization is to synchronize the data of the local cache to the cloud storage, and the reverse synchronization is to synchronize the data of the cloud storage to the local cache.
9. An apparatus, comprising: memory, processor and executable instructions stored in the memory and executable in the processor, characterized in that the processor implements the method according to any of claims 1-7 when executing the executable instructions.
10. A computer-readable storage medium having stored thereon computer-executable instructions, which when executed by a processor, implement the method of any one of claims 1-7.
CN202110762931.5A 2021-07-06 2021-07-06 Data processing method, device, equipment and storage medium Pending CN113407506A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110762931.5A CN113407506A (en) 2021-07-06 2021-07-06 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110762931.5A CN113407506A (en) 2021-07-06 2021-07-06 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113407506A true CN113407506A (en) 2021-09-17

Family

ID=77685299

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110762931.5A Pending CN113407506A (en) 2021-07-06 2021-07-06 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113407506A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961510A (en) * 2021-10-19 2022-01-21 北京百度网讯科技有限公司 File processing method, device, equipment and storage medium
CN114461146A (en) * 2022-01-26 2022-05-10 北京百度网讯科技有限公司 Cloud storage data processing method, device, system, equipment, medium and product
CN115905306A (en) * 2022-12-26 2023-04-04 北京滴普科技有限公司 Local caching method, equipment and medium for OLAP analysis database

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102014158A (en) * 2010-11-29 2011-04-13 北京兴宇中科科技开发股份有限公司 Cloud storage service client high-efficiency fine-granularity data caching system and method
CN102035881A (en) * 2010-11-19 2011-04-27 清华大学 Data caching method of cloud storage system
CN103037004A (en) * 2012-12-21 2013-04-10 曙光信息产业(北京)有限公司 Implement method and device of cloud storage system operation
CN103605798A (en) * 2013-12-05 2014-02-26 上海够快网络科技有限公司 Method for directly operating file stored at cloud end
CN104219283A (en) * 2014-08-06 2014-12-17 上海爱数软件有限公司 Method for file downloading on demand and automatic synchronizing on basis of cloud storage and device thereof
CN104580437A (en) * 2014-12-30 2015-04-29 创新科存储技术(深圳)有限公司 Cloud storage client and high-efficiency data access method thereof
CN106570108A (en) * 2016-11-01 2017-04-19 中国科学院计算机网络信息中心 Adaptive reading optimization method and system for mass data under cloud storage environment
CN111581175A (en) * 2020-04-29 2020-08-25 上海爱数信息技术股份有限公司 File storage gateway system and data migration method thereof
CN111966533A (en) * 2020-07-23 2020-11-20 招联消费金融有限公司 Electronic file management method and device, computer equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102035881A (en) * 2010-11-19 2011-04-27 清华大学 Data caching method of cloud storage system
CN102014158A (en) * 2010-11-29 2011-04-13 北京兴宇中科科技开发股份有限公司 Cloud storage service client high-efficiency fine-granularity data caching system and method
CN103037004A (en) * 2012-12-21 2013-04-10 曙光信息产业(北京)有限公司 Implement method and device of cloud storage system operation
CN103605798A (en) * 2013-12-05 2014-02-26 上海够快网络科技有限公司 Method for directly operating file stored at cloud end
CN104219283A (en) * 2014-08-06 2014-12-17 上海爱数软件有限公司 Method for file downloading on demand and automatic synchronizing on basis of cloud storage and device thereof
CN104580437A (en) * 2014-12-30 2015-04-29 创新科存储技术(深圳)有限公司 Cloud storage client and high-efficiency data access method thereof
CN106570108A (en) * 2016-11-01 2017-04-19 中国科学院计算机网络信息中心 Adaptive reading optimization method and system for mass data under cloud storage environment
CN111581175A (en) * 2020-04-29 2020-08-25 上海爱数信息技术股份有限公司 File storage gateway system and data migration method thereof
CN111966533A (en) * 2020-07-23 2020-11-20 招联消费金融有限公司 Electronic file management method and device, computer equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961510A (en) * 2021-10-19 2022-01-21 北京百度网讯科技有限公司 File processing method, device, equipment and storage medium
CN113961510B (en) * 2021-10-19 2023-11-03 北京百度网讯科技有限公司 File processing method, device, equipment and storage medium
CN114461146A (en) * 2022-01-26 2022-05-10 北京百度网讯科技有限公司 Cloud storage data processing method, device, system, equipment, medium and product
CN114461146B (en) * 2022-01-26 2024-05-07 北京百度网讯科技有限公司 Cloud storage data processing method, device, system, equipment, medium and product
CN115905306A (en) * 2022-12-26 2023-04-04 北京滴普科技有限公司 Local caching method, equipment and medium for OLAP analysis database

Similar Documents

Publication Publication Date Title
US20220100494A1 (en) Providing access to a hybrid application offline
US11516288B2 (en) Synchronized content library
CN113407506A (en) Data processing method, device, equipment and storage medium
CN109254733B (en) Method, device and system for storing data
US20180046692A1 (en) Secure deletion operations in a wide area network
CN109683826A (en) Expansion method and device for distributed memory system
CN109714229B (en) Performance bottleneck positioning method of distributed storage system
CN115517009B (en) Cluster management method, cluster management device, storage medium and electronic equipment
CN109165078B (en) Virtual distributed server and access method thereof
WO2024082857A1 (en) Data migration method and system, and related apparatus
CN113050890A (en) Data migration method and device
CN116955225A (en) Data caching method, device, electronic equipment and readable medium
CN112286448B (en) Object access method and device, electronic equipment and machine-readable storage medium
US20170091253A1 (en) Interrupted synchronization detection and recovery
CN113742376A (en) Data synchronization method, first server and data synchronization system
CN110851192A (en) Method and device for responding to configuration of degraded switch
CN116938961B (en) System and method for data synchronization and data reading method
CN113254415B (en) Method and device for processing read request of distributed file system
CN118035594B (en) Method, apparatus, electronic device and computer readable medium for accessing production document
JPH1198448A (en) Video server system and access processing method therefor
CN117370295A (en) File copying method, device, server and storage medium
CN116010364A (en) Method and device for updating network disk file state, network disk and storage medium
JP2023107766A (en) Synchronizing external location
CN113468127A (en) Data caching method, device, medium and electronic equipment
CN114077639A (en) Data writing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220210

Address after: 100007 room 205-32, floor 2, building 2, No. 1 and No. 3, qinglonghutong a, Dongcheng District, Beijing

Applicant after: Tianyiyun Technology Co.,Ltd.

Address before: No.31, Financial Street, Xicheng District, Beijing, 100033

Applicant before: CHINA TELECOM Corp.,Ltd.

TA01 Transfer of patent application right