WO2014128819A1 - 情報処理システム及びそのデータ同期制御方式 - Google Patents
情報処理システム及びそのデータ同期制御方式 Download PDFInfo
- Publication number
- WO2014128819A1 WO2014128819A1 PCT/JP2013/053909 JP2013053909W WO2014128819A1 WO 2014128819 A1 WO2014128819 A1 WO 2014128819A1 JP 2013053909 W JP2013053909 W JP 2013053909W WO 2014128819 A1 WO2014128819 A1 WO 2014128819A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- file
- update
- computer
- data synchronization
- synchronization
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/178—Techniques for file synchronisation in file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Definitions
- the present invention relates to the efficiency of data synchronization processing between file servers operating at a plurality of locations.
- Patent Document 1 describes a system that allows a NAS (Network Attached Storage) file system at one site to be read-only referenced from another site's NAS via a CAS (Content-Addressable Storage) device at the data center. Disclosure.
- the base that publishes the file system performs a synchronization process for synchronizing the update contents with the data center periodically (for example, once a day).
- the other bases regularly share the update contents of the data center on their own sites to realize file sharing between the bases.
- Patent Document 2 discloses a technique for resolving conflicts during file system data synchronization.
- Patent Document 3 discloses a mechanism that excludes a file system name space in units of subtrees.
- the name space here is a management structure for managing files in the file system, and generally has a tree structure in which directories are nodes and files are leaves.
- a subtree is a part of a tree structure of a file system.
- the synchronization range of the file system can be made finer in file units or directory units.
- the exclusion process via the WAN has a problem that the communication delay is large. If the exclusion range is made finer, the problem of the delay time of the synchronization process can be solved, but the number of exclusion processes increases and the throughput of the synchronization process decreases. Conversely, when the exclusion range is increased, the number of exclusion processes decreases, but the problem of the delay time of the synchronization process cannot be solved.
- the present invention has been made in view of such a situation, and performs synchronization processing of a file system by dividing it into sub-trees of an appropriate size.
- the file system to be synchronized is divided into subtrees based on the size of the file to be synchronized, the throughput between bases and data centers, and the upper limit value of the synchronization processing time. Also, the divided subtrees are synchronized from the one with the highest frequency of conflict occurrence. In other words, the high-conflict frequency files that are likely to be accessed at other sites are set as a small exclusion range and are synchronized with priority, thereby reducing the synchronization delay time of these files.
- files with low conflict occurrence frequency are unlikely to be accessed at other locations
- high throughput is achieved by synchronizing many files at once.
- files with many conflicts such as group shared files
- reduce the delay time of synchronization processing by reducing the number of files subject to synchronization processing once, and files with few conflicts such as personal work files and archives once.
- High throughput is achieved by increasing the number of synchronous processing target files.
- the influence of an increase in delay time can be suppressed while suppressing a decrease in the throughput of synchronization processing, and more bases can be supported.
- the present invention relates to a technique for managing data in a storage system in an information processing system, and more specifically to a technique for transferring data stored in a NAS to a CAS and synchronizing the data between NAS. is there.
- information used in the present invention is described using a table or a list as an example.
- the information is not limited to the information provided in the table structure or list structure, and depends on the data structure. It may be information that does not.
- the NAS and CAS communication networks are not limited to the adoption of WAN, and a communication network such as a LAN (Local Area Network) can also be adopted.
- Aspects of the present invention are not limited to adopting the NFS (Network File System) protocol, but can also adopt other file sharing protocols including CIFS (Common Internet File System), HTTP (Hypertext Transfer Protocol), and the like. .
- NAS is used as the storage device on the site side, but this is only an example. It is also possible to use a distributed file system such as a CAS device, HDFS (Hadoop Distributed File System), or Object-based storage as the storage device on the site side. Further, although a CAS device is used as a data center storage device, this is merely an example. In addition to the CAS device, for example, a NAS device, a distributed file system, and Object-based storage can be used.
- each process may be described with “program” as the subject, but the program is executed by the processor so that the predetermined process is performed using the memory and the communication port (communication control device).
- the description may be based on the processor.
- the processing disclosed with the program as the subject may be processing performed by a computer such as a management server or an information processing apparatus. Part or all of the program may be realized by dedicated hardware, or may be modularized.
- Various programs may be installed in each computer by a program distribution server or a storage medium.
- FIG. 1 is a block diagram showing an example of a physical configuration of an information processing system according to an embodiment of the present invention.
- bases A and B are shown, but more bases may be included in the system, and the configuration of each base may be the same.
- the information processing system 10 includes one or a plurality of sub computer systems 100 and 110 arranged at each base, and a data center system 120 configured by a CAS device 121, and each of the sub computer systems 100 and 110. And the data center system 120 are connected via the networks 130 and 140.
- the sub computer systems 100 and 110 have clients 101 and 111 and NAS devices 102 and 112, which are connected by networks 105 and 115.
- the clients 101 and 111 are one or a plurality of computers that use the file sharing service provided by the NAS devices 102 and 112.
- the clients 101 and 111 use a file sharing service provided by the NAS devices 102 and 112 via the networks 105 and 115 using a file sharing protocol such as NFS or CIFS.
- the administrator accesses the management interface provided by the NAS devices 102 and 112 from the clients 101 and 111, and manages the NAS devices 102 and 112.
- Such management includes, for example, starting operation of the file server, stopping the file server, creating / publishing the file system, managing accounts of the clients 101 and 111, and the like.
- the plurality of NAS devices 102 may be collectively referred to as the NAS device 102 in some cases.
- the NAS devices 102 and 112 include NAS controllers 103 and 113 and storage devices 104 and 114.
- the NAS controllers 103 and 113 provide a file sharing service to the clients 101 and 111, and have a function to cooperate with the CAS device 121.
- the NAS controllers 103 and 113 store various files created by the clients 101 and 111 and file system configuration information in the storage devices 104 and 114.
- the storage devices 104 and 114 provide volumes to the NAS controllers 103 and 113, and the NAS controllers 103 and 113 are locations where various files and file system configuration information are stored.
- the volume here is a logical storage area associated with a physical storage area.
- a file is a data management unit, and a file system is management information for managing a file in a volume.
- a logical storage area in a volume managed by the file system may be simply referred to as a file system.
- the data center system 120 includes a CAS device 121 and a management terminal 124, which are connected via a network 125.
- the CAS device 121 is an archive and backup destination storage device of the NAS devices 102 and 112.
- the management terminal 124 is a computer used by an administrator who manages the information processing system 10.
- the administrator manages the CAS device 121 from the management terminal 124 through the network 125.
- Examples of such management include creation of a file system assigned to the NAS devices 102 and 112.
- the management terminal 124 has an input / output device.
- Examples of the input / output device include a display, a printer, a keyboard, and a pointer device, but other devices (for example, a speaker, a microphone, etc.) may be used.
- a configuration may be adopted in which a serial interface is used as the input / output device, and a display computer having a display, a keyboard, or a pointer device is connected to the interface. In this case, the display information is transmitted to the display computer, or the input information is received from the display computer. It may be replaced.
- a network 105 is a LAN within the site of the site A100
- a network 115 is a LAN within the site of the site B110
- a network 125 is a LAN within the data center of the data center system 120
- a network 130 is a WAN network between the site A100 and the data center system 120.
- the network 140 connects the base B 110 and the data center system 120 via a WAN.
- the type of network is not limited to the above network, and various networks can be used.
- FIG. 2 is a block diagram showing an example of a logical configuration of the information processing system according to the embodiment of the present invention.
- data read and written by the client 101 at the site A100 is stored as a file in the file system FS_A200 created by the NAS device 102.
- the data read / written by the client 111 is stored in the file system FS_A′210 created by the NAS device 112 as a file in the base B110 as well.
- a directory is a management structure for hierarchically managing a plurality of files, and is managed by the file system. The administrator designates the usage of the directory when the directory is disclosed to the clients 101 and 111.
- the user directory in the figure is a directory for storing personal work files
- the group directory is a directory used for group sharing.
- the files stored in the file system FS_A 200 and the file system FS_A ′ 210 are synchronized with the data center system 120 at a certain timing (predetermined or arbitrary timing: for example, nighttime batch processing).
- the file system FS_A ′′ 220 created by the CAS device 121 is a file system associated with the file system FS_A 200 at the site A and the file system FS_A ′ 210 at the site B.
- the file system FS_A 200 and the file system FS_A ′ 210 periodically synchronize with the CAS device fill system FS_A ′′ 220 to reflect the updated contents. At this time, if the update content of the local site conflicts with the update content of the other site and the update content of the other site has priority, the corresponding file is saved in the conflict file save directory (conflict directory in the figure). Details of these synchronization processes will be described in detail with reference to FIG.
- FIG. 3 is a block diagram illustrating an internal configuration example of the NAS device 102.
- the NAS device 112 at the site B110 has the same configuration.
- the NAS device 102 includes a NAS controller 103 and a storage device 104.
- the NAS controller 103 includes a CPU 402 that executes a program stored in the memory 401, a network interface 403 that is used for communication with the client 101 via the network 105, a network interface 404 that is used for communication with the data center system 120 via the network 130, and storage.
- a storage interface 405 used for connection to the device 104 and a memory 401 for storing programs and data are mounted, and these are connected by an internal communication path (for example, a bus).
- the memory 401 includes a file sharing server program 406, a file sharing client program 407, a file system program 408, an operating system 409, a synchronization program 410, a synchronization file division program 411, a management screen display program 412, and a local site.
- the update file list 413, the other site update file list 414, the division synchronization file list 415, and the directory management table 416 are stored. Note that the programs 406 to 412, the file lists, and the tables 413 to 416 stored in the memory may be stored in the storage device 104 and read into the memory 401 by the CPU 402 and executed.
- the file sharing server program 406 is a program that provides a means for the client 101 to perform file operations on files on the NAS device 102.
- the file sharing client program 407 is a program that provides a means for the NAS device 102 to perform a file operation on a file on the CAS device 121, and the NAS device at each site uses its own site on the CAS device 121 by the file sharing client program 407. In addition, it becomes possible to execute a predetermined file operation on files at other sites.
- the file system program 408 controls the file system FS_A200.
- the operating system 409 has an input / output control function, a read / write control function for a storage device such as a disk and a memory, and the like, and provides these functions to other programs.
- the synchronization program 410 executes file synchronization processing between the NAS device 102 and the CAS device 121.
- the synchronized file division program 411 is called from the synchronization program 410 and divides a file group to be synchronized into a plurality of subtrees.
- the management screen display program 412 controls the management screen for synchronization processing. The administrator can access the management screen provided by the management screen display program 412 via the client 101.
- the own site update list 413 is a list for the NAS device 102 to manage file update processing of its own site.
- the other site update list 414 is a list for the NAS device 102 to manage file update information at other sites.
- the divided synchronization file list 415 is a list for the NAS device 102 to divide and manage a file group subject to synchronization processing at its own site in units of subtrees.
- the directory management table 416 is a table for managing the usage of the stored file and the conflict occurrence frequency for each directory disclosed to the client 101. Details of the directory management table 416 will be described later with reference to FIG.
- the storage device 104 includes a storage interface 423 used for connection to the NAS controller 103, a CPU 422 that executes instructions from the NAS controller 103, a memory 421 that stores programs and data, and one or more disks 424. Are connected by an internal communication path (for example, a bus).
- the storage device 104 provides a block-type storage function such as FC-SAN (Fibre Channel Storage Area Network) to the NAS controller 103.
- FC-SAN Fibre Channel Storage Area Network
- FIG. 4 is a block diagram illustrating an example of the internal configuration of the CAS device 121.
- the CAS device 121 includes a CAS controller 122 and a storage device 123.
- the CAS controller 122 is used for communication with the management terminal 124 through the network 502 used for communication with the NAS devices 102 and 112 via the network 502 and 112 through the CPU 502 that executes the program stored in the memory 501, and the networks 130 and 140.
- a network interface 504, a storage interface 505 used for connection with the storage device 123, and a memory 501 for storing programs and data are mounted, and these are connected by an internal communication path (for example, a bus).
- the memory 501 stores a file sharing server program 506, a file system program 507, an operating system 508, a synchronized file list 509, and a lock list 510.
- the programs 506 to 508 and the lists 509 to 510 may be stored in the storage device 123 and read out to the memory 501 by the CPU 502 and executed.
- the file sharing server program 506 is a program that provides a means for the NAS devices 102 and 112 to perform file operations on files on the CAS device 121.
- the file system program 507 controls the file system FS_A ′′ 220.
- the operating system 508 provides other programs with an input / output control function and a read / write control function for a storage device such as a disk and a memory.
- the synchronized file list 509 is a list for managing file update processing performed by the NAS device 102 and the NAS device 112 through synchronization processing.
- the lock list 510 is a list for managing the subtree of the file system FS_A ′′ 220 acquired by the NAS device 102 and the NAS device 112 as a lock.
- the storage device 123 includes a storage interface 523 used for connection with the CAS controller 122, a CPU 522 that executes instructions from the CAS controller 122, a memory 521 that stores programs and data, and one or more disks 524. Connected by a common communication path (for example, a bus).
- the storage device 123 provides the CAS controller 122 with a block-type storage function such as FC-SAN (Fibre Channel Storage Area Network).
- FIG. 5 is a diagram illustrating a configuration example of the local site update file list 413 of the NAS device 102.
- the NAS device 112 also manages the same local site update file list 413.
- the own site update file list 413 includes, as configuration items, a file name 413A, an update date / time 413B, an update target 413C, and an update content 413D.
- Each entry in the local site update file list 413 corresponds to an update process for a file or directory generated in the NAS device 102.
- the file name 413A is identification information for identifying the file / directory to be updated, and includes the path of the file or directory.
- the path here is a character string for indicating the location of the file or directory in the file system. In the path, all directories from the root of the file system to the corresponding file / directory are described.
- the update date / time 413B is information indicating the date / time when the file or directory was updated.
- the update target 413C is information indicating the update target, and includes one of a file and a directory.
- the update content 413D is the update content performed on the update file, and one of write, delete, rename source, and rename destination is designated. Note that the writing here includes an operation of increasing or decreasing the files in the directory.
- the own site update file list 413 is newly created at the timing when the NAS device 102 performs the synchronization process. That is, the local site update file list 413 is a list of update processes that have occurred since the last synchronization process.
- Such a local site update file list 413 allows the NAS device 102 to manage update processing that has occurred at the local site since the previous synchronization processing. As a result, the NAS device 102 can notify the CAS device 121 and, by extension, other sites of update processing that has occurred at its own site.
- FIG. 6 is a diagram illustrating a configuration example of the other site update file list 414 of the NAS device 102.
- the NAS device 112 also manages a similar other site update file list 414.
- Each entry in the other site update file list 414 corresponds to a file updated at another site.
- the other site update file list 414 includes a site 414A, a synchronization number 414B, a file name 414C, an update date 414D, and an update target 414E as configuration items.
- the base 414A is information indicating the name of the base where the update has occurred.
- the synchronization number 414B is an identifier for uniquely indicating the synchronization processing performed by the NAS device 102 to all the NAS devices 102 in time series.
- the NAS device 102 uses a value obtained by adding 1 to the newest synchronization number as the synchronization number 414B during the synchronization process.
- the file name 414C is information indicating the path of the updated file.
- the update date and time 414D is information indicating the date and time when the update occurred at another base.
- the update target 414E is information indicating the update target and includes either one of a file or a directory.
- the NAS device 102 acquires the update content of the other site from the synchronized file list 509 of the CAS device 121 during the synchronization process, and adds the file updated at the other site to the other site update file list 414.
- the NAS device 102 performs processing for acquiring the latest file from the CAS device 121 when accessing a file described in the other site update file list 414.
- the other site update file list 414 makes it possible to acquire an update file generated at another site as needed when accessed from the client 101. As a result, the NAS device 102 can immediately access the update files at other sites to the client 101 without reading all the update files from the CAS device 121.
- the reflection process of the update file of another base using the other base update file list 414 is an example.
- the present invention can also be applied to a synchronization method in which an update file is transferred to the CAS device 121 each time the NAS device 102 performs synchronization processing.
- FIG. 7 is a diagram illustrating a configuration example of the divided synchronization file list 415 of the NAS device 102.
- the NAS device 112 also manages the same divided synchronization file list 415.
- Each entry in the divided synchronization file list 415 corresponds to an update process that has occurred at the local site during the synchronization process newly performed from the previous synchronization process.
- the divided synchronization file list 415 includes subtree numbers 415A, exclusion ranges 415B, file names 415C, update dates 415D, update targets 415E, and update contents 415F as configuration items.
- the subtree number 415A indicates an identification number for uniquely identifying the subtree in the divided synchronization file list 415.
- the exclusion range 415B indicates an exclusion range necessary when synchronizing the subtree in which the update process has occurred.
- the exclusion range 415B includes one or more exclusion target sub-tree root directory paths.
- the file name 415C is information indicating the path of the file / directory to be processed by the corresponding update process.
- the update date and time 415D indicates the date and time when the corresponding update process was performed.
- the update target 415E is information indicating whether the updated content is a file or a directory.
- the update content 415F is the update content performed on the update file, and one of write, delete, rename source, and rename destination is designated.
- the NAS device 102 creates a copy of the local site update file list 413 (hereinafter referred to as local site update file list copy) during the synchronization process. Thereafter, the NAS device 102 divides the update process included in the local site update file list replica into subtree units, and creates a divided synchronization file list 415.
- FIG. 8 is a diagram illustrating a configuration example of the directory management table 416 of the NAS device 102.
- the NAS device 112 also manages a similar directory management table 416.
- Each entry in the directory management table 416 corresponds to a subtree whose usage is set by the administrator.
- the directory management table 416 includes a top directory 416A, a usage 416B, a conflict frequency 416C, a synchronization time upper limit 416D, and an average throughput 416E as configuration items.
- the top directory 416A is information indicating the path of the top directory of the subtree to be controlled by the synchronization process.
- the top directory 416A may be set to a public directory path or an arbitrary directory / file path set by the administrator.
- “* (asterisk)” is included in the path of the top directory 416A, the contents of the corresponding entry are applied to all directories having paths that match paths other than “*”.
- the usage 416B is information indicating the usage of the files under the top directory 416A, and any one of personal, group sharing, archive, and backup is designated.
- the usage 416B is designated by the administrator via a public directory setting interface described later when the directory is published.
- the conflict frequency 416C is information indicating the conflict occurrence frequency during the synchronization process.
- a predetermined value set in advance for the application 416B is set.
- the conflict occurrence frequency is “low” for personal use, and “high” for group sharing.
- the synchronization time upper limit 416D is the upper limit of the processing time per synchronization process for the corresponding top directory.
- the NAS device 102 determines the size of the subtree to be subjected to the synchronization process so as to satisfy the synchronization time upper limit 416D in the synchronization division process described later.
- the synchronization time upper limit 416D is set to a predetermined value determined in advance for the application 416B.
- the average throughput 416E is information indicating the throughput of the corresponding directory synchronization processing.
- the NAS device 102 recalculates the average throughput and rewrites the average throughput 416E with the calculation result when the synchronization process for the corresponding directory occurs.
- the recalculation of the average throughput may be performed with statistical information accumulated from the past to reflect a long-term trend, or may be performed with, for example, the past five statistical information in order to grasp the latest trend. Good.
- the conflict occurrence frequency 416C may be set based on statistical information, or an average value may be set.
- the directory management table manages the requirements for synchronization processing for each directory, and it is possible to construct a subtree of an appropriate size according to the purpose of the file during synchronization processing.
- the usage 416B, the conflict occurrence frequency 416C, and the synchronization time upper limit 416D are set for the top directory 416A, but this is not restrictive.
- the usage 416B, the conflict occurrence frequency 416C, and the synchronization time upper limit 416D may be set for each directory or file other than the top directory.
- FIG. 9 is a diagram illustrating a configuration example of the synchronized file list 509 of the CAS device 121.
- Each entry in the synchronized file list 509 corresponds to the update process performed on the CAS device 121 by the synchronization process of the NAS device 102. These update processes are recorded in the order in which they are performed on the CAS device 121.
- the synchronized file list 509 includes, as configuration items, a base 509A, a synchronization number 509B, a file name 509C, an update date 509D, an update target 509E, and an update content 509F.
- the base 509A is information indicating the base where the update process has been performed. Similar to the synchronization number 414B, the synchronization number 509B is information indicating an identifier for uniquely identifying the synchronization processing across all the NAS devices 102 over time.
- the file name 509C indicates the path of the file / directory to be updated. Note that the path stored in the file name 509C is the path of the FS_A ′′ 220 of the CAS device 121 when the update process is performed.
- the update date and time 509D is information indicating the date and time when the update process occurred in the NAS device 102.
- the update target 509E is information indicating whether the updated content is a file or a directory.
- the update content 509F is the content of the update process, and one of write, delete, rename source, and rename destination is designated.
- the NAS device 102 performs an update process corresponding to the update process generated at the local site during the synchronization process on the CAS device 121 and adds the contents to the synchronized file list 509.
- the update processing generated in the NAS device 102 is recorded in the synchronized file list 509 in time series.
- update processes are recorded as update processes for the FS_A ′′ 220 of the CAS device 121, not the update process itself of the NAS device 102.
- the NAS device 102 synchronizes the update process for the renamed path when the file to be updated is stored in the FS_A ′′ 220 of the CAS device 121 with a different path due to the rename process of another base.
- the synchronized file list 509 allows the NAS device 102 to record the update processing generated at its own site in the CAS device 121 and notify the NAS device 102 at another site.
- FIG. 10 is a diagram illustrating a configuration example of the lock list 510 of the CAS device 121.
- Each entry in the lock list 510 corresponds to a subtree in the FS_A ′′ 220 of the CAS device 121 to be excluded during synchronization processing.
- the NAS device 102 cannot perform the synchronization process on the subtree including the corresponding subtree.
- the lock list 510 has a subtree root 510A, a lock owner 510B, an acquisition date 510C, and a retention period 510D as constituent items.
- the subtree root 510A is information indicating the top directory of the subtree to be excluded. Note that when a single file, not a subtree, is to be excluded, the information indicates the path of the file.
- the lock owner 510B is information indicating the NAS device 102 that has acquired the lock, and indicates an identifier (such as a host name) for uniquely indicating the NAS device 102 in the system.
- the acquisition date 510C is information indicating the date and time when the lock was acquired.
- the holding period 510D is information indicating the effective period of the lock after acquiring the lock. *
- the NAS device 102 When performing the synchronization process, the NAS device 102 adds (locks) an entry to the lock list 510 and deletes (unlocks) the entry after the synchronization process.
- the NAS device 102 synchronizes the subtree, if there is a subtree in which the other NAS device has acquired a lock in the subtree, the NAS device 102 does not perform the synchronization process. Further, if the NAS device 102 that has acquired the lock does not extend the lock period (overwrite of the acquisition date 510C) or unlocks even after the holding period 510D, the lock becomes invalid. This is to prevent another NAS device 102 from being unable to perform synchronization processing forever when a failure occurs in the NAS device 102 that has acquired the lock and recovery is not possible within the retention period.
- the lock list 510 enables exclusion of synchronization processing between NAS devices 102 in units of subtrees.
- FIG. 11 is a flowchart for explaining file read / write processing of the NAS device 102 according to the present invention.
- the NAS device 102 receives a file read / write request from the client 101, the NAS device 102 performs the read / write processing shown in FIG. In the following, the process illustrated in FIG. 11 will be described in order of step number.
- Step S1000 The file system program 408 receives a file read / write request from the client 101 via the file sharing server program 406.
- the read request includes the file name to be read and the start position and length in the file of the read target data.
- the write request includes the file name to be written, the head position in the file of the write target data, and the write data.
- Step S1001 The file system program 408 searches the other site update file list 414 and determines whether or not the read / write target file at the other site is updated.
- the file system program 408 checks the other site update file list 414 for an entry having the same file name as the read / write target file. This process is performed to determine whether the read / write target file at the local site has already been updated at the other site and it is necessary to obtain the latest version from the CAS device 121.
- Step S1002 If the read / write target file is not in the other site update file list 414, the file system program 408 proceeds to the process of step S1005. In this case, since the file at the local site is the latest, normal read / write is performed on the file of the local site FS_A200. If the read / write target file is in the other site update file list 414, the file system program 408 proceeds to the process of step S1003. In this case, since the file in the FS_A 200 at the local site is not the latest, the latest version file is acquired from the CAS device 121 and the read / write process is performed.
- Step S1003 The file sharing client program 407 reads the latest version of the read / write target file from the CAS device 121, and stores the data in the FS_A 200.
- an error is returned to the client 101, and the read / write process is terminated.
- the file system program 408 determines that the read / write target file is in an inconsistent state, and prohibits access to the file until the next synchronization processing.
- Step S1004 The file system program 408 deletes the file entry acquired in step S1003 from the other site update file list 414. This is because the corresponding file in the FS_A 200 at the local site is replaced with the latest version file in step S1003. This process eliminates the need for CAS access in the subsequent read / write processes of the file.
- Step S1005 The file system program 408 determines whether the request from the client is a file read request or a file write request. In the case of a file read request, the file system program 408 moves to the process of step S1008, and in the case of a file write request, moves to the process of step S1006.
- Step S1006 The file system program 408 performs the writing process of the corresponding file of the FS_A 200 in accordance with the client write request.
- Step S1007 The file system program 408 adds the update process performed in step S1006 to the local site update file list 413, and proceeds to the process of step S1009.
- Step S1008 The file system program 408 reads the data of the read target file.
- Step S1009 The file system program 408 returns a file read / write processing response to the client 101 via the file sharing server program 406.
- the NAS device 102 may reflect the update processing that has occurred at another site and the update processing that has occurred at its own site in the local site update file list 413 during the read / write processing. It becomes possible.
- the read / write process for the file is similarly applied to the directory.
- ⁇ File deletion / renaming process When the NAS device 102 receives a file deletion / rename request from the client, the NAS device 102 deletes / renames the corresponding file of the FS_A 200 at the local site. Thereafter, update processing information is added to the local site update file list 413.
- the rename process if the file / directory that is the rename source or the file / directory in the directory that is the rename source is included in the entries of the other site file list 414, the corresponding entry is renamed to the rename destination. Performs conversion to a path.
- the path before renaming included in the file name 414C of the other site update file list 414 is replaced with the path name after renaming. For example, if there is a renaming process that moves / home / userA / File_A to / home / userA / File_B, and / home / userA / File_A is included in the other site update file list 414, the file name 414C of the corresponding entry is / home Rewrite to / userA / File_B.
- the file name 414 of the entry included in the other site update file list 414 and the file name in the FS_A 200 can be matched.
- the rename / deletion process for the file is similarly applied to the directory.
- FIG. 12 is a flowchart for explaining the synchronization processing of the NAS device 102. In the following, the process illustrated in FIG. 12 will be described in order of step number.
- Step S2000 The synchronization program 410 of the NAS device 102 starts periodically, and starts a synchronization process for reflecting the update process of the local site generated after the previous synchronization process to the CAS device 121.
- the execution interval of the synchronization process is set by the administrator via the management screen display program 412.
- the synchronization program 410 creates a copy of the local site update file list and deletes all entries in the local site update file 413. By this processing, a file group to be synchronized can be determined.
- Step S2100 The synchronization program 410 calls the synchronization file division program 411, divides the copied entry of the local site update file list 413 into subtree units according to the directory management table 416 (FIG. 8), and creates a division synchronization file list 415. .
- the file group to be synchronized can be divided into subtree units of a size that can be synchronized within the synchronization time upper limit 416D. Details of this processing will be described later with reference to FIG.
- Step S2200 The synchronization program 410 performs a synchronization process to the CAS device 121 for each subtree created in step S2100.
- the NAS device 102 excludes the file system FS_A ′′ 220 of the CAS device 121 in units of subtrees, and performs synchronization processing on the excluded subtrees. Also, in this process, the synchronization process order is controlled in accordance with the frequency of occurrence of subtree conflicts. Details of this step will be described later with reference to FIG.
- the NAS device 102 can reflect the update processing for the FS_A 200 generated after the previous synchronization processing in the FS_A ′′ 220 of the CAS device 121 in units of subtrees.
- FIG. 13 is a flowchart for explaining details of the synchronous division processing described in step S2100. In the following, the process illustrated in FIG. 13 will be described in order of step number.
- Step S2101 The synchronous file division program 411 reads the local site update file list copy created in step S2000, and extracts files and directories that have been updated since the previous synchronization process. Note that the path in the FS_A 200 at the time when the update process occurs is recorded in the file name of the local site update file list copy. If the update file directory or its upper directory has been renamed after the update process, the file name of the local site update file list copy and the actual path will be different. Therefore, the synchronous file division program 411 replaces the path name of the rename source file / directory with the path name of the rename destination file / directory for the entry before the rename process of the local site update file list replication occurs. I do.
- Step S2102 The synchronous file division program 411 rearranges the entries included in the update file / directory list extracted in step S2101 so that they are consecutive for each subtree.
- the synchronous file division program 411 first sorts the entries by path, and sequentially arranges files under the same directory. This is because the partial path to the corresponding directory is the same for the files under the same directory.
- S2116 The synchronous file division program 411 checks the top directory of the directory management table 416. If another top directory is included in a certain top directory, the entries under the other top directories are rearranged after the top directory entry.
- Step S2103 The synchronous file division program 411 creates a work list on the memory as a work area for creating a subtree.
- the work list is a list for storing reference pointers of files included in the subtree under construction.
- Step S2104 The synchronous file division program 411 selects the first entry synchronous division process target file (processing file) in the processing target file list created in step S2102. At this time, the directory management table 416 is checked to check the top directory to which the file being processed belongs.
- Step S2105 The synchronous file division program 411 adds the file being processed to the work list.
- Step S2106 If the current file being processed is the last file in the processing target file list, the synchronous file division program 411 determines that the file being processed is the last file in the subtree, and proceeds to the processing in step S2110. Otherwise, the process proceeds to step S2107.
- Step S2107 The synchronous file division program 411 checks whether or not the next file in the processing target file list is included in the same top directory as the currently processed file.
- the top directory here corresponds to the top directory 416A of the directory management table 416. If they are not included in the same top directory, the file being processed is determined to be the last file in the subtree, and the process proceeds to step S2110. Otherwise, the process proceeds to step S2108.
- Step S2108 The synchronous file dividing program uses Equation 1 below to estimate the synchronization time when the file included in the work file list and the next processing target file are added.
- Estimated synchronization time ((sum of file sizes included in work file list) + size of next processing target file) / average throughput. . . Formula 1
- the average throughput uses the value of the average throughput 416E described in the directory management table 416.
- Step S2109 The synchronous file division program 411 determines whether or not the synchronization time estimate obtained in S2108 is greater than the synchronization time upper limit 416D of the top directory to which the file being processed belongs. If the synchronization time estimate is larger than the data synchronization time upper limit 416D, the synchronization file division program 411 determines that the file being processed is the last file in the subtree, and proceeds to the process of step S2110. Otherwise, the process proceeds to step S2115.
- Step S2110 The synchronous file division program 411 creates a subtree to be synchronized from the files included in the work list.
- the subtree is the smallest subtree that includes all the files included in the work list in the tree structure of the file system.
- the subtree becomes the top directory at the maximum.
- Step S2111 The synchronous file division program 411 sets the subtree created in S2110 as the exclusive range. Also, the synchronous file division program 411 checks the local site update file list copy, and if the file / directory in the subtree being created is renamed from outside the subtree, the rename source path is also set as the exclusive range. This is to prevent the rename source file / directory from being updated by the synchronization processing of another base during the synchronization processing.
- Step S2112 The synchronization file division program 411 outputs a division synchronization file list 415 for each subtree determined in step S2107.
- the synchronous file division program 411 assigns a unique subtree number within the synchronization process for each subtree and outputs the subtree number to the subtree number 415A.
- the exclusion range determined in step S2111 is output to the exclusion range 415B.
- the file name 415C stores the path of the update file checked in step S2101, and the update date 415D, the update target 415E, and the update content 415F store the contents of the local site update file list 413.
- Step S2113 The synchronous file division program 411 initializes the work list and starts to create the next subtree.
- Step S2114 If there is an unprocessed update file in the synchronization target file group, the synchronous file division program 411 proceeds to the process of step S2115, and otherwise ends the synchronous division process.
- Step S2115 The synchronous file division program 411 performs the processing from step S2105 on for the next update file of the synchronization target file group.
- the NAS device 102 can divide the synchronization process as large as possible within a range that satisfies the synchronization time upper limit 416D.
- FIG. 14 is a flowchart for explaining the details of the subtree unit synchronization processing described in step S2200. In the following, the process illustrated in FIG. 14 will be described in order of step number.
- Step S2201 The synchronization program 410 checks the divided synchronization file list 415 and the directory management table 416, and checks the conflict frequency 416C of each subtree. Note that the update frequency of the subtree is determined to be equivalent to the conflict frequency 416C of the top directory 416A including the subtree.
- Step S2202 The synchronization program 410 sorts the subtrees in descending order of the conflict frequency based on the conflict frequency checked in S2201. If there are a plurality of sub-trees having the same conflict frequency, priority is given to the one having the smallest total file size in the sub-tree. As a result, since a file having a high frequency of conflict occurrence can be synchronized with another base, the probability of occurrence of conflict can be reduced.
- Step S2203 The synchronization program 410 checks the lock list 510 of the CAS device 121 and determines whether or not a lock can be acquired for the exclusive range 415B of the sub-tree to be processed. A lock is determined to be acquirable only when all the subtree roots 510A in the lock list 510 are not included in the exclusive range 415B.
- the synchronization program 410 When locks can be acquired for all exclusive ranges 415B, the synchronization program 410 adds an entry with the exclusive range 415B as the subtree root 510A and itself as the lock owner 510B to the lock list 510. It should be noted that the execution date and time of this process is designated as the acquisition date and time 510C, and a period preset in the system (for example, a value designated by the synchronization time upper limit 416D) is designated as the retention period 510D.
- the NAS device 102 puts a lock file with a specific name on the CAS device 121 when only the lock list 510 has an update authority at the start of the operation of the lock list 510. create.
- the NAS device 102 deletes the lock file as soon as the operation on the lock list 510 is completed.
- the other sites do not operate the lock list 510 while the lock file exists.
- Step S2204 If the lock program succeeds in acquiring the lock in step S2203, the synchronization program 410 proceeds to step S2205 in order to synchronize the sub-tree being processed with the CAS device 121. If the lock acquisition fails, the process proceeds to step S2211, and the process proceeds to the next subtree.
- Step S2205 The synchronization program 410 reads the synchronized file list 509 from the CAS device 121.
- Step S2206 The synchronization program 410 reflects the contents of the synchronized file list 509 read in step S2205 in the file system FS_A200 and the other site update file list 414. Further, when the renamed / deleted process is included in the synchronized file list 509, the contents thereof are also reflected in the local site update file list 413 and the divided synchronized file list 415. Details of the synchronized file list reflection processing will be described later with reference to FIG.
- Step S2207 The synchronization program 410 performs a synchronization process in which the update process for the processing subtree in the divided synchronization file list 415 is reflected on the CAS device 121.
- the synchronization program 410 reads the divided synchronization file list 415 and performs processing equivalent to the file update processing in the subtree performed after the previous synchronization processing on the CAS device 121.
- the synchronization program 410 stores the update target file in the CAS device 121 when the update content 415F is a write. When the update content 415F is rename / delete, the same processing is performed on the CAS device 121.
- the path name in the FS_A 200 at the time of the update process is recorded. Therefore, similarly to step S2101, a process of converting the file name 415C into a path in the FS_A 200 at the time of the synchronization process is performed.
- the FS_A ′′ 220 is recorded with a different path. Therefore, the rename process that has occurred since the previous synchronization process is checked from the synchronized file list 509, and the process of converting the file name 415C into the path of FS_A ′′ 220 is also performed. The update process is performed on the file directory of the FS_A ′′ 220 finally obtained.
- Step S2208 The synchronization program 410 adds the update process to the CAS device 121 performed in step S2207 to the synchronized file list 509 on the CAS device 121. Note that the update process to the synchronized file list 509 is performed exclusively with other sites in the same manner as in step S2203.
- the synchronization program 410 records the cumulative transfer amount and transfer time of the subtree as statistical information, and calculates the average transfer throughput.
- the synchronization program 410 records the average transfer throughput in the average throughput 416E of the directory management table 416.
- Step S2209 The synchronization program 410 deletes the entry added in step S2203 from the lock list 510 of the CAS device 121 and unlocks it. As in step S2203, the operation of the lock list 510 is performed exclusively with other sites.
- Step S2210 If there is an unprocessed subtree, the synchronization program 410 moves the process to S2211 for processing of the next subtree, and ends the synchronization process otherwise.
- Step S2211 The synchronization program 410 selects the subtree with the highest conflict frequency among the unprocessed subtrees as the next subtree, and repeats the processing from step S2203 onward.
- the lock is acquired and released for each subtree synchronization process, but this is merely an example.
- the synchronization program 410 performs synchronization processing for all subtrees and then releases the locks of these subtrees.
- a plurality of subtrees under different top directories may be acquired collectively, and the lock target can be set more finely than changing the range of the subtree.
- the determination of the range in which a plurality of locks are collected may take into account the time for synchronization in addition to the above-described conflict frequency. Also, this determination may be made at the stage of sorting the subtree in S2202, for example.
- FIG. 15 is a flowchart for explaining details of the synchronized file list reflection processing of the CAS device 121. In the following, the process illustrated in FIG. 15 will be described in order of step number.
- Step S3000 The synchronization program 410 is an update process included in the subtree being processed in the update process for the CAS device 121 generated after the previous synchronization process performed in the synchronized file list 509 read in step S2205. Check out. Thereafter, the synchronization program 410 checks whether or not the processing target file for the update process performed by the CAS device 121 is included in the local site update file list 413, and determines whether or not a conflict has occurred.
- the corresponding update process is extracted as a conflict process.
- the update process is processed as another site priority conflict. Otherwise, it is processed as a local conflict.
- the rename process / deletion process it is determined whether the local site priority conflict or the other site priority conflict occurs. If both the local site update file list 413 and the synchronized file list 509 are being renamed, the file name of the subsequent entry is converted to the file name before the rename and then the same file. Determine whether or not.
- Step S3001 The synchronization program 410 copies the file or directory that has been processed in the other site priority conflict found in Step S3000 to a dedicated save directory.
- the save directory is a system directory prepared for each file system as a directory for storing conflict occurrence files.
- Step S3002 The synchronization program 410 sets the first entry among the process target entries in the synchronized file list 509 found in step S3000 as the next process target, and proceeds to the process of step S3003.
- Step S3003 If the update process being processed is a local site priority conflict, the synchronization program 410 determines that it is not necessary to reflect the update process of another site, and proceeds to the process of step S3007. Otherwise, the process proceeds to step S3004.
- Step S3004 Based on the update content 509F, the synchronization program 410 determines whether the update process is a write to a file or directory, or a delete or rename process. If the update process is writing, the process proceeds to step S3005. If not, the process proceeds to step S3006.
- Step S3005 The synchronization program 410 adds the update target file 509C of the update process currently being processed to the other site update file list 414.
- the file system program 408 can determine whether the file or directory to be accessed is the latest state by referring to the other site update file list 414. If renaming has occurred for the file to be updated or the upper directory at the local site or another site, the file name 509C in the synchronized file list 509 may be different from the path in the file system FS_A200. In that case, the path is converted into a path in the FS_A 200 and registered in the other site update file list 414.
- Step S3006 The synchronization program 410 performs the deletion / renaming process of the synchronized file list 509 on the file system FS_A200. Prior to these processes, as in the case of step S3005, if a rename process has occurred at the local site / other sites, the process is performed using the path in the FS_A200.
- Step S3007 The synchronization program 410 repeats the processing after step S3002 when there is an unprocessed update processing for the processing target subtree of the synchronized file list 509, and performs synchronized file list reflection processing otherwise. finish.
- FIG. 16 is a schematic diagram showing a management interface for setting a directory to be disclosed to the client 101.
- the public directory setting interface 4160 is provided to the administrator via the client 101 by the management screen display program 412.
- the administrator can set the contents of the directory management table 416 by specifying the usage when setting the public directory of the FS_A 200.
- the public directory setting interface 4160 includes a text input box 4160A and check boxes 4160B to 4160E.
- the text input box 4160A is a text input box for setting a path name of a directory to be disclosed.
- An asterisk (“*”) may be used for the text input box 4160A.
- the input content of the text input box 4160A corresponds to the top directory 416A of the directory management table 416.
- Check boxes 4160B to 4160E are check boxes for specifying the usage of the directory input in the text input box 4160A. If any one of these check boxes is specified, the other check boxes are invalidated.
- the check boxes 4160B to 4160E correspond to values that can be specified in the directory management table 416, respectively.
- a directory path name can be set in the text input box 4160A, and the usage corresponding to the set path name, the conflict frequency, the upper limit of the data synchronization processing time, and the like can be displayed.
- FIG. 16 the application setting at the timing of opening the top directory has been described. However, the same management interface may be provided to the administrator in the application setting for directories and files below the top directory.
- the directory below the top directory is disclosed as a share, it is set at the directory publication timing.
- the administrator can arbitrarily set it during operation in response to a request transmitted from the client 101. And provide a management interface.
- a management interface may be provided in response to a request transmitted from the client 101 so that the administrator can set the file as needed during operation.
- file synchronization for applications with a low conflict frequency is performed in units of large subtrees.
- the number of lock acquisitions can be reduced, and a decrease in throughput due to the overhead of synchronous processing division can be suppressed.
- the present invention it is possible to reduce the data synchronization delay time between a plurality of bases while suppressing a decrease in data synchronization throughput due to lock acquisition. As a result, more bases can be supported.
- the sub-tree unit is calculated from the use, the frequency of occurrence of conflict, the directory and file size, the synchronization upper limit time, the average throughput, etc., but it is not always necessary to consider all these factors.
- the entire top directory may be determined as one subtree. This is because in the case of a personal top directory, there is only one user accessing it, and it is difficult to imagine a scene in which an update from another base occurs during an update from a base, and the entire top directory is locked for a long time. This is because problems are less likely to occur.
- the subtree can be determined easily, and the synchronization process can be realized in a shorter time.
- the NAS device 102 estimates the frequency of file conflict occurrence from the use of the public directory set by the administrator, and determines the upper limit of the synchronization processing time.
- the NAS device 102 estimates the file conflict occurrence frequency based on the statistical information of the number of conflict occurrences during the synchronization process.
- the directory management table 416 does not manage the usage 416B, and the conflict frequency 416C and the synchronization time upper limit 416D are set based on the statistical information.
- the synchronization program 410 updates the conflict frequency 416C and the synchronization time upper limit 416D based on the average conflict occurrence frequency per synchronization process. For example, when the average conflict occurrence frequency per synchronization process is 1 or less, the conflict occurrence frequency 416C is “low”. When the average conflict occurrence frequency is 3 or more, the conflict occurrence frequency is “high”.
- the synchronization time upper limit uses a predetermined value for conflict and occurrence frequency. For example, when the conflict occurrence frequency is “low”, “1 hour” is set, “medium” is set to “30 minutes”, and “high” is set to “5 minutes”.
- the synchronization program 410 acquires statistical information on the frequency of occurrence of conflicts between sites in the synchronized file list reflection process.
- the difference in FIG. 15 will be described.
- Step S3000 The synchronization program 410 records the number of times the conflict has occurred for each top directory 416A of the directory management table 416 when performing conventional conflict detection. If there is an increase / decrease in the conflict occurrence frequency and the conflict occurrence frequency 416C is changed, the values of the conflict occurrence frequency 416C and the synchronization time upper limit 416D in the directory management table 416 are updated.
- the NAS device 112 acquires the statistical information of the conflict occurrence frequency between the bases, so that it is possible to estimate the conflict occurrence frequency without the administrator inputting the usage.
- the present invention can be applied to a directory where it is difficult to reduce the management cost of the administrator and to guess the usage of the file in advance.
- a file under the top directory can be updated from a plurality of locations. It is done. Specifically, files under the top directory with a low probability of conflict are allowed to be updated from multiple locations, but files under the top directory with a high probability of conflict are allowed to be updated from multiple locations. First, only file updates from a specific site are allowed, and only other sites can refer to it. In this case, the update file of the first embodiment is divided into subtree units only for those with a low possibility of conflict occurrence.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
<システムの物理的構成>
図1は、本発明の実施形態による情報処理システムの物理的構成の一例を示すブロック図である。なお、図1においては、拠点A及びBのみが示されているが、より多くの拠点がシステムに含まれていても良く、各拠点の構成は同様とすることが可能である。
図2は、本発明の実施形態による情報処理システムの論理的構成の一例を示すブロック図である。
図3は、NASデバイス102の内部構成例を示すブロック図である。なお、拠点B110のNASデバイス112も同様の構成となる。NASデバイス102は、NASコントローラ103と記憶装置104を有する。
図4は、CASデバイス121の内部構成例を示すブロック図である。CASデバイス121は、CASコントローラ122と記憶装置123を有する。
図5は、NASデバイス102の自拠点更新ファイルリスト413の構成例を示す図である。NASデバイス112も同様の自拠点更新ファイルリスト413を管理する。
図6は、NASデバイス102の他拠点更新ファイルリスト414の構成例を示す図である。NASデバイス112も同様の他拠点更新ファイルリスト414を管理する。
図7は、NASデバイス102の分割同期ファイルリスト415の構成例を示す図である。NASデバイス112も同様の分割同期ファイルリスト415を管理する。
図8は、NASデバイス102のディレクトリ管理テーブル416の構成例を示す図である。NASデバイス112も同様のディレクトリ管理テーブル416を管理する。
図9は、CASデバイス121の同期済みファイルリスト509の構成例を示す図である。
図10は、CASデバイス121のロックリスト510の構成例を示す図である。
図11は、本発明による、NASデバイス102のファイルリード・ライト処理を説明するためのフローチャートである。NASデバイス102は、クライアント101からファイルリード・ライト要求を受信した際に、図11に示すリード・ライト処理を実施する。以下、図11に示す処理をステップ番号に沿って説明する。
NASデバイス102はクライアントからファイル削除、リネーム要求を受信した場合、自拠点のFS_A200の該当ファイルの削除、リネームを実施する。その後、自拠点更新ファイルリスト413に更新処理の情報を追記する。また、リネーム処理時には、他拠点ファイルリスト414のエントリのうち、リネーム元となったファイル・ディレクトリまたは、リネーム元となったディレクトリ内のファイル・ディレクトリが含まれる場合には、該当エントリをリネーム先のパスに変換する処理を行う。この場合、他拠点更新ファイルリスト414のファイル名414Cに含まれるリネーム前のパスを、リネーム後のパス名に置き換えることになる。例えば/home/userA/File_Aを/home/userA/File_Bに移すリネーム処理があり、他拠点更新ファイルリスト414に/home/userA/File_Aが含まれている場合、該当エントリのファイル名414Cを/home/userA/File_Bに書き換える。
図12は、NASデバイス102の同期処理を説明するためのフローチャートである。以下、図12に示す処理をステップ番号に沿って説明する。
図13は、ステップS2100で述べた同期分割処理の詳細を説明するためのフローチャートである。以下、図13に示す処理をステップ番号に沿って説明する。
S2116:同期ファイル分割プログラム411は、ディレクトリ管理テーブル416のトップディレクトリを調べる。もしあるトップディレクトリ内に他のトップディレクトリが含まれている場合、他のトップディレクトリ以下のエントリを該当トップディレクトリのエントリの後ろに並べ替える。
図14は、ステップS2200で述べたサブツリー単位同期処理の詳細を説明するためのフローチャートである。以下、図14に示す処理をステップ番号に沿って説明する。
図15は、CASデバイス121の同期済みファイルリスト反映処理の詳細を説明するためのフローチャートである。以下、図15に示す処理をステップ番号に沿って説明する。
図16は、クライアント101に公開するディレクトリを設定するための管理インタフェースを示す概要図である。
以下、本発明の第2の実施形態について説明する。
第2の実施形態では、ディレクトリ管理テーブル416で用途416Bの管理は行わず、統計情報に基づきコンフリクト頻度416C、同期時間上限416Dが設定される。
第2の実施形態では、同期プログラム410は同期済みファイルリスト反映処理にて、拠点間のコンフリクト発生頻度の統計情報を取得する。以下、図15の差分について説明をする。
第2の実施形態では、管理者は公開ディレクトリ設定時に用途を入力する必要がなくなる。すなわち、図16における4160Bから4160Eのチェックボックスが不要となる。
100 拠点A(第1のサブ計算機システム)
110 拠点B(第2のサブ計算機システム)
101、111 クライアント
102、112 NASデバイス(NAS)
105、115 データアクセスネットワーク
120 データセンタ
121 CASデバイス(CAS)
122 CASコントローラ
123 記憶装置
124 管理端末
125 管理ネットワーク
130、140 バックエンドネットワーク
200 ファイルシステムFS_A
210 ファイルシステムFS_A’
220 ファイルシステムFS_A’’
406 ファイル共有サーバプログラム
407 ファイル共有クライアントプログラム
408 ファイルシステムプログラム
409 オペレーティングシステム
410 同期プログラム
411 同期ファイル分割プログラム
412 管理画面表示プログラム
413 自拠点更新ファイルリスト
413A ファイル名
413B 更新日時
413C 更新対象
413D 更新内容
414 他拠点更新ファイルリスト
414A 拠点
414B 同期番号
414B ファイル名
414D 更新日時
414E 更新対象
415 分割同期ファイルリスト
415A サブツリー番号
415B 排他範囲
415C ファイル名
415D 更新日時
415E 更新対象
415F 更新内容
416 ディレクトリ管理テーブル
416A トップディレクトリ
416B 用途
416C コンフリクト頻度
416D 同期時間上限
416E 平均スループット
509 同期済みファイルリスト
509A 拠点
509B 同期番号
509C ファイル名
509D 更新日時
509E 更新対象
509F 更新内容
510 ロックリスト
510A サブツリールート
510B ロックオーナー
510C 獲得日時
510D 保持期間
Claims (14)
- 情報処理システムであって、
前記情報処理システムは、
複数の第1の計算機と、前記複数の第1の計算機に接続する第2の計算機とを備え、
前記複数の第1の計算機はそれぞれ、第1のコントローラと第1の記憶媒体とを備え、
前記第2の計算機は、第2のコントローラと第2の記憶媒体とを備え、
前記複数の第1の計算機は第3の計算機に接続され、
前記第1のコントローラは、前記第3の計算機から送信されたファイルを前記第1の記憶媒体に格納し、
前記第2のコントローラは、前記複数の第1の計算機のうちのひとつの第1の計算機から送信されたファイルを前記第2の記憶媒体に格納し、前記ファイルは前記複数の第1の計算機から更新可能であり、
前記複数の第1の計算機のうちのひとつの第1の計算機である更新反映元第1の計算機でのファイル更新を前記第2の計算機に反映するデータ同期処理として、前記更新反映元第1の計算機の前記第1のコントローラは、データ同期処理対象群を複数のファイルサブツリーに分割し、前記ファイルサブツリー毎にデータ同期を実行する
ことを特徴とする情報処理システム。
- 請求項1記載の情報処理システムであって、前記ファイルサブツリーは、
前記更新反映元第1の計算機に予め格納された前記データ同期処理対象群を構成するディレクトリまたはファイルの用途に応じて決定される
ことを特徴とする情報処理システム。
- 請求項2記載の情報処理システムであって、前記ファイルサブツリーは、
前記データ同期の1回当たりの処理時間がデータ同期処理の上限時間以下となるように決定される
ことを特徴とする情報処理システム。
- 請求項3記載の情報処理システムであって、前記ファイルサブツリーは、
前記更新反映元第1の計算機と前記第2の計算機との間の平均転送速度に、前記データ同期処理の上限時間を乗算し算出されたデータ容量以下となるように決定される
ことを特徴とする情報処理システム。
- 請求項4記載の情報処理システムであって、
前記更新反映元第1の計算機に予め格納された前記データ同期処理対象群を構成するディレクトリまたはファイルの用途により、
前記更新反映元第1の計算機と、前記更新反映元第1の計算機以外の前記複数の第1の計算機のうちの1つ以上の第1の計算機とが、同一ファイルに対し更新が行われるコンフリクト頻度が決定される
ことを特徴とする情報処理システム。
- 請求項5記載の情報処理システムであって、
前記データ同期を行うファイルの総サイズを、前記更新反映元第1の計算機に予め格納された前記データ同期処理対象群を構成するディレクトリまたはファイルの用途に応じて決定される平均転送速度で、除算して前記データ同期処理の上限時間が算出される
ことを特徴とする情報処理システム。
- 請求項2記載の情報処理システムであって、前記データ同期処理は、
前記ファイルサブツリーの更新内容を反映する前記第2の記憶媒体での記憶領域に対し、前記更新反映元第1の計算機以外の第1の計算機からのアクセスを禁止して実行される
ことを特徴とする情報処理システム。
- 請求項2記載の情報処理システムであって、
前記更新反映元第1の計算機と、前記更新反映元第1の計算機以外の前記複数の第1の計算機のうちの1つ以上の第1の計算機とが、同一ファイルに対し更新が行われるコンフリクトの頻度は、所定時間に発生したコンフリクトの回数を前記所定時間で除算して算出される
ことを特徴とする情報処理システム。
- 請求項1記載の情報処理システムであって、前記複数の第1の計算機それぞれは、
第1のコントローラ上で動作する前記複数の第1の計算機への情報設定または前記複数の第1の計算機での情報出力を実行する管理インタフェースを備え、
前記管理インタフェースを第3の計算機に送信し、前記管理インタフェースで前記複数の第1の計算機に予め格納される前記データ同期処理対象群を構成するディレクトリまたはファイルの用途が入力される
ことを特徴とする情報処理システム。
- 請求項1記載の情報処理システムであって、
前記更新反映元第1の計算機と、前記更新反映元第1の計算機以外の前記複数の第1の計算機のうちの1つ以上の第1の計算機とが、同一ファイルに対し更新が行われるコンフリクトの頻度が高いファイルサブツリーのファイルから前記データ同期処理を実行する
ことを特徴とする情報処理システム。
- 請求項10記載の情報処理システムであって、コンフリクト頻度が同じファイルサブツリーが複数存在する場合には、ファイルサブツリーのデータサイズが小さいものから優先して前記データ同期処理を実行する
ことを特徴とする情報処理システム。
- 異なる記憶領域間でのデータ同期を制御する方式であって、
前記異なる記憶領域は、
ファイルを格納する第1の記憶領域と、
前記第1の記憶領域に格納されたファイルを送信し、前記送信されたファイルを格納する第2の記憶領域とを有し、
前記第1の記憶領域及び第2の記憶領域に格納されたファイルは更新可能であり、
前記第1の記憶領域に格納されたファイルの更新内容を前記第2の記憶領域に格納されたファイルに反映するデータ同期処理で、
データ同期処理対象群を複数のファイルサブツリーに分割し、前記ファイルサブツリー毎にデータ同期を実行する
ことを特徴とするデータ同期制御方式。
- 請求項12記載のデータ同期制御方式であって、前記ファイルサブツリーは、
予め格納された前記データ同期処理対象群を構成するディレクトリまたはファイルの用途に応じて決定される
ことを特徴とするデータ同期制御方式。
- 請求項1記載の情報処理システムであって、前記更新反映元第1の計算機の前記第1のコントローラは、
(1)前記更新反映元第1の計算機に予め格納された前記データ同期処理対象群を構成するディレクトリの用途
(2)前記更新反映元第1の計算機に予め格納されたファイルの用途
(3)前記データ同期の1回当たりの処理時間がデータ同期処理の上限時間
(4)前記データ同期処理を行う前記更新反映元第1の計算機と前記第2の計算機との間の平均転送速度と、前記データ同期処理の上限時間とを乗算し算出されたデータ容量
のいずれかで、データ同期処理対象群を複数のファイルサブツリーに分割し、
前記データ同期処理は、
前記更新反映元第1の計算機と、前記更新反映元第1の計算機以外の前記複数の第1の計算機のうちの1つ以上の第1の計算機とが、同一ファイルに対し更新が行われるコンフリクトの頻度が高いファイルサブツリーからデータ同期を実行し、
コンフリクト頻度が同じファイルサブツリーが複数存在する場合には、ファイルサブツリーのデータサイズが小さいものから優先してデータ同期を実行し、
前記ファイルサブツリー更新内容の前記第2の計算機への反映が可能かを判断し、
前記反映が可能と判断された場合には、
前記ファイルサブツリー更新内容を反映する前記第2の計算機での記憶領域に対する更新反映元第1の計算機以外の第1の計算機からのアクセスを禁止し、
前記更新反映元第1の計算機のコンフリクト発生ファイルの更新時間と前記第2の計算機のコンフリクト発生ファイルの更新時間とを比較して前記第1の計算機での更新時間が古い場合には、前記第2の計算機のファイルサブツリーのファイルを、前記ファイルサブツリーのファイルが格納されている記憶領域と異なる記憶領域に退避し、
前記アクセスを禁止している時間が、予め設定されている禁止時間の上限を超えた場合はアクセスの禁止を解除し、
前記反映が可能でないと判断された場合は、
当該ファイルサブツリー以外のファイルサブツリーでのデータ同期を実行する
ことを特徴とする情報処理システム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2013/053909 WO2014128819A1 (ja) | 2013-02-19 | 2013-02-19 | 情報処理システム及びそのデータ同期制御方式 |
JP2015501101A JP6033949B2 (ja) | 2013-02-19 | 2013-02-19 | 情報処理システム |
US14/759,975 US10191915B2 (en) | 2013-02-19 | 2013-02-19 | Information processing system and data synchronization control scheme thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2013/053909 WO2014128819A1 (ja) | 2013-02-19 | 2013-02-19 | 情報処理システム及びそのデータ同期制御方式 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014128819A1 true WO2014128819A1 (ja) | 2014-08-28 |
Family
ID=51390657
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/053909 WO2014128819A1 (ja) | 2013-02-19 | 2013-02-19 | 情報処理システム及びそのデータ同期制御方式 |
Country Status (3)
Country | Link |
---|---|
US (1) | US10191915B2 (ja) |
JP (1) | JP6033949B2 (ja) |
WO (1) | WO2014128819A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016121082A1 (ja) * | 2015-01-30 | 2016-08-04 | 株式会社日立製作所 | 計算機システム、ファイルストレージ装置、およびストレージ制御方法 |
WO2016121084A1 (ja) * | 2015-01-30 | 2016-08-04 | 株式会社日立製作所 | 計算機システム、ファイルストレージコントローラ、及び、データ共有方法 |
WO2016121083A1 (ja) * | 2015-01-30 | 2016-08-04 | 株式会社日立製作所 | 計算機システム、分散オブジェクト共有方法、エッジノード |
JP2020095589A (ja) * | 2018-12-14 | 2020-06-18 | 株式会社アール・アイ | 仮想ファイル処理システム及び仮想ファイル処理プログラム |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5783301B1 (ja) | 2014-06-11 | 2015-09-24 | 富士ゼロックス株式会社 | 通信端末、通信システム及びプログラム |
US9824093B1 (en) * | 2014-06-30 | 2017-11-21 | EMC IP Holding Company LLC | Datacenter maintenance |
US20170154066A1 (en) * | 2015-11-30 | 2017-06-01 | International Business Machines Corporation | Subscription service for monitoring changes in remote content |
JP6677072B2 (ja) * | 2016-05-13 | 2020-04-08 | 富士通株式会社 | 情報処理装置、情報処理システム、情報処理プログラム、及び情報処理方法 |
CN107770273A (zh) * | 2017-10-23 | 2018-03-06 | 上海斐讯数据通信技术有限公司 | 一种大文件云同步方法及系统 |
US10866963B2 (en) | 2017-12-28 | 2020-12-15 | Dropbox, Inc. | File system authentication |
US11593315B2 (en) * | 2018-06-26 | 2023-02-28 | Hulu, LLC | Data cluster migration using an incremental synchronization |
US11645236B2 (en) * | 2020-09-16 | 2023-05-09 | EMC IP Holding Company LLC | Extending retention lock protection from on-premises to the cloud |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000155710A (ja) * | 1998-11-20 | 2000-06-06 | Toshiba Corp | ネットワークコンピュータおよびそのネットワークコンピュータにおける同期処理方法 |
JP2002373101A (ja) * | 2001-06-14 | 2002-12-26 | Hitachi Ltd | ディレクトリ間連携方法 |
JP2008152772A (ja) * | 2006-12-14 | 2008-07-03 | Qnx Software Systems Gmbh & Co Kg | 同期順序の割り込み優先順位を用いた同期を有するメディアシステム |
US20120259813A1 (en) * | 2011-04-08 | 2012-10-11 | Hitachi, Ltd. | Information processing system and data processing method |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5319780A (en) | 1987-10-19 | 1994-06-07 | International Business Machines Corporation | System that implicitly locks a subtree or explicitly locks a node based upon whether or not an explicit lock request is issued |
US6161125A (en) * | 1998-05-14 | 2000-12-12 | Sun Microsystems, Inc. | Generic schema for storing configuration information on a client computer |
JP4378029B2 (ja) * | 1999-04-27 | 2009-12-02 | キヤノン株式会社 | データ処理方法及び装置及び記憶媒体 |
US7085779B2 (en) | 2001-06-04 | 2006-08-01 | Sun Microsystems, Inc. | File tree change reconciler |
US8037438B2 (en) * | 2009-02-27 | 2011-10-11 | International Business Machines Corporation | Techniques for parallel buffer insertion |
EP2740055A4 (en) * | 2011-08-01 | 2015-09-09 | Tagged Inc | SYSTEMS AND METHOD FOR ASYNCHRONOUS DISTRIBUTED DATABASE MANAGEMENT |
EP2791831B1 (en) * | 2012-01-25 | 2020-03-11 | Hitachi, Ltd. | Single instantiation method using file clone and file storage system utilizing the same |
-
2013
- 2013-02-19 JP JP2015501101A patent/JP6033949B2/ja not_active Expired - Fee Related
- 2013-02-19 US US14/759,975 patent/US10191915B2/en active Active
- 2013-02-19 WO PCT/JP2013/053909 patent/WO2014128819A1/ja active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000155710A (ja) * | 1998-11-20 | 2000-06-06 | Toshiba Corp | ネットワークコンピュータおよびそのネットワークコンピュータにおける同期処理方法 |
JP2002373101A (ja) * | 2001-06-14 | 2002-12-26 | Hitachi Ltd | ディレクトリ間連携方法 |
JP2008152772A (ja) * | 2006-12-14 | 2008-07-03 | Qnx Software Systems Gmbh & Co Kg | 同期順序の割り込み優先順位を用いた同期を有するメディアシステム |
US20120259813A1 (en) * | 2011-04-08 | 2012-10-11 | Hitachi, Ltd. | Information processing system and data processing method |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016121082A1 (ja) * | 2015-01-30 | 2016-08-04 | 株式会社日立製作所 | 計算機システム、ファイルストレージ装置、およびストレージ制御方法 |
WO2016121084A1 (ja) * | 2015-01-30 | 2016-08-04 | 株式会社日立製作所 | 計算機システム、ファイルストレージコントローラ、及び、データ共有方法 |
WO2016121083A1 (ja) * | 2015-01-30 | 2016-08-04 | 株式会社日立製作所 | 計算機システム、分散オブジェクト共有方法、エッジノード |
JPWO2016121083A1 (ja) * | 2015-01-30 | 2017-08-24 | 株式会社日立製作所 | 計算機システム、分散オブジェクト共有方法、エッジノード |
US10412163B2 (en) | 2015-01-30 | 2019-09-10 | Hitachi, Ltd. | Computer system, distributed object sharing method, and edge node |
US10459893B2 (en) | 2015-01-30 | 2019-10-29 | Hitachi, Ltd. | Computer system, file storage apparatus, and storage control method |
US11106635B2 (en) | 2015-01-30 | 2021-08-31 | Hitachi, Ltd. | Computer system, file storage controller, and data sharing method |
JP2020095589A (ja) * | 2018-12-14 | 2020-06-18 | 株式会社アール・アイ | 仮想ファイル処理システム及び仮想ファイル処理プログラム |
JP7164176B2 (ja) | 2018-12-14 | 2022-11-01 | アップデータ株式会社 | 仮想ファイル処理システム及び仮想ファイル処理プログラム |
Also Published As
Publication number | Publication date |
---|---|
JPWO2014128819A1 (ja) | 2017-02-02 |
US20150358408A1 (en) | 2015-12-10 |
JP6033949B2 (ja) | 2016-11-30 |
US10191915B2 (en) | 2019-01-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6033949B2 (ja) | 情報処理システム | |
US11650959B2 (en) | System and method for policy based synchronization of remote and local file systems | |
US10019460B2 (en) | Hosted file sync with direct access to hosted files | |
EP2836901B1 (en) | Method and apparatus for virtualization of a file system, data storage system for virtualization of a file system, and file server for use in a data storage system | |
JP5433074B2 (ja) | ストレージクラスタを指定可能な複製されたコンテンツのための非同期的分散オブジェクトアップロード | |
JP5918244B2 (ja) | フォールトトレラントデータベース管理システムにおいてクエリ結果を統合するシステム及び方法 | |
JP5661188B2 (ja) | ファイルシステム及びデータ処理方法 | |
US9798486B1 (en) | Method and system for file system based replication of a deduplicated storage system | |
US8762344B2 (en) | Method for managing information processing system and data management computer system | |
US20170124111A1 (en) | System And Method For Synchronizing File Systems With Large Namespaces | |
EP2615566A2 (en) | Unified local storage supporting file and cloud object access | |
JP5516575B2 (ja) | データ挿入システム | |
US20110196838A1 (en) | Method and System for Managing Weakly Mutable Data In A Distributed Storage System | |
JP2013545162A5 (ja) | ||
TW201227291A (en) | Data deduplication | |
US20120254555A1 (en) | Computer system and data management method | |
JP6279770B2 (ja) | ファイルサーバ装置 | |
WO2017187311A1 (en) | Storage constrained synchronization engine | |
Liu et al. | UGSD: scalable and efficient metadata management for EB-scale file systems | |
US11656946B2 (en) | Cloud-native global file system with reshapable caching | |
WO2023033100A1 (en) | Processing apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13876068 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2015501101 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14759975 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13876068 Country of ref document: EP Kind code of ref document: A1 |