US7100007B2 - Backup system and method based on data characteristics - Google Patents

Backup system and method based on data characteristics Download PDF

Info

Publication number
US7100007B2
US7100007B2 US10/794,241 US79424104A US7100007B2 US 7100007 B2 US7100007 B2 US 7100007B2 US 79424104 A US79424104 A US 79424104A US 7100007 B2 US7100007 B2 US 7100007B2
Authority
US
United States
Prior art keywords
backup
data
computer device
target data
destination computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US10/794,241
Other versions
US20050060356A1 (en
Inventor
Nobuyuki Saika
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Assigned to HITACHI, LTD. reassignment HITACHI, LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: SAIKA, NOBUYUKI
Publication of US20050060356A1 publication Critical patent/US20050060356A1/en
Priority to US11/486,610 priority Critical patent/US20060259724A1/en
Application granted granted Critical
Publication of US7100007B2 publication Critical patent/US7100007B2/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1456Hardware arrangements for backup
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Definitions

  • the present invention relates to a technology for backing up data, and, more specifically, to a technology for backing up data to at least one device among a plurality of backup destination devices via a communication network, for example.
  • backup target data a backup destination server constituting a destination for storing data to be backed up
  • the data characteristics relating to the backup target data are not considered when a backup is made.
  • a variety of types of backup target data may be considered to exist for the user. For example, if there is also backup target data that needs to undergo backup processing distinctly from other data, there is probably also backup target data that needs to undergo backup processing with an emphasis on security and backup target data for which reliability must be secured as in the case of a backup to a plurality of locations may also be considered to exist.
  • the backup system comprises: a backup source computer device that stores backup target data; a plurality of backup destination computer devices each connected to the backup source computer device via a network; a backup mode selector that selects, according to data characteristics of the backup target data, any one backup mode from among a plurality of pre-prepared backup modes; and a backup executor that stores the backup target data by transferring same from the backup source computer device to a backup destination computer device that is selected on the basis of the selected backup mode from among the backup destination computer devices.
  • the backup source computer device and the backup destination computer device may be constituted as a computer system that is capable of using storage devices such as hard disk or semiconductor memory device, as in the case of a file server (NAS (Network Attached Storage)) or similar, for example.
  • the backup source computer device can comprise a file system that is shared by a plurality of users, for example.
  • Examples of backup target data may include data files created by each user, data groups constituting the content of a database, and system files defining the constitution and the like of the user system, and so forth. With this embodiment, backups are performed in file units.
  • Data characteristics can be defined as information denoting the data usage characteristics possessed by the backup target data, for example, and can be classified in accordance with the purpose for using the backup target data, and the form of usage.
  • backup modes are prepared in accordance with predefined plural-type data characteristics.
  • the backup mode selector discriminates data characteristics of the backup target data and select a backup mode that matches the data characteristics.
  • a backup destination computer device that is used as the backup destination of the backup target data is determined by the selection of the backup mode.
  • the backup executor transfers the backup target data to the selected backup destination computer device.
  • the backup target data is accordingly prepared for the backup destination computer device.
  • Methods for transferring backup target data can be broadly classified into two types. One method is a method that transmits backup target data from a backup source computer device to a backup destination computer device. The other method is one in which the backup destination computer device accesses the backup source computer device to download backup target data.
  • the backup mode selector determines whether the backup target data possesses any data characteristic on the basis of pre-prepared characteristic classification conditions.
  • Characteristic classification conditions are discrimination information serving to discriminate whether the backup target data pertains to any of the predefined plural-type data characteristics.
  • data characteristics that are pre-classified by the characteristic classification conditions can be expressed as defined data characteristics, data characteristic types, data characteristic categories, and so forth, for example.
  • the backup mode selector determines whether the backup target data possesses any data characteristics by comparing acquired metadata relating to the backup target data, and characteristic classification conditions.
  • Examples of acquired metadata relating to the backup target data include, for example, a file name, file size, an update date and time, a file extension (that is, the file type), access group management information set in a file, names of the users sharing the file, the total number of common users, and the category to which the common users belong (job categories such as the planning department, accounting department, development department, as well as ranking categories such as person in charge, section manager, head of department, and executive, for example), and so forth.
  • job categories such as the planning department, accounting department, development department, as well as ranking categories such as person in charge, section manager, head of department, and executive, for example
  • Data characteristics include any one of data characteristics that prioritize the securing of data reliability or data characteristics that prioritize the securing of data security.
  • both data characteristics that place emphasis on data reliability and data characteristics that place emphasis on data security are included.
  • Data characteristics that prioritize the securing of data reliability are a data characteristic segment in which data consistency is secured and the prevention of data destruction and loss is required.
  • the data characteristics that prioritize the securing of data reliability can be determined by considering at least one or more judgment elements among judgment elements such as the number of common users, file extension type, file name, and the presence or absence of write permissions, for example.
  • the data characteristics that prioritize the securing of data security are a data characteristic segment in which data secrecy is retained and the prevention of unauthorized copying and so forth is required.
  • Data characteristics that prioritize the securing of data security can be determined by considering at least one or more judgment elements among judgment elements such as the presence or absence of encryption, the number of common users, special features common to common users, the presence or absence of access restrictions, file extension type, file name, and the presence or absence of predetermined keywords, for example.
  • the backup executor selects a backup destination computer device constituting a backup destination on the basis of backup destination mapping information constituted so as to pre-match at least one or more backup destination computer devices of the backup destination computer devices with each backup mode.
  • the backup executor comprises: a backup list generator that generates a backup list that includes information specifying backup target data to be acquired by the backup destination computer device; and a backup list transmitter that transmits the backup list to the backup destination computer device, and wherein the backup destination computer device comprises: a data acquisitor that stores backup target data by acquiring same from the backup source computer device on the basis of the backup list received from the backup source computer device.
  • examples of information specifying the backup target data include information on the path to the backup target data, the file name, and so forth.
  • the backup list is prepared for each of the backup destination computer devices and each of the backup target data items to be acquired by each backup destination computer device is explicit in each backup list. Accordingly, the backup destination computer device is able to specify backup target data to be acquired, and acquire backup target data from the backup source computer device by referencing only the backup list that is addressed to the backup destination computer device.
  • the backup list includes information indicating a backup availability time when backup target data can be acquired from the backup source computer device, and the backup destination computer device accesses the backup source computer device according to the backup availability time to acquire the backup target data.
  • the reading of backup target data at a time other than the backup availability time can be prevented beforehand, and hence stability can be raised.
  • the backup destination computer device upon receiving the backup list from the backup source computer device, the backup destination computer device generates restore data to be used for restoring the backup target data, and transmits the restore data thus generated to the backup source computer device.
  • the restore data can include discrimination information for specifying data (or a data group) that is backed up to the backup destination computer device.
  • the backup list transmitter controls the time for transmitting the backup list to the backup destination computer device.
  • the time of the backup processing by each backup destination computer device can be adjusted and the processing load on the backup source computer device and the communication network traffic can be controlled.
  • a backup method for performing a backup between a backup source computer device for storing backup target data and a plurality of backup destination computer devices each connected to the backup source computer device via a network, comprises the steps of: determining data characteristics pertaining to backup target data on the basis of characteristic classification conditions for classifying data characteristics; determining a backup destination computer device for each of the backup target data on the basis of the determined data characteristics and of backup destination mapping information that is constituted so that at least one or more of the backup destination computer devices constituting backup destinations correspond(s) with each of the data characteristics; collecting, for each of the backup destination computer devices, the backup target data corresponding with the backup destination computer devices; generating, for each of the backup destination computer devices, a backup list that includes information specifying backup target data to be acquired by the backup destination computer devices; transmitting the generated backup lists to the backup destination computer devices; and transferring the backup target data from the backup source computer device to the backup destination computer devices on the basis of the basis of the received backup lists.
  • a computer device comprising: a component that stores characteristic classification conditions for classifying data characteristics; a component that determines data characteristics pertaining to backup target data on the basis of the characteristic classification conditions; a component that stores backup destination mapping information constituted such that at least one or more backup destination computer devices constituting a backup destination correspond(s) with each of the data characteristics; a component that determines a backup destination computer device for each of the backup target data on the basis of the determined data characteristics and the backup destination mapping information; a component that that collects, for each of the backup destination computer devices, the backup target data corresponding with the backup destination computer devices and generating, for each of the backup destination computer devices, a backup list including information specifying backup target data to be acquired by the backup destination computer devices; a component that transmits the generated backup lists to the backup destination computer devices; and a component that transfers the backup target data to the backup destination computer devices when the backup destination computer devices request the acquisition of backup target data on the basis of the transmitted backup lists.
  • the computer program according to another aspect of the present invention is a computer program that causes a computer device for storing backup target data to execute a method for issuing a backup request, the backup method comprising the steps of: determining data characteristics pertaining to backup target data on the basis of characteristic classification conditions for classifying data characteristics; determining a backup destination computer device for each of the backup target data on the basis of the determined data characteristics and of backup destination mapping information that is constituted so that at least one or more of the backup destination computer devices constituting backup destinations correspond(s) with each of the data characteristics; generating, for each of the backup destination computer devices, a backup list that includes information specifying backup target data to be acquired by the backup destination computer devices; and transmitting the generated backup lists to the backup destination computer devices.
  • FIG. 1 is an overall constitutional view of the backup system according to an embodiment of the present invention.
  • FIG. 2 is a block diagram showing the functions of the servers 3 and 6 A to 6 C that constitute the backup system according to this embodiment.
  • FIG. 3 shows an example of data characteristic classification definition information.
  • FIG. 4 shows an example of classification result data.
  • FIG. 5 shows an example of backup destination mapping information.
  • FIG. 6 shows an example of backup lists 200 A, 200 B, and 200 C.
  • FIG. 7 is an image diagram of the flow of the processing of a backup request unit 12 .
  • FIG. 8 shows the constitution of an archive file created by a backup request acceptance unit 21 .
  • FIG. 9 is a flowchart showing the flow of the processing of a data characteristic classification unit 11 that a backup source server 3 comprises.
  • FIG. 10 is a flowchart showing the flow of the processing of the backup request unit 12 .
  • FIG. 11 is a flowchart showing the flow of the processing of the backup request unit 12 .
  • FIG. 12 is a flowchart showing the flow of the processing of a download acceptance unit 13 .
  • FIG. 13 shows the flow of the processing of the backup request acceptance unit 21 of the backup destination server.
  • FIG. 14 shows the flow of the processing of the backup request acceptance unit 21 of the backup destination server.
  • FIG. 15 shows an example of a restore file.
  • FIG. 16 shows the flow of the restore processing of the backup source server 3 .
  • FIG. 1 is an overall constitutional view of the backup system according to an embodiment of the present invention.
  • the backup system has a single backup source data center 1 (or a plurality thereof), and a plurality (three, for example) of backup destination data centers ( 2 A, 2 B, 2 C).
  • the backup source data center 1 is a data center constituting the backup source of backup target files.
  • the center 1 comprises one or a plurality of backup source storage devices 4 that store backup target files, and a backup source storage device 3 , which can be communicably connected to the backup source storage devices 4 via a communication network or the like such as an SAN (Storage Area Network).
  • SAN Storage Area Network
  • the backup destination data centers are data centers for storing backups of files that are stored in the backup source data center 1 .
  • the center 2 A comprises one or a plurality of backup destination storage devices 5 A, which are capable of storing backups of backup target files; and a backup destination server 6 A that can be communicably connected to the backup destination storage devices 5 via a communication network or the like such as an SAN.
  • another center 2 B (and 2 C) also comprises a storage device 5 B (and 5 C) like the center 2 A, and a server 6 B (and 6 C).
  • the servers 3 and 6 A to 6 C, and the storage devices 4 and 5 A to 5 C will be described below.
  • the backup source server 3 classifies one or a plurality of backup target files in the backup source storage device 4 into one or more file groups (groups including one or more backup target files) that have common data characteristics on the basis of respective data characteristics for these backup target files. Further, the backup source server 3 transmits one or more backup target files pertaining to each file group to one or more backup destination servers 6 A to 6 C on the basis of common data characteristics in these file groups.
  • the backup source storage device 4 is a storage system that comprises an external or internal hard disk, or one or a plurality of hard disks in the form of an array, for example, and is able to store backup target files.
  • various files are of a predetermined format and are managed according to a hierarchical structure in which, for example, a second directory lies below a first directory and one or a plurality of files are stored in the second directory. Therefore, if, for example, the operator designates the first directory as the backup target with respect to the backup source server 3 , all the directories and files that lie below the first directory are designated as the backup target.
  • the backup destination servers 6 A to 6 C receive backup target files from the backup source server 3 and store these backup target files in the communicably connected backup destination storage devices 5 A to 5 C respectively.
  • the backup destination storage devices 5 A to 5 C are storage devices in which backup target files are stored via the backup destination servers 6 A to 6 C from the backup source server 3 , and are magnetic tape libraries equipped with one or a plurality of magnetic tapes, or storage systems comprising one or a plurality of hard disks in the form of an array, for example.
  • FIG. 2 is a block diagram showing the functions of the servers 3 and 6 A to 6 C that constitute the backup system according to this embodiment.
  • the backup source server 3 comprises an operating system (OS) 10 such as a Microsoft Windows (Trademark) operating system, and comprises, as application software above this OS 10 , a data characteristic classification unit 11 , a backup request unit 12 , a download acceptance unit 13 , and a restore unit 14 .
  • OS operating system
  • Trident Microsoft Windows
  • the data characteristic classification unit 11 acquires, with predetermined timing (immediately following the designation of the backup target or at fixed intervals, for example), metadata for the designated backup target file (and/or metadata for one or more directories containing the file) from the backup source storage device 4 .
  • the data characteristic classification unit 11 then classifies the designated backup target file on the basis of the acquired meta data and pre-prepared data characteristic classification definition information.
  • the file (and ‘directory’) ‘metadata’ represents characteristics relating to the file (and directory) and includes at least one type of information among the following subinformation in (1) to (7) below, for example:
  • data characteristic classification definition information is information relating to rules on how backup target files are classified, and is created by a predetermined individual (an operations manager, for example) and then stored in a predetermined location on a communication network (in the backup source server 3 or backup source storage device 4 , for example). More specifically, as shown in FIG. 3 , for example, data characteristic classification definition information includes one or more data characteristic IDs for discriminating one or more data characteristic types and one or more rule bodies that correspond with the one or more data characteristic IDs.
  • the rule body is information representing conditions for assigning the corresponding data characteristic IDs (in other words, information representing the data characteristic type corresponding with the data characteristic ID).
  • Each rule body has a predetermined constitution, comprising a plurality of subconditions and logic operators that link the subconditions, for example (AND, OR, XOR, and so forth, for example).
  • the file is not encrypted, and there is no ACL setting and no restrictions relating to access-enabled user groups (that is, common users are not set), and the rule body shows that the file extension is ‘.html’, ‘.doc’, or ‘.xls’. If a file corresponding to the conditions indicated by this rule body (a file with metadata or actual data satisfying the conditions of the rule body, for example) exists, the data characteristic ID ‘ID- 002 ’ is assigned to this file.
  • the individual creating the data characteristic classification definition information (hereinafter called the ‘creator’) is not limited to the example shown in FIG. 3 , and is able to create a variety of rule bodies by preparing subconditions of any kind and connecting any of the prepared plurality of subconditions in some way.
  • a rule body can be created on the basis of at least one aspect of the following aspects (A) to (C), for example.
  • a rule body can be created on the basis of at least one aspect of the following aspects (a) to (f), for example.
  • the data characteristic classification unit 11 assigns, on the basis of metadata (and/or actual data) of one or more designated backup target data files and the above-mentioned data characteristic classification definition information, one or more data characteristic IDs corresponding with one or more rule bodies satisfied by the file to the one or more backup target files (that is, performs classification of the backup target files).
  • the data characteristic classification unit 11 then outputs data relating to the classification result, that is, for example, as illustrated in FIG. 4 , classification result data D 21 that is produced by associating information relating to the files (such as the file names, path names, and data sizes of the files, for example) with one or more data characteristic IDs assigned to these files, for each of the one or more backup target files. Further, as illustrated in FIG.
  • the data characteristic classification unit 11 assigns a predetermined code (‘Default’, for example), which indicates that such a condition is absent, to the file in place of data characteristic IDs or as one data characteristic ID. Furthermore, although not especially shown in FIG. 4 , when a backup target file satisfying a plurality of conditions each indicated by a plurality of rule bodies exists, a plurality of data characteristic IDs are assigned to one backup target file.
  • the backup request unit 12 When the backup request unit 12 receives a backup request from outside (the operations manager, for example) with predetermined timing, the backup request unit 12 collates the classification result data D 21 (see FIG. 4 ) output by the data characteristic classification unit 11 , along with pre-prepared backup destination mapping information. Then, on the basis of the classification result data and the backup destination mapping information, the backup request unit 12 prepares, for each of the backup destination servers 6 A to 6 C, information relating to which backup target file is transmitted to which backup destination server, such as a backup list (described later), for example, and then transmits each backup list to the backup destination servers 6 A to 6 C to which these lists are addressed.
  • a backup list described later
  • the backup destination mapping information includes information indicating which data characteristic ID (and the above-mentioned ‘Default’ indicating the absence thereof)—assigned file is backed up to which backup destination server, that is, information (a host (server) name, or IP address and so forth, for example) relating to one (or a plurality of) backup destination servers associated with a plurality of data characteristic IDs.
  • This backup destination mapping information is created automatically by a computer or manually by a predetermined individual (an operations manager, for example), and is pre-stored in a predetermined location on a communication network (in the backup source server 3 or backup source storage device 4 , for example).
  • the backup lists are prepared in the same quantity as the backup destination servers.
  • the three backup lists 200 A, 200 B, and 200 C illustrated in FIG. 6 are prepared for three backup destination servers 6 A, 6 B, and 6 C by the backup request unit 12 . If this is described representatively with respect to the backup list 200 A, the ‘acceptance date and time’ and information relating to the backup destination server corresponding with this list 200 A (the ‘host name’ and ‘file list’ as shown, for example) are recorded in the backup list 200 A.
  • the ‘acceptance date and time’ is information indicating the date and time (or permitted time slot) when the backup destination server 6 A is granted access to the backup source server 3 , and is expressed in predetermined units (year/month/day/hour/minutes/seconds, for example).
  • the acceptance date and time is allocated automatically by the backup request unit 12 , for example, but may be established manually by a predetermined user (operations manager, for example).
  • the backup request unit 12 is able to avoid a concentration of the load resulting from the backup processing on the backup destination servers 6 A to 6 C by varying the respective acceptance date and time of the backup lists 200 A to 200 C at fixed time intervals (a time interval that is presumed necessary in order for the backup destination servers 6 A to 6 C to acquire one or more predetermined backup target files from the backup source server 3 , for example). Further, the time required in order to acquire one or more predetermined backup target files can be estimated from the total of the data size of the backup target files, for example.
  • the ‘host name’ is information indicating the name of the backup destination server 6 A.
  • the ‘file list’ expresses information relating to one or more backup target files classified as backed up to the backup destination server 6 A (the file name, path name, data size, and so forth, of each file, for example) in list format.
  • the other backup lists 200 B and 200 C are substantially the same as the backup list 200 A.
  • the backup request unit 12 creates the backup lists 200 A to 200 C based on the flow described below.
  • FIG. 7 is an image diagram of the flow of the processing of the backup request unit 12 .
  • the backup request unit 12 collates the classification result data D 21 that is output by the data characteristic classification unit 11 , and pre-prepared backup destination mapping information D 22 , and thus obtains data D 23 that is produced by converting a data characteristic ID in the classification result data D 21 into a backup destination host name (backup destination server name).
  • the backup request unit 12 sorts sets of file names and host names recorded in this data D 23 by the host names, and thus converts the data D 23 into data D 24 in which the sets of file names and host names recorded in this data D 23 are sorted by the host names.
  • the backup request unit 12 then divides up and outputs this data D 24 into files for each host name, and creates the backup lists 200 A to 200 C corresponding to the backup destination servers 6 A to 6 C respectively by adding the above-mentioned acceptance date and time to each output file (further, the backup lists 200 A to 200 C may be divided into even smaller files, in which case the acceptance date and time is added to each further divided file).
  • the backup request unit 12 transmits the backup lists 200 A to 200 C so created to the corresponding backup destination servers 6 A to 6 C. Accordingly, the backup destination servers 6 A to 6 C assign discrimination information (hereinafter ‘backup discrimination information’) to the received backup lists 200 A to 200 C and transmit this backup discrimination information to the backup source server 3 .
  • the backup source server 3 receives backup discrimination information from each of the backup destination servers 6 A to 6 C and the backup request unit 12 creates restore information for recovering backup target files on the basis of the backup discrimination information (the restore information as well as the restoration processing that employs this information will be described in detail subsequently).
  • the backup request unit 12 Upon receiving a download request (described subsequently) from the backup destination servers 6 A to 6 C, the backup request unit 12 stores the transmitted backup lists 200 A to 200 C in predetermined storage regions (predetermined storage regions in the backup source server 3 or backup source storage device 4 , for example) in order to perform a validity check on the date and time a request is received and to back up one or more predetermined backup target files in the backup destination servers 6 A to 6 C constituting the request source.
  • predetermined storage regions predetermined storage regions in the backup source server 3 or backup source storage device 4 , for example
  • the download acceptance unit 13 accepts backup target file download (transfer) requests from the backup destination servers 6 A to 6 C, and, in the event of a request, checks whether the date and time when the request was received are valid. More specifically, in a case where the download acceptance unit 13 receives a download request from the backup destination server 6 A, for example, the download acceptance unit 13 judges whether there is a complete or substantial match between this date and time and a date and time that is designated in advance by the backup source server 3 with respect to the backup destination server 6 A (that is, the acceptance date and time written in the backup list 200 A that is output by the backup request unit 12 ).
  • the download acceptance unit 13 reads one or more backup target files that have one or more file names written in the backup file 200 A from a predetermined location (the backup source storage device 4 , for example), and transmits these backup target files to the backup destination server 6 A that is the source of the download request. On the other hand, if no such match exists, the download acceptance unit 13 performs predetermined processing, i.e. communicates an error to the backup destination server 6 A, for example.
  • the above-mentioned ‘substantial match’ means that the difference between the current date and time when the download request is received and the acceptance date and time lies within a predetermined error range, for example, and this predetermined error range may be common to all the backup lists or vary from one backup list to the next.
  • the predetermined error range may be varied by a predetermined user or may be fixed so as to be unchangeable. Further, the predetermined error range may be stored in a predetermined storage device separately from the backup list or may be described in the backup list.
  • the restore unit 14 restores a backup target file on the basis of the restore information created by the backup request unit 12 (described in detail subsequently with respect to the restore processing).
  • backup destination server 6 A will be described representatively for the backup destination servers 6 A to 6 C with reference to FIG. 2 (further, although the backup destination server 6 A is illustrated representatively in FIG. 2 , the other backup destination servers 6 B and 6 C are also able to communicate with the backup source server 3 ).
  • the backup destination server 6 A comprises an operating system (OS) 20 and the backup request acceptance unit 21 as application software above this OS.
  • OS operating system
  • the backup request acceptance unit 21 receives the backup list 200 A from the backup source server 3 and stores this list in a predetermined storage region (a predetermined storage region in the backup destination server 6 A or backup destination storage device 5 A, for example). The backup request acceptance unit 21 then generates backup discrimination information for this backup on the basis of the backup list 200 , and stores this information in a predetermined storage region. Then, after running a process to perform backup processing, the backup request acceptance unit 21 transmits the stored backup discrimination information to the backup source server 3 .
  • a predetermined storage region a predetermined storage region in the backup destination server 6 A or backup destination storage device 5 A, for example.
  • the backup request acceptance unit 21 then generates backup discrimination information for this backup on the basis of the backup list 200 , and stores this information in a predetermined storage region. Then, after running a process to perform backup processing, the backup request acceptance unit 21 transmits the stored backup discrimination information to the backup source server 3 .
  • the backup process is in a standby state until the current date and time reaches the acceptance date and time listed in the received and stored backup list 200 A (until the current date and time falls within the range of a time slot when the acceptance date and time is expressed by this time slot).
  • the backup request acceptance unit 21 runs a backup process, issues a download (transfer) request to the download acceptance unit 13 of the backup source server 3 , and creates an archive file that has the stored backup discrimination information.
  • FIG. 8 shows the constitution of an archive file created by the backup request acceptance unit 21 .
  • backup discrimination information generated for the backup list 200 A is: the entry number of backup target files (that is, the number of backup target files stored in the archive file), and backup target information in an amount corresponding to the entry number (such as the data size, path within the backup source server 3 , and body (file itself), of each file, for example).
  • the backup request acceptance unit 21 shown in FIG. 2 downloads one or more backup target files recorded in the backup list 200 A from the backup source server 3 in response to the download request, the backup request acceptance unit 21 stores these backup target files in the archive file. Once the processing to store the backup target files in the archive file is complete, the backup request acceptance unit 21 stores the archive file containing the backup target files in the backup destination storage device 5 A.
  • FIG. 9 is a flowchart showing the flow of the processing of the data characteristic classification unit 11 that the backup source server 3 comprises.
  • a predetermined user inputs a backup target directory (or a directory with the backup target file, for example) to the backup source server 3 (step S 1 ). Further, the data characteristic classification unit 11 reads (S 2 ) the data characteristic classification definition information (see FIG. 3 ) that has been preset and stored.
  • the data characteristic classification unit 11 searches for directories and files contained in the directory that is input in S 1 (that is, on a level below the directory), and, if the sought directories and files are present (YES in S 3 ), acquires metadata for all these files and directories (and/or actual data) (S 4 ).
  • the data characteristic classification unit 11 collates (S 5 ) metadata (and/or actual data) for the files (and directories) acquired in S 4 and data characteristic classification definition information read in S 2 , performs classification based on the data characteristics of the backup target files by capturing the data characteristic IDs corresponding to the backup target files and then associating these data characteristic IDs with the files, and outputs (S 6 ) the classification result data representing the classification results (see FIG. 4 ), and stores this data in a predetermined storage region.
  • the data characteristic classification unit 11 may read a plurality of files in the directory that was input in S 1 one by one, and then repeatedly execute S 4 to S 6 .
  • the data characteristic classification unit 11 performs S 4 to S 6 by reading out a certain single file from the plurality of files in the directory input in S 1 , and then performs S 4 to S 6 by reading out another single file, and may repeat this processing until it is complete for all these plural files.
  • the one or more backup target files retrieved in S 3 are classified according to a predetermined standard, such as at least one standard among (A) to (C) and (a) to (f) mentioned earlier, for example, based on the file data characteristics. That is, one or more characteristic ID data items is (are) assigned to each backup target file on the basis of at least one item among: the number of common users of the file, special features common to the common users, an extension, a keyword, and the presence or absence of access restriction information such as an ACL, and the presence or absence of encryption, for example.
  • a predetermined standard such as at least one standard among (A) to (C) and (a) to (f) mentioned earlier, for example, based on the file data characteristics. That is, one or more characteristic ID data items is (are) assigned to each backup target file on the basis of at least one item among: the number of common users of the file, special features common to the common users, an extension, a keyword, and the presence or absence of access restriction information such as an ACL, and the presence or
  • a plurality of conditions expressed by a rule body are sometimes satisfied, in which case a plurality of data characteristic IDs are assigned to one backup target file.
  • a plurality of server information items correspond with one data characteristic ID
  • one backup target file is backed up to two or more servers.
  • the serial flow above can also be performed with predetermined timing, such as immediately after the designation, for example, or can be performed at fixed or irregular intervals after the designation. In the latter case, for example, if the user designates a pre-prepared desired directory as the backup target and stores a file in this desired directory, the classification of the file stored in the desired directory is performed automatically at fixed intervals or with other predetermined timing.
  • FIGS. 10 and 11 are flowcharts showing the flow of the processing of the backup request unit 12 .
  • the backup request unit 12 When the backup request unit 12 receives a backup request from outside (the operations manager, for example), for example, with predetermined timing, the classification result data that is output by the data characteristic classification unit 11 is read from a predetermined storage region (S 11 ) as shown in FIG. 10 .
  • the backup request unit 12 reads the pre-prepared backup destination mapping information (S 12 ).
  • the backup request unit 12 then sets the counter value at ‘0 ’ (S 13 ), and compares this value with the number of files recorded in the classification result data (S 14 ).
  • the backup request unit 12 performs the processing of (S 15 ) to (S 18 ) below until the counter value equals the number of files of the backup target files recorded in the classification result data (NO in S 14 ).
  • the backup request unit 12 acquires the data characteristic ID corresponding with the file name (or path name) of the target recorded in the classification result data.
  • the backup request unit 12 references the backup destination mapping information to acquire the host name corresponding with the data characteristic ID acquired in S 15 .
  • the backup request unit 12 associates the host name acquired in S 16 with the file name of the target in S 15 , renders a set of the file name and the host name one record, and outputs same to a predetermined temporary file (the data file D 23 shown in FIG. 8 , for example).
  • the backup request unit 12 sorts the one or more records recorded in the temporary file by the host names (S 19 ).
  • the backup request unit 12 divides up the temporary file whose records have been sorted by the host name. That is, the backup request unit 12 performs division to produce the same number of files as the types of host names (that is, the backup destination servers 6 A to 6 C) recorded in the temporary file, and creates and outputs (S 20 ) the backup lists 200 A to 200 C corresponding with the backup destination servers 6 A to 6 C by recording the acceptance date and time in the files obtained by this division.
  • the backup request unit 12 performs the following processing on all the backup lists 200 A to 200 C.
  • the backup request unit 12 captures (S 25 ) the backup destination servers 6 A to 6 C by acquiring the host names (backup destination server names) from the backup lists 200 A to 200 C.
  • the backup request unit 12 then transmits (S 26 ) each of the backup lists 200 A to 200 C to the backup destination servers 6 A to 6 C thus captured in S 25 .
  • the backup request unit 12 also stores these backup lists 200 A to 200 C in a predetermined storage region.
  • the backup request unit 12 receives (S 27 ) a response that includes the above-mentioned backup discrimination information from the backup destination servers 6 A to 6 C.
  • the backup request unit 12 then renders the backup discrimination information included in the response and information (host name, for example) relating to the backup destination server constituting the information transmission source a set, and outputs this set (S 28 ) to a predetermined file (for example, a restore file described subsequently).
  • FIG. 12 is a flowchart showing the flow of the processing of the download acceptance unit 13 .
  • the download acceptance unit 13 When the download acceptance unit 13 receives (YES in S 31 ) a download request including the host name of the server 6 A from the backup destination server 6 A, for example, the download acceptance unit 13 acquires (S 32 ) the acceptance date and time and the host name from all the file lists 200 A to 200 C output by the backup request unit 12 .
  • the download acceptance unit 13 compares the host name and the current date and time included in the download request received in S 31 with the host name and acceptance date and time acquired in S 32 , and thus judges whether a match exists (S 33 ).
  • the download acceptance unit 13 reads out one or more backup target files each having one or file names listed in the backup list 200 A from the backup source storage device 4 and transmits (S 34 ) the one or more backup target files thus read to the backup destination server 6 A that is the transmission source of the download request.
  • the download acceptance unit 13 transmits an error to the backup destination server 6 A (S 35 ).
  • FIGS. 13 and 14 show the flow of the processing of the backup request acceptance unit 21 of the backup destination server.
  • the backup destination server is described below as the backup destination server 6 A.
  • the backup request acceptance unit 21 of the backup destination server 6 A receives (S 41 ) the backup list 200 A from the backup request unit 12 of the backup source server 3 and stores the backup list 200 A in a predetermined storage region.
  • the backup request acceptance unit 21 creates backup discrimination information relating to the backup list 200 A (S 42 ).
  • the backup request acceptance unit 21 then generates and runs the backup process (S 43 ).
  • the backup request acceptance unit 21 transmits (S 44 ) the backup discrimination information thus created in S 42 to the backup request unit 12 of the backup source server 3 .
  • the backup request acceptance unit 21 creates (S 52 ) an archive file (see FIG. 8 ) with the backup discrimination information created in S 42 by means of the backup process run in S 43 .
  • the backup request acceptance unit 21 records (S 53 ) information relating to the backup lists 200 in the archive file. For example, based on the backup lists 200 , the backup request acceptance unit 21 registers the number of file names recorded in the backup list 200 A as the entry number in the created archive file and registers the path (path within the backup source server 3 ) of each file.
  • the backup request acceptance unit 21 receives (YES in S 54 , and S 55 ) one or more backup target files each having one or more file names written in the backup list 200 A from the backup source server 3 and stores the received backup target files in the archive file (S 56 ).
  • the backup request acceptance unit 21 has downloaded all the backup target files and stored these files in an archive file (NO in S 54 ), the archive file is stored in the backup destination storage device 5 A (S 57 ).
  • data characteristic classification definition information in which one or a plurality of data characteristic IDs correspond with one or more data characteristic types, and mapping information in which one, or two or more backup destination server information items (server names, for example) correspond with one or a plurality of data characteristic IDs are prepared.
  • the backup source server 3 Upon receiving a backup target designation, the backup source server 3 sets metadata (and/or actual data) for the designated files (and/or directories) with predetermined timing, and, based on the above data characteristic classification definition information, sets data characteristic IDs (that is, data characteristic types) for the backup target files, and, based on the set data characteristic IDs and mapping information, determines the backup destination servers 6 A to 6 C of the backup target files, before transmitting the backup target files to the servers 6 A to 6 C so determined.
  • data characteristic IDs that is, data characteristic types
  • the designated backup target is automatically backed up to the backup destination matching the data characteristic type of the backup target on the basis of the data characteristics of the backup target. That is, backup processing, which is suited to the data characteristics relating to the backup target, is performed by means of a method that is simple for the user.
  • the backup source server 3 when there is no match between the current date and time when the download request is received from a certain backup destination server 6 A and the acceptance date and time allocated to the backup list 200 A of the server 6 A, that is, even when a download request is received at a date and time other than the predetermined acceptance date and time, the backup source server 3 does not perform a backup of the backup target file. Accordingly, unauthorized downloading of the backup target file can be prevented before it takes place, whereby the security of the backup target file can be raised.
  • the backup discrimination information that the backup destination servers 6 A to 6 C create upon receiving the backup lists 200 A to 200 C is used by the backup source server 3 in order to recover the backup target files written in the backup lists 200 A to 200 C.
  • the backup discrimination information corresponding with the backup lists 200 A to 200 C may be any information as long as the backup source server 3 is able to obtain the backup target files written in the corresponding backup list from the backup destination servers 6 A to 6 C.
  • the backup discrimination information can be information including at least one of the backup destination server name, the name of the backed up backup target file, and the data size. In such a case, the backup source server 3 can inform any backup destination server which file is to be stored by managing such information.
  • the backup source server 3 associates and records backup discrimination information corresponding with each of the servers 6 A to 6 C with information relating to a plurality of backup destination servers 6 A to 6 C (host name, for example) in a predetermined restore file D 30 shown in FIG. 15 , for example.
  • the backup source server 3 restores the backup target file to the backup source storage device 4 as follows by using the restore file D 30 .
  • FIG. 16 shows the flow of the restore processing of the backup source server 3 .
  • the backup source server 3 performs the processing of (S 61 ) to (S 65 ) below with respect to all the servers 6 A to 6 C each having all the host names recorded in the restore file D 30 . This processing is described representatively for server 6 A below.
  • the backup source server 3 connects to the backup destination server 6 A.
  • the backup source server 3 reads the backup discrimination information for the server 6 A constituting the connection destination from the restore file D 30 , sets the storage destination directory for the backup target file to be subsequently acquired from the backup destination server 6 A in the backup source storage device 5 , and acquires the path of this directory.
  • the backup source server 3 communicates the read backup discrimination information to the backup destination server 6 A and, based on this backup discrimination information, specifies the archive file that stores the backup target file constituting the recovery target to the server 6 and acquires the backup target file from the specified archive file, whereby the acquired backup target file is received from the backup destination server 6 A.
  • the backup source server 3 stores the backup target file received from the backup destination server 6 A in the directory set in S 62 .
  • the backup source server 3 is able to restore one or more backup target files, which have been backed up in the backup destination servers 6 A to 6 C respectively, to the backup source storage devices 5 .
  • the backup request unit 12 is able to avoid a concentration of the load resulting from the backup processing on the backup destination servers 6 A to 6 C by varying the respective acceptance date and time of the backup lists 200 A to 200 C at fixed time intervals (a time interval that is presumed necessary in order for the backup destination servers 6 A to 6 C to acquire one or more predetermined backup target files from the backup source server 3 , for example).
  • This acceptance date and time may be established manually by the individual requesting the backup or may be established automatically by the backup source server 3 .
  • the backup source server 3 When the acceptance date and time are established automatically, the backup source server 3 is able to capture the total data size of one or more backed up backup target files for each of the backup destination servers 6 A to 6 C, estimate the time required for the backup on the basis of the data size, and schedule the acceptance date and time on the basis of the estimated time, for example (the acceptance date and time may be set in the order of the estimated backup time starting with the shortest or longest time first, for example).
  • the backup destination servers 6 A to 6 C may issue a download request immediately after receiving a backup list from the backup source server 3 .
  • the acceptance date and time need not be written in the backup list, for example.
  • the backup source server 3 may transmit all the backup lists 200 A to 200 C to the backup destination servers 6 A to 6 C at the same time, or may schedule the timing for transmitting the backup lists 200 A to 200 C and perform transmission at another time.
  • the concentration of the load on the backup source server 3 or network can be avoided by adjusting the timing for transmitting the backup lists 200 A to 200 C.
  • the timing for transmitting the backup lists may be scheduled on the basis of an estimated time by capturing the total data size of one or more backup target files for each of the backup destination servers 6 A to 6 C, for example, and estimating the time required for a backup on the basis of this data size (the transmission timing may be brought forward for a shorter or longer estimated backup time, for example).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

The present invention permits backup processing suited to data characteristics relating to backup target data by means of a method that is simple for the user. Data characteristic classification definition information, in which data characteristic IDs correspond with one or more data characteristic types respectively, and mapping information, in which backup destination server information corresponds with one or more data characteristic IDs respectively, are prepared. The backup source server 3 sets data characteristic IDs in each of the backup target files on the basis of metadata of designated backup target files and of the data characteristic classification definition information, and then determines backup destination servers 6A to 6C for each of the backup target files on the basis of the set data characteristic IDs and mapping information and transmits the backup target files to the determined servers 6A to 6C.

Description

CROSS-REFERENCES TO RELATED APPLICATIONS
This application relates to and claims priority from Japanese Patent Application No. 2003-320771 filed on Sep. 12, 2003, the entire disclosure of which is incorporated herein by reference.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a technology for backing up data, and, more specifically, to a technology for backing up data to at least one device among a plurality of backup destination devices via a communication network, for example.
2. Description of the Related Art
Conventionally known backup systems are systems in which a plurality of backup destination servers exist in a communication network such as the Internet, and at least one backup destination server is selected from this plurality of backup destination servers, and data is backed up by transferring data to the backup destination servers. In such a system, according to Japanese Patent Publication Laid Open No. 2002-215474, for example, a backup destination server constituting a destination for storing data to be backed up (hereinafter “backup target data”) is selected from among the plurality of backup destination servers on the basis of the reliability, performance, or processing speed of the plurality of backup destination servers, and then backup processing is performed by transferring backup target data to the selected backup destination server.
However, with the above-mentioned conventional backup systems, the data characteristics relating to the backup target data are not considered when a backup is made. A variety of types of backup target data may be considered to exist for the user. For example, if there is also backup target data that needs to undergo backup processing distinctly from other data, there is probably also backup target data that needs to undergo backup processing with an emphasis on security and backup target data for which reliability must be secured as in the case of a backup to a plurality of locations may also be considered to exist.
In order to perform a backup in which the data characteristics of such backup target data are considered, settings with respect to how the backup processing is performed may be considered one at a time. However, the setting of backup processing in such small units is a tedious operation and troublesome for the user.
Accordingly, it is a feature of the present invention to make it possible to perform backup processing that is suited to the data characteristics pertaining to the backup target data by means of a method that is straightforward for the user.
BRIEF SUMMARY OF THE INVENTION
The backup system according to one aspect of the present invention comprises: a backup source computer device that stores backup target data; a plurality of backup destination computer devices each connected to the backup source computer device via a network; a backup mode selector that selects, according to data characteristics of the backup target data, any one backup mode from among a plurality of pre-prepared backup modes; and a backup executor that stores the backup target data by transferring same from the backup source computer device to a backup destination computer device that is selected on the basis of the selected backup mode from among the backup destination computer devices.
The backup source computer device and the backup destination computer device may be constituted as a computer system that is capable of using storage devices such as hard disk or semiconductor memory device, as in the case of a file server (NAS (Network Attached Storage)) or similar, for example. The backup source computer device can comprise a file system that is shared by a plurality of users, for example. Examples of backup target data may include data files created by each user, data groups constituting the content of a database, and system files defining the constitution and the like of the user system, and so forth. With this embodiment, backups are performed in file units. Data characteristics can be defined as information denoting the data usage characteristics possessed by the backup target data, for example, and can be classified in accordance with the purpose for using the backup target data, and the form of usage.
According to an embodiment of the present invention, backup modes are prepared in accordance with predefined plural-type data characteristics. When a backup instruction is issued by the operations manager or similar, or a preset backup time arrives, the backup mode selector discriminates data characteristics of the backup target data and select a backup mode that matches the data characteristics. A backup destination computer device that is used as the backup destination of the backup target data is determined by the selection of the backup mode. The backup executor transfers the backup target data to the selected backup destination computer device. The backup target data is accordingly prepared for the backup destination computer device. Methods for transferring backup target data can be broadly classified into two types. One method is a method that transmits backup target data from a backup source computer device to a backup destination computer device. The other method is one in which the backup destination computer device accesses the backup source computer device to download backup target data.
According to an embodiment of the present invention, the backup mode selector determines whether the backup target data possesses any data characteristic on the basis of pre-prepared characteristic classification conditions.
Characteristic classification conditions are discrimination information serving to discriminate whether the backup target data pertains to any of the predefined plural-type data characteristics. Here, data characteristics that are pre-classified by the characteristic classification conditions can be expressed as defined data characteristics, data characteristic types, data characteristic categories, and so forth, for example.
According to an embodiment of the present invention, the backup mode selector determines whether the backup target data possesses any data characteristics by comparing acquired metadata relating to the backup target data, and characteristic classification conditions.
Examples of acquired metadata relating to the backup target data include, for example, a file name, file size, an update date and time, a file extension (that is, the file type), access group management information set in a file, names of the users sharing the file, the total number of common users, and the category to which the common users belong (job categories such as the planning department, accounting department, development department, as well as ranking categories such as person in charge, section manager, head of department, and executive, for example), and so forth.
Data characteristics include any one of data characteristics that prioritize the securing of data reliability or data characteristics that prioritize the securing of data security.
According to an embodiment of the present invention, both data characteristics that place emphasis on data reliability and data characteristics that place emphasis on data security are included.
Data characteristics that prioritize the securing of data reliability are a data characteristic segment in which data consistency is secured and the prevention of data destruction and loss is required. The data characteristics that prioritize the securing of data reliability can be determined by considering at least one or more judgment elements among judgment elements such as the number of common users, file extension type, file name, and the presence or absence of write permissions, for example.
The data characteristics that prioritize the securing of data security are a data characteristic segment in which data secrecy is retained and the prevention of unauthorized copying and so forth is required. Data characteristics that prioritize the securing of data security can be determined by considering at least one or more judgment elements among judgment elements such as the presence or absence of encryption, the number of common users, special features common to common users, the presence or absence of access restrictions, file extension type, file name, and the presence or absence of predetermined keywords, for example.
According to an embodiment of the present invention, the backup executor selects a backup destination computer device constituting a backup destination on the basis of backup destination mapping information constituted so as to pre-match at least one or more backup destination computer devices of the backup destination computer devices with each backup mode.
According to an embodiment of the present invention, the backup executor comprises: a backup list generator that generates a backup list that includes information specifying backup target data to be acquired by the backup destination computer device; and a backup list transmitter that transmits the backup list to the backup destination computer device, and wherein the backup destination computer device comprises: a data acquisitor that stores backup target data by acquiring same from the backup source computer device on the basis of the backup list received from the backup source computer device.
Here, examples of information specifying the backup target data include information on the path to the backup target data, the file name, and so forth. The backup list is prepared for each of the backup destination computer devices and each of the backup target data items to be acquired by each backup destination computer device is explicit in each backup list. Accordingly, the backup destination computer device is able to specify backup target data to be acquired, and acquire backup target data from the backup source computer device by referencing only the backup list that is addressed to the backup destination computer device.
According to an embodiment of the present invention, the backup list includes information indicating a backup availability time when backup target data can be acquired from the backup source computer device, and the backup destination computer device accesses the backup source computer device according to the backup availability time to acquire the backup target data.
By including a backup availability time in the backup list, the reading of backup target data at a time other than the backup availability time can be prevented beforehand, and hence stability can be raised.
According to an embodiment of the present invention, upon receiving the backup list from the backup source computer device, the backup destination computer device generates restore data to be used for restoring the backup target data, and transmits the restore data thus generated to the backup source computer device.
For example, the restore data can include discrimination information for specifying data (or a data group) that is backed up to the backup destination computer device.
According to an embodiment of the present invention, the backup list transmitter controls the time for transmitting the backup list to the backup destination computer device.
For example, as a result of a shift between the backup availability time that is granted one backup destination computer device and a backup availability time that is granted another backup destination computer device, the time of the backup processing by each backup destination computer device can be adjusted and the processing load on the backup source computer device and the communication network traffic can be controlled.
A backup method according to another aspect of the present invention for performing a backup between a backup source computer device for storing backup target data and a plurality of backup destination computer devices each connected to the backup source computer device via a network, comprises the steps of: determining data characteristics pertaining to backup target data on the basis of characteristic classification conditions for classifying data characteristics; determining a backup destination computer device for each of the backup target data on the basis of the determined data characteristics and of backup destination mapping information that is constituted so that at least one or more of the backup destination computer devices constituting backup destinations correspond(s) with each of the data characteristics; collecting, for each of the backup destination computer devices, the backup target data corresponding with the backup destination computer devices; generating, for each of the backup destination computer devices, a backup list that includes information specifying backup target data to be acquired by the backup destination computer devices; transmitting the generated backup lists to the backup destination computer devices; and transferring the backup target data from the backup source computer device to the backup destination computer devices on the basis of the basis of the received backup lists.
A computer device according to yet another aspect of the present invention, comprising: a component that stores characteristic classification conditions for classifying data characteristics; a component that determines data characteristics pertaining to backup target data on the basis of the characteristic classification conditions; a component that stores backup destination mapping information constituted such that at least one or more backup destination computer devices constituting a backup destination correspond(s) with each of the data characteristics; a component that determines a backup destination computer device for each of the backup target data on the basis of the determined data characteristics and the backup destination mapping information; a component that that collects, for each of the backup destination computer devices, the backup target data corresponding with the backup destination computer devices and generating, for each of the backup destination computer devices, a backup list including information specifying backup target data to be acquired by the backup destination computer devices; a component that transmits the generated backup lists to the backup destination computer devices; and a component that transfers the backup target data to the backup destination computer devices when the backup destination computer devices request the acquisition of backup target data on the basis of the transmitted backup lists.
The computer program according to another aspect of the present invention is a computer program that causes a computer device for storing backup target data to execute a method for issuing a backup request, the backup method comprising the steps of: determining data characteristics pertaining to backup target data on the basis of characteristic classification conditions for classifying data characteristics; determining a backup destination computer device for each of the backup target data on the basis of the determined data characteristics and of backup destination mapping information that is constituted so that at least one or more of the backup destination computer devices constituting backup destinations correspond(s) with each of the data characteristics; generating, for each of the backup destination computer devices, a backup list that includes information specifying backup target data to be acquired by the backup destination computer devices; and transmitting the generated backup lists to the backup destination computer devices.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is an overall constitutional view of the backup system according to an embodiment of the present invention.
FIG. 2 is a block diagram showing the functions of the servers 3 and 6A to 6C that constitute the backup system according to this embodiment.
FIG. 3 shows an example of data characteristic classification definition information.
FIG. 4 shows an example of classification result data.
FIG. 5 shows an example of backup destination mapping information.
FIG. 6 shows an example of backup lists 200A, 200B, and 200C.
FIG. 7 is an image diagram of the flow of the processing of a backup request unit 12.
FIG. 8 shows the constitution of an archive file created by a backup request acceptance unit 21.
FIG. 9 is a flowchart showing the flow of the processing of a data characteristic classification unit 11 that a backup source server 3 comprises.
FIG. 10 is a flowchart showing the flow of the processing of the backup request unit 12.
FIG. 11 is a flowchart showing the flow of the processing of the backup request unit 12.
FIG. 12 is a flowchart showing the flow of the processing of a download acceptance unit 13.
FIG. 13 shows the flow of the processing of the backup request acceptance unit 21 of the backup destination server.
FIG. 14 shows the flow of the processing of the backup request acceptance unit 21 of the backup destination server.
FIG. 15 shows an example of a restore file.
FIG. 16 shows the flow of the restore processing of the backup source server 3.
DETAILED DESCRIPTION OF THE INVENTION
An embodiment of the present invention will be described hereinbelow with reference to the drawings.
FIG. 1 is an overall constitutional view of the backup system according to an embodiment of the present invention.
As shown in FIG. 1, the backup system according to this embodiment has a single backup source data center 1 (or a plurality thereof), and a plurality (three, for example) of backup destination data centers (2A, 2B, 2C).
The backup source data center 1 is a data center constituting the backup source of backup target files. The center 1 comprises one or a plurality of backup source storage devices 4 that store backup target files, and a backup source storage device 3, which can be communicably connected to the backup source storage devices 4 via a communication network or the like such as an SAN (Storage Area Network).
Meanwhile, the backup destination data centers (2A, 2B, and 2C) are data centers for storing backups of files that are stored in the backup source data center 1. For example, the center 2A comprises one or a plurality of backup destination storage devices 5A, which are capable of storing backups of backup target files; and a backup destination server 6A that can be communicably connected to the backup destination storage devices 5 via a communication network or the like such as an SAN. As is shown, another center 2B (and 2C) also comprises a storage device 5B (and 5C) like the center 2A, and a server 6B (and 6C).
The servers 3 and 6A to 6C, and the storage devices 4 and 5A to 5C will be described below.
The backup source server 3 classifies one or a plurality of backup target files in the backup source storage device 4 into one or more file groups (groups including one or more backup target files) that have common data characteristics on the basis of respective data characteristics for these backup target files. Further, the backup source server 3 transmits one or more backup target files pertaining to each file group to one or more backup destination servers 6A to 6C on the basis of common data characteristics in these file groups.
The backup source storage device 4 is a storage system that comprises an external or internal hard disk, or one or a plurality of hard disks in the form of an array, for example, and is able to store backup target files. In the backup source storage device 4, various files are of a predetermined format and are managed according to a hierarchical structure in which, for example, a second directory lies below a first directory and one or a plurality of files are stored in the second directory. Therefore, if, for example, the operator designates the first directory as the backup target with respect to the backup source server 3, all the directories and files that lie below the first directory are designated as the backup target.
The backup destination servers 6A to 6C receive backup target files from the backup source server 3 and store these backup target files in the communicably connected backup destination storage devices 5A to 5C respectively.
The backup destination storage devices 5A to 5C are storage devices in which backup target files are stored via the backup destination servers 6A to 6C from the backup source server 3, and are magnetic tape libraries equipped with one or a plurality of magnetic tapes, or storage systems comprising one or a plurality of hard disks in the form of an array, for example.
FIG. 2 is a block diagram showing the functions of the servers 3 and 6A to 6C that constitute the backup system according to this embodiment.
The backup source server 3 comprises an operating system (OS) 10 such as a Microsoft Windows (Trademark) operating system, and comprises, as application software above this OS 10, a data characteristic classification unit 11, a backup request unit 12, a download acceptance unit 13, and a restore unit 14.
When a backup target is designated from outside the backup source server 3 (an operator or a remote external device, for example), the data characteristic classification unit 11 acquires, with predetermined timing (immediately following the designation of the backup target or at fixed intervals, for example), metadata for the designated backup target file (and/or metadata for one or more directories containing the file) from the backup source storage device 4. The data characteristic classification unit 11 then classifies the designated backup target file on the basis of the acquired meta data and pre-prepared data characteristic classification definition information.
Here, the file (and ‘directory’) ‘metadata’ represents characteristics relating to the file (and directory) and includes at least one type of information among the following subinformation in (1) to (7) below, for example:
  • (1) the number of common users (the number of users allowed to access the file (and directory) and view same);
  • (2) extension (such as ‘jpg’ or ‘gif’, for example);
  • (3) keyword (character or character string contained in the file name, directory name and/or actual data, for example);
  • (4) presence or absence of write permissions (whether or not writing is permitted);
  • (5) encryption attribute (information indicating whether the file (and directory) is encrypted);
  • (6) presence or absence of ACL(Access Control List) settings (whether there is corresponding information (that is, an ACL) indicating which users or user groups can gain access and the manner in which they do so (reading, writing, or execution, for example)); and
  • (7) special characteristics of common users (common user posts or departments, for example) (may be another type of information (such as message data, for example) relating to access restrictions instead of the ACL of the ‘presence or absence of ACL settings’ in (6), for example).
Further, data characteristic classification definition information is information relating to rules on how backup target files are classified, and is created by a predetermined individual (an operations manager, for example) and then stored in a predetermined location on a communication network (in the backup source server 3 or backup source storage device 4, for example). More specifically, as shown in FIG. 3, for example, data characteristic classification definition information includes one or more data characteristic IDs for discriminating one or more data characteristic types and one or more rule bodies that correspond with the one or more data characteristic IDs. The rule body is information representing conditions for assigning the corresponding data characteristic IDs (in other words, information representing the data characteristic type corresponding with the data characteristic ID). Each rule body has a predetermined constitution, comprising a plurality of subconditions and logic operators that link the subconditions, for example (AND, OR, XOR, and so forth, for example). In FIG. 3, for example, in the case of the rule body for the data characteristic ‘ID-002’, the file is not encrypted, and there is no ACL setting and no restrictions relating to access-enabled user groups (that is, common users are not set), and the rule body shows that the file extension is ‘.html’, ‘.doc’, or ‘.xls’. If a file corresponding to the conditions indicated by this rule body (a file with metadata or actual data satisfying the conditions of the rule body, for example) exists, the data characteristic ID ‘ID-002’ is assigned to this file.
The individual creating the data characteristic classification definition information (hereinafter called the ‘creator’) is not limited to the example shown in FIG. 3, and is able to create a variety of rule bodies by preparing subconditions of any kind and connecting any of the prepared plurality of subconditions in some way.
For example, when a file classification with an emphasis on file (or directory) reliability is desired of the data characteristic classification unit 11, a rule body can be created on the basis of at least one aspect of the following aspects (A) to (C), for example.
    • (A) Subconditions based on the number of common users are prepared, and the data characteristic classification unit 11 is thus made to perform file classification on the basis of the number of common users. For example, a file (or directory) for which the number of common users is at or more than a certain value can be judged as being a file for which reliability is emphasized in that it is thought that the number of accessing users is large (or the access frequency is high) and hence the effects of file deterioration are large. For this reason, the creator may set a desired value for the “number of common users” as a subcondition in the rule body, and, by means of processing by the data characteristic classification unit 11 (described in detail subsequently), a data characteristic ID indicating that the file reliability is high is assigned to a certain file (or directory) the number of common users of which is equal to or more than the desired value.
    • (B) Subconditions based on an extension or keyword (character string contained in metadata or actual data, for example) are prepared, and the data characteristic classification unit 11 is thus made to perform file classification on the basis of an extension or a character string that is contained in the file name. For example, when the extension is ‘.sys’, it can be judged that this is data for which reliability is emphasized in that, when a definition file relating to the constitution of a server program, or the like, has been stored and the file content is then changed or destroyed, the server no longer operates correctly and so forth, this being the cause of a fatal error. On the other hand, for example, because a temporary file (extension ‘tmp’), or a work directory (extension ‘wrk’) that is temporarily created is temporary data, a backup is not required, and a judgment to remove this data from backup processing is possible. As a result, the creator may set ‘extension’ or ‘keyword’ as a subcondition within the rule body as a condition to be selected during classification, or may set ‘extension’ or ‘keyword’ as a condition to be excluded from the backup target during classification, whereby file classification based on file reliability is performed by the processing of the data characteristic classification unit 11 (described in detail subsequently).
    • (C) Subconditions based on the number of users with write permission are prepared, and the data characteristic classification unit 11 is thus made to perform file classification based on the number of users with write permission. For example, in the case of files shared by a large number of people and for which writing is permitted, there is a high possibility of file destruction occurring, and hence it can be judged that backup processing in a stable state with the emphasis on reliability is required. Therefore, the creator may set the number of common users and the presence or absence of write permissions as subconditions in the rule body, whereby file classification based on this aspect is performed by the processing of the data characteristic classification unit 11 (described in detail subsequently).
Further, when file classification with an emphasis on file (or directory) security is desired of the data characteristic classification unit 11, for example, a rule body can be created on the basis of at least one aspect of the following aspects (a) to (f), for example.
    • (a) Subconditions based on the presence or absence of encryption are prepared, and the data characteristic classification unit 11 is thus made to perform file classification based on the presence or absence of encryption. For example, a predetermined OS (one example of which is Microsoft's Windows2000 (trademark)) supports file encryption in the file system, and the presence or absence of this file encryption can be confirmed from a file attribute (metadata, for example). However, when encryption settings are in place, it is possible to judge that the file is data for which security must be secured. For this reason, the creator may establish the presence of encryption for the ‘encryption presence or absence’ as a subcondition in the rule body, and hence a data characteristic ID indicating that security is high is assigned to a file determined to be a high security file by the processing of the data characteristic classification unit 11 (described in detail subsequently).
    • (b) Subconditions based on the number of common users are prepared, and the data characteristic classification unit 11 is thus made to perform file classification based on the number of common users. For example, it can be judged that a file having a ‘number of common users’ equal to or less than a certain value is highly secret and file disclosure is restricted. For this reason, the creator may set the value of the ‘number of common users’ as a subcondition in the rule body and hence a data characteristic ID indicating that security is high is assigned to a file (or directory) for which the number of common users is equal to or more than a desired value by the processing of the data characteristic classification unit 11 (described in detail subsequently).
    • (c) Subconditions that are based on the presence or absence of ACL settings are prepared, and the data characteristic classification unit 11 is thus made to perform file classification based on the presence or absence of ACL settings. For example, a file with ACL setting can be judged as being data with access restrictions provided and for which security must be secured. For this reason, the creator may set the presence of ACL settings for ‘presence or absence of ACL settings’ as a subcondition in the rule body, and a data characteristic ID indicating that security is high is assigned to a file determined to be a high security file by the processing of the data characteristic classification unit 11 (described in detail subsequently).
    • (d) Subconditions based on the common user special characteristics (posts or departments, for example) are prepared and the data characteristic classification unit 11 is thus made to perform file classification based on the characteristics of common users. For example, it can be judged that a file that tends toward a characteristic according to which there is a large number of common users with a high position requires security to be secured. The creator may therefore set ‘common user special characteristics’ as a subcondition in the rule body and hence a data characteristic ID indicating that security is high is assigned to a file determined to be a high security file by the processing of the data characteristic classification unit 11 (described in detail subsequently).
    • (e) Subconditions based on a file name (or a directory name) are prepared and the data characteristic classification unit 11 is thus made to be perform file classification based on the file name (or directory name). For example, when a file that is known to require the securing of security exists, the creator may set the file name of this file (or the name of the directory with this file) as a subcondition in the rule body, and hence a data characteristic ID indicating that security is high is assigned to the file by the processing of the data characteristic classification unit 11 (described in detail subsequently).
    • (f) Subconditions based on a keyword contained in metadata or actual data are prepared, and the data characteristic classification unit 11 is thus made to perform file classification based on a keyword contained in metadata or actual data. For example, a file in which a keyword such as “(secret)” or “confidential” repeats itself in the metadata or actual data can be judged as one requiring security securing. In this case, the creator may set “(secret)” or “confidential” (or a number of keywords in addition to this word) as a “keyword” for the metadata or actual data as a subcondition in the rule body, and hence a data characteristic ID indicating that security is high is assigned to a file determined as a high security file by the processing of the data characteristic classification unit 11 (described in detail subsequently).
The data characteristic classification unit 11 assigns, on the basis of metadata (and/or actual data) of one or more designated backup target data files and the above-mentioned data characteristic classification definition information, one or more data characteristic IDs corresponding with one or more rule bodies satisfied by the file to the one or more backup target files (that is, performs classification of the backup target files). The data characteristic classification unit 11 then outputs data relating to the classification result, that is, for example, as illustrated in FIG. 4, classification result data D21 that is produced by associating information relating to the files (such as the file names, path names, and data sizes of the files, for example) with one or more data characteristic IDs assigned to these files, for each of the one or more backup target files. Further, as illustrated in FIG. 4, during file classification, when there is a file for which no condition of the rule body is satisfied, the data characteristic classification unit 11 assigns a predetermined code (‘Default’, for example), which indicates that such a condition is absent, to the file in place of data characteristic IDs or as one data characteristic ID. Furthermore, although not especially shown in FIG. 4, when a backup target file satisfying a plurality of conditions each indicated by a plurality of rule bodies exists, a plurality of data characteristic IDs are assigned to one backup target file.
Let us now refer to FIG. 2 once again. When the backup request unit 12 receives a backup request from outside (the operations manager, for example) with predetermined timing, the backup request unit 12 collates the classification result data D21 (see FIG. 4) output by the data characteristic classification unit 11, along with pre-prepared backup destination mapping information. Then, on the basis of the classification result data and the backup destination mapping information, the backup request unit 12 prepares, for each of the backup destination servers 6A to 6C, information relating to which backup target file is transmitted to which backup destination server, such as a backup list (described later), for example, and then transmits each backup list to the backup destination servers 6A to 6C to which these lists are addressed.
Here, as illustrated in FIG. 5, the backup destination mapping information includes information indicating which data characteristic ID (and the above-mentioned ‘Default’ indicating the absence thereof)—assigned file is backed up to which backup destination server, that is, information (a host (server) name, or IP address and so forth, for example) relating to one (or a plurality of) backup destination servers associated with a plurality of data characteristic IDs. This backup destination mapping information is created automatically by a computer or manually by a predetermined individual (an operations manager, for example), and is pre-stored in a predetermined location on a communication network (in the backup source server 3 or backup source storage device 4, for example).
Furthermore, the backup lists are prepared in the same quantity as the backup destination servers. For example, the three backup lists 200A, 200B, and 200C illustrated in FIG. 6 are prepared for three backup destination servers 6A, 6B, and 6C by the backup request unit 12. If this is described representatively with respect to the backup list 200A, the ‘acceptance date and time’ and information relating to the backup destination server corresponding with this list 200A (the ‘host name’ and ‘file list’ as shown, for example) are recorded in the backup list 200A.
The ‘acceptance date and time’ is information indicating the date and time (or permitted time slot) when the backup destination server 6A is granted access to the backup source server 3, and is expressed in predetermined units (year/month/day/hour/minutes/seconds, for example). The acceptance date and time is allocated automatically by the backup request unit 12, for example, but may be established manually by a predetermined user (operations manager, for example). The backup request unit 12 is able to avoid a concentration of the load resulting from the backup processing on the backup destination servers 6A to 6C by varying the respective acceptance date and time of the backup lists 200A to 200C at fixed time intervals (a time interval that is presumed necessary in order for the backup destination servers 6A to 6C to acquire one or more predetermined backup target files from the backup source server 3, for example). Further, the time required in order to acquire one or more predetermined backup target files can be estimated from the total of the data size of the backup target files, for example.
The ‘host name’ is information indicating the name of the backup destination server 6A.
The ‘file list’ expresses information relating to one or more backup target files classified as backed up to the backup destination server 6A (the file name, path name, data size, and so forth, of each file, for example) in list format.
The other backup lists 200B and 200C are substantially the same as the backup list 200A. The backup request unit 12 creates the backup lists 200A to 200C based on the flow described below.
FIG. 7 is an image diagram of the flow of the processing of the backup request unit 12.
As shown in FIG. 7, the backup request unit 12 collates the classification result data D21 that is output by the data characteristic classification unit 11, and pre-prepared backup destination mapping information D22, and thus obtains data D23 that is produced by converting a data characteristic ID in the classification result data D21 into a backup destination host name (backup destination server name). Next, the backup request unit 12 sorts sets of file names and host names recorded in this data D23 by the host names, and thus converts the data D23 into data D24 in which the sets of file names and host names recorded in this data D23 are sorted by the host names. The backup request unit 12 then divides up and outputs this data D24 into files for each host name, and creates the backup lists 200A to 200C corresponding to the backup destination servers 6A to 6C respectively by adding the above-mentioned acceptance date and time to each output file (further, the backup lists 200A to 200C may be divided into even smaller files, in which case the acceptance date and time is added to each further divided file).
The backup request unit 12 transmits the backup lists 200A to 200C so created to the corresponding backup destination servers 6A to 6C. Accordingly, the backup destination servers 6A to 6C assign discrimination information (hereinafter ‘backup discrimination information’) to the received backup lists 200A to 200C and transmit this backup discrimination information to the backup source server 3. The backup source server 3 receives backup discrimination information from each of the backup destination servers 6A to 6C and the backup request unit 12 creates restore information for recovering backup target files on the basis of the backup discrimination information (the restore information as well as the restoration processing that employs this information will be described in detail subsequently).
Upon receiving a download request (described subsequently) from the backup destination servers 6A to 6C, the backup request unit 12 stores the transmitted backup lists 200A to 200C in predetermined storage regions (predetermined storage regions in the backup source server 3 or backup source storage device 4, for example) in order to perform a validity check on the date and time a request is received and to back up one or more predetermined backup target files in the backup destination servers 6A to 6C constituting the request source.
Let us refer to FIG. 2 once again. The download acceptance unit 13 accepts backup target file download (transfer) requests from the backup destination servers 6A to 6C, and, in the event of a request, checks whether the date and time when the request was received are valid. More specifically, in a case where the download acceptance unit 13 receives a download request from the backup destination server 6A, for example, the download acceptance unit 13 judges whether there is a complete or substantial match between this date and time and a date and time that is designated in advance by the backup source server 3 with respect to the backup destination server 6A (that is, the acceptance date and time written in the backup list 200A that is output by the backup request unit 12). In the event of such a match, the download acceptance unit 13 reads one or more backup target files that have one or more file names written in the backup file 200A from a predetermined location (the backup source storage device 4, for example), and transmits these backup target files to the backup destination server 6A that is the source of the download request. On the other hand, if no such match exists, the download acceptance unit 13 performs predetermined processing, i.e. communicates an error to the backup destination server 6A, for example. Further, the above-mentioned ‘substantial match’ means that the difference between the current date and time when the download request is received and the acceptance date and time lies within a predetermined error range, for example, and this predetermined error range may be common to all the backup lists or vary from one backup list to the next. In addition, the predetermined error range may be varied by a predetermined user or may be fixed so as to be unchangeable. Further, the predetermined error range may be stored in a predetermined storage device separately from the backup list or may be described in the backup list.
The restore unit 14 restores a backup target file on the basis of the restore information created by the backup request unit 12 (described in detail subsequently with respect to the restore processing).
Each application with which the backup source server 3 is equipped was described above. Next, the backup destination server 6A will be described representatively for the backup destination servers 6A to 6C with reference to FIG. 2 (further, although the backup destination server 6A is illustrated representatively in FIG. 2, the other backup destination servers 6B and 6C are also able to communicate with the backup source server 3).
The backup destination server 6A comprises an operating system (OS) 20 and the backup request acceptance unit 21 as application software above this OS.
The backup request acceptance unit 21 receives the backup list 200A from the backup source server 3 and stores this list in a predetermined storage region (a predetermined storage region in the backup destination server 6A or backup destination storage device 5A, for example). The backup request acceptance unit 21 then generates backup discrimination information for this backup on the basis of the backup list 200, and stores this information in a predetermined storage region. Then, after running a process to perform backup processing, the backup request acceptance unit 21 transmits the stored backup discrimination information to the backup source server 3. Incidentally, the backup process is in a standby state until the current date and time reaches the acceptance date and time listed in the received and stored backup list 200A (until the current date and time falls within the range of a time slot when the acceptance date and time is expressed by this time slot). When the current date and time reaches the acceptance date and time of the backup list 200A, the backup request acceptance unit 21 runs a backup process, issues a download (transfer) request to the download acceptance unit 13 of the backup source server 3, and creates an archive file that has the stored backup discrimination information.
FIG. 8 shows the constitution of an archive file created by the backup request acceptance unit 21.
For example, as is shown, stored in the archive file for the backup list 200A are: backup discrimination information generated for the backup list 200A, the entry number of backup target files (that is, the number of backup target files stored in the archive file), and backup target information in an amount corresponding to the entry number (such as the data size, path within the backup source server 3, and body (file itself), of each file, for example).
When the backup request acceptance unit 21 shown in FIG. 2 downloads one or more backup target files recorded in the backup list 200A from the backup source server 3 in response to the download request, the backup request acceptance unit 21 stores these backup target files in the archive file. Once the processing to store the backup target files in the archive file is complete, the backup request acceptance unit 21 stores the archive file containing the backup target files in the backup destination storage device 5A.
The flow of the processing of each application above will now be described below by using a flowchart.
FIG. 9 is a flowchart showing the flow of the processing of the data characteristic classification unit 11 that the backup source server 3 comprises.
A predetermined user (operations manager, for example) inputs a backup target directory (or a directory with the backup target file, for example) to the backup source server 3 (step S1). Further, the data characteristic classification unit 11 reads (S2) the data characteristic classification definition information (see FIG. 3) that has been preset and stored.
Next, the data characteristic classification unit 11 searches for directories and files contained in the directory that is input in S1 (that is, on a level below the directory), and, if the sought directories and files are present (YES in S3), acquires metadata for all these files and directories (and/or actual data) (S4).
Next, the data characteristic classification unit 11 collates (S5) metadata (and/or actual data) for the files (and directories) acquired in S4 and data characteristic classification definition information read in S2, performs classification based on the data characteristics of the backup target files by capturing the data characteristic IDs corresponding to the backup target files and then associating these data characteristic IDs with the files, and outputs (S6) the classification result data representing the classification results (see FIG. 4), and stores this data in a predetermined storage region. For example, the data characteristic classification unit 11 may read a plurality of files in the directory that was input in S1 one by one, and then repeatedly execute S4 to S6. That is, the data characteristic classification unit 11 performs S4 to S6 by reading out a certain single file from the plurality of files in the directory input in S1, and then performs S4 to S6 by reading out another single file, and may repeat this processing until it is complete for all these plural files.
Further, as a result of such classification processing, the one or more backup target files retrieved in S3 are classified according to a predetermined standard, such as at least one standard among (A) to (C) and (a) to (f) mentioned earlier, for example, based on the file data characteristics. That is, one or more characteristic ID data items is (are) assigned to each backup target file on the basis of at least one item among: the number of common users of the file, special features common to the common users, an extension, a keyword, and the presence or absence of access restriction information such as an ACL, and the presence or absence of encryption, for example.
Further, in this classification processing, depending on the content of metadata (or actual data) of a backup target file (and/or directory), a plurality of conditions expressed by a rule body are sometimes satisfied, in which case a plurality of data characteristic IDs are assigned to one backup target file. Further, in a case where, in the mapping information, two or more server information items (host names, for example) correspond with one data characteristic ID, one backup target file is backed up to two or more servers.
When the backup target is designated, the serial flow above can also be performed with predetermined timing, such as immediately after the designation, for example, or can be performed at fixed or irregular intervals after the designation. In the latter case, for example, if the user designates a pre-prepared desired directory as the backup target and stores a file in this desired directory, the classification of the file stored in the desired directory is performed automatically at fixed intervals or with other predetermined timing.
FIGS. 10 and 11 are flowcharts showing the flow of the processing of the backup request unit 12.
When the backup request unit 12 receives a backup request from outside (the operations manager, for example), for example, with predetermined timing, the classification result data that is output by the data characteristic classification unit 11 is read from a predetermined storage region (S11) as shown in FIG. 10.
Next, the backup request unit 12 reads the pre-prepared backup destination mapping information (S12).
The backup request unit 12 then sets the counter value at ‘0 ’ (S13), and compares this value with the number of files recorded in the classification result data (S 14). The backup request unit 12 performs the processing of (S15) to (S18) below until the counter value equals the number of files of the backup target files recorded in the classification result data (NO in S14).
(S15) The backup request unit 12 acquires the data characteristic ID corresponding with the file name (or path name) of the target recorded in the classification result data.
(S16) The backup request unit 12 references the backup destination mapping information to acquire the host name corresponding with the data characteristic ID acquired in S15.
(S17) The backup request unit 12 associates the host name acquired in S16 with the file name of the target in S15, renders a set of the file name and the host name one record, and outputs same to a predetermined temporary file (the data file D23 shown in FIG. 8, for example).
(S18) The backup request unit 12 increments the counter value by one.
Once the counter value reaches the file number recorded in the classification in result data as a result of the above processing of (S15) to (S18) (YES in S14), the backup request unit 12 sorts the one or more records recorded in the temporary file by the host names (S19).
Next, on the basis of the host name, the backup request unit 12 divides up the temporary file whose records have been sorted by the host name. That is, the backup request unit 12 performs division to produce the same number of files as the types of host names (that is, the backup destination servers 6A to 6C) recorded in the temporary file, and creates and outputs (S20) the backup lists 200A to 200C corresponding with the backup destination servers 6A to 6C by recording the acceptance date and time in the files obtained by this division.
Next, as shown in FIG. 11, the backup request unit 12 performs the following processing on all the backup lists 200A to 200C.
That is, first of all, the backup request unit 12 captures (S25) the backup destination servers 6A to 6C by acquiring the host names (backup destination server names) from the backup lists 200A to 200C.
The backup request unit 12 then transmits (S26) each of the backup lists 200A to 200C to the backup destination servers 6A to 6C thus captured in S25. The backup request unit 12 also stores these backup lists 200A to 200C in a predetermined storage region.
Thereafter, the backup request unit 12 receives (S27) a response that includes the above-mentioned backup discrimination information from the backup destination servers 6A to 6C. The backup request unit 12 then renders the backup discrimination information included in the response and information (host name, for example) relating to the backup destination server constituting the information transmission source a set, and outputs this set (S28) to a predetermined file (for example, a restore file described subsequently).
FIG. 12 is a flowchart showing the flow of the processing of the download acceptance unit 13.
When the download acceptance unit 13 receives (YES in S31) a download request including the host name of the server 6A from the backup destination server 6A, for example, the download acceptance unit 13 acquires (S32) the acceptance date and time and the host name from all the file lists 200A to 200C output by the backup request unit 12.
The download acceptance unit 13 compares the host name and the current date and time included in the download request received in S31 with the host name and acceptance date and time acquired in S32, and thus judges whether a match exists (S33).
When such a match exists (YES in S33) as a result of the judgment in S23, the download acceptance unit 13 reads out one or more backup target files each having one or file names listed in the backup list 200A from the backup source storage device 4 and transmits (S34) the one or more backup target files thus read to the backup destination server 6A that is the transmission source of the download request.
When, on the other hand, no such match exists as a result of S23, the download acceptance unit 13 transmits an error to the backup destination server 6A (S35).
FIGS. 13 and 14 show the flow of the processing of the backup request acceptance unit 21 of the backup destination server. The backup destination server is described below as the backup destination server 6A.
As shown in FIG. 13, the backup request acceptance unit 21 of the backup destination server 6A receives (S41) the backup list 200A from the backup request unit 12 of the backup source server 3 and stores the backup list 200A in a predetermined storage region.
Next, the backup request acceptance unit 21 creates backup discrimination information relating to the backup list 200A (S42).
The backup request acceptance unit 21 then generates and runs the backup process (S43).
Thereafter, the backup request acceptance unit 21 transmits (S44) the backup discrimination information thus created in S42 to the backup request unit 12 of the backup source server 3.
Thereafter, as shown in FIG. 14, when it is detected that the current date and time has reached the acceptance date and time listed in the backup list 200A (YES in S51), the backup request acceptance unit 21 creates (S52) an archive file (see FIG. 8) with the backup discrimination information created in S42 by means of the backup process run in S43.
Next, the backup request acceptance unit 21 records (S53) information relating to the backup lists 200 in the archive file. For example, based on the backup lists 200, the backup request acceptance unit 21 registers the number of file names recorded in the backup list 200A as the entry number in the created archive file and registers the path (path within the backup source server 3) of each file.
Next, the backup request acceptance unit 21 receives (YES in S54, and S55) one or more backup target files each having one or more file names written in the backup list 200A from the backup source server 3 and stores the received backup target files in the archive file (S56).
Once the backup request acceptance unit 21 has downloaded all the backup target files and stored these files in an archive file (NO in S54), the archive file is stored in the backup destination storage device 5A (S57).
According to the embodiment above, data characteristic classification definition information in which one or a plurality of data characteristic IDs correspond with one or more data characteristic types, and mapping information in which one, or two or more backup destination server information items (server names, for example) correspond with one or a plurality of data characteristic IDs are prepared. Upon receiving a backup target designation, the backup source server 3 sets metadata (and/or actual data) for the designated files (and/or directories) with predetermined timing, and, based on the above data characteristic classification definition information, sets data characteristic IDs (that is, data characteristic types) for the backup target files, and, based on the set data characteristic IDs and mapping information, determines the backup destination servers 6A to 6C of the backup target files, before transmitting the backup target files to the servers 6A to 6C so determined. Accordingly, even if the data characteristic classification definition information and mapping information (or, instead, information in which one, or two or more backup destination server information items correspond with one or more data characteristic types) are prepared, the designated backup target is automatically backed up to the backup destination matching the data characteristic type of the backup target on the basis of the data characteristics of the backup target. That is, backup processing, which is suited to the data characteristics relating to the backup target, is performed by means of a method that is simple for the user.
Further, according to the above embodiment, when there is no match between the current date and time when the download request is received from a certain backup destination server 6A and the acceptance date and time allocated to the backup list 200A of the server 6A, that is, even when a download request is received at a date and time other than the predetermined acceptance date and time, the backup source server 3 does not perform a backup of the backup target file. Accordingly, unauthorized downloading of the backup target file can be prevented before it takes place, whereby the security of the backup target file can be raised.
Therefore, according to the embodiment above, the backup discrimination information that the backup destination servers 6A to 6C create upon receiving the backup lists 200A to 200C is used by the backup source server 3 in order to recover the backup target files written in the backup lists 200A to 200C. The backup discrimination information corresponding with the backup lists 200A to 200C may be any information as long as the backup source server 3 is able to obtain the backup target files written in the corresponding backup list from the backup destination servers 6A to 6C. For example, the backup discrimination information can be information including at least one of the backup destination server name, the name of the backed up backup target file, and the data size. In such a case, the backup source server 3 can inform any backup destination server which file is to be stored by managing such information.
When backup discrimination information is received from the backup destination servers 6A to 6C, the backup source server 3 associates and records backup discrimination information corresponding with each of the servers 6A to 6C with information relating to a plurality of backup destination servers 6A to 6C (host name, for example) in a predetermined restore file D30 shown in FIG. 15, for example.
Then the backup source server 3 restores the backup target file to the backup source storage device 4 as follows by using the restore file D30.
FIG. 16 shows the flow of the restore processing of the backup source server 3.
The backup source server 3 performs the processing of (S61) to (S65) below with respect to all the servers 6A to 6C each having all the host names recorded in the restore file D30. This processing is described representatively for server 6A below.
(S61) The backup source server 3 connects to the backup destination server 6A.
(S62) The backup source server 3 reads the backup discrimination information for the server 6A constituting the connection destination from the restore file D30, sets the storage destination directory for the backup target file to be subsequently acquired from the backup destination server 6A in the backup source storage device 5, and acquires the path of this directory.
(S63) The backup source server 3 communicates the read backup discrimination information to the backup destination server 6A and, based on this backup discrimination information, specifies the archive file that stores the backup target file constituting the recovery target to the server 6 and acquires the backup target file from the specified archive file, whereby the acquired backup target file is received from the backup destination server 6A.
(S64) Based on the path acquired in S62, the backup source server 3 stores the backup target file received from the backup destination server 6A in the directory set in S62.
(S65) The backup source server 3 breaks the connection with the backup destination server 6A.
As a result of the above processing, the backup source server 3 is able to restore one or more backup target files, which have been backed up in the backup destination servers 6A to 6C respectively, to the backup source storage devices 5.
A preferred embodiment of the present invention has been described above but this embodiment is an example serving to illustrate the present invention and is not intended to restrict the scope of the present invention to this embodiment alone. The present invention can also be implemented in a variety of other forms.
For example, the backup request unit 12 is able to avoid a concentration of the load resulting from the backup processing on the backup destination servers 6A to 6C by varying the respective acceptance date and time of the backup lists 200A to 200C at fixed time intervals (a time interval that is presumed necessary in order for the backup destination servers 6A to 6C to acquire one or more predetermined backup target files from the backup source server 3, for example). This acceptance date and time may be established manually by the individual requesting the backup or may be established automatically by the backup source server 3. When the acceptance date and time are established automatically, the backup source server 3 is able to capture the total data size of one or more backed up backup target files for each of the backup destination servers 6A to 6C, estimate the time required for the backup on the basis of the data size, and schedule the acceptance date and time on the basis of the estimated time, for example (the acceptance date and time may be set in the order of the estimated backup time starting with the shortest or longest time first, for example).
In addition, for example, the backup destination servers 6A to 6C may issue a download request immediately after receiving a backup list from the backup source server 3. In this case, the acceptance date and time need not be written in the backup list, for example. Alternatively, when a download request is issued, the download request may be issued once again at the acceptance date and time listed in the backup list only when the communication traffic is congested. Further, in this case, for example, the backup source server 3 may transmit all the backup lists 200A to 200C to the backup destination servers 6A to 6C at the same time, or may schedule the timing for transmitting the backup lists 200A to 200C and perform transmission at another time. When a backup destination server requests a download immediately after receiving a backup list, the concentration of the load on the backup source server 3 or network can be avoided by adjusting the timing for transmitting the backup lists 200A to 200C. Further, the timing for transmitting the backup lists may be scheduled on the basis of an estimated time by capturing the total data size of one or more backup target files for each of the backup destination servers 6A to 6C, for example, and estimating the time required for a backup on the basis of this data size (the transmission timing may be brought forward for a shorter or longer estimated backup time, for example).

Claims (11)

1. A backup system, comprising:
a backup source computer device that stores backup target data;
a plurality of backup destination computer devices each connected to the backup source computer device via a network;
a backup mode selector that selects, according to data characteristics of the backup target data, any one backup mode from among a plurality of pre-prepared backup modes; and
a backup executor that stores the backup target data by transferring the backup target data from the backup source computer device to a backup destination computer device that is selected on the basis of the selected backup mode from among the backup destination computer devices,
wherein the backup executor selects a backup destination computer device constituting a backup destination on the basis of backup destination mapping information constituted so as to pre-match at least one or more backup destination computer devices of the backup destination computer devices with each backup mode.
2. The backup system according to claim 1, wherein the backup mode selector determines whether the backup target data possesses any data characteristic on the basis of pre-prepared characteristic classification conditions.
3. The backup system according to claim 2, wherein the backup mode selector determines whether the backup target data possesses any data characteristics by comparing acquired metadata relating to the backup target data, and the characteristic classification conditions.
4. The backup system according to claim 1, wherein data characteristics include any one of data characteristics that prioritize the securing of data reliability or data characteristics that prioritize the securing of data security.
5. The backup system according to claim 4, wherein the data characteristics that prioritize the securing of data reliability can be determined on the basis of at least one or more of judgment elements comprising the number of common users, file extension type, file name, and the presence or absence of write permissions; and
the data characteristics that prioritize the securing of data security can be determined on the basis of at least one or more of judgment elements comprising the presence or absence of encryption, the number of common users, special features common to common users, the presence or absence of access restrictions, file extension type, file name, and the presence or absence of predetermined keywords.
6. The backup system according to claim 1, wherein the backup executor comprises:
a backup list generator that generates a backup list that includes information specifying backup target data to be acquired by the backup destination computer device; and
a backup list transmitter that transmits the backup list to the backup destination computer device, and
wherein the backup destination computer device comprises:
a data acquisitor that stores backup target data by acquiring same from the backup source computer device on the basis of the backup list received from the backup source computer device.
7. The backup system according to claim 6, wherein the backup list includes information indicating a backup availability time when backup target data can be acquired from the backup source computer device, and the backup destination computer device accesses the backup source computer device according to the backup availability time to acquire the backup target data.
8. The backup system according to claim 6, wherein, upon receiving the backup list from the backup source computer device, the backup destination computer device generates restore data to be used for restoring the backup target data, and transmits the restore data thus generated to the backup source computer device.
9. The backup system according to claim 6, wherein the backup list transmitter controls the time for transmitting the backup list to the backup destination computer device.
10. A backup method that performs a backup between a backup source computer device for storing backup target data and a plurality of backup destination computer devices each connected to the backup source computer device via a network, the backup method comprising:
determining data characteristics pertaining to backup target data on the basis of characteristic classification conditions for classifying data characteristics;
selecting a backup mode by determining a backup destination computer device for each of the backup target data on the basis of the determined data characteristics and of backup destination mapping information that is constituted so that at least one or more of the backup destination computer devices constituting backup destinations correspond(s) with each of the data characteristics;
collecting, for each of the backup destination computer devices, the backup target data corresponding with the backup destination computer devices;
generating, for each of the backup destination computer devices, a backup list that includes information specifying backup target data to be acquired by the backup destination computer devices;
transmitting the generated backup lists to the backup destination computer devices; and
transmitting the backup target data from the backup source computer device to the backup destination computer devices on the basis of the received backup lists.
11. A computer device, comprising:
a component that stores characteristic classification conditions for classifying data characteristics;
a component that determines data characteristics pertaining to backup target data on the basis of the characteristic classification conditions;
a component that stores backup destination mapping information constituted such that at least one or more backup destination computer devices constituting a backup destination correspond(s) with each of the data characteristics;
a component that selects a backup mode by determining a backup destination computer device for each of the backup target data on the basis of the determined data characteristics and the backup destination mapping information;
a component that collects, for each of the backup destination computer devices, the backup target data corresponding with the backup destination computer devices and generating, for each of the backup destination computer devices, a backup list including information specifying backup target data to be acquired by the backup destination computer devices;
a component that transmits the generated backup lists to the backup destination computer devices; and
a component that transfers the backup target data to the backup destination computer devices when the backup destination computer devices request the acquisition of backup target data on the basis of the transmitted backup lists.
US10/794,241 2003-09-12 2004-03-05 Backup system and method based on data characteristics Expired - Fee Related US7100007B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/486,610 US20060259724A1 (en) 2003-09-12 2006-07-13 Backup system and method based on data characteristics

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2003-320771 2003-09-12
JP2003320771A JP4404246B2 (en) 2003-09-12 2003-09-12 Backup system and method based on data characteristics

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US11/486,610 Continuation US20060259724A1 (en) 2003-09-12 2006-07-13 Backup system and method based on data characteristics

Publications (2)

Publication Number Publication Date
US20050060356A1 US20050060356A1 (en) 2005-03-17
US7100007B2 true US7100007B2 (en) 2006-08-29

Family

ID=34269936

Family Applications (2)

Application Number Title Priority Date Filing Date
US10/794,241 Expired - Fee Related US7100007B2 (en) 2003-09-12 2004-03-05 Backup system and method based on data characteristics
US11/486,610 Abandoned US20060259724A1 (en) 2003-09-12 2006-07-13 Backup system and method based on data characteristics

Family Applications After (1)

Application Number Title Priority Date Filing Date
US11/486,610 Abandoned US20060259724A1 (en) 2003-09-12 2006-07-13 Backup system and method based on data characteristics

Country Status (2)

Country Link
US (2) US7100007B2 (en)
JP (1) JP4404246B2 (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050125467A1 (en) * 2002-12-11 2005-06-09 Fujitsu Limited Backup system, backup controlling apparatus, backup data managing method and a computer readable recording medium recorded thereon backup controlling program
US20060036658A1 (en) * 2004-08-13 2006-02-16 Henrickson David L Combined computer disaster recovery and migration tool for effective disaster recovery as well as the backup and migration of user- and system-specific information
US20060195666A1 (en) * 2005-02-25 2006-08-31 Naoko Maruyama Switching method of data replication mode
US20060288058A1 (en) * 2005-04-28 2006-12-21 Farstone Tech., Inc. Backup/recovery system and methods regarding the same
US20070198610A1 (en) * 2006-02-17 2007-08-23 Hon Hai Precision Industry Co., Ltd. System and method for backing up a database
US20070233828A1 (en) * 2006-03-31 2007-10-04 Jeremy Gilbert Methods and systems for providing data storage and retrieval
US20080140960A1 (en) * 2006-12-06 2008-06-12 Jason Ferris Basler System and method for optimizing memory usage during data backup
US20090254593A1 (en) * 2008-04-03 2009-10-08 Memeo, Inc. Online-assisted backup and restore
US20100005287A1 (en) * 2001-03-27 2010-01-07 Micron Technology, Inc. Data security for digital data storage
US8341121B1 (en) * 2007-09-28 2012-12-25 Emc Corporation Imminent failure prioritized backup
US8656057B1 (en) 2009-04-01 2014-02-18 Emc Corporation Opportunistic restore
CN105868049A (en) * 2015-12-15 2016-08-17 乐视移动智能信息技术(北京)有限公司 Data processing method and apparatus
US9537705B1 (en) * 2009-03-31 2017-01-03 EMC IP Holding Company LLC Global space reduction groups
US9626305B1 (en) * 2009-03-31 2017-04-18 EMC IP Holding Company LLC Complementary space reduction
US10210054B1 (en) * 2017-08-31 2019-02-19 International Business Machines Corporation Backup optimization in hybrid storage environment
US11321189B2 (en) 2014-04-02 2022-05-03 Commvault Systems, Inc. Information management by a media agent in the absence of communications with a storage manager
US11429499B2 (en) 2016-09-30 2022-08-30 Commvault Systems, Inc. Heartbeat monitoring of virtual machines for initiating failover operations in a data storage management system, including operations by a master monitor node
US11449394B2 (en) * 2010-06-04 2022-09-20 Commvault Systems, Inc. Failover systems and methods for performing backup operations, including heterogeneous indexing and load balancing of backup and indexing resources
US11550680B2 (en) 2018-12-06 2023-01-10 Commvault Systems, Inc. Assigning backup resources in a data storage management system based on failover of partnered data storage resources
US11645175B2 (en) 2021-02-12 2023-05-09 Commvault Systems, Inc. Automatic failover of a storage manager
US11663099B2 (en) 2020-03-26 2023-05-30 Commvault Systems, Inc. Snapshot-based disaster recovery orchestration of virtual machine failover and failback operations

Families Citing this family (132)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6879988B2 (en) * 2000-03-09 2005-04-12 Pkware System and method for manipulating and managing computer archive files
US7844579B2 (en) * 2000-03-09 2010-11-30 Pkware, Inc. System and method for manipulating and managing computer archive files
US8230482B2 (en) * 2000-03-09 2012-07-24 Pkware, Inc. System and method for manipulating and managing computer archive files
US20060173847A1 (en) * 2000-03-09 2006-08-03 Pkware, Inc. System and method for manipulating and managing computer archive files
US20060143714A1 (en) * 2000-03-09 2006-06-29 Pkware, Inc. System and method for manipulating and managing computer archive files
US20050015608A1 (en) 2003-07-16 2005-01-20 Pkware, Inc. Method for strongly encrypting .ZIP files
US8959582B2 (en) 2000-03-09 2015-02-17 Pkware, Inc. System and method for manipulating and managing computer archive files
JP2004015141A (en) * 2002-06-04 2004-01-15 Fuji Xerox Co Ltd System and method for transmitting data
JP4354233B2 (en) * 2003-09-05 2009-10-28 株式会社日立製作所 Backup system and method
WO2005050952A1 (en) * 2003-11-21 2005-06-02 Nimcat Networks Inc. Back up of network devices
US20060070120A1 (en) * 2004-09-02 2006-03-30 Brother Kogyo Kabushiki Kaisha File transmitting device and multi function device
US8145601B2 (en) 2004-09-09 2012-03-27 Microsoft Corporation Method, system, and apparatus for providing resilient data transfer in a data protection system
US7567974B2 (en) 2004-09-09 2009-07-28 Microsoft Corporation Method, system, and apparatus for configuring a data protection system
TWI252413B (en) * 2004-12-10 2006-04-01 Hon Hai Prec Ind Co Ltd System and method for updating remote computer files
US8122191B2 (en) 2005-02-17 2012-02-21 Overland Storage, Inc. Data protection systems with multiple site replication
US7600133B2 (en) * 2005-02-24 2009-10-06 Lenovo Singapore Pte. Ltd Backing up at least one encrypted computer file
US20060218435A1 (en) * 2005-03-24 2006-09-28 Microsoft Corporation Method and system for a consumer oriented backup
US20060288057A1 (en) * 2005-06-15 2006-12-21 Ian Collins Portable data backup appliance
JP2007025858A (en) * 2005-07-13 2007-02-01 Konica Minolta Photo Imaging Inc Display program
US7734589B1 (en) 2005-09-16 2010-06-08 Qurio Holdings, Inc. System and method for optimizing data uploading in a network based media sharing system
US7747574B1 (en) * 2005-09-19 2010-06-29 Qurio Holdings, Inc. System and method for archiving digital media
US7818160B2 (en) * 2005-10-12 2010-10-19 Storage Appliance Corporation Data backup devices and methods for backing up data
US7813913B2 (en) * 2005-10-12 2010-10-12 Storage Appliance Corporation Emulation component for data backup applications
US7844445B2 (en) * 2005-10-12 2010-11-30 Storage Appliance Corporation Automatic connection to an online service provider from a backup system
US20070162271A1 (en) * 2005-10-12 2007-07-12 Storage Appliance Corporation Systems and methods for selecting and printing data files from a backup system
US20080028008A1 (en) * 2006-07-31 2008-01-31 Storage Appliance Corporation Optical disc initiated data backup
US8195444B2 (en) * 2005-10-12 2012-06-05 Storage Appliance Corporation Systems and methods for automated diagnosis and repair of storage devices
US20070091746A1 (en) * 2005-10-12 2007-04-26 Storage Appliance Corporation Optical disc for simplified data backup
US7822595B2 (en) * 2005-10-12 2010-10-26 Storage Appliance Corporation Systems and methods for selectively copying embedded data files
US8069271B2 (en) * 2005-10-12 2011-11-29 Storage Appliance Corporation Systems and methods for converting a media player into a backup device
US7702830B2 (en) * 2005-10-12 2010-04-20 Storage Appliance Corporation Methods for selectively copying data files to networked storage and devices for initiating the same
US7899662B2 (en) * 2005-10-12 2011-03-01 Storage Appliance Corporation Data backup system including a data protection component
US9141825B2 (en) * 2005-11-18 2015-09-22 Qurio Holdings, Inc. System and method for controlling access to assets in a network-based media sharing system using tagging
US7657550B2 (en) 2005-11-28 2010-02-02 Commvault Systems, Inc. User interfaces and methods for managing data in a metabase
US20070136200A1 (en) * 2005-12-09 2007-06-14 Microsoft Corporation Backup broker for private, integral and affordable distributed storage
US8930496B2 (en) 2005-12-19 2015-01-06 Commvault Systems, Inc. Systems and methods of unified reconstruction in storage systems
US20200257596A1 (en) 2005-12-19 2020-08-13 Commvault Systems, Inc. Systems and methods of unified reconstruction in storage systems
US7966513B2 (en) * 2006-02-03 2011-06-21 Emc Corporation Automatic classification of backup clients
JP2007257814A (en) * 2006-02-27 2007-10-04 Fujitsu Ltd Library system and method for controlling library system
WO2007138463A2 (en) * 2006-05-31 2007-12-06 Pankaj Anand Local data archiving method and system thereof
US8078580B2 (en) 2006-05-31 2011-12-13 Hewlett-Packard Development Company, L.P. Hybrid data archival method and system thereof
US9052826B2 (en) * 2006-07-28 2015-06-09 Condusiv Technologies Corporation Selecting storage locations for storing data based on storage location attributes and data usage statistics
US7870128B2 (en) * 2006-07-28 2011-01-11 Diskeeper Corporation Assigning data for storage based on speed with which data may be retrieved
US7536504B2 (en) * 2006-07-28 2009-05-19 Diskeeper Corporation Online storage medium transfer rate characteristics determination
US20090132621A1 (en) * 2006-07-28 2009-05-21 Craig Jensen Selecting storage location for file storage based on storage longevity and speed
US20080082453A1 (en) * 2006-10-02 2008-04-03 Storage Appliance Corporation Methods for bundling credits with electronic devices and systems for implementing the same
JP2008117342A (en) * 2006-11-08 2008-05-22 Hitachi Ltd Storage system, and controller for controlling remote copying
US20080126446A1 (en) * 2006-11-27 2008-05-29 Storage Appliance Corporation Systems and methods for backing up user settings
JP4930031B2 (en) * 2006-12-13 2012-05-09 富士通株式会社 Control device and control system
US20080172487A1 (en) * 2007-01-03 2008-07-17 Storage Appliance Corporation Systems and methods for providing targeted marketing
US20080226082A1 (en) * 2007-03-12 2008-09-18 Storage Appliance Corporation Systems and methods for secure data backup
US8924844B2 (en) * 2007-03-13 2014-12-30 Visual Cues Llc Object annotation
DE102007013139A1 (en) * 2007-03-15 2008-09-18 Stefan Kistner Method and computer promo product for classifying electronic data
US20080270453A1 (en) * 2007-04-16 2008-10-30 Memeo, Inc. Keyword-based content management
US20090030955A1 (en) * 2007-06-11 2009-01-29 Storage Appliance Corporation Automated data backup with graceful shutdown for vista-based system
US20090031298A1 (en) * 2007-06-11 2009-01-29 Jeffrey Brunet System and method for automated installation and/or launch of software
US20080320053A1 (en) * 2007-06-21 2008-12-25 Michio Iijima Data management method for accessing data storage area based on characteristic of stored data
US7991972B2 (en) * 2007-12-06 2011-08-02 International Business Machines Corporation Determining whether to use a full volume or repository for a logical copy backup space
FR2924839B1 (en) * 2007-12-06 2010-03-19 Agematis METHOD FOR AUTOMATICALLY SAVING DIGITAL DATA PRESERVED IN MEMORY IN A COMPUTER INSTALLATION, COMPUTER-READABLE DATA MEDIUM, COMPUTER-BASED INSTALLATION AND SYSTEM FOR IMPLEMENTING SAID METHOD
US8296301B2 (en) 2008-01-30 2012-10-23 Commvault Systems, Inc. Systems and methods for probabilistic data classification
US7836174B2 (en) 2008-01-30 2010-11-16 Commvault Systems, Inc. Systems and methods for grid-based data scanning
US8819363B2 (en) * 2008-02-12 2014-08-26 Fujitsu Limited Data copying method
WO2009122528A1 (en) 2008-03-31 2009-10-08 富士通株式会社 Integrated configuration management device, disparate configuration management device, and backup data management system
US8745001B1 (en) * 2008-03-31 2014-06-03 Symantec Operating Corporation Automated remediation of corrupted and tempered files
US8307177B2 (en) 2008-09-05 2012-11-06 Commvault Systems, Inc. Systems and methods for management of virtualization data
US8930423B1 (en) * 2008-12-30 2015-01-06 Symantec Corporation Method and system for restoring encrypted files from a virtual machine image
JP5387827B2 (en) * 2009-03-19 2014-01-15 日本電気株式会社 Network management device, network management system, network management method, and program
JP5683088B2 (en) * 2009-08-31 2015-03-11 沖電気工業株式会社 Recovery system, recovery method, and backup control system
US8566288B1 (en) * 2009-08-31 2013-10-22 Cms Products, Inc. Organized data removal or redirection from a cloning process to enable cloning a larger system to a smaller system
WO2011036707A1 (en) * 2009-09-24 2011-03-31 Hitachi, Ltd. Computer system for controlling backups using wide area network
US8566287B2 (en) * 2010-01-29 2013-10-22 Hewlett-Packard Development Company, L.P. Method and apparatus for scheduling data backups
US8413137B2 (en) * 2010-02-04 2013-04-02 Storage Appliance Corporation Automated network backup peripheral device and method
US8725970B2 (en) * 2011-08-23 2014-05-13 Ca, Inc. System and method for backing up data
US8626714B1 (en) * 2011-09-07 2014-01-07 Symantec Corporation Automated separation of corporate and private data for backup and archiving
US8700572B2 (en) 2011-12-20 2014-04-15 Hitachi, Ltd. Storage system and method for controlling storage system
JP5889009B2 (en) * 2012-01-31 2016-03-22 キヤノン株式会社 Document management server, document management method, program
US8521692B1 (en) 2012-02-28 2013-08-27 Hitachi, Ltd. Storage system and method for controlling storage system
US8914663B2 (en) 2012-03-28 2014-12-16 Hewlett-Packard Development Company, L.P. Rescheduling failed backup jobs
CN102750476B (en) * 2012-06-07 2015-04-08 腾讯科技(深圳)有限公司 Method and system for identifying file security
US8892523B2 (en) 2012-06-08 2014-11-18 Commvault Systems, Inc. Auto summarization of content
US10581763B2 (en) 2012-09-21 2020-03-03 Avago Technologies International Sales Pte. Limited High availability application messaging layer
US9967106B2 (en) 2012-09-24 2018-05-08 Brocade Communications Systems LLC Role based multicast messaging infrastructure
US9454995B2 (en) 2012-11-30 2016-09-27 Samsung Electronics Co., Ltd. Information storage medium storing content, content providing method, content reproducing method and apparatus therefor
CN103064757A (en) * 2012-12-12 2013-04-24 鸿富锦精密工业(深圳)有限公司 Method and system for backing up data
US9069482B1 (en) * 2012-12-14 2015-06-30 Emc Corporation Method and system for dynamic snapshot based backup and recovery operations
US10725996B1 (en) * 2012-12-18 2020-07-28 EMC IP Holding Company LLC Method and system for determining differing file path hierarchies for backup file paths
US9223597B2 (en) 2012-12-21 2015-12-29 Commvault Systems, Inc. Archiving virtual machines in a data storage system
US20140181038A1 (en) 2012-12-21 2014-06-26 Commvault Systems, Inc. Systems and methods to categorize unprotected virtual machines
US9703584B2 (en) 2013-01-08 2017-07-11 Commvault Systems, Inc. Virtual server agent load balancing
US20140201162A1 (en) 2013-01-11 2014-07-17 Commvault Systems, Inc. Systems and methods to restore selected files from block-level backup for virtual machines
US9286110B2 (en) 2013-01-14 2016-03-15 Commvault Systems, Inc. Seamless virtual machine recall in a data storage system
US20150370645A1 (en) * 2013-02-27 2015-12-24 Hewlett-Packard Development Company, L.P. Selecting a backup type based on changed data
JP6015850B2 (en) * 2013-03-29 2016-10-26 日本電気株式会社 Information processing system, server device, program, and information processing method
US20150074536A1 (en) 2013-09-12 2015-03-12 Commvault Systems, Inc. File manager integration with virtualization in an information management system, including user control and storage management of virtual machines
US20160019317A1 (en) 2014-07-16 2016-01-21 Commvault Systems, Inc. Volume or virtual machine level backup and generating placeholders for virtual machine files
US9417968B2 (en) 2014-09-22 2016-08-16 Commvault Systems, Inc. Efficiently restoring execution of a backed up virtual machine based on coordination with virtual-machine-file-relocation operations
US9710465B2 (en) 2014-09-22 2017-07-18 Commvault Systems, Inc. Efficiently restoring execution of a backed up virtual machine based on coordination with virtual-machine-file-relocation operations
US9436555B2 (en) 2014-09-22 2016-09-06 Commvault Systems, Inc. Efficient live-mount of a backed up virtual machine in a storage management system
US10776209B2 (en) 2014-11-10 2020-09-15 Commvault Systems, Inc. Cross-platform virtual machine backup and replication
US9983936B2 (en) 2014-11-20 2018-05-29 Commvault Systems, Inc. Virtual machine change block tracking
JP6511795B2 (en) * 2014-12-18 2019-05-15 富士通株式会社 STORAGE MANAGEMENT DEVICE, STORAGE MANAGEMENT METHOD, STORAGE MANAGEMENT PROGRAM, AND STORAGE SYSTEM
US9946603B1 (en) * 2015-04-14 2018-04-17 EMC IP Holding Company LLC Mountable container for incremental file backups
US10078555B1 (en) 2015-04-14 2018-09-18 EMC IP Holding Company LLC Synthetic full backups for incremental file backups
US9996429B1 (en) 2015-04-14 2018-06-12 EMC IP Holding Company LLC Mountable container backups for files
JP2016218906A (en) * 2015-05-25 2016-12-22 パナソニックIpマネジメント株式会社 Data recording and reproduction system
JP6520448B2 (en) 2015-06-18 2019-05-29 富士通株式会社 INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING DEVICE, AND INFORMATION PROCESSING DEVICE CONTROL METHOD
US10642633B1 (en) * 2015-09-29 2020-05-05 EMC IP Holding Company LLC Intelligent backups with dynamic proxy in virtualized environment
US10430283B1 (en) * 2015-09-30 2019-10-01 EMC IP Holding Company LLC Intelligent data dissemination
US10592350B2 (en) 2016-03-09 2020-03-17 Commvault Systems, Inc. Virtual server cloud file system for virtual machine restore to cloud operations
US10540516B2 (en) 2016-10-13 2020-01-21 Commvault Systems, Inc. Data protection within an unsecured storage environment
US10162528B2 (en) 2016-10-25 2018-12-25 Commvault Systems, Inc. Targeted snapshot based on virtual machine location
US10152251B2 (en) 2016-10-25 2018-12-11 Commvault Systems, Inc. Targeted backup of virtual machine
US10389810B2 (en) 2016-11-02 2019-08-20 Commvault Systems, Inc. Multi-threaded scanning of distributed file systems
US10922189B2 (en) 2016-11-02 2021-02-16 Commvault Systems, Inc. Historical network data-based scanning thread generation
US10678758B2 (en) 2016-11-21 2020-06-09 Commvault Systems, Inc. Cross-platform virtual machine data and memory backup and replication
US10896100B2 (en) 2017-03-24 2021-01-19 Commvault Systems, Inc. Buffered virtual machine replication
US10387073B2 (en) 2017-03-29 2019-08-20 Commvault Systems, Inc. External dynamic virtual machine synchronization
US10642886B2 (en) 2018-02-14 2020-05-05 Commvault Systems, Inc. Targeted search of backup data using facial recognition
US20190251204A1 (en) 2018-02-14 2019-08-15 Commvault Systems, Inc. Targeted search of backup data using calendar event data
US10877928B2 (en) 2018-03-07 2020-12-29 Commvault Systems, Inc. Using utilities injected into cloud-based virtual machines for speeding up virtual machine backup operations
US10984122B2 (en) * 2018-04-13 2021-04-20 Sophos Limited Enterprise document classification
JP2020047114A (en) * 2018-09-20 2020-03-26 富士ゼロックス株式会社 Data processing device, data processing method, and data processing program
US10996974B2 (en) 2019-01-30 2021-05-04 Commvault Systems, Inc. Cross-hypervisor live mount of backed up virtual machine data, including management of cache storage for virtual machine data
US10768971B2 (en) 2019-01-30 2020-09-08 Commvault Systems, Inc. Cross-hypervisor live mount of backed up virtual machine data
US11709740B2 (en) * 2019-07-18 2023-07-25 EMC IP Holding Company LLC Automatically determining optimal storage medium based on source data characteristics
US20220188195A1 (en) * 2019-07-18 2022-06-16 EMC IP Holding Company LLC Automatically determining optimal storage medium based on source data characteristics
US11467753B2 (en) 2020-02-14 2022-10-11 Commvault Systems, Inc. On-demand restore of virtual machine data
US11442768B2 (en) 2020-03-12 2022-09-13 Commvault Systems, Inc. Cross-hypervisor live recovery of virtual machines
CN111459718B (en) * 2020-03-31 2024-02-09 珠海格力电器股份有限公司 Multi-terminal system and data backup method and storage medium thereof
US11748143B2 (en) 2020-05-15 2023-09-05 Commvault Systems, Inc. Live mount of virtual machines in a public cloud computing environment
US11656951B2 (en) 2020-10-28 2023-05-23 Commvault Systems, Inc. Data loss vulnerability detection
CN114461762A (en) * 2022-04-08 2022-05-10 深圳市科力锐科技有限公司 Archive change identification method, device, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6286085B1 (en) * 1997-04-21 2001-09-04 Alcatel System for backing up data synchronously and a synchronously depending on a pre-established criterion
US20020083085A1 (en) * 2000-12-22 2002-06-27 Davis Ray Charles Virtual tape storage system and method
JP2002215474A (en) 2001-01-15 2002-08-02 Fujitsu Ten Ltd Network data backup system
US20020184559A1 (en) 2001-06-01 2002-12-05 Farstone Technology Inc. Backup/recovery system and methods regarding the same
US20030126247A1 (en) * 2002-01-02 2003-07-03 Exanet Ltd. Apparatus and method for file backup using multiple backup devices
US20040034672A1 (en) 2002-05-30 2004-02-19 Takeshi Inagaki Data backup technique using network
US6850958B2 (en) 2001-05-25 2005-02-01 Fujitsu Limited Backup system, backup method, database apparatus, and backup apparatus
US20050125467A1 (en) 2002-12-11 2005-06-09 Fujitsu Limited Backup system, backup controlling apparatus, backup data managing method and a computer readable recording medium recorded thereon backup controlling program

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09204362A (en) * 1996-01-26 1997-08-05 F I T:Kk Backup system for file
JPH09214935A (en) * 1996-02-02 1997-08-15 Mitsubishi Electric Corp Video information service system
JP2003330783A (en) * 2002-05-15 2003-11-21 Secom Co Ltd Document storage system, document storage method and information recording medium
US20050081008A1 (en) * 2003-10-10 2005-04-14 Stephen Gold Loading of media

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6286085B1 (en) * 1997-04-21 2001-09-04 Alcatel System for backing up data synchronously and a synchronously depending on a pre-established criterion
US20020083085A1 (en) * 2000-12-22 2002-06-27 Davis Ray Charles Virtual tape storage system and method
JP2002215474A (en) 2001-01-15 2002-08-02 Fujitsu Ten Ltd Network data backup system
US6850958B2 (en) 2001-05-25 2005-02-01 Fujitsu Limited Backup system, backup method, database apparatus, and backup apparatus
US20020184559A1 (en) 2001-06-01 2002-12-05 Farstone Technology Inc. Backup/recovery system and methods regarding the same
US20030126247A1 (en) * 2002-01-02 2003-07-03 Exanet Ltd. Apparatus and method for file backup using multiple backup devices
US20040034672A1 (en) 2002-05-30 2004-02-19 Takeshi Inagaki Data backup technique using network
US20050125467A1 (en) 2002-12-11 2005-06-09 Fujitsu Limited Backup system, backup controlling apparatus, backup data managing method and a computer readable recording medium recorded thereon backup controlling program

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100005287A1 (en) * 2001-03-27 2010-01-07 Micron Technology, Inc. Data security for digital data storage
US20120233454A1 (en) * 2001-03-27 2012-09-13 Rollins Doug L Data security for digital data storage
US8191159B2 (en) * 2001-03-27 2012-05-29 Micron Technology, Inc Data security for digital data storage
US9003177B2 (en) * 2001-03-27 2015-04-07 Micron Technology, Inc. Data security for digital data storage
US20050125467A1 (en) * 2002-12-11 2005-06-09 Fujitsu Limited Backup system, backup controlling apparatus, backup data managing method and a computer readable recording medium recorded thereon backup controlling program
US7539708B2 (en) * 2002-12-11 2009-05-26 Fujitsu Limited Backup system, backup controlling apparatus, backup data managing method and a computer readable recording medium recorded thereon backup controlling program
US20060036658A1 (en) * 2004-08-13 2006-02-16 Henrickson David L Combined computer disaster recovery and migration tool for effective disaster recovery as well as the backup and migration of user- and system-specific information
US8224784B2 (en) * 2004-08-13 2012-07-17 Microsoft Corporation Combined computer disaster recovery and migration tool for effective disaster recovery as well as the backup and migration of user- and system-specific information
US20060195666A1 (en) * 2005-02-25 2006-08-31 Naoko Maruyama Switching method of data replication mode
US7398364B2 (en) * 2005-02-25 2008-07-08 Hitachi, Ltd. Switching method of data replication mode
US20060288058A1 (en) * 2005-04-28 2006-12-21 Farstone Tech., Inc. Backup/recovery system and methods regarding the same
US20070198610A1 (en) * 2006-02-17 2007-08-23 Hon Hai Precision Industry Co., Ltd. System and method for backing up a database
US20070233828A1 (en) * 2006-03-31 2007-10-04 Jeremy Gilbert Methods and systems for providing data storage and retrieval
US20080140960A1 (en) * 2006-12-06 2008-06-12 Jason Ferris Basler System and method for optimizing memory usage during data backup
US8341121B1 (en) * 2007-09-28 2012-12-25 Emc Corporation Imminent failure prioritized backup
US20090254593A1 (en) * 2008-04-03 2009-10-08 Memeo, Inc. Online-assisted backup and restore
US10430289B2 (en) * 2008-04-03 2019-10-01 Unicom Systems, Inc. Online-assisted backup and restore
US9537705B1 (en) * 2009-03-31 2017-01-03 EMC IP Holding Company LLC Global space reduction groups
US9626305B1 (en) * 2009-03-31 2017-04-18 EMC IP Holding Company LLC Complementary space reduction
US8656057B1 (en) 2009-04-01 2014-02-18 Emc Corporation Opportunistic restore
US12001295B2 (en) 2010-06-04 2024-06-04 Commvault Systems, Inc. Heterogeneous indexing and load balancing of backup and indexing resources
US11449394B2 (en) * 2010-06-04 2022-09-20 Commvault Systems, Inc. Failover systems and methods for performing backup operations, including heterogeneous indexing and load balancing of backup and indexing resources
US11321189B2 (en) 2014-04-02 2022-05-03 Commvault Systems, Inc. Information management by a media agent in the absence of communications with a storage manager
CN105868049A (en) * 2015-12-15 2016-08-17 乐视移动智能信息技术(北京)有限公司 Data processing method and apparatus
US11429499B2 (en) 2016-09-30 2022-08-30 Commvault Systems, Inc. Heartbeat monitoring of virtual machines for initiating failover operations in a data storage management system, including operations by a master monitor node
US10552269B2 (en) * 2017-08-31 2020-02-04 International Business Machines Corporation Backup optimization in hybrid storage environment
US10210054B1 (en) * 2017-08-31 2019-02-19 International Business Machines Corporation Backup optimization in hybrid storage environment
US11550680B2 (en) 2018-12-06 2023-01-10 Commvault Systems, Inc. Assigning backup resources in a data storage management system based on failover of partnered data storage resources
US11663099B2 (en) 2020-03-26 2023-05-30 Commvault Systems, Inc. Snapshot-based disaster recovery orchestration of virtual machine failover and failback operations
US11645175B2 (en) 2021-02-12 2023-05-09 Commvault Systems, Inc. Automatic failover of a storage manager
US12056026B2 (en) 2021-02-12 2024-08-06 Commvault Systems, Inc. Automatic failover of a storage manager

Also Published As

Publication number Publication date
US20060259724A1 (en) 2006-11-16
US20050060356A1 (en) 2005-03-17
JP2005092282A (en) 2005-04-07
JP4404246B2 (en) 2010-01-27

Similar Documents

Publication Publication Date Title
US7100007B2 (en) Backup system and method based on data characteristics
US11561931B2 (en) Information source agent systems and methods for distributed data storage and management using content signatures
EP1513065B1 (en) File system and file transfer method between file sharing devices
JP4648723B2 (en) Method and apparatus for hierarchical storage management based on data value
US8700576B2 (en) Method, system, and program for archiving files
EP3133507A1 (en) Context-based data classification
US20080104145A1 (en) Method and appartus for backup of networked computers
US7234077B2 (en) Rapid restoration of file system usage in very large file systems
US6990631B2 (en) Document management apparatus, related document extracting method, and document processing assist method
JP4426280B2 (en) Backup / restore system and method
US7840750B2 (en) Electrical transmission system in secret environment between virtual disks and electrical transmission method thereof
US20020152261A1 (en) Method and system for preventing the infringement of intellectual property rights
US20050086447A1 (en) Program and apparatus for blocking information leaks, and storage medium for the program
US20070276823A1 (en) Data management systems and methods for distributed data storage and management using content signatures
US20100306180A1 (en) File revision management
US20070083487A1 (en) Document preservation
US8065743B2 (en) Content use management system, content-providing system, content-using device and computer readable medium
JPH09214935A (en) Video information service system
RU2715288C1 (en) System and method of deleting files and counteracting recovery thereof
US6941322B2 (en) Method for efficient recording and management of data changes to an object
US7912859B2 (en) Information processing apparatus, system, and method for managing documents used in an organization
US8316008B1 (en) Fast file attribute search
EP1156426A2 (en) Content managing system, content managing method, and camera apparatus
EP1156425A2 (en) Content managing system and content managing method
US9002788B2 (en) System for configurable reporting of network data and related method

Legal Events

Date Code Title Description
AS Assignment

Owner name: HITACHI, LTD., JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SAIKA, NOBUYUKI;REEL/FRAME:015619/0475

Effective date: 20040626

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FEPP Fee payment procedure

Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.)

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20180829