US20070198690A1 - Data Management System - Google Patents
Data Management System Download PDFInfo
- Publication number
- US20070198690A1 US20070198690A1 US11/733,305 US73330507A US2007198690A1 US 20070198690 A1 US20070198690 A1 US 20070198690A1 US 73330507 A US73330507 A US 73330507A US 2007198690 A1 US2007198690 A1 US 2007198690A1
- Authority
- US
- United States
- Prior art keywords
- data
- server
- information
- storage
- stored
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/9032—Query formulation
- G06F16/90324—Query formulation using system suggestions
- G06F16/90328—Query formulation using system suggestions using search space presentation or visualization, e.g. category or range presentation and selection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99941—Database schema or data structure
- Y10S707/99942—Manipulating data structure, e.g. compression, compaction, compilation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99941—Database schema or data structure
- Y10S707/99943—Generating database or data structure, e.g. via user interface
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99941—Database schema or data structure
- Y10S707/99944—Object-oriented database structure
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99941—Database schema or data structure
- Y10S707/99944—Object-oriented database structure
- Y10S707/99945—Object-oriented database structure processing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99951—File or database maintenance
- Y10S707/99952—Coherency, e.g. same view to multiple users
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99951—File or database maintenance
- Y10S707/99952—Coherency, e.g. same view to multiple users
- Y10S707/99953—Recoverability
Definitions
- This invention relates to systems for storing data, and in particular to storage systems in which data is distributed among large numbers of hard disk drives or other storage media.
- an e-mail server In a typical data storage network, data from many different applications is stored and retrieved, and it is difficult to track the relationships among all of the data stored.
- an e-mail server generates original data and provides it to a storage system.
- An archive server may archive some parts of the data to different parts of the storage system or to different storage systems.
- a replication server may replicate the original data to different storage, and the data may be backed up by a backup server to yet further storage. While each of these data handling processes operate on the data associated with that process in an appropriate manner, the archive server, the replication server and the backup server each operate independently. Each has its own catalog or other mechanism for managing how the data is stored and retrieved. Because of the distributed nature of the system and the lack of consolidated catalogs, a user of a storage system typically cannot understand where data is situated in that storage system on a reliable basis.
- the complexity of storage systems increases the probability of mistakes.
- some parts of the original data are not stored in the original storage, but instead have been stored in the archive storage.
- a replication of the original data will not contain the archive data.
- the backup data will also not contain the archive data. Therefore, when a user restores data from the backup, because the backup data is not a complete backup of the original data, not all of the original data will be restored. All of this complexity makes managing the data in a coherent manner difficult and error-prone.
- SANPoint Control a commercially available tool for use in management of a data storage system
- AppIQ storage authority suite
- This system provides information about the hardware in the storage system, including hosts, bus adapters, switches, disk subsystems, etc. It also provides capabilities for management of particular applications running on the storage system, for example, Oracle databases, file servers, etc.
- the Aptare StorageConsole Another commercially available tool for use in storage systems is the Aptare StorageConsole.
- This application software provides increased reliability for backup and restore operations in a storage system.
- the Storage Resource Broker from Nirvana is software that enables users of systems to share and manage filed stored in various locations. It provides various searching and presentation functions to enable users to find particular files or information stored in various portions of large data storage units.
- a system which enables a user of the system to have a complete view of the data handling processes and the relationships among processes for management of the data to reduce the chance of error and improve the efficiency with which the data is managed.
- a system provides a method for collecting information about data and data handling processes from different types of data applications.
- This invention enables a user of the system to appreciate relationships among the data. It shows the data in a system view and can illustrate the relationships among the data stored in the system with a graphical user interface.
- a data manager collects information about the relationships among data and files stored therein and presents them to a user.
- the graphical user interface provides the user with the option of choosing from among three different views of data handling processes. These include a data view which illustrates how data are related to each other, for example, by showing where a particular file has been archived, replicated, or backed up. Preferably the system also provides a storage view which illustrates how the data volumes are related, for example, indicating which volumes in the storage system have the original data, the archived data, replica data, and backed up data.
- a third view for information in the storage system is referred to as the path view.
- the path view illustrates how data is transferred through the system by various data handling processes, for example indicating which ports, switches, and storage handle particular files or other data.
- a system according to this invention provides a way to detect erroneous configurations of backup data by comparison of the amount of backup data with the amount of original data.
- a storage system having a replication server, a backup server, and an archive server further includes a data manager which tracks the stored data in at least two of three approaches.
- the stored data is tracked by presenting file name relationships among the replicated, backup, or archived copies of the stored data.
- the physical locations within the storage system for example, in terms of volumes, are presented.
- path information depicting the processes by which the data arrived at its storage location are provided for the replicated, backup, or archived copies of the stored data.
- FIG. 1 is a block diagram illustrating a system configuration for a typical storage area network including a data manager according to this invention
- FIG. 2 illustrates an archive catalog for an archive profile
- FIG. 3 illustrates an archive catalog for media information
- FIG. 4 illustrates an archive catalog for archived data
- FIG. 5 illustrates a backup catalog for a backup profile
- FIG. 6 illustrates a backup catalog for media information
- FIG. 7 illustrates a backup catalog for backup data
- FIG. 8 illustrates a replication catalog
- FIG. 9 illustrates a device catalog for a volume
- FIG. 10 illustrates a device catalog for storage
- FIG. 11 illustrates a device catalog for a file system
- FIG. 12 illustrates a device catalog for a path
- FIG. 13 illustrates a device catalog for an application
- FIG. 14 illustrates an archive catalog for an archive profile
- FIG. 15 illustrates an archive catalog for archived data
- FIG. 16 is a block diagram of one example of interconnections in a storage system
- FIG. 17 illustrates a data descriptor
- FIG. 18 illustrates a relationship descriptor for archived data
- FIG. 19 illustrates a relationship descriptor for backup data
- FIG. 20 illustrates a relationship descriptor for replication data
- FIG. 21 illustrates a relationship descriptor for application data
- FIG. 22 illustrates another relationship descriptor for archived data
- FIG. 23 illustrates a discovered configuration table
- FIG. 24 is an example of a discovered data table
- FIG. 25 is an example of a discovered relationship table
- FIG. 26 is an example of a GUI for a view of the data
- FIG. 27 is an illustration of a GUI for a view of the storage system
- FIG. 28 is an example of a GUI for a view of the path information
- FIG. 29 illustrates a process for data discovery
- FIG. 30 illustrates details of the Get Data From App process shown in FIG. 29 ;
- FIG. 31 illustrates details of the Get Data From Backup process shown in FIG. 29 ;
- FIG. 32 illustrates further details of the Get Data From Backup process shown in FIG. 29 ;
- FIG. 33 illustrates details of the Get Data From Archive process shown in FIG. 29 ;
- FIG. 34 illustrates further details of the Get Data From Archive process shown in FIG. 29 ;
- FIG. 35 illustrates details of the Get Data from Replica process shown in FIG. 29 ;
- FIG. 36 is a flow chart illustrating the steps for depicting the data view
- FIG. 37 is a flow chart illustrating the steps for depicting the storage view
- FIG. 38 is a flow chart illustrating the steps for depicting the path view.
- FIG. 39 is a flow chart illustrating the steps for checking backup operations
- FIG. 1 is a block diagram illustrating a hypothetical typical storage system as might be found in a complex computing environment. Most of the components of the system shown in FIG. 1 are well known and thus are discussed only briefly herein. The data manager 111 , however, is not well known and is explained in detail below.
- the system shown in FIG. 1 includes two application servers 101 and 102 . These servers run computer programs 101 a and 102 a to provide computing resources to users of the overall system. By execution of a stored program, the applications 101 a and 102 a generate data which is stored in the system illustrated in FIG. 1 .
- a replication server 103 replicates data to different storage systems or volumes within the storage system to provide well known mirroring functionality.
- the replication server maintains a replication catalog 106 as will be discussed below.
- a backup server 104 provides data backup functionality to enable restoration of data at a later date should there be hardware, software, or facilities failures.
- a backup catalog 107 maintains a record of the backup operations, as also discussed below.
- Server 105 archives little used data from primary storage areas to secondary storage areas to provide improved system performance and to reduce costs by maintaining the data on lower cost media.
- archive server 105 maintains an archive catalog 108 , also explained further below.
- servers 101 - 105 have been discussed as though each were a standalone hardware implementation, this is not necessary.
- the servers may be implemented as separate processes running on a single large computer, or as separate processes running on separate processors within a connected array of computers.
- the system shown in FIG. 1 also includes a storage area manager 109 .
- the storage area manager is preferably a management server that manages the entire network depicted in FIG. 1 , including the servers and the storage systems 115 , 116 , and 117 .
- the storage area manager maintains a device catalog 110 which is also discussed below. In essence, the storage area manager can retrieve information from the switches 114 , servers 101 . . . 105 , storage systems 115 - 117 , and the applications 101 a, 102 a.
- Storage area managers such as depicted in FIG. 1 are often implemented using a standard protocol such as DMTF's CIM. Another way to implement the storage area manager is to install an agent on the server and have the agent collect information about the server locality and provide it to the storage area manager.
- switches 114 have become an increasingly popular connection technique. These switches are typically switches based on Fibre Channel, Ethernet, or broadband technology.
- the data received by the system or generated by the system as the result of its server operations is stored in storage systems such as 115 , 116 , and 117 .
- Each such storage system includes a disk controller 118 , 119 , and 120 , respectively, as well as hard disk drives 118 a . . . 120 b for storing data.
- FIG. 1 illustrates only two disk drives per storage system. In conventional implementations, however, hundreds of disk drives may be employed in the storage system.
- the disk controllers 118 , 119 and 120 control input and output requests issued from the servers to store and retrieve data from the hard disk drives.
- Storage system 115 is an enterprise Fibre Channel storage system. Such systems typically support SCSI as a data protocol between the servers and the storage systems.
- the Nearline PC storage system 116 operates in a similar manner, however, using ATA format hard disk drives.
- the Network Attached Storage system 117 supports NFS and CIFS as file protocols.
- the system of this invention can be applicable to any type of storage system.
- FIG. 1 The components and systems shown in FIG. 1 are interconnected using two techniques.
- a network 100 is provided, for example based on TCP/IP/Ethernet to provide “out of band” communications.
- the main data handling, however, for the storage systems is provided by switches 114 which allow interconnections of desired components as necessitated by the particular operations to be performed.
- the system of this invention adds an additional component 111 , referred to herein as a data manager, to the overall system of FIG. 1 .
- This data manager communicates with the other components via the local area network 100 and the switches 114 .
- the data manager functions to collect data handling process information from the applications and the data applications and present the results to a user.
- the results are typically presented through a graphical user interface running on a console 113 .
- the data manager maintains a data catalog.
- the data catalog enables the data manager to present to the user various “views” of the storage system.
- the data manager 111 and data catalog together enable a user to view information about the physical locations where various files are stored, the path by which the information was stored, and other relationships among the data stored in the storage systems 115 , 116 , and 117 .
- the data manager 111 creates and manages data descriptors, relationship descriptors, a discovered data table (discussed below) and a discovered relationship table (also discussed below). These tables are typically stored in local storage or network storage attached to the data manager.
- the data manager also uses a discovery configuration table as discussed below.
- the data manager itself may be configured by the console 113 .
- the data manager relies upon catalogs created and stored throughout the system as designated in FIG. 1 . These catalogs are discussed next.
- FIG. 2 is a diagram illustrating an archive catalog for the archive profile. This catalog is included within the catalog 108 shown in FIG. 1 .
- the catalog 200 shown in FIG. 2 describes which data is to be archived, at what time, and to which storage. In the example shown in FIG. 2 the data is to be archived if it is not accessed within 30 days.
- the data to be archived is set forth as the Folder, and the media to which it is to be archived is listed under Archive Media.
- FIG. 3 illustrates an archive catalog for media information. This catalog is also included within catalog 108 shown in FIG. 1 .
- the example in FIG. 3 illustrates that the Archive Media is actually an Archive Folder having a specified address associated with the specific server.
- FIG. 3 also indicates that the Folder has a maximum capacity as shown.
- FIG. 4 is a diagram illustrating an archive catalog for archive data. This catalog is included within catalog 108 shown in FIG. 1 .
- the indicated Source Data is shown as being archived at the designated media location as an Archive Stream at the Archive Time shown in FIG. 4 .
- FIGS. 5-7 illustrate backup catalogs stored as catalog 107 in FIG. 1 .
- FIG. 5 an exemplary backup catalog for a backup profile is illustrated. This catalog describes how and when data is to be backed up.
- files under the folder designated by Source are to be backed up to the Backup Media at the Backup Time stated.
- the Backup Type indicates that all files are to be backed up, while the Next Backup Time indicates the time and date of the next backup operation.
- FIG. 6 is a diagram illustrating a backup catalog for media information. In a similar manner to FIG. 3 , it illustrates the physical location of the particular media designated, as well as its capacity.
- FIG. 7 illustrates a backup catalog for backup data. This catalog describes when and where data is backed up. In the example shown, two files as designated by Data Source have been backed up to the Backup Media at the time shown.
- FIG. 8 is a diagram illustrating a replication relationship between two devices in the storage system, and is referred to as a replication catalog. This diagram provides additional information with regard to the replication catalog 106 in FIG. 1 .
- the replication catalog describes the relationship between two data storage locations, commonly known as LDEVs in the storage system. As shown by FIG. 8 , the data in the Primary Storage is replicated to the Secondary Storage location. The Mode indicates whether the backup is to be synchronous or asynchronous.
- FIG. 9 is a diagram illustrating a device catalog for a volume, with FIGS. 10-13 illustrating other device catalogs, all incorporated within catalog 110 in FIG. 1 .
- the volume catalog 207 shown in FIG. 9 includes the volume identification, name, address, port, logical unit number, etc.
- FIG. 10 illustrates a device catalog 208 for storage.
- This catalog provides information about a storage system. As shown, the catalog includes an identification, name, address, capacity, information about ports coupled to the storage, etc.
- FIG. 11 illustrates a catalog 220 for a file system. As shown there, the catalog includes information about identification, physical volume location, file system type, free space, etc. Similarly, FIG. 12 illustrates a device catalog for a path 221 . This catalog includes identification information and worldwide name identification.
- FIG. 13 is a device catalog 222 for an application. As shown by FIG. 13 , the catalog includes identification, application type, host name, and associated data files.
- FIGS. 14 and 15 illustrate an archive catalog for message based archiving.
- FIGS. 2-4 illustrated archive catalogs for file-based archiving.
- the archiving is performed at an application level. For example, an e-mail server may store messages into data files and an archive server then communicates with the e-mail server to archive the messages themselves, instead of the data files.
- the archive profile also indicates the name of a server and the name of an application.
- FIG. 14 illustrates an archive catalog 223 for an archive profile for the case just described. As shown, the application is indicated with A as well as the media name MN, and the media and timing information. The media information itself may be archived in the same manner as described in conjunction with FIG. 3 .
- FIG. 15 illustrates an archive catalog 224 for archive data.
- the Source Data designates particular messages instead of files.
- the Server Name and information about the media, data, and time are also provided.
- FIG. 16 depicts an exemplary system configuration which is used in the remainder of this application as an example to clarify the explanation.
- several servers 230 are represented across the upper portion of the diagram, including an application server, an archive server, a backup server, and a replication server. Two of the servers are connected with an Ethernet link. In the middle portion of the diagram, two switches 231 couple the various servers to various storage systems 232 .
- the replication server is coupled to the Enterprise Storage A to allow replication in that storage system.
- the application server 230 stores data into LDEV 1 , while the archive server archives some of that data into LDEV 2 .
- the replication server asks storage unit A to replicate LDEV 1 to LDEV 3 , and in response that event occurs.
- the backup server backs up data from LDEV 3 to LDEV 4 .
- FIG. 17 illustrates a sample data descriptor table 240 .
- This table illustrates information collected by the data manager 111 (see FIG. 1 ) about the data being handled by the storage system and the servers.
- the data descriptor table includes a considerable information for the particular unit of data discovered. It also includes logical information about the data, including for example, the host name associated with that data, the path name, the “owner” of the data, any restrictions on access or rewriting of the data, the size, time of creation, time of modification, time of last access, and a count of the number of accesses.
- the data descriptor also includes information about the mount point (where the data is located), the type of file system associated with the data, and the maximum size of that file system.
- the data descriptor includes physical information about the data, including the storage system brand name (Lightning 9900), its IP address, its LDEV, etc. The physical information can also include information about the maximum volume size, the level of RAID protection, etc.
- the logical information includes which server has the data, its logical location within that server, and access control information, as well as size, and other parameters about the stored data.
- the file system information describes the type of file system in which the data is stored.
- the physical information describes the storage system and the LDEVs on which a particular file system has been created.
- FIGS. 18-22 illustrate relationship descriptor tables to help establish the relationships among the data stored in the storage system.
- FIG. 18 is an example of a relationship descriptor table 241 for the archives
- the table includes information about a descriptor identification, its relationship to the original data, the original data descriptor, the archive data descriptor, the archive time and the retention period thus far.
- the relationship descriptor shows how the discovered data are related and assigns a unique ID (RID).
- FIG. 19 provides a relationship descriptor for backup as shown there.
- Table 242 illustrates the original data of the specified addresses has been backed up as data specified at that address. The backup date, time, speed, and other parameters are also maintained.
- FIG. 20 is a relationship descriptor table 243 for replication. This table, in addition to the other information provided, maintains the relationship between the original and the replicated data based on their global identification.
- FIG. 21 is a relationship descriptor table 244 for an application. As shown by this table, the e-mail server in the Trinity server has data sources specified by the designated global identification numbers.
- the data manager 111 also creates a number of tables based upon its interactions with the servers. These tables are referred to here as consisting of a discovery configuration table 280 shown in FIG. 23 , a discovered data table 420 shown in FIG. 24 , and a discovered relationship table 430 shown in FIG. 25 . These tables are discussed next.
- the discovered configuration table 280 shown in FIG. 23 shows from which applications and data applications the data manager has gathered information. Each entry in the table, consisting of a row, specifies a type of discovered data, a server from which the information is gathered, an application or data application name, and ID and password information to gain access as needed. For example, in the first row of table 280 , an application program has collected information from server E using the application SAMSoft, and this can be accessed using the ID and password shown at the end of the row.
- FIG. 24 illustrates a discovered data table 420 .
- This table provides management information for the discovered data.
- the data is uniquely identified by the combination of storage system, LDEV and a relative path name.
- Files stored in the storage system are stored using a file system.
- the relative path name provides a path name inside the file system instead of a path name when the file system is mounted on a folder in the server. For example, assume LDEV 1 is mounted on ⁇ folder 1 at a server. Also assume there is a file with a path name which is ⁇ folder 2 ⁇ fileA. Thus the relative path name is File A.
- FIG. 25 illustrates a discovered relationship table 430 .
- This table manages the identifications of discovered relationships.
- the relationship identified by RID 0002 is a backup relationship indicating that the files having GIDs shown in the column “Source” were backed up as data identified by the “Destination” column. While backup, archive, and replication actions are associated with data at two locations, the application itself only has source data. Thus “destination” is not applicable.
- GUI graphical user interface
- FIG. 26 illustrates a “data view” GUI 250 .
- the data manager presents a view related to the data itself.
- the GUI has two parts, a data specifying panel on the left hand side and an information panel on the right hand side of the figure.
- the data specification panel shows all of the applications and all of the data in the system that is being used by those applications.
- the specification panel lists e-mail applications and within those applications an e-mail server A. That e-mail server has a number of files, shown in the example as A, B, and C. The user has chosen file A.
- the GUI is illustrating information about that file in the right hand panel shown in FIG. 26 .
- This panel illustrates the relationship information about the data associated with file A.
- the server and file location are shown, as well as all archived, replicated, and backed up copies of that file.
- file A has been archived by server B at the designated location, has been replicated by server C at the designated location, and has been backed up by server D at the designated location.
- the user By clicking on the “Details” designation, the user causes the system to retrieve “deeper” information about that data, for example it's size, the time of the event, or other information provided in the descriptor tables discussed above, and that data will be presented on the GUI.
- FIG. 27 illustrates the GUI for a “storage view” of the data.
- the left hand panel shown in FIG. 27 corresponds to that discussed in FIG. 26 , enabling the user to select a particular file.
- the user selected file A and thus the right hand panel of the storage view 260 is illustrating information about file A.
- That panel shows the LDEV and storage system where the original data is stored, as well as the LDEVs and the storage systems in which all of the data related to the original data are stored, as well as the relationships among those locations. For example, as shown in the upper portion of the right hand panel, the replica, archive, and backup relationships are illustrated.
- FIG. 28 is a third GUI enabling the user to more easily understand the location of various data in the storage system and the path by which that data is being handled.
- FIG. 28 illustrates the “path view” GUI.
- the left hand side of the GUI 270 enables the user to select the particular file, while the right hand side depicts the topology map of the servers, switches, storage systems, and LDEVs for the original data, and for data related to the original data.
- This diagram also illustrates how data is transferred in the topology.
- across the upper portion of the right hand panel in FIG. 28 are a series of “buttons.” By clicking on one of these buttons, the screen will show a path through which data is transferred by the specified relationship.
- FIG. 29 is a flowchart illustrating a preferred embodiment of the data discovery process by the data manager shown in FIG. 1 .
- the process is initiated by a user at the console 113 shown in FIG. 1 .
- the data manager retrieves an entry from the discovery configuration table shown in FIG. 23 , unless that entry is a replication entry. If there is a non-replication entry the flow proceeds immediately downward as shown in FIG. 29 .
- the data discovery process retrieves a replication entry from the discovery configuration table as shown by step 296 .
- the data manager checks the type of server and executes one of three procedures 293 , 294 or 295 , depending upon the type of server, as shown by the loop in FIG. 29 . After that entry is retrieved the process reverts back to step 290 to be repeated as many times as is necessary to retrieve all of the entries from all of the servers. The details of the particular “get data” procedure 293 , 294 , or 295 are discussed below. Once these procedures are completed, then the system reverts to checking the replication entries as shown by step 296 . Assuming there are replication entries, then the procedure follows step 298 , which is also discussed later below. Once all of the entries have been retrieved as shown at step 297 , the data discovery process ends.
- FIG. 30 illustrates in more detail the process flow for getting data from an application as shown by block 293 in FIG. 29 .
- the data manager first connects to the SAM server via the network. It uses an identification and password in the discovery configuration table for the connection 300 . It then retrieves a list of applications from the SAM server 301 , and for each application a list of data files from that server as shown by step 302 . As shown by step 303 , for each data file on that list, the data manager gets a file system name in which the data file is stored in the SAM server. Then, as shown by step 304 , for each file system a storage name and an LDEV on which the file system is created are also retrieved from the SAM server.
- the data manager creates a new entry in the discovered data table and allocates a new global identification to that if there is not already an entry for that set.
- a data descriptor is created for each such GID.
- the data manager will retrieve logical information, file system information, and physical information from the SAM server and file that information into the data descriptor table.
- step 308 for each application a new entry in the discovered relationship table is created and a new RID is provided if there is not already an entry for that application.
- the relationship descriptor for the application and the file information is then created. Once these steps are completed, the process flow returns to the diagram shown in FIG. 29 .
- FIG. 31 illustrates the process of retrieving data from the backup server, illustrated in FIG. 29 as step 294 .
- the data manager first connects to a backup server via the network. It uses the ID and password information from the discovery configuration table for the connection, as shown in step 320 . It also connects to the SAM server in the same manner, as shown in step 321 .
- the data manager retrieves a list of backup profiles from the backup server. As shown by step 323 , for each such backup profile the data manager obtains a list of backup data from the backup server.
- the data manager retrieves a file system in which the backup stream is stored from the backup server.
- a storage name and an LDEV on which the file system is created are retrieved from the SAM server.
- a new entry is created in the discovered data table and a new GID is allocated if there is not already an entry for that set.
- a data descriptor is created.
- logical information, file system information, and physical information from the SAM server is retrieved and provided to the data descriptor table.
- FIG. 32 illustrates the process following step 328 .
- the data manager obtains a list of the data sources from the backup server at step 329 .
- a file system in which the data source is stored is also retrieved from the backup server at step 330 .
- the data manager retrieves a storage name and an LDEV on which the file system is created from the same server.
- a new entry is created in the discovered data table, and a new GID is allocated if there is not already an entry for that set.
- a data descriptor is created for each GID.
- logical information, file system information, and physical information is retrieved from the same server and filled into the data descriptor table.
- step 336 for each backup data, a new entry is created in the discovered relationship table and a new RID is allocated if there is not already an entry for that backup data.
- step 337 for each RID, a relationship descriptor for the backup information is created and this is filled into the discovered data table. That step concludes operations for the get data from backup step shown generally as step 294 in FIG. 29 .
- FIG. 33 illustrates the details behind the step of getting data from the archive, represented by step 295 in FIG. 29 . As described above, these operations are similar to the other get data operations discussed in the previous few figures.
- the process begins with step 340 in which the data manager connects to the archive server using an ID and password information. It also connects to the same server with the ID and password information as shown by step 341 .
- the data manager connects to the archive server using an ID and password information. It also connects to the same server with the ID and password information as shown by step 341 .
- it obtains a list of archive profiles, and at step 344 , for each archive profile it obtains a list of archive data from the archive server.
- At step 345 for each archive data it retrieves the file system in which the archive stream is stored from the archive server.
- a new entry is created in the discovered data table and a new GID is allocated if there is not already one for that set.
- a data descriptor is created, and finally at step 349 , for each such data descriptor logical information from a file system information and physical information from the SAM server is filled into the data descriptor table. The process then continues with FIG. 34 .
- step 350 for each archived data, a list of data sources is retrieved from the archive server. Then for each unique data source, a file system for that data source is retrieved from the archive server, as shown by step 351 . Then, for each unique file system, the storage name and LDEV on which the file system is created are retrieved from the SAM server. Next, at step 353 , for each unique set of a storage name, an LDEV, and a data source relative path name, a new entry is created in the discovered data table and a new GID is allocated if there is not already one for that set.
- a new data descriptor is created for each GID and for each such data descriptor, logical information, file system information, and physical information is retrieved from the SAM server and filled into the data descriptor table as shown by step 355 . Then, for each archived data, a new entry is created in the discovered relationship table and a new RID is allocated if there is not already one for that data. Finally, a relationship descriptor is created for that RID and filled in to the data discovery table.
- the process for getting data from the replica servers is similar to that described above. It is illustrated in FIG. 35 .
- the process follows a flow of connecting to the replication server with an ID and password 360 , connecting to the SAM server 361 , and obtaining a list of replication profiles from the replication server 362 .
- selected information is retrieved at step 363 , and for each such replication set, the data is located that is stored in these volumes at step 364 .
- a new entry is created in the discovered relationship table, and for each such new RID a relationship descriptor is created and the information filled into the table at step 366 .
- the techniques for showing the various data, storage and path view are described in FIG. 29 .
- the data manager receives a server name, an application name, and a data file from the GUI, as shown by step 370 . As discussed above, this selection will typically be made by the user choosing an appropriate entry in the left hand panel of the GUI. Then, as shown by step 371 , the GID for the specified data is retrieved from the discovered data table, and at step 372 , a list is retrieved of all RIDs that contain the GID from the discovered relationship table. If there are none, then the found GIDs may be displayed, as shown by step 376 . If there are RIDs, then for each such RID, the GIDs and the destination are also retrieved from the discovered relationship table as shown by step 374 . Once this is completed, the display is produced as shown by step 376 .
- FIG. 37 illustrates the steps for showing a storage view in the GUI.
- the user selects various information as shown in step 380 , and the GID for the specified data is retrieved from the discovered data table.
- the flow of operations through steps 382 , 383 , 384 , and 385 matches that from FIG. 36 .
- the data manager finds the storage system and LDEVs in which the data specified by the GID is stored, and shows the storage as a storage icon on the screen and the LDEV as LDEV icons on the screen.
- the LDEV icons are interconnected by relationship indicators for each found RID.
- FIG. 38 is a flow chart illustrating the manner in which the path view GUI is created. Steps 390 - 395 are the same as those described above for the data and storage views.
- step 396 for all of the found GIDs and RIDs find the related servers, switches, storage systems, and LDEVs that are related to the data or data applications specified by these found GIDs and RIDs.
- step 397 the physical topology map for all the found hardware components is displayed at step 397 , and relationship buttons are added at step 398 .
- step 399 if a button is pushed, then the system shows the data path by which the designated data is transferred, which information is provided by the SAM server.
- FIG. 39 is a flow chart illustrating another feature provided by the system of this invention.
- FIG. 39 provides a technique for detecting a misconfiguration of a data backup by comparing the size of the backup data with the size of the original data.
- the process shown in FIG. 39 may be invoked by the user through the storage console 113 shown in FIG. 1 .
- the system receives a server name, an application, and a data file from the GUI as shown by step 400 .
- the GID for the specified data is retrieved from the discovered data table and the list of RIDs that contain that GID are retrieved from the discovered relationship table. This process is repeated until all RIDs and GIDs are retrieved as shown by steps 403 - 405 .
- the size of the data files for that application are then computed at step 407 .
- a successfully completed message is displayed at step 409 , while if the amounts do not match, an error is displayed at step 410 .
- the user can then either reperform the backup of investigate the error and resolve it in some other manner.
- the technology described has numerous applications. These applications are not restricted to backup, archive, replication, etc.
- the invention can be applied to other applications or custom applications in which data is to be analyzed and relationships determined.
- the invention is also not limited to files or data in the local file system or local server. Instead, the invention can be applied to volumes in storage systems and objects in object based storage devices, or files in network attached storage systems. It can be applied to volumes, and to storage systems which replicate volumes by themselves.
- the data manager in such an application can determine from the storage system or the replication server how the volumes are replicated and create a data descriptor for each volume without path information, and also create a relationship descriptor by using the replication relationship.
- the data is uniquely identified by an IP address, an exported file system and a relative path name.
- the data manager may calculate a hash value for each data. Then the data manager can retrieve the logical location and physical location of such data from a SAM server. If the data are related to different locations, then the data manager can create a relationship descriptor for these data which indicates that the data are identical (in the case of duplicate hash values). This enables the user to see how many replications of data are present on the storage system and to determine which data can be deleted.
- the data manager can also detect at what location in the hierarchy a performance bottleneck exists. In such a case, the data manager which retrieves performance information for each relationship and determines if those numbers are restricted by physical resources or disturbances caused by other data processing or application software.
- the data manager also provides users a way to search for data and relationships among data by specifying some portion of the data. If the data manager receives such a request, the data manager can find data descriptors and relationship descriptors that include the specified information and provide it, for example as described on a graphical user interface.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A method of collecting information about data and data handling processes from different types of applications in the context of a storage system is described. The retrieved information is presented to the user to illustrate the relationships among the data, for example, in the form of a data view illustrating the relationship among files, a storage view, illustrating the physical location at which the stored data is located, or a path view illustrating a particular path through the topology of the overall computing system and storage system. Also described are techniques for assuring the accuracy of backed up files.
Description
- The present application is a Continuation Application of U.S. application Ser. No. 10/890,652, filed Jul. 13, 2004, which is incorporated by reference herein in its entirety for all purposes.
- This invention relates to systems for storing data, and in particular to storage systems in which data is distributed among large numbers of hard disk drives or other storage media.
- In a typical data storage network, data from many different applications is stored and retrieved, and it is difficult to track the relationships among all of the data stored. For example, in an e-mail system, an e-mail server generates original data and provides it to a storage system. An archive server may archive some parts of the data to different parts of the storage system or to different storage systems. At the same time a replication server may replicate the original data to different storage, and the data may be backed up by a backup server to yet further storage. While each of these data handling processes operate on the data associated with that process in an appropriate manner, the archive server, the replication server and the backup server each operate independently. Each has its own catalog or other mechanism for managing how the data is stored and retrieved. Because of the distributed nature of the system and the lack of consolidated catalogs, a user of a storage system typically cannot understand where data is situated in that storage system on a reliable basis.
- Furthermore, the complexity of storage systems increases the probability of mistakes. In the example just described, some parts of the original data are not stored in the original storage, but instead have been stored in the archive storage. As a result, a replication of the original data will not contain the archive data. Thus the backup data will also not contain the archive data. Therefore, when a user restores data from the backup, because the backup data is not a complete backup of the original data, not all of the original data will be restored. All of this complexity makes managing the data in a coherent manner difficult and error-prone.
- There are a few tools that help manage data in storage systems. These tools, however, do not address the issues mentioned above. One commercially available tool for use in management of a data storage system is provided by Veritas™ and referred to as SANPoint Control. This system enables keeping track of the hardware devices and their relationships in a storage area network. Another commercially available tool is provided by AppIQ and known as storage authority suite. This system provides information about the hardware in the storage system, including hosts, bus adapters, switches, disk subsystems, etc. It also provides capabilities for management of particular applications running on the storage system, for example, Oracle databases, file servers, etc.
- Another commercially available tool for use in storage systems is the Aptare StorageConsole. This application software provides increased reliability for backup and restore operations in a storage system. The Storage Resource Broker from Nirvana is software that enables users of systems to share and manage filed stored in various locations. It provides various searching and presentation functions to enable users to find particular files or information stored in various portions of large data storage units.
- Therefore, a system is needed which enables a user of the system to have a complete view of the data handling processes and the relationships among processes for management of the data to reduce the chance of error and improve the efficiency with which the data is managed.
- A system according to this invention provides a method for collecting information about data and data handling processes from different types of data applications. This invention enables a user of the system to appreciate relationships among the data. It shows the data in a system view and can illustrate the relationships among the data stored in the system with a graphical user interface. Preferably, in a storage system having arrays of storage devices for storing information, a data manager according to this invention collects information about the relationships among data and files stored therein and presents them to a user.
- In a preferred embodiment, the graphical user interface provides the user with the option of choosing from among three different views of data handling processes. These include a data view which illustrates how data are related to each other, for example, by showing where a particular file has been archived, replicated, or backed up. Preferably the system also provides a storage view which illustrates how the data volumes are related, for example, indicating which volumes in the storage system have the original data, the archived data, replica data, and backed up data.
- A third view for information in the storage system is referred to as the path view. The path view illustrates how data is transferred through the system by various data handling processes, for example indicating which ports, switches, and storage handle particular files or other data. Furthermore, a system according to this invention provides a way to detect erroneous configurations of backup data by comparison of the amount of backup data with the amount of original data.
- In one embodiment, a storage system having a replication server, a backup server, and an archive server further includes a data manager which tracks the stored data in at least two of three approaches. In one approach the stored data is tracked by presenting file name relationships among the replicated, backup, or archived copies of the stored data. In the second approach, the physical locations within the storage system, for example, in terms of volumes, are presented. In the third approach, path information depicting the processes by which the data arrived at its storage location are provided for the replicated, backup, or archived copies of the stored data.
-
FIG. 1 is a block diagram illustrating a system configuration for a typical storage area network including a data manager according to this invention; -
FIG. 2 illustrates an archive catalog for an archive profile; -
FIG. 3 illustrates an archive catalog for media information; -
FIG. 4 illustrates an archive catalog for archived data; -
FIG. 5 illustrates a backup catalog for a backup profile; -
FIG. 6 illustrates a backup catalog for media information; -
FIG. 7 illustrates a backup catalog for backup data; -
FIG. 8 illustrates a replication catalog; -
FIG. 9 illustrates a device catalog for a volume; -
FIG. 10 illustrates a device catalog for storage; -
FIG. 11 illustrates a device catalog for a file system; -
FIG. 12 illustrates a device catalog for a path; -
FIG. 13 illustrates a device catalog for an application; -
FIG. 14 illustrates an archive catalog for an archive profile; -
FIG. 15 illustrates an archive catalog for archived data; -
FIG. 16 is a block diagram of one example of interconnections in a storage system; -
FIG. 17 illustrates a data descriptor; -
FIG. 18 illustrates a relationship descriptor for archived data; -
FIG. 19 illustrates a relationship descriptor for backup data; -
FIG. 20 illustrates a relationship descriptor for replication data; -
FIG. 21 illustrates a relationship descriptor for application data; -
FIG. 22 illustrates another relationship descriptor for archived data; -
FIG. 23 illustrates a discovered configuration table; -
FIG. 24 is an example of a discovered data table; -
FIG. 25 is an example of a discovered relationship table; -
FIG. 26 is an example of a GUI for a view of the data; -
FIG. 27 is an illustration of a GUI for a view of the storage system; -
FIG. 28 is an example of a GUI for a view of the path information; -
FIG. 29 illustrates a process for data discovery; -
FIG. 30 illustrates details of the Get Data From App process shown inFIG. 29 ; -
FIG. 31 illustrates details of the Get Data From Backup process shown inFIG. 29 ; -
FIG. 32 illustrates further details of the Get Data From Backup process shown inFIG. 29 ; -
FIG. 33 illustrates details of the Get Data From Archive process shown inFIG. 29 ; -
FIG. 34 illustrates further details of the Get Data From Archive process shown inFIG. 29 ; -
FIG. 35 illustrates details of the Get Data from Replica process shown inFIG. 29 ; -
FIG. 36 is a flow chart illustrating the steps for depicting the data view; -
FIG. 37 is a flow chart illustrating the steps for depicting the storage view; -
FIG. 38 is a flow chart illustrating the steps for depicting the path view; and -
FIG. 39 is a flow chart illustrating the steps for checking backup operations; -
FIG. 1 is a block diagram illustrating a hypothetical typical storage system as might be found in a complex computing environment. Most of the components of the system shown inFIG. 1 are well known and thus are discussed only briefly herein. Thedata manager 111, however, is not well known and is explained in detail below. - The system shown in
FIG. 1 includes twoapplication servers computer programs applications FIG. 1 . - A
replication server 103 replicates data to different storage systems or volumes within the storage system to provide well known mirroring functionality. The replication server maintains areplication catalog 106 as will be discussed below. Similarly, abackup server 104 provides data backup functionality to enable restoration of data at a later date should there be hardware, software, or facilities failures. Abackup catalog 107 maintains a record of the backup operations, as also discussed below. - Many large storage systems also include a hierarchical storage manager or
archive server 105.Server 105 archives little used data from primary storage areas to secondary storage areas to provide improved system performance and to reduce costs by maintaining the data on lower cost media. As with the other servers,archive server 105 maintains anarchive catalog 108, also explained further below. Although servers 101-105 have been discussed as though each were a standalone hardware implementation, this is not necessary. The servers may be implemented as separate processes running on a single large computer, or as separate processes running on separate processors within a connected array of computers. - The system shown in
FIG. 1 also includes astorage area manager 109. The storage area manager is preferably a management server that manages the entire network depicted inFIG. 1 , including the servers and thestorage systems device catalog 110 which is also discussed below. In essence, the storage area manager can retrieve information from theswitches 114,servers 101 . . . 105, storage systems 115-117, and theapplications FIG. 1 are often implemented using a standard protocol such as DMTF's CIM. Another way to implement the storage area manager is to install an agent on the server and have the agent collect information about the server locality and provide it to the storage area manager. - Although there are a variety of techniques commonly used to interconnect systems such as depicted in
FIG. 1 , switches 114 have become an increasingly popular connection technique. These switches are typically switches based on Fibre Channel, Ethernet, or broadband technology. - The data received by the system or generated by the system as the result of its server operations is stored in storage systems such as 115, 116, and 117. Each such storage system includes a
disk controller FIG. 1 illustrates only two disk drives per storage system. In conventional implementations, however, hundreds of disk drives may be employed in the storage system. Thedisk controllers - For illustration three different types of storage systems are shown in
FIG. 1 .Storage system 115 is an enterprise Fibre Channel storage system. Such systems typically support SCSI as a data protocol between the servers and the storage systems. The NearlinePC storage system 116 operates in a similar manner, however, using ATA format hard disk drives. Finally, the Network AttachedStorage system 117 supports NFS and CIFS as file protocols. Thus, as depicted inFIG. 1 , the system of this invention can be applicable to any type of storage system. - The components and systems shown in
FIG. 1 are interconnected using two techniques. Anetwork 100 is provided, for example based on TCP/IP/Ethernet to provide “out of band” communications. The main data handling, however, for the storage systems is provided byswitches 114 which allow interconnections of desired components as necessitated by the particular operations to be performed. - The system of this invention adds an
additional component 111, referred to herein as a data manager, to the overall system ofFIG. 1 . This data manager communicates with the other components via thelocal area network 100 and theswitches 114. The data manager functions to collect data handling process information from the applications and the data applications and present the results to a user. The results are typically presented through a graphical user interface running on aconsole 113. The data manager maintains a data catalog. The data catalog enables the data manager to present to the user various “views” of the storage system. For example, thedata manager 111 and data catalog together enable a user to view information about the physical locations where various files are stored, the path by which the information was stored, and other relationships among the data stored in thestorage systems data manager 111 creates and manages data descriptors, relationship descriptors, a discovered data table (discussed below) and a discovered relationship table (also discussed below). These tables are typically stored in local storage or network storage attached to the data manager. The data manager also uses a discovery configuration table as discussed below. The data manager itself may be configured by theconsole 113. The data manager relies upon catalogs created and stored throughout the system as designated inFIG. 1 . These catalogs are discussed next. -
FIG. 2 is a diagram illustrating an archive catalog for the archive profile. This catalog is included within thecatalog 108 shown inFIG. 1 . Thecatalog 200 shown inFIG. 2 describes which data is to be archived, at what time, and to which storage. In the example shown inFIG. 2 the data is to be archived if it is not accessed within 30 days. The data to be archived is set forth as the Folder, and the media to which it is to be archived is listed under Archive Media. -
FIG. 3 illustrates an archive catalog for media information. This catalog is also included withincatalog 108 shown inFIG. 1 . The example inFIG. 3 illustrates that the Archive Media is actually an Archive Folder having a specified address associated with the specific server.FIG. 3 also indicates that the Folder has a maximum capacity as shown. -
FIG. 4 is a diagram illustrating an archive catalog for archive data. This catalog is included withincatalog 108 shown inFIG. 1 . In the example ofFIG. 4 , the indicated Source Data is shown as being archived at the designated media location as an Archive Stream at the Archive Time shown inFIG. 4 . -
FIGS. 5-7 illustrate backup catalogs stored ascatalog 107 inFIG. 1 . InFIG. 5 , an exemplary backup catalog for a backup profile is illustrated. This catalog describes how and when data is to be backed up. In the example depicted, files under the folder designated by Source are to be backed up to the Backup Media at the Backup Time stated. The Backup Type indicates that all files are to be backed up, while the Next Backup Time indicates the time and date of the next backup operation. -
FIG. 6 is a diagram illustrating a backup catalog for media information. In a similar manner toFIG. 3 , it illustrates the physical location of the particular media designated, as well as its capacity. -
FIG. 7 illustrates a backup catalog for backup data. This catalog describes when and where data is backed up. In the example shown, two files as designated by Data Source have been backed up to the Backup Media at the time shown. -
FIG. 8 is a diagram illustrating a replication relationship between two devices in the storage system, and is referred to as a replication catalog. This diagram provides additional information with regard to thereplication catalog 106 inFIG. 1 . The replication catalog describes the relationship between two data storage locations, commonly known as LDEVs in the storage system. As shown byFIG. 8 , the data in the Primary Storage is replicated to the Secondary Storage location. The Mode indicates whether the backup is to be synchronous or asynchronous. -
FIG. 9 is a diagram illustrating a device catalog for a volume, withFIGS. 10-13 illustrating other device catalogs, all incorporated withincatalog 110 inFIG. 1 . Thevolume catalog 207 shown inFIG. 9 includes the volume identification, name, address, port, logical unit number, etc. -
FIG. 10 illustrates adevice catalog 208 for storage. This catalog provides information about a storage system. As shown, the catalog includes an identification, name, address, capacity, information about ports coupled to the storage, etc. -
FIG. 11 illustrates acatalog 220 for a file system. As shown there, the catalog includes information about identification, physical volume location, file system type, free space, etc. Similarly,FIG. 12 illustrates a device catalog for apath 221. This catalog includes identification information and worldwide name identification. -
FIG. 13 is adevice catalog 222 for an application. As shown byFIG. 13 , the catalog includes identification, application type, host name, and associated data files. -
FIGS. 14 and 15 illustrate an archive catalog for message based archiving. (FIGS. 2-4 illustrated archive catalogs for file-based archiving.) In message based archiving, the archiving is performed at an application level. For example, an e-mail server may store messages into data files and an archive server then communicates with the e-mail server to archive the messages themselves, instead of the data files. In these circumstances, the archive profile also indicates the name of a server and the name of an application. -
FIG. 14 illustrates anarchive catalog 223 for an archive profile for the case just described. As shown, the application is indicated with A as well as the media name MN, and the media and timing information. The media information itself may be archived in the same manner as described in conjunction withFIG. 3 . -
FIG. 15 illustrates anarchive catalog 224 for archive data. As mentioned above, the Source Data designates particular messages instead of files. The Server Name and information about the media, data, and time are also provided. -
FIG. 16 depicts an exemplary system configuration which is used in the remainder of this application as an example to clarify the explanation. As shown inFIG. 16 ,several servers 230 are represented across the upper portion of the diagram, including an application server, an archive server, a backup server, and a replication server. Two of the servers are connected with an Ethernet link. In the middle portion of the diagram, twoswitches 231 couple the various servers tovarious storage systems 232. The replication server is coupled to the Enterprise Storage A to allow replication in that storage system. Theapplication server 230 stores data into LDEV1, while the archive server archives some of that data into LDEV2. The replication server asks storage unit A to replicate LDEV1 to LDEV3, and in response that event occurs. The backup server backs up data from LDEV3 to LDEV4. - In a conventional system without the data manager described in conjunction with
FIG. 1 , the various catalogs described above are all separated and the user is not able to see the total relationships of the data and files being managed by the storage system. The addition of the data manager, however, allows communication among the various servers and the data manager, for example using scripts or other well known interfaces. By communication between the data manager and the various servers these relationships may be discovered and presented to the user as discussed next. -
FIG. 17 illustrates a sample data descriptor table 240. This table illustrates information collected by the data manager 111 (seeFIG. 1 ) about the data being handled by the storage system and the servers. As shown inFIG. 17 , the data descriptor table includes a considerable information for the particular unit of data discovered. It also includes logical information about the data, including for example, the host name associated with that data, the path name, the “owner” of the data, any restrictions on access or rewriting of the data, the size, time of creation, time of modification, time of last access, and a count of the number of accesses. The data descriptor also includes information about the mount point (where the data is located), the type of file system associated with the data, and the maximum size of that file system. Finally, the data descriptor includes physical information about the data, including the storage system brand name (Lightning 9900), its IP address, its LDEV, etc. The physical information can also include information about the maximum volume size, the level of RAID protection, etc. - Generally speaking, the logical information includes which server has the data, its logical location within that server, and access control information, as well as size, and other parameters about the stored data. Also generally speaking, the file system information describes the type of file system in which the data is stored. The physical information describes the storage system and the LDEVs on which a particular file system has been created.
-
FIGS. 18-22 illustrate relationship descriptor tables to help establish the relationships among the data stored in the storage system.FIG. 18 is an example of a relationship descriptor table 241 for the archives The table includes information about a descriptor identification, its relationship to the original data, the original data descriptor, the archive data descriptor, the archive time and the retention period thus far. The relationship descriptor shows how the discovered data are related and assigns a unique ID (RID). -
FIG. 19 provides a relationship descriptor for backup as shown there. Table 242 illustrates the original data of the specified addresses has been backed up as data specified at that address. The backup date, time, speed, and other parameters are also maintained. -
FIG. 20 is a relationship descriptor table 243 for replication. This table, in addition to the other information provided, maintains the relationship between the original and the replicated data based on their global identification. -
FIG. 21 is a relationship descriptor table 244 for an application. As shown by this table, the e-mail server in the Trinity server has data sources specified by the designated global identification numbers. - As shown by table 245 in
FIG. 22 , there is a relationship descriptor for the archive in a message based system. Because it would be resource-consuming to create a data descriptor and a relationship descriptor for each message, only the relationship between the original data and the archived data are identified in the case of message based archiving. Of course, if desired, a data descriptor could be created. - The
data manager 111 also creates a number of tables based upon its interactions with the servers. These tables are referred to here as consisting of a discovery configuration table 280 shown inFIG. 23 , a discovered data table 420 shown inFIG. 24 , and a discovered relationship table 430 shown inFIG. 25 . These tables are discussed next. - The discovered configuration table 280 shown in
FIG. 23 shows from which applications and data applications the data manager has gathered information. Each entry in the table, consisting of a row, specifies a type of discovered data, a server from which the information is gathered, an application or data application name, and ID and password information to gain access as needed. For example, in the first row of table 280, an application program has collected information from server E using the application SAMSoft, and this can be accessed using the ID and password shown at the end of the row. -
FIG. 24 illustrates a discovered data table 420. This table provides management information for the discovered data. As shown by the table, the data is uniquely identified by the combination of storage system, LDEV and a relative path name. Files stored in the storage system are stored using a file system. The relative path name provides a path name inside the file system instead of a path name when the file system is mounted on a folder in the server. For example, assume LDEV1 is mounted on \folder1 at a server. Also assume there is a file with a path name which is \folder2\fileA. Thus the relative path name is File A. -
FIG. 25 illustrates a discovered relationship table 430. This table manages the identifications of discovered relationships. In the example depicted, the relationship identified by RID 0002 is a backup relationship indicating that the files having GIDs shown in the column “Source” were backed up as data identified by the “Destination” column. While backup, archive, and replication actions are associated with data at two locations, the application itself only has source data. Thus “destination” is not applicable. - Using all of the tables discussed above and the various relationships created, in a manner which will be discussed in detail below, the system is capable of providing a comprehensive view of the relationships among the data stored in the affiliated storage systems. Exemplary graphical user interfaces for presenting these relationships to the user of the storage system are shown in
FIGS. 26, 27 , and 28. As should be understood, other graphical user interfaces (GUI) can also be created for presentation to the user to enable a better understanding of the data in the storage system. These interfaces will typically be of most benefit to an administrator of the data management system. Typically these interfaces will be presented on theconsole 113 shown inFIG. 1 . Typical GUIs are discussed next. -
FIG. 26 illustrates a “data view”GUI 250. In this exemplary GUI, the data manager presents a view related to the data itself. In the embodiment depicted, the GUI has two parts, a data specifying panel on the left hand side and an information panel on the right hand side of the figure. The data specification panel shows all of the applications and all of the data in the system that is being used by those applications. For example, inFIG. 26 , the specification panel lists e-mail applications and within those applications an e-mail server A. That e-mail server has a number of files, shown in the example as A, B, and C. The user has chosen file A. In response the GUI is illustrating information about that file in the right hand panel shown inFIG. 26 . This panel illustrates the relationship information about the data associated with file A. As shown at the top of the panel, the server and file location are shown, as well as all archived, replicated, and backed up copies of that file. As illustrated, file A has been archived by server B at the designated location, has been replicated by server C at the designated location, and has been backed up by server D at the designated location. By clicking on the “Details” designation, the user causes the system to retrieve “deeper” information about that data, for example it's size, the time of the event, or other information provided in the descriptor tables discussed above, and that data will be presented on the GUI. -
FIG. 27 illustrates the GUI for a “storage view” of the data. The left hand panel shown inFIG. 27 corresponds to that discussed inFIG. 26 , enabling the user to select a particular file. In the same manner as described there, the user selected file A, and thus the right hand panel of thestorage view 260 is illustrating information about file A. That panel shows the LDEV and storage system where the original data is stored, as well as the LDEVs and the storage systems in which all of the data related to the original data are stored, as well as the relationships among those locations. For example, as shown in the upper portion of the right hand panel, the replica, archive, and backup relationships are illustrated. -
FIG. 28 is a third GUI enabling the user to more easily understand the location of various data in the storage system and the path by which that data is being handled.FIG. 28 illustrates the “path view” GUI. As with the aboveFIGS. 26 and 27 , the left hand side of theGUI 270 enables the user to select the particular file, while the right hand side depicts the topology map of the servers, switches, storage systems, and LDEVs for the original data, and for data related to the original data. This diagram also illustrates how data is transferred in the topology. To simplify the diagram, across the upper portion of the right hand panel inFIG. 28 are a series of “buttons.” By clicking on one of these buttons, the screen will show a path through which data is transferred by the specified relationship. - The preceding discussion has discussed the various tables created and used by the
data manager 111, and the graphical user interface for presentation of that data to a user of the system. The remaining portion of this specification discusses the manner in which the system operates to establish those tables and present the graphical user interfaces. -
FIG. 29 is a flowchart illustrating a preferred embodiment of the data discovery process by the data manager shown inFIG. 1 . The process is initiated by a user at theconsole 113 shown inFIG. 1 . At afirst step 290 the data manager retrieves an entry from the discovery configuration table shown inFIG. 23 , unless that entry is a replication entry. If there is a non-replication entry the flow proceeds immediately downward as shown inFIG. 29 . On the other hand, if there is no new entry, then the data discovery process retrieves a replication entry from the discovery configuration table as shown bystep 296. Assuming there is a new entry, the data manager checks the type of server and executes one of threeprocedures FIG. 29 . After that entry is retrieved the process reverts back to step 290 to be repeated as many times as is necessary to retrieve all of the entries from all of the servers. The details of the particular “get data”procedure step 296. Assuming there are replication entries, then the procedure followsstep 298, which is also discussed later below. Once all of the entries have been retrieved as shown atstep 297, the data discovery process ends. -
FIG. 30 illustrates in more detail the process flow for getting data from an application as shown byblock 293 inFIG. 29 . The data manager first connects to the SAM server via the network. It uses an identification and password in the discovery configuration table for theconnection 300. It then retrieves a list of applications from the SAM server 301, and for each application a list of data files from that server as shown bystep 302. As shown bystep 303, for each data file on that list, the data manager gets a file system name in which the data file is stored in the SAM server. Then, as shown bystep 304, for each file system a storage name and an LDEV on which the file system is created are also retrieved from the SAM server. Next, for each unique set (a name of a storage system, an LDEV, a data file relative path name) the data manager creates a new entry in the discovered data table and allocates a new global identification to that if there is not already an entry for that set. As shown bystep 306, for each such GID, a data descriptor is created. Then, as shown bystep 307, for each data descriptor, the data manager will retrieve logical information, file system information, and physical information from the SAM server and file that information into the data descriptor table. Then, as shown bystep 308, for each application a new entry in the discovered relationship table is created and a new RID is provided if there is not already an entry for that application. Finally, as shown bystep 309, for each RID the relationship descriptor for the application and the file information is then created. Once these steps are completed, the process flow returns to the diagram shown inFIG. 29 . -
FIG. 31 illustrates the process of retrieving data from the backup server, illustrated inFIG. 29 asstep 294. Once this process is invoked, the operation is similar to that described inFIG. 30 . In particular, the data manager first connects to a backup server via the network. It uses the ID and password information from the discovery configuration table for the connection, as shown instep 320. It also connects to the SAM server in the same manner, as shown instep 321. Atstep 322, the data manager retrieves a list of backup profiles from the backup server. As shown bystep 323, for each such backup profile the data manager obtains a list of backup data from the backup server. Then, atstep 324, for each backup data, the data manager retrieves a file system in which the backup stream is stored from the backup server. Next, as shown bystep 325, for each unique file system a storage name and an LDEV on which the file system is created, are retrieved from the SAM server. Then, atstep 326, for each unique set (name, LDEV, and backup stream relative path name) a new entry is created in the discovered data table and a new GID is allocated if there is not already an entry for that set. Next, atstep 327, for each GID a data descriptor is created. Then, as shown atstep 328, for each data descriptor logical information, file system information, and physical information from the SAM server is retrieved and provided to the data descriptor table. -
FIG. 32 illustrates theprocess following step 328. As shown inFIG. 32 , for each backup data, the data manager obtains a list of the data sources from the backup server atstep 329. Then for each unique data source, a file system in which the data source is stored is also retrieved from the backup server atstep 330. Atstep 331, for each unique file system, the data manager retrieves a storage name and an LDEV on which the file system is created from the same server. Then, atstep 332, for each unique set of storage name, LDEV, and data source relative path name, a new entry is created in the discovered data table, and a new GID is allocated if there is not already an entry for that set. Then atstep 333, a data descriptor is created for each GID. Atstep 334, for each data descriptor, logical information, file system information, and physical information is retrieved from the same server and filled into the data descriptor table. Then atstep 336, for each backup data, a new entry is created in the discovered relationship table and a new RID is allocated if there is not already an entry for that backup data. Finally, atstep 337 for each RID, a relationship descriptor for the backup information is created and this is filled into the discovered data table. That step concludes operations for the get data from backup step shown generally asstep 294 inFIG. 29 . -
FIG. 33 illustrates the details behind the step of getting data from the archive, represented bystep 295 inFIG. 29 . As described above, these operations are similar to the other get data operations discussed in the previous few figures. The process begins withstep 340 in which the data manager connects to the archive server using an ID and password information. It also connects to the same server with the ID and password information as shown by step 341. Atstep 343, it obtains a list of archive profiles, and atstep 344, for each archive profile it obtains a list of archive data from the archive server. Atstep 345 for each archive data, it retrieves the file system in which the archive stream is stored from the archive server. Then for each unique set of a storage name, an LDEV, and an archive stream relative path name, a new entry is created in the discovered data table and a new GID is allocated if there is not already one for that set. Next atstep 348, for each GID a data descriptor is created, and finally atstep 349, for each such data descriptor logical information from a file system information and physical information from the SAM server is filled into the data descriptor table. The process then continues withFIG. 34 . - As shown by
step 350, for each archived data, a list of data sources is retrieved from the archive server. Then for each unique data source, a file system for that data source is retrieved from the archive server, as shown bystep 351. Then, for each unique file system, the storage name and LDEV on which the file system is created are retrieved from the SAM server. Next, atstep 353, for each unique set of a storage name, an LDEV, and a data source relative path name, a new entry is created in the discovered data table and a new GID is allocated if there is not already one for that set. Then a new data descriptor is created for each GID and for each such data descriptor, logical information, file system information, and physical information is retrieved from the SAM server and filled into the data descriptor table as shown bystep 355. Then, for each archived data, a new entry is created in the discovered relationship table and a new RID is allocated if there is not already one for that data. Finally, a relationship descriptor is created for that RID and filled in to the data discovery table. - The process for getting data from the replica servers is similar to that described above. It is illustrated in
FIG. 35 . The process follows a flow of connecting to the replication server with an ID andpassword 360, connecting to theSAM server 361, and obtaining a list of replication profiles from thereplication server 362. Then for each replication profile, selected information is retrieved atstep 363, and for each such replication set, the data is located that is stored in these volumes atstep 364. Then for each found data set a new entry is created in the discovered relationship table, and for each such new RID a relationship descriptor is created and the information filled into the table atstep 366. This completes the description of the processes initially shown inFIG. 29 . Next, the techniques for showing the various data, storage and path view. The steps for showing a data view are illustrated by the flow chart ofFIG. 36 . To show the data view, the data manager receives a server name, an application name, and a data file from the GUI, as shown bystep 370. As discussed above, this selection will typically be made by the user choosing an appropriate entry in the left hand panel of the GUI. Then, as shown bystep 371, the GID for the specified data is retrieved from the discovered data table, and atstep 372, a list is retrieved of all RIDs that contain the GID from the discovered relationship table. If there are none, then the found GIDs may be displayed, as shown bystep 376. If there are RIDs, then for each such RID, the GIDs and the destination are also retrieved from the discovered relationship table as shown bystep 374. Once this is completed, the display is produced as shown bystep 376. -
FIG. 37 illustrates the steps for showing a storage view in the GUI. In a manner similar to that described withFIG. 36 , the user selects various information as shown instep 380, and the GID for the specified data is retrieved from the discovered data table. The flow of operations throughsteps FIG. 36 . Then, atstep 386, for each found GID the data manager finds the storage system and LDEVs in which the data specified by the GID is stored, and shows the storage as a storage icon on the screen and the LDEV as LDEV icons on the screen. Next, as shown bystep 387, the LDEV icons are interconnected by relationship indicators for each found RID. -
FIG. 38 is a flow chart illustrating the manner in which the path view GUI is created. Steps 390-395 are the same as those described above for the data and storage views. Atstep 396, for all of the found GIDs and RIDs find the related servers, switches, storage systems, and LDEVs that are related to the data or data applications specified by these found GIDs and RIDs. Following this step, the physical topology map for all the found hardware components is displayed atstep 397, and relationship buttons are added atstep 398. Atstep 399, if a button is pushed, then the system shows the data path by which the designated data is transferred, which information is provided by the SAM server. -
FIG. 39 is a flow chart illustrating another feature provided by the system of this invention.FIG. 39 provides a technique for detecting a misconfiguration of a data backup by comparing the size of the backup data with the size of the original data. The process shown inFIG. 39 may be invoked by the user through thestorage console 113 shown inFIG. 1 . Upon invocation, the system receives a server name, an application, and a data file from the GUI as shown bystep 400. Then the GID for the specified data is retrieved from the discovered data table and the list of RIDs that contain that GID are retrieved from the discovered relationship table. This process is repeated until all RIDs and GIDs are retrieved as shown by steps 403-405. At step 406 a calculation is performed for each GID with a full backup to determine the size of the backup stream. The size of the data files for that application are then computed atstep 407. Atstep 408, if the amounts match, a successfully completed message is displayed atstep 409, while if the amounts do not match, an error is displayed atstep 410. Upon receipt of the error the user can then either reperform the backup of investigate the error and resolve it in some other manner. - The technology described has numerous applications. These applications are not restricted to backup, archive, replication, etc. The invention can be applied to other applications or custom applications in which data is to be analyzed and relationships determined. The invention is also not limited to files or data in the local file system or local server. Instead, the invention can be applied to volumes in storage systems and objects in object based storage devices, or files in network attached storage systems. It can be applied to volumes, and to storage systems which replicate volumes by themselves. The data manager in such an application can determine from the storage system or the replication server how the volumes are replicated and create a data descriptor for each volume without path information, and also create a relationship descriptor by using the replication relationship. In the case of network attached storage, the data is uniquely identified by an IP address, an exported file system and a relative path name.
- While LDEV has been user herein to identify the uniqueness of data, other approaches may be used. The data manager may calculate a hash value for each data. Then the data manager can retrieve the logical location and physical location of such data from a SAM server. If the data are related to different locations, then the data manager can create a relationship descriptor for these data which indicates that the data are identical (in the case of duplicate hash values). This enables the user to see how many replications of data are present on the storage system and to determine which data can be deleted.
- By checking a hierarchy of relationships among data and performance information from the data processing, the data manager can also detect at what location in the hierarchy a performance bottleneck exists. In such a case, the data manager which retrieves performance information for each relationship and determines if those numbers are restricted by physical resources or disturbances caused by other data processing or application software. The data manager also provides users a way to search for data and relationships among data by specifying some portion of the data. If the data manager receives such a request, the data manager can find data descriptors and relationship descriptors that include the specified information and provide it, for example as described on a graphical user interface.
- Although the invention has been described in detail above with respect to a preferred embodiment, it will be appreciated that variations and alterations may be made in the implementation of the invention without departing from its scope as shown by the appended claims.
Claims (6)
1. In a data management system coupled to a first server which processes data to be stored in a first storage system, a second server which provides a copy of the stored data to be stored in a second storage system, and a third server which provides another copy of the stored data to be stored in a third storage system, a data management method comprising steps of:
collecting from the second server, information about the copied data stored in the second storage system;
collecting from the third server, information about the another copied data stored in the third storage system;
creating relationship information indicative of associations among the stored data in the first storage system, the copied data stored in the second storage system and the another copied data stored in the third storage system; and
presenting the relationship information associated with stored data identified in a user's request.
2. The data management method of claim 1 , wherein the relationship information include location information and/or path information.
3. The data management method of claim 2 , wherein the path information include port and switch information.
4. The data management method of claim 1 , wherein the relationship information include physical location information and/or virtual location information.
5. The data management method of claim 1 , wherein the presenting the relationship information includes displaying the relationship information by graphical user interface.
6. The data management method of claim 1 , the first server includes an application server.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/733,305 US20070198690A1 (en) | 2004-07-13 | 2007-04-10 | Data Management System |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/890,652 US7206790B2 (en) | 2004-07-13 | 2004-07-13 | Data management system |
US11/733,305 US20070198690A1 (en) | 2004-07-13 | 2007-04-10 | Data Management System |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/890,652 Continuation US7206790B2 (en) | 2004-07-13 | 2004-07-13 | Data management system |
Publications (1)
Publication Number | Publication Date |
---|---|
US20070198690A1 true US20070198690A1 (en) | 2007-08-23 |
Family
ID=35600711
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/890,652 Expired - Fee Related US7206790B2 (en) | 2004-07-13 | 2004-07-13 | Data management system |
US11/733,305 Abandoned US20070198690A1 (en) | 2004-07-13 | 2007-04-10 | Data Management System |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/890,652 Expired - Fee Related US7206790B2 (en) | 2004-07-13 | 2004-07-13 | Data management system |
Country Status (2)
Country | Link |
---|---|
US (2) | US7206790B2 (en) |
JP (1) | JP4744955B2 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090106407A1 (en) * | 2007-10-19 | 2009-04-23 | Hitachi, Ltd. | Content transfer system, content transfer method and home server |
US20090204648A1 (en) * | 2008-02-11 | 2009-08-13 | Steven Francie Best | Tracking metadata for files to automate selective backup of applications and their associated data |
US8296414B1 (en) * | 2007-09-28 | 2012-10-23 | Emc Corporation | Techniques for automated application discovery |
US20120271934A1 (en) * | 2007-12-27 | 2012-10-25 | Naoko Iwami | Storage system and data management method in storage system |
US20130110785A1 (en) * | 2011-10-27 | 2013-05-02 | Hon Hai Precision Industry Co., Ltd. | System and method for backing up test data |
US20130151475A1 (en) * | 2011-12-07 | 2013-06-13 | Fabrice Helliker | Data Management System with Console Module |
US20130262374A1 (en) * | 2012-03-28 | 2013-10-03 | Fabrice Helliker | Method Of Managing Data With Console Module |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7571387B1 (en) * | 2005-09-21 | 2009-08-04 | Emc Corporation | Methods and apparatus facilitating management of a SAN |
US7613742B2 (en) * | 2006-05-02 | 2009-11-03 | Mypoints.Com Inc. | System and method for providing three-way failover for a transactional database |
US8903883B2 (en) * | 2006-05-24 | 2014-12-02 | International Business Machines Corporation | Apparatus, system, and method for pattern-based archiving of business events |
US8112396B2 (en) * | 2006-06-07 | 2012-02-07 | Emc Corporation | Backup and recovery of integrated linked databases |
US20080104146A1 (en) * | 2006-10-31 | 2008-05-01 | Rebit, Inc. | System for automatically shadowing encrypted data and file directory structures for a plurality of network-connected computers using a network-attached memory with single instance storage |
CA2668074A1 (en) * | 2006-10-31 | 2008-05-08 | David Schwaab | System for automatically shadowing data and file directory structures that are recorded on a computer memory |
US8266105B2 (en) * | 2006-10-31 | 2012-09-11 | Rebit, Inc. | System for automatically replicating a customer's personalized computer system image on a new computer system |
JP5073348B2 (en) * | 2007-04-04 | 2012-11-14 | 株式会社日立製作所 | Application management support system, management computer, host computer, and application management support method |
US8209443B2 (en) * | 2008-01-31 | 2012-06-26 | Hewlett-Packard Development Company, L.P. | System and method for identifying lost/stale hardware in a computing system |
US7882246B2 (en) * | 2008-04-07 | 2011-02-01 | Lg Electronics Inc. | Method for updating connection profile in content delivery service |
JP5579195B2 (en) | 2008-12-22 | 2014-08-27 | グーグル インコーポレイテッド | Asynchronous distributed deduplication for replicated content addressable storage clusters |
US9329951B2 (en) | 2009-07-31 | 2016-05-03 | Paypal, Inc. | System and method to uniformly manage operational life cycles and service levels |
JP5357068B2 (en) * | 2010-01-20 | 2013-12-04 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Information processing apparatus, information processing system, data archive method, and data deletion method |
US8935612B2 (en) * | 2010-04-07 | 2015-01-13 | Sybase, Inc. | Data replication tracing |
US8527431B2 (en) | 2010-11-18 | 2013-09-03 | Gaurab Bhattacharjee | Management of data via cooperative method and system |
JP2014031458A (en) * | 2012-08-06 | 2014-02-20 | Nsk Ltd | Lubricant composition and bearing unit for a hard disc drive swing arm |
US10664356B1 (en) * | 2013-05-30 | 2020-05-26 | EMC IP Holding Company LLC | Method and system for enabling separation of database administrator and backup administrator roles |
WO2015126971A1 (en) * | 2014-02-18 | 2015-08-27 | Cobalt Iron, Inc. | Techniques for presenting views of a backup environment for an organization on a sub-organizational basis |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5065347A (en) * | 1988-08-11 | 1991-11-12 | Xerox Corporation | Hierarchical folders display |
US5388196A (en) * | 1990-09-07 | 1995-02-07 | Xerox Corporation | Hierarchical shared books with database |
US6282602B1 (en) * | 1998-06-30 | 2001-08-28 | Emc Corporation | Method and apparatus for manipulating logical objects in a data storage system |
US6329985B1 (en) * | 1998-06-30 | 2001-12-11 | Emc Corporation | Method and apparatus for graphically displaying mapping of a logical object |
US20020161855A1 (en) * | 2000-12-05 | 2002-10-31 | Olaf Manczak | Symmetric shared file storage system |
US20030009295A1 (en) * | 2001-03-14 | 2003-01-09 | Victor Markowitz | System and method for retrieving and using gene expression data from multiple sources |
US20030115218A1 (en) * | 2001-12-19 | 2003-06-19 | Bobbitt Jared E. | Virtual file system |
US20030185064A1 (en) * | 2002-04-02 | 2003-10-02 | Hitachi, Ltd. | Clustering storage system |
US20040078376A1 (en) * | 2002-10-21 | 2004-04-22 | Hitachi, Ltd. | Method for displaying the amount of storage use |
US20040139128A1 (en) * | 2002-07-15 | 2004-07-15 | Becker Gregory A. | System and method for backing up a computer system |
US6854035B2 (en) * | 2001-10-05 | 2005-02-08 | International Business Machines Corporation | Storage area network methods and apparatus for display and management of a hierarchical file system extension policy |
US7054892B1 (en) * | 1999-12-23 | 2006-05-30 | Emc Corporation | Method and apparatus for managing information related to storage activities of data storage systems |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5495607A (en) * | 1993-11-15 | 1996-02-27 | Conner Peripherals, Inc. | Network management system having virtual catalog overview of files distributively stored across network domain |
US5890165A (en) * | 1996-03-29 | 1999-03-30 | Emc Corporation | Method and apparatus for automatic discovery of databases |
US5953525A (en) * | 1997-03-31 | 1999-09-14 | International Business Machines Corporation | Multi-tier view project window |
US6173293B1 (en) * | 1998-03-13 | 2001-01-09 | Digital Equipment Corporation | Scalable distributed file system |
WO2000004483A2 (en) * | 1998-07-15 | 2000-01-27 | Imation Corp. | Hierarchical data storage management |
US6952823B2 (en) * | 1998-09-01 | 2005-10-04 | Pkware, Inc. | Software patch generator using compression techniques |
US6380957B1 (en) * | 1998-12-15 | 2002-04-30 | International Business Machines Corporation | Method of controlling view of large expansion tree |
JP2000207264A (en) * | 1999-01-19 | 2000-07-28 | Canon Inc | Backup method and restoring method |
US6950871B1 (en) * | 2000-06-29 | 2005-09-27 | Hitachi, Ltd. | Computer system having a storage area network and method of handling data in the computer system |
WO2002029540A2 (en) * | 2000-10-06 | 2002-04-11 | Ampex Corporation | System and method for transferring data between recording devices |
US6912543B2 (en) * | 2000-11-14 | 2005-06-28 | International Business Machines Corporation | Object-oriented method and system for transferring a file system |
US6636878B1 (en) * | 2001-01-16 | 2003-10-21 | Sun Microsystems, Inc. | Mechanism for replicating and maintaining files in a spaced-efficient manner |
US7478096B2 (en) * | 2003-02-26 | 2009-01-13 | Burnside Acquisition, Llc | History preservation in a computer storage system |
US20050049998A1 (en) * | 2003-08-28 | 2005-03-03 | International Business Machines Corporation | Mechanism for deploying enterprise information system resources |
US7287048B2 (en) * | 2004-01-07 | 2007-10-23 | International Business Machines Corporation | Transparent archiving |
US20050267918A1 (en) * | 2004-05-28 | 2005-12-01 | Gatev Andrei A | System and method for bundling deployment descriptor files within an enterprise archive for fast reliable resource setup at deployment time |
-
2004
- 2004-07-13 US US10/890,652 patent/US7206790B2/en not_active Expired - Fee Related
-
2005
- 2005-07-04 JP JP2005194782A patent/JP4744955B2/en not_active Expired - Fee Related
-
2007
- 2007-04-10 US US11/733,305 patent/US20070198690A1/en not_active Abandoned
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5065347A (en) * | 1988-08-11 | 1991-11-12 | Xerox Corporation | Hierarchical folders display |
US5388196A (en) * | 1990-09-07 | 1995-02-07 | Xerox Corporation | Hierarchical shared books with database |
US6282602B1 (en) * | 1998-06-30 | 2001-08-28 | Emc Corporation | Method and apparatus for manipulating logical objects in a data storage system |
US6329985B1 (en) * | 1998-06-30 | 2001-12-11 | Emc Corporation | Method and apparatus for graphically displaying mapping of a logical object |
US7054892B1 (en) * | 1999-12-23 | 2006-05-30 | Emc Corporation | Method and apparatus for managing information related to storage activities of data storage systems |
US20020161855A1 (en) * | 2000-12-05 | 2002-10-31 | Olaf Manczak | Symmetric shared file storage system |
US20030009295A1 (en) * | 2001-03-14 | 2003-01-09 | Victor Markowitz | System and method for retrieving and using gene expression data from multiple sources |
US6854035B2 (en) * | 2001-10-05 | 2005-02-08 | International Business Machines Corporation | Storage area network methods and apparatus for display and management of a hierarchical file system extension policy |
US20030115218A1 (en) * | 2001-12-19 | 2003-06-19 | Bobbitt Jared E. | Virtual file system |
US20030185064A1 (en) * | 2002-04-02 | 2003-10-02 | Hitachi, Ltd. | Clustering storage system |
US20040139128A1 (en) * | 2002-07-15 | 2004-07-15 | Becker Gregory A. | System and method for backing up a computer system |
US20040078376A1 (en) * | 2002-10-21 | 2004-04-22 | Hitachi, Ltd. | Method for displaying the amount of storage use |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8296414B1 (en) * | 2007-09-28 | 2012-10-23 | Emc Corporation | Techniques for automated application discovery |
US20090106407A1 (en) * | 2007-10-19 | 2009-04-23 | Hitachi, Ltd. | Content transfer system, content transfer method and home server |
US8819205B2 (en) * | 2007-10-19 | 2014-08-26 | Hitachi, Ltd. | Content transfer system, content transfer method and home server |
US20120271934A1 (en) * | 2007-12-27 | 2012-10-25 | Naoko Iwami | Storage system and data management method in storage system |
US8775600B2 (en) * | 2007-12-27 | 2014-07-08 | Hitachi, Ltd. | Storage system and data management method in storage system |
US20090204648A1 (en) * | 2008-02-11 | 2009-08-13 | Steven Francie Best | Tracking metadata for files to automate selective backup of applications and their associated data |
US20130110785A1 (en) * | 2011-10-27 | 2013-05-02 | Hon Hai Precision Industry Co., Ltd. | System and method for backing up test data |
US8538925B2 (en) * | 2011-10-27 | 2013-09-17 | Hong Fu Jin Precision Industry (Shenzhen) Co., Ltd. | System and method for backing up test data |
US20130151475A1 (en) * | 2011-12-07 | 2013-06-13 | Fabrice Helliker | Data Management System with Console Module |
US9514143B2 (en) * | 2011-12-07 | 2016-12-06 | Hitachi Data Systems Corporation | Data management system with console module |
US20170039113A1 (en) * | 2011-12-07 | 2017-02-09 | Hitachi Data Systems Corporation | Data management system with console module |
US10817384B2 (en) * | 2011-12-07 | 2020-10-27 | Hitachi Vantara Llc | Data management system with console module |
US20130262374A1 (en) * | 2012-03-28 | 2013-10-03 | Fabrice Helliker | Method Of Managing Data With Console Module |
Also Published As
Publication number | Publication date |
---|---|
US20060015544A1 (en) | 2006-01-19 |
US7206790B2 (en) | 2007-04-17 |
JP4744955B2 (en) | 2011-08-10 |
JP2006031695A (en) | 2006-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20070198690A1 (en) | Data Management System | |
US20200257595A1 (en) | Systems and methods for restoring data from network attached storage | |
US7406473B1 (en) | Distributed file system using disk servers, lock servers and file servers | |
US9740700B1 (en) | Snapshot map | |
US7320060B2 (en) | Method, apparatus, and computer readable medium for managing back-up | |
US9092378B2 (en) | Restoring computing environments, such as autorecovery of file systems at certain points in time | |
US8032491B1 (en) | Encapsulating information in a storage format suitable for backup and restore | |
EP1522926B1 (en) | Systems and methods for backing up data files | |
US20060230243A1 (en) | Cascaded snapshots | |
US8284198B1 (en) | Method for visualizing space utilization in storage containers | |
US9047352B1 (en) | Centralized searching in a data storage environment | |
US20030065780A1 (en) | Data storage system having data restore by swapping logical units | |
US20100223428A1 (en) | Snapshot reset method and apparatus | |
US8719535B1 (en) | Method and system for non-disruptive migration | |
US9690791B1 (en) | Snapshot history map | |
US20050278383A1 (en) | Method and apparatus for keeping a file system client in a read-only name space of the file system | |
US7228306B1 (en) | Population of discovery data | |
US9165003B1 (en) | Technique for permitting multiple virtual file systems having the same identifier to be served by a single storage system | |
KR20060007435A (en) | Managing a relationship between one target volume and one source volume | |
US7685460B1 (en) | Multiple concurrent restore using same user interface | |
US7613720B2 (en) | Selectively removing entities from a user interface displaying network entities | |
US11442815B2 (en) | Coordinating backup configurations for a data protection environment implementing multiple types of replication |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |