SYSTEM AND METHOD FOR IN-PLACE DATA MIGRATION
Related Application
[0001] This application claims the benefit of U.S.
Provisional Patent Application Ser. No. 61/378,516, entitled System and Method for In-Place Data Migration, filed on
8/31/2010, the contents of which are incorporated herein by reference in their entirety for all purposes.
TECHNICAL FIELD
[0002] The present invention relates to data storage and digital content management and more particularly, to a cost- effective system and method for in-place or post-facto migration of data to cloud-based storage services.
BACKGROUND INFORMATION
[0003] It is important for companies to find cost effective ways to manage their digital file storage. Although it may seem that file storage is inexpensive, 80% or more of the total cost of ownership is in managing and administering that storage.
Most organizations' need for file storage is growing at 40% to 50% per year, along with the cost to manage that storage. Today, many companies have so much data that moving it from place to place can be cost-prohibitive.
[0004] A number of storage software vendors provide solutions that will store and organize data. Examples of such solutions in include conventional NAS, SAN or DAS storage devices which are typically deployed and maintained by an enterprises IT department. In addition, there is currently a trend towards public and private cloud-based or virtual data stores and associated name spaces supported internally and externally, and accessed by users via a Wide Area Network such as the Internet and by legacy protocols such as CIFS and NFS. Examples of these approaches include the Microsoft® SharePoint™, ByCast, and Xanet services, etc.
[0005] One of the drawbacks the Storage Industry has today is that, unlike in the past when file data was comparatively small could be easily copied from one location to another, today's enterprises often have too much data to move other than by necessity. This may be particularly problematic for relatively large users attempting to migrate from conventional user- supported NAS, SAN or DAS storage devices, to the aforementioned cloud-based or virtual data stores. Indeed, for an enterprise- class customer that may have several terabytes (or more) of data, such movement may not be realistically feasible, since the resources required for such a data migration may approach or exceed the available resources of their IT infrastructure.
[0006] For example, the US military has recently attempted to standardize on SharePoint™. In total there are approximately 3 million users, hundreds of petabytes of data and trillions of files. Currently, it may be possible to load a trillion records into a database. Indeed, in some applications it may be possible to manipulate a billion records using a conventional desktop computer. However, it is impractical, if not substantially impossible, to move 100 petabytes of data electronically from point A to point B in any reasonable period of time or
affordable cost.
[0007] Accordingly, what is needed is a cost-effective system and method for the virtual, or post-facto migration of
relatively large amounts of data to cloud-based data sharing services or other content management systems.
SUMMARY
[0008] Aspects of the present invention include methods and systems for the in-place or post-facto migration of data to a cloud-based data storage service or other virtual storage environment. The system includes a Cloud Storage Import Utility (CSIU) device including a file selection module and configured to generate a user interface. The user interface is configured for allowing a storage administrator to select one or more files, file folders, or shares to be to be published to the
cloud and optionally migrated from a current storage device to another storage service, and for providing an indication of said selection. The CSIU is configured to capture metadata for the selected files or file folders. The CSIU also provides one or more commands understandable by the cloud-based data storage service, to cause the metadata to be migrated to the cloud-based data storage service independently of the files or file folders, so that they are usable by the cloud-based storage service without being moved to the cloud-based storage service.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] These and other features and advantages of the present invention will be better understood by reading the following detailed description, taken together with the drawings wherein:
[0010] Figs. 1 and 2 are block diagrams of systems of the prior art;
[0011] Fig. 3 is a block diagram of an embodiment of a system and method of the present invention;
[0012] Fig. 4 is a block diagram of an alternate embodiment of a system and method of the present invention; and
[0013] Figs. 5-15 are screen displays of an exemplary
operation of an embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0014] An aspect of the invention was the realization that data storage for large scale, enterprise-level applications presents issues that are substantially different from those of relatively small scale applications. The instant inventor also realized that contrary to conventional wisdom among much of the relevant industry, metadata and the underlying data to which it pertains, may be separated from one another without sacrificing desired functionality.
[0015] Turning now to Fig. 1, can be seen that in order to use conventional cloud-based data storage services 28, all of the data, i.e., the underlying data bits and their corresponding metadata, must be moved from an original location (e.g., data store 14) into the cloud-based service 28. As shown in Fig. 2, once in service 28, the data may be transferred to a remote data store, such as via Sharepoint's Remote Blob Storage feature, shown at 14', where it may be accessed by service 28. However, both of these scenarios require the initial upload of the underlying data, as well as its corresponding metadata, to service 28.
[0016] Turning now to Fig. 3, an embodiment of the present invention will be described in connection with an exemplary system 10. As shown, system 10 may be accessed by a storage administrator, via a user device 12, which may take the form of a computer, laptop, PDA, Smart phone or the like. Other
examples of user devices 12 include a workstation, personal computer, personal digital assistant (PDA) , wireless telephone, or any other suitable computing device including a processor, a computer readable medium upon which computer readable program code (including instructions and/or data) may be disposed, and a user interface, all of which require or may be used by a storage administrator to migrate data to a cloud-based or virtual data storage service for primary and/or archiving storage. A similar device usable by an end-user, shown as end-user device 12', may be used in a conventional manner to access files administered by embodiments of the present invention.
[0017] As shown, the user device 12 is communicably couplable via a network 18, e.g., a Wide Area Network such as the
Internet, to a storage device 14 that may be used for primary (day to day) storage, and/or that may also be used for long term storage or archiving. The primary storage and long-term or archiving storage may be performed on two different areas of the same physical storage device 14 or alternatively, may be
performed on two physically different and/or remotely located storage devices 14.
[0018] Storage device 14 may include any number of storage devices, including, but not limited to, Network Attached Storage (NAS) such as those available from EMC Corporation (Hopkinton, MA, USA) and NetApp (Sunnyvale, CA,USA), Storage Area Network
(SAN) devices such as, but not limited to, those from EMC
Corporation (Hopkinton, MA, USA) , and direct attached storage devices (DAS) such as, but not limited to, devices running the Microsoft Windows Server operating system.
[0019] A cloud-based (virtual) data store/storage system 28 is also shown communicably coupled to network 18. This storage system 28 may take the form of any number of commercially available services, such as the aforementioned Microsoft®
SharePoint™, ByCast, and Xanet services, etc. For ease of explication, the embodiments disclosed herein will be shown and described with respect to the Microsoft® Sharepoint™ service, with the understanding that these embodiments/descriptions are applicable to substantially any cloud-based or other virtual storage environment data store/storage system currently
available or which may be developed in the future.
[0020] As also shown, system 10 includes a Cloud Storage Import Utility (CSIU) 30. This CSIU 30 is located on a server (e.g., a webserver) that may enable user access via webpage(s). This server may also perform other functions and may provide various other features to the network such as database hosting, etc. The CSIU 30 enables users, such as storage administrators, to select files, e.g., by accessing a file selection application 15, to select files for in-place-migration from a storage device 14 to a Sharepoint system 28. The CSIU 30 receives file
selections from the file selection application 15 and then captures information (e.g., metadata) associated with the selected files. CSIU 30 is configured to then insert this captured metadata into the metadata database of the Sharepoint data store 28. The CSIU 30 may also be configured to index (or to enable communication with Sharepoint enabling it to index) the files selected by file selection application 15, e.g., to enable end-users to effect content-based, full text searching of the selected files via the Sharepoint interface.
[0021] It should be recognized that the file selection application 15 may be a software application, such as a version of the NTP Software Storage Investigator™ available from NTP Software (Nashua NH) and incorporated herein by reference, that may be modified in accordance with the teachings hereof, to permit users to designate specific files or categories of files for use by CSIU 30. The file selection application 15 may reside directly on the server hosting CSIU 30, or on another server or platform, including, optionally, user device 12. It should also be recognized that storage device 14 may be
substantially any data store which is remote from the Sharepoint store 28, including, for example, a data store connected via Sharepoint ' s Remote Blob Storage, shown as 14' in Fig. 4.
[0022] As mentioned hereinabove, user device 12, 12', storage device 14, 14', cloud storage service 28, and the server that
holds CSIU 30, are communicably coupled to one another over a network communication path 18, such as the Internet. The user device 12, 12' may be any form of computing or data processing device capable of communicating via network 18.
[0023] Terms such as "server", "application", "engine", "module" and the like are each intended to refer to a computer- related component, including hardware, software, and/or software in execution. For example, an engine may be, but is not limited to being, a process running on a processor, a processor
including an object, an executable, a thread of execution, a program, and a computer. Moreover, the various components may be localized on one computer and/or distributed between two or more computers. The term "cloud-based data storage" will be used herein to refer to substantially any virtual storage
environment. The term "in-place migration" and/or "post-facto migration" refers to publishing or otherwise making data usable by the cloud-based storage service without having to first move the data to the cloud-based storage service.
[0024] In various embodiments, the CSIU 30 and/or file selection application 15 may provide a user interface that takes any of various forms including, but not limited to, a standard web browser based application that operates with web browsers such as, but not limited to, Microsoft Internet Explorer (IE) and Mozilla Firefox.
[0025] The CSIU 30 is an application configured to
effectively translate selections made using the File Selection Application 15 e.g., using lookup tables, database, hard coded programming, configuration files or the like, into instructions or commands usable by CSIU 30 as discussed hereinabove. CSIU 30 is also configured to capture information (e.g., metadata) associated with the file selections and effectively package it with these instructions/commands for use by cloud-based service 28. CSIU 30 may also handle appropriate security requirements, e.g., to ensure that the particular user at device 12 has requisite permissions, etc.
[0026] In particular embodiments, CSIU 30 may include a version of the NTP Software ODA™ engine commercially available from NTP Software, Inc. (Nashua, NH, USA) and incorporated herein by reference, and which has been modified in accordance with the teachings hereof.
[0027] In a representative method of operating system 10, a user (e.g., storage administrator) may use device 12 to access 40 the file selection application 15 of the CSUI 30 and select files or folders on primary data store 14. The CSIU 30 may then capture information (e.g., metadata) for the selected files and/or folder(s), and translates the intended actions into instructions, including metadata, to be conveyed 42 to the
Sharepoint service 28 for incorporation into the Sharepoint
metadata file(s), to effect the desired in-place-migration of the selected files/folders. Thereafter, an end-user 12' may query 44 the Sharepoint data store 28, to retrieve 46 data files stored on remote data store 14.
[0028] Turning now to Fig. 4, an alternate embodiment of the present invention is shown as exemplary system 10'. System 10' is substantially similar to system 10 of Fig. 3, while also including another remote data store 14' which may serve as a new repository for the underlying source data for the files/folders selected by the user via device 12. During operation of this system 10', a user (e.g., storage administrator) may use device 12 to access and select 40 files using the file selection application 15 of the CSUI 30. The CSIU 30 may then capture information (e.g., metadata) for the selected files/folder ( s ) , translate the intended actions into instructions, and convey 42 this information, including the metadata, to the Sharepoint service 28. The underlying data may also be moved 43 (e.g., in response to a command sent via device 12) from data store 14 to the other data store 14' (e.g., via Sharepoint Remote Blob
Storage), where it may be handled by cloud-based service 28. In this manner, system 10' effects the desired in-place-migration of the files selected by the user, by moving them to target data store 14' where they may be accessed via service 28 without ever having to be moved to the service 28. Thereafter, an end-user
12' may query 44 the Sharepoint data service 28, to retrieve 46 data files stored on remote data store 14'.
[0029] A more detailed example of in-place-migration in accordance with the present invention will now be shown and described with reference to Figs. 5-15. Turning now to Fig. 5, user device 12 may be used to access a particular end-user's home directory on data store 14. In this example, the entire contents of this home directory will be selected for (in-place) migration into this user's Home Documents site on SharePoint 28.
[0030] It should be recognized that the data files shown in this home directory on data store 14 are indexed, e.g., by the CSIU 30 using any number of conventional indexing approaches, to enable end-users to search the contents based on keywords. For example, as shown in Fig. 6, the word "royalty" has been used to search for the EULA.doc file. The index(es) of this home
directory may thus be imported into service 28 as part of the migration process, and/or the data files may be indexed by service 28 after receiving the metadata, as will be discussed in greater detail hereinbelow.
[0031] As shown in Fig. 7, in this example, prior to file migration, the contents of the end-user's Home Documents site on Sharepoint 28 is empty.
[0032] As shown in Fig. 8, the CSIU 30, e.g., accessed by a storage administrator via device 12, displays a dialog screen by
which the user may select data files, e.g., by entering the source directory path of the end-user's home directory on the file server 14, along with that of the target SharePoint site 28. Clicking the "import" button causes the utility to perform the import by capturing and forwarding the corresponding metadata, while leaving the underlying data files in place at data store 14. After the import /in-place-migration is complete, the SharePoint site 28 contains "links" to each file imported, such as shown in Fig. 9.
[0033] To illustrate the items in SharePoint 28 are simply "links" to the files on file server 14, the screenshot of Fig. 10 shows the contents of a "Draglmg" Word document. This document was launched (e.g., by the end-user device 12') from the "link" in the user's Home Documents site on Sharepoint 28.
[0034] Thereafter, as shown in Fig. 11, the title of the Draglmg document file is modified from the end-user's Home directory on the original file server 14 (i.e., not through SharePoint 28), and then stored back to the file server 14.
[0035] Then, the same file is opened through its "link" on SharePoint 28. As can be seen in Fig. 12, the title of this document shows the change made outside of Sharepoint 28. Thus, it can be seen that the contents of the file still resides on the original file server 14, not in the SharePoint database 28.
[0036] Turning now to Fig. 13, once they have been published or "migrated" as described herein, Sharepoint 28 may use its indexing service, e.g., as part of its external "Blob Storage" feature to index the files. This indexing service may be run on a schedule set by the storage administrator. Alternatively, the indexing process may be initiated manually using the "Start Full Crawl" feature as shown.
[0037] Turning to Fig. 14, the end-user may verify successful indexing by returning to his SharePoint home directory site 28 and perform a search for the word "royalty". As shown in Fig. 15, the search results indicate the search string was located in the EULA.doc file, illustrating successful indexing of the files imported using the in-place-migration of the present invention.
[0038] In this manner, the present invention can interface with and can be programmed to interface with essentially any archiving application that will allow it's command set /command interface to be made known to third parties for interfacing with that archiving application.
[0039] It should be recognized that information, e.g., commands, instructions, metadata, etc., may be passed between the various components (modules) disclosed herein by any convenient means, including conventional push or pull technology, without departing from the scope of the present invention. Moreover, modifications and substitutions by one of ordinary skill in the
art are considered to be within the scope of the present invention, which is not to be limited except by any allowed claims and their legal equivalents.
[0040] What is claimed is: