EP1573508A4 - Systems for the implementation of a synchronization schemas - Google Patents
Systems for the implementation of a synchronization schemasInfo
- Publication number
- EP1573508A4 EP1573508A4 EP04757348A EP04757348A EP1573508A4 EP 1573508 A4 EP1573508 A4 EP 1573508A4 EP 04757348 A EP04757348 A EP 04757348A EP 04757348 A EP04757348 A EP 04757348A EP 1573508 A4 EP1573508 A4 EP 1573508A4
- Authority
- EP
- European Patent Office
- Prior art keywords
- item
- changes
- ofthe
- sync
- relationship
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/178—Techniques for file synchronisation in file systems
- G06F16/1787—Details of non-transparently synchronising file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
Definitions
- MSFT- 2734 filed on August 21, 2003, entitled “STORAGE PLATFORM FOR ORGANIZING, SEARCHING, AND SHARING DATA”
- U.S. Patent Application No. 10/646,580 (Atty. Docket No. MSFT-2735), filed on August 21, 2003, entitled “SYSTEMS AND METHODS FOR DATA MODELING IN AN ITEM-BASED STORAGE PLATFORM”
- U.S. Patent Application No. 10/692,779 (Atty. Docket No. MSFT-2829), filed on October 24, 2003, entitled “SYSTEMS AND METHODS FOR THE IMPLEMENTATION OF A DIGITAL IMAGES SCHEMA FOR ORGANIZING UNITS OF INFORMATION MANAGEABLE BY A
- MSFT-2845 filed on October 24, 2003, entitled “SYSTEMS AND METHODS FOR PROVIDING RELATIONAL AND HIERARCHICAL SYNCHRONIZATION SERVICES FOR UNITS OF INFORMATION MANAGEABLE BY A HARDWARE/SOFTWARE INTERFACE SYSTEM”; and U.S. Patent Application No. 10/693,574 (Atty. Docket No. MSFT-2847), filed on October 24, 2003, entitled “SYSTEMS AND METHODS FOR EXTENSIONS AND INHERITANCE FOR UNITS OF INFORMATION MANAGEABLE BY A HARDWARE/SOFTWARE INTERFACE SYSTEM".
- the present invention relates generally to the field of information storage and retrieval, and, more particularly, to an active storage platform for organizing, searching, and sharing different types of data in a computerized system, as well as to the synchronization of multiple instances of a data store or a subset thereof.
- BACKGROUND [0004] Individual disk capacity has been growing at roughly seventy percent (70%) per year over the last decade. Moore's law accurately predicted the tremendous gains in central processing unit (CPU) power that has occurred over the years. Wired and wireless technologies have provided tremendous connectivity and bandwidth.
- Multics operating system developed during the 1960s, can be credited with pioneering the use ofthe files, folders, and directories to manage storable units of data at the operating system level.
- Multics used symbolic addresses within a hierarchy of files (thereby introducing the idea of a file path) where physical addresses ofthe files were not transparent to the user (applications and end-users).
- This file system was entirely unconcerned with the file format of any individual file, and the relationships amongst and between files was deemed irrelevant at the operating system level (that is, other than the location ofthe file within the hierarchy). Since the advent of Multics, storable data has been organized into files, folders, and directories at the operating system level.
- These files generally include the file hierarchy itself (the "directory") embodied in a special file maintained by the file system.
- This directory maintains a list of entries corresponding to all ofthe other files in the directory and the nodal location of such files in the hierarchy (herein referred to as the folders).
- Such has been the state ofthe art for approximately forty years. [0008]
- a file system is nevertheless an abstraction of that physical storage system, and therefore utilization ofthe files requires a level of indirection (interpretation) between what the user manipulates (units having context, features, and relationships to other units) and what the operating system provides (files, folders, and directories).
- PC personal computer
- IM instant messaging
- the basic relational model does not provide a sufficient platform for storage of data on which higher-level applications can easily be developed because the basic relational model requires a level of indirection between the application and the storage system— where the semantic structure ofthe data might only be visible in the application in certain instances.
- database vendors are building higher-level functionality into their products—such as providing object relational capabilities, new organizational models, and the like— none have yet to provide the kind of comprehensive solution needed, where a truly comprehensive solution is one which provides both useful data model abstractions (such as "Items,” “Extensions,” “Relationships,” and so on) for useful domain abstractions (such as “Persons,” “Locations,” “Events,” etc.).
- the present invention as well as the related inventions, are collectively directed to a storage platform for organizing, searching, and sharing data.
- the storage platform ofthe present invention extends and broadens the concept of data storage beyond existing file systems and database systems, and is designed to be the store for all types of data including structured, non-structured, or semi-structured data.
- the storage platform ofthe present invention comprises a data store implemented on a database engine.
- the database engine comprises a relational database engine with object relational extensions.
- the data store implements a data model that supports organization, searching, sharing, synchronization, and security of data.
- the platform provides a mechanism to extend the set of schemas to define new types of data (essentially subtypes ofthe basic types provides by the schemas).
- a synchronization capability facilitates the sharing of data among users or systems.
- File-systemlike capabilities are provided that allow interoperability ofthe data store with existing file systems but without the limitation of such traditional file systems.
- a change tracking mechanism provides the ability track changes to the data store.
- the storage platform further comprises a set of application program interfaces that enable applications to access all ofthe foregoing capabilities ofthe storage platform and to access the data described in the schemas.
- An item is a unit of data storable in a data store and can comprise one or more elements and relationships.
- An element is an instance of a type comprising one or more fields (also referred to herein as a property).
- a relationship is a link between two items.
- the computer system further comprises a plurality of Items where each Item constitutes a discrete storable unit of information that can be manipulated by a hardware/software interface system; a plurality of Item Folders that constitute an organizational structure for said Items; and a hardware/software interface system for manipulating a plurality of Items and wherein each Item belongs to at least one Item Folder and may belong to more than one Item Folder.
- An Item or some ofthe Item's property values may be computed dynamically as opposed to being derived from a persistent store.
- the hardware/software interface system does not require that the Item be stored, and certain operations are supported such as the ability to enumerate the current set of Items or the ability to retrieve an Item given its identifier (which is more fully described in the sections that describe the application programming interface, or API) ofthe storage platform — for example, an Item might be the current location of a cell phone or the temperature reading on a temperature sensor.
- the hardware/software interface system may manipulate a plurality of Items, and may further comprise Items interconnected by a plurality of Relationships managed by the hardware/software interface system.
- a hardware/software interface system for the computer system further comprises a core schema to define a set of core Items which said hardware/software interface system understands and can directly process in a predetermined and predictable way.
- the computer system interconnects said Items with a plurality of Relationships and manages said Relationships at the hardware/software interface system level.
- the API ofthe storage platform provides data classes for each item, item extension, and relationship defined in the set of storage platform schemas.
- the application programming interface provides a set of framework classes that define a common set of behaviors for the data classes and that, together with the data classes, provide the basic programming model for the storage platform API.
- the storage platform API provides a simplified query model that enables application programmers to form queries based on various properties ofthe items in the data store, in a manner that insulates the application programmer from the details ofthe query language ofthe underlying database engine.
- the storage platform API also collects changes to an item made by an application program and then organizes them into the correct updates required by the database engine (or any kind of storage engine) on which the data store is implemented. This enables application programmers to make changes to an item in memory, while leaving the complexity of data store updates to the API. [0020] Through its common storage foundation and schematized data, the storage platform ofthe present invention enables more efficient application development for consumers, knowledge workers and enterprises.
- Fig. 1 is a block diagram representing a computer system in which aspects ofthe present invention may be incorporated
- Fig. 2 is a block diagram illustrating a computer system divided into three component groups: the hardware component, the hardware/software interface system component, and the application programs component; [0025] Fig.
- FIG. 2A illustrates the traditional tree-based hierarchical structure for files grouped in folders in a directory in a file-based operating system
- FIG. 3 is a block diagram illustrating a storage platform
- Fig. 4 illustrates the structural relationship between Items, Item Folders, and Categories
- Fig. 5 A is a block diagram illustrating the structure of an Item
- Fig. 5B is a block diagram illustrating the complex property types ofthe Item of Fig. 5A
- Fig. 5C is a block diagram illustrating the "Location" Item wherein its complex types are further described (explicitly listed);
- Fig. 5A is a block diagram illustrating the "Location" Item wherein its complex types are further described (explicitly listed);
- Fig. 5A is a block diagram illustrating the "Location" Item wherein its complex types are further described (explicitly listed); [0031] Fig.
- FIG. 6A illustrates an Item as a subtype ofthe Item found in the Base Schema
- Fig. 6B is a block diagram illustrating the subtype Item of Fig. 6 A wherein its inherited types are explicitly listed (in addition to its immediate properties);
- Fig. 7 is a block diagram illustrating the Base Schema including its two top- level class types, Item and PropertyBase, and the additional Base Schema types derived therefrom;
- Fig. 8A is a block diagram illustrating Items in the Core Schema;
- Fig. 8B is a block diagram illustrating the property types in the Core Schema; [0036] Fig.
- FIG. 9 is a block diagram illustrating an Item Folder, its member Items, and the interconnecting Relationships between the Item Folder and its member Items;
- Fig. 10 is a block diagram illustrating a Category (which, again, is an Item itself), its member Items, and the interconnecting Relationships between the Category and its member Items;
- Fig. 11 is a diagram illustrating a reference type hierarchy ofthe data model of the storage platform;
- Fig. 12 is a diagram illustrating how relationships are classified;
- Fig. 13 is a diagram illustrating a notification mechanism; [0041] Fig.
- FIG. 14 is a diagram illustrating an example in which two transactions are both inserting a new record into the same B-Tree; [0042] Fig. 15 illustrates a data change detection process; [0043] Fig. 16 illustrates an exemplary directory tree; [0044] Fig. 17 shows an example in which an existing folder of a directory-based file system is moved into the storage platform data store; [0045] Fig. 18 illustrates the concept of Containment Folders; [0046] Fig. 19 illustrates the basic architecture ofthe storage platform API; [0047] Fig. 20 schematically represents the various components ofthe storage platform API stack; [0048] Fig. 21 A is a pictorial representation of an exemplary Contacts Item schema; [0049] Fig.
- FIG. 21B is a pictorial representation ofthe Elements for the exemplary Contacts Item schema of Fig. 21 A; [0050] Fig. 22 illustrates the runtime framework ofthe storage platform API; [0051] Fig. 23 illustrates the execution of a "FindAll" operation; [0052] Fig. 24 illustrates the process by which storage platform API classes are generated from the storage platform Schema; [0053] Fig. 25 illustrates a schema on which a File API is based; [0054] Fig. 26 is a diagram illustrating an access mask format used for data security purposes; [0055] Fig.
- FIG. 27 depicts a new identically protected security region being carved out of an existing security region;
- Fig. 28 is a diagram illustrating the concept of an Item search view;
- Fig. 29 is a diagram illustrating an exemplary Item hierarchy;
- Fig. 30A illustrates an interface Interfacel as a conduit tlirough which first and second code segments communicate;
- Fig. 30B illustrates an interface as comprising interface objects II and 12 which enable first and second code segments of a system to communicate via medium M;
- FIG. 31 A illustrates how the function provided by interface Interfacel may be subdivided to convert the communications ofthe interface into multiple interfaces Interfacel A, Interface IB, Interface IC;
- Fig. 3 IB illustrates how the function provided by interface II may be subdivided into multiple interfaces II a, lib, lie;
- Fig. 32A illustrates a scenario where a meaningless parameter precision can be ignored or replaced with an arbitrary parameter;
- Fig. 32B illustrates a scenario where an interface is replaced by a substitute interface that is defined to ignore or add parameters to an interface;
- Fig. 33 A illustrates a scenario where a 1st and 2nd Code Segments are merged into a module containing them both;
- FIG. 33B illustrates a scenario where part or all of an interface may be written inline into another interface to form a merged interface.
- Fig. 34A illustrates how one or more pieces of middleware might convert communications on the first interface to conform them to one or more different interfaces;
- Fig. 34B illustrates how a code segment can be introduced with an interface to receive the communications from one interface but transmit the functionality to second and third interfaces;
- Fig. 35A illustrates how a just-in-time compiler (JIT) might convert communications from one code segment to another code segment;
- Fig. 35B illustrates a JIT method of dynamically rewriting one or more interfaces may be applied to dynamically factor or otherwise alter said interface;
- Fig. 36 illustrates a three instances of a common data store and the components for synchronizing them; and
- Fig. 37 illustrates one embodiment ofthe present invention that presumes a simple adapter that is unaware of how state is calculated or its associated metadata is exchanged.
- FIG. 1 A. EXEMPLARY COMPUTING ENVIRONMENT
- Fig. 1 and the following discussion is intended to provide a brief general description of a suitable computing environment in which the invention may be implemented.
- various aspects ofthe invention may be described in the general context of computer executable instructions, such as program modules, being executed by a computer, such as a client workstation or a server.
- program modules include routines, programs, objects, components, data structures and the like that perform particular tasks or implement particular abstract data types.
- an exemplary general purpose computing system includes a conventional personal computer 20 or the like, including a processing unit 21, a system memory 22, and a system bus 23 that couples various system components including the system memory to the processing unit 21.
- the system bus 23 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures.
- the system memory includes read only memory (ROM) 24 and random access memory (RAM) 25.
- ROM 24 read only memory
- RAM random access memory
- the personal computer 20 may further include a hard disk drive 27 for reading from and writing to a hard disk, not shown, a magnetic disk drive 28 for reading from or writing to a removable magnetic disk 29, and an optical disk drive 30 for reading from or writing to a removable optical disk 31 such as a CD ROM or other optical media.
- the hard disk drive 27, magnetic disk drive 28, and optical disk drive 30 are connected to the system bus 23 by a hard disk drive interface 32, a magnetic disk drive interface 33, and an optical drive interface 34, respectively.
- the drives and their associated computer readable media provide non volatile storage of computer readable instructions, data structures, program modules and other data for the personal computer 20.
- the exemplary environment described herein employs a hard disk, a removable magnetic disk 29 and a removable optical disk 31, it should be appreciated by those skilled in the art that other types of computer readable media which can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, Bernoulli cartridges, random access memories (RAMs), read only memories (ROMs) and the like may also be used in the exemplary operating environment.
- the exemplary environment may also include many types of monitoring devices such as heat sensors and security or fire alarm systems, and other sources of information.
- a number of program modules may be stored on the hard disk, magnetic disk 29, optical disk 31, ROM 24 or RAM 25, including an operating system 35, one or more application programs 36, other program modules 37 and program data 38.
- a user may enter commands and information into the personal computer 20 through input devices such as a keyboard 40 and pointing device 42.
- Other input devices may include a microphone, joystick, game pad, satellite disk, scanner or the like.
- serial port interface 46 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port or universal serial bus (USB).
- a monitor 47 or other type of display device is also connected to the system bus 23 via an interface, such as a video adapter 48.
- personal computers typically include other peripheral output devices (not shown), such as speakers and printers.
- the exemplary system of Fig. 1 also includes a host adapter 55, Small Computer System Interface (SCSI) bus 56, and an external storage device 62 connected to the SCSI bus 56.
- SCSI Small Computer System Interface
- the personal computer 20 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 49.
- the remote computer 49 may be another personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all ofthe elements described above relative to the personal computer 20, although only a memory storage device 50 has been illustrated in Fig. 1.
- the logical connections depicted in Fig. 1 include a local area network (LAN) 51 and a wide area network (WAN) 52.
- LAN local area network
- WAN wide area network
- Such networking environments are commonplace in offices, enterprise wide computer networks, intranets and the Internet.
- the personal computer 20 When used in a LAN networking environment, the personal computer 20 is connected to the LAN 51 through a network interface or adapter 53.
- the personal computer 20 When used in a WAN networking enviromnent, the personal computer 20 typically includes a modem 54 or other means for establishing communications over the wide area network 52, such as the Internet.
- the modem 54 which may be internal or external, is connected to the system bus 23 via the serial port interface 46.
- program modules depicted relative to the personal computer 20, or portions thereof may be stored in the remote memory storage device. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
- a computer system 200 can be roughly divided into three component groups: the hardware component 202, the hardware/software interface system component 204, and the applications programs component 206 (also referred to as the "user component” or “software component” in certain contexts herein).
- the hardware component 202 may comprise the central processing unit (CPU) 21, the memory (both ROM 24 and RAM 25), the basic input/output system (BIOS) 26, and various input/output (I/O) devices such as a keyboard 40, a mouse 42, a monitor 47, and/or a printer (not shown), among other things.
- the hardware component 202 comprises the basic physical infrastructure for the computer system 200.
- the applications programs component 206 comprises various software programs including but not limited to compilers, database systems, word processors, business programs, videogames, and so forth. Application programs provide the means by which computer resources are utilized to solve problems, provide solutions, and process data for various users (machines, other computer systems, and/or end-users).
- the hardware/software interface system component 204 comprises (and, in some embodiments, may solely consist of) an operating system that itself comprises, in most cases, a shell and a kernel.
- An "operating system” (OS) is a special program that acts as an intermediary between application programs and computer hardware.
- the hardware/software interface system component 204 may also comprise a virtual machine manager (VMM), a Common Language Runtime (CLR) or its functional equivalent, a Java Virtual Machine (JVM) or its functional equivalent, or other such software components in the place of or in addition to the operating system in a computer system.
- VMM virtual machine manager
- CLR Common Language Runtime
- JVM Java Virtual Machine
- the purpose of a hardware/software interface system is to provide an environment in which a user can execute application programs.
- the goal of any hardware/software interface system is to make the computer system convenient to use, as well as utilize the computer hardware in an efficient manner.
- the hardware/software interface system is generally loaded into a computer system at startup and thereafter manages all ofthe application programs in the computer system.
- the application programs interact with the hardware/software interface system by requesting services via an application program interface (API).
- API application program interface
- a hardware/software interface system traditionally performs a variety of services for applications. In a multitasking hardware/software interface system where multiple programs may be running at the same time, the hardware/software interface system determines which applications should run in what order and how much time should be allowed for each application before switching to another application for a turn. The hardware/software interface system also manages the sharing of internal memory among multiple applications, and handles input and output to and from attached hardware devices such as hard disks, printers, and dial-up ports.
- the hardware/software interface system also sends messages to each application (and, in certain case, to the end-user) regarding the status of operations and any errors that may have occurred.
- the hardware/software interface system can also offload the management of batch jobs (e.g., printing) so that the initiating application is freed from this work and can resume other processing and/or operations.
- batch jobs e.g., printing
- a hardware/software interface system On computers that can provide parallel processing, a hardware/software interface system also manages dividing a program so that it runs on more than one processor at a time.
- a hardware/software interface system shell (simply referred to herein as a "shell”) is an interactive end-user interface to a hardware/software interface system.
- a shell may also be referred to as a "command interpreter” or, in an operating system, as an “operating system shell”).
- a shell is the outer layer of a hardware/software interface system that is directly accessible by application programs and/or end-users.
- a kernel is a hardware/software interface system's innermost layer that interacts directly with the hardware components.
- computer system is intended to encompass any and all devices capable of storing and processing information and/or capable of using the stored information to control the behavior or execution ofthe device itself, regardless of whether such devices are electronic, mechanical, logical, or virtual in nature.
- files are units of storable information that may include the hardware/software interface system as well as application programs, data sets, and so forth.
- files are the basic discrete (storable and retrievable) units of information (e.g., data, programs, and so forth) that can be manipulated by the hardware/software interface system.
- Groups of files are generally organized in "folders.”
- a folder is a collection of files that can be retrieved, moved, and otherwise manipulated as single units of information.
- directory a tree-based hierarchical arrangement
- directory a tree-based hierarchical arrangement
- the terms "directory” and/or “folder” are interchangeable, and early Apple computer systems (e.g., the Apple He) used the term "catalog” instead of directory; however, as used herein, all of these terms are deemed to be synonymous and interchangeable and are intended to further include all other equivalent terms for and references to hierarchical information storage structures and their folder and file components.
- a directory of folders is a tree-based hierarchical structure wherein files are grouped into folders and folder, in turn, are arranged according to relative nodal locations that comprise the directory tree.
- a DOS-based file system base folder (or "root directory") 212 may comprise a plurality of folders 214, each of which may further comprise additional folders (as "subfolders" of that particular folder) 216, and each of these may also comprise additional folders 218 ad infinitum.
- Each of these folders may have one or more files 220 although, at the hardware/software interface system level, the individual files in a folder have nothing in common other than their location in the tree hierarchy.
- each folder is a container for its subfolders and its files —that is, each folder owns its subfolders and files.
- each folder is deleted by the hardware/software interface system, that folder's subfolders and files are also deleted (which, in the case of each subfolder, further includes its own subfolders and files recursively).
- each file is generally owned by only one folder and, although a file can be copied and the copy located in a different folder, a copy of a file is itself a distinct and separate unit that has no direct connection to the original (e.g., changes to the original file are not mirrored in the copy file at the hardware/software interface system level).
- files and folders are therefore characteristically "physical" in nature because folders are the treated like physical containers, and files are treated as discrete and separate physical elements inside these containers.
- the present invention in combination with the related inventions incorporated by reference as discussed earlier herein, is directed to a storage platform for organizing, searching, and sharing data.
- the storage platform ofthe present invention extends and broadens the data platform beyond the kinds of existing file systems and database systems discussed above, and is designed to be the store for all types of data, including a new form of data called Items.
- An "Item” is an unit of storable information accessible to a hardware/software interface system that, unlike a simple file, is an object having a basic set of properties that are commonly supported across all objects exposed to an end-user by the hardware/software interface system shell. Items also have properties and relationships that are commonly supported across all Item types including features that allow new properties and relationships to be introduced (and discussed in great detail later herein).
- An "operating system” (OS) is a special program that acts as an intermediary between application programs and computer hardware. An operating system comprises, in most cases, a shell and a kernel.
- a "hardware/software interface system” is software, or a combination of hardware and software, that serves as the interface between the underlying hardware components of a computer system and applications that execute on the computer system.
- a hardware/software interface system typically comprises (and, in some embodiments, may solely consist of) an operating system.
- a hardware/software interface system may also comprise a virtual machine manager (VMM), a Common Language Runtime (CLR) or its functional equivalent, a Java Virtual Machine (JVM) or its functional equivalent, or other such software components in the place of or in addition to the operating system in a computer system.
- VMM virtual machine manager
- CLR Common Language Runtime
- JVM Java Virtual Machine
- the purpose of a hardware/software interface system is to provide an environment in which a user can execute application programs.
- the goal of any hardware/software interface system is to make the computer system convenient to use, as well as utilize the computer hardware in an efficient manner.
- a storage platform 300 comprises a data store 302 implemented on a database engine 314.
- the database engine comprises a relational database engine with object relational extensions.
- the relational database engine 314 comprises the Microsoft SQL Server relational database engine.
- the data store 302 implements a data model 304 that supports the organization, searching, sharing, synchronization, and security of data. Specific types of data are described in schemas, such as schemas 340, and the storage platform 300 provides tools 346 for deploying those schemas as well as for extending those schemas, as described more fully below.
- a change tracking mechanism 306 implemented within the data store 302 provides the ability track changes to the data store.
- the data store 302 also provides security capabilities 308 and a promotion/demotion capability 310, both of which are discussed more fully below.
- the data store 302 also provides a set of application programming interfaces 312 to expose the capabilities ofthe data store 302 to other storage platform components and application programs (e.g., application programs 350a, 350b, and 350c) that utilize the storage platform.
- the storage platform ofthe present invention still further comprises an application programming interfaces (API) 322, which enables application programs, such as application programs 350a, 350b, and 350c, to access all ofthe foregoing capabilities ofthe storage platform and to access the data described in the schemas.
- API application programming interfaces
- the storage platform API 322 may be used by application programs in combination with other APIs, such as the OLE DB API 324 and the Microsoft Windows Win32 API 326.
- the storage platform 300 ofthe present invention may provide a variety of services 328 to application programs, including a synchronization service 330 that facilitates the sharing of data among users or systems.
- the synchronization service 330 may enable interoperability with other data stores 340 having the same format as data store 302, as well as access to data stores 342 having other formats.
- the storage platform 300 also provides file system capabilities that allow interoperability ofthe data store 302 with existing file systems, such as the Windows NTFS files system 318.
- the storage platform 320 may also provide application programs with additional capabilities for enabling data to be acted upon and for enabling interaction with other systems. These capabilities may be embodied in the form of additional services 328, such as an Info Agent service 334 and a notification service 332, as well as in the form of other utilities 336.
- the storage platform is embodied in, or forms an integral part of, the hardware/software interface system of a computer system.
- the storage platform ofthe present invention may be embodied in, or form an integral part of, an operating system, a virtual machine manager (VMM), a Common Language Runtime (CLR) or its functional equivalent, or a Java Virtual Machine (JVM) or its functional equivalent.
- VMM virtual machine manager
- CLR Common Language Runtime
- JVM Java Virtual Machine
- the storage platform ofthe present invention enables more efficient application development for consumers, knowledge workers and enterprises. It offers a rich and extensible programming surface area that not only makes available the capabilities inherent in its data model, but also embraces and extends existing file system and database access methods.
- WinFS the storage platform 300 of the present invention may be referred to as "WinFS.” However, use of this name to refer to the storage platform is solely for convenience of description and is not intended to be limiting in any way.
- the data store 302 ofthe storage platform 300 ofthe present invention implements a data model that supports the organization, searching, sharing, synchronization, and security of data that resides in the store.
- an "Item" is the fundamental unit of storage information.
- the data model provides a mechanism for declaring Items and Item extensions and for establishing relationships between Items and for organizing Items in Item Folders and in Categories, as described more fully below.
- the data model relies on two primitive mechanisms, Types and Relationships. Types are structures which provide a format which governs the form of an instance ofthe Type. The format is expressed as an ordered set of Properties.
- a Property is a name for a value or set of values of a given Type.
- a USPostalAddress type might have the properties Street, City, Zip, State in which Street, City and State are of type String and Zip is of Type Int32. Street may be multi-valued (i.e. a set of values) allowing the address to have more than one value for the Street property.
- the system defines certain primitive types that can be used in the construction of other types - these include String, Binary, Boolean, Intl6, Int32, Int64, Single, Double, Byte, DateTime, Decimal and GUID.
- the Properties of a Type may be defined using any ofthe primitive types or (with some restrictions noted below) any ofthe constructed types.
- Relationships can be declared and represent a mapping between the sets of instances of two types. For example there may be a Relationship declared between the Person Type and the Location Type called LivesAt which defines which people live at which locations. The Relationship has a name, two endpoints, namely a source endpoint and a target endpoint. Relationships may also have an ordered set of properties. Both the Source and Target endpoints have a Name and a Type.
- the LivesAt Relationship has a Source called Occupant of Type Person and a Target called Dwelling of Type Location and in addition has properties StartDate and EndDate indicating the period of time for which the occupant lived at the dwelling. Note that a Person may live at multiple dwellings over time and a dwelling may have multiple occupants so the most likely place to put the StartDate and EndDate information is on the relationship itself.
- Relationships define a mapping between instances that is constrained by the types given as the endpoint types. For example the LivesAt relationship cannot be a relationship in which an Automobile is the Occupant because an Automobile is not a Person.
- the data model does allow the definition of a subtype-supertype relationship between types.
- the subtype-supertype relationship also known as the BaseType relationship is defined in such a way that if Type A is a BaseType for Type B it must be the case that every instance of B is also an instance of A. Another way of expressing this is that every instance that conforms to B must also conform to A. If, for example A has a property Name of Type String while B has a property Age of Type Intl6, it follows that any instance of B must have both a Name and an Age.
- the type hierarchy may be envisaged as an tree with a single supertype at the root. The branches from the root provide the first level subtypes, the branches at this level provide the second level subtypes and so on to the leaf-most subtypes which themselves do not have any subtypes.
- a given Type may have zero or many subtypes and zero or one super type.
- a given instance may conform to at most one type together with that type's super types. To put it another way, for a given instance at any level in the tree the instance may conform to at most one subtype at that level.
- a type is said to be Abstract if instances ofthe type must also be an instance of a subtype ofthe type. 1. Items [0101]
- An Item is a unit of storable information that, unlike a simple file, is an object having a basic set of properties that are commonly supported across all objects exposed to an end-user or application program by the storage platform.
- Items also have properties and relationships that are commonly supported across all Item types including features that allow new properties and relationships to be introduced, as discussed below.
- Items are the objects for common operations such as copy, delete, move, open, print, backup, restore, replicate, and so forth. Items are the units that can be stored and retrieved, and all forms of storable information manipulated by the storage platform exist as Items, properties of Items, or Relationships between Items, each of which is discussed in greater detail herein below.
- Items are intended to represent real- world and readily-understandable units of data like Contacts, People, Services, Locations, Documents (of all various sorts), and so on.
- Fig. 5 A is a block diagram illustrating the structure of an Item.
- the unqualified name ofthe Item is "Location”.
- the qualified name ofthe Item is "Core.Location” which indicates that this Item structure is defined as a specific type of Item in the Core Schema. (The Core Schema is discussed in more detail later herein.)
- the Location Item has a plurality of properties including EAddresses, MetropolitanRegion, Neighborhood, and PostalAddresses. The specific type of property for each is indicated immediately following the property name and is separated from the property name by a colon (":").
- EAddresses and PostalAddresses are properties of defined types or "complex types" (as denoted herein by capitalization) of types EAddress and PostalAddress respectively.
- a complex type is type that is derived from one or more simple data types and/or from other complex types.
- the complex types for the properties of an Item also constitute "nested elements” since the details of the complex type are nested into the immediate Item to define its properties, and the information pertaining to these complex types is maintained with the Item that has these properties (within the Item's boundary, as discussed later herein).
- FIG. 5B is a block diagram illustrating the complex property types PostalAddress and EAddress.
- the PostalAddress property type defines that an Item of property type PostalAddress can be expected to have zero or one City values, zero or one CountryCode values, zero or one MailStop values, and any number (zero to many) of PostalAddressTypes, and so on and so forth. In this way, the shape ofthe data for a particular property in an Item is hereby defined.
- the EAddress property type is similarly defined as shown.
- another way to represent the complex types in the Location Item is to draw the Item with the individual properties of each complex type listed therein.
- Fig. 5C is a block diagram illustrating the Location Item wherein its complex types are further described.
- the storage platform ofthe present invention also allows subtyping whereby one property type can be a subtype of another (where the one property type inherits the properties of another, parent property type).
- Items inherently represent their own Item Types that can also be the subject of subtyping.
- the storage platform in several embodiments ofthe present invention allows an Item to be a subtype of another Item (whereby the one Item inherits the properties ofthe other, parent Item).
- every Item is a subtype ofthe "Item” Item type which is the first and foundational Item type found in the Base Schema. (The Base Schema will also be discussed in detail later herein.)
- Fig. 6A illustrates an Item, the Location Item in this Instance, as being a subtype ofthe Item Item type found in the Base Schema. In this drawing, the arrow indicates that the Location Item (like all other Items) is a subtype ofthe Item Item type.
- the Item Item type as the foundational Item from which all other Items are derived, has a number of important properties such as Itemld and various timestamps, and thereby defines the standard properties of all Items in an operating system. In the present figure, these properties ofthe Item Item type are inherited by Location and thereby become properties of Location. [0107] Another way to represent the properties in the Location Item inherited from the Item Item type is to draw Location with the individual properties of each property type from the parent Item listed therein.
- Fig. 6B is a block diagram illustrating the Location Item wherein its inherited types described in addition to its immediate properties. It should be noted and understood that this Item is the same Item illustrated in Fig.
- Certain embodiments ofthe present invention may enable one to request a subset of properties when retrieving a specific Item; however, the default for many such embodiments is to provide the Item with all of its immediate and inherited properties when retrieved.
- the properties of Items can also be extended by adding new properties to the existing properties of that Item's type. These "extensions" are thereafter bona fide properties ofthe Item and subtypes of that Item type may automatically include the extension properties.
- the "boundary" ofthe Item is represented by its properties (including complex property types, extensions, and so forth). An Item's boundary also represents the limit of an operation performed on an Item such as copy, delete, move, create, and so on.
- the boundary encompasses the following: • The Item Type ofthe Item and, if the Item is a subtype of another Item (as is the case in several embodiments ofthe present invention where all Items are derived from a single Item and Item Type in the Base Schema), any applicable subtype information (that is, information pertaining to the parent Item Type). If the original Item being copied is a subtype of another Item, the copy may also be a subtype of that same Item. • The Item's complex-type properties and extensions, if any.
- the copy may also have the same complex types.
- the Base.Item type defines a field ItemlD of type GUID that stores the identity for the Item.
- An Item must have exactly one identity in the data store 302.
- An item reference is a data structure that contains information to locate and identify an Item.
- an abstract type is defined named ItemReference from which all item reference types derive.
- the ItemReference type defines a virtual method named Resolve. The Resolve method resolves the ItemReference and returns an Item. This method is overridden by the concrete subtypes of ItemReferehce, which implement a function that retrieves an Item given a reference.
- the Resolve method is invoked as part ofthe storage platform API 322.
- Iten ⁇ DReference is a subtype of ItemReference. It defines a Locator and an ItemlD field.
- the Locator field names (i.e. identifies) an item domain. It is processed by a locator resolution method that can resolve the value ofthe Locator to an item domain.
- the ItemlD field is of type ItemlD
- ItemPathReference is a specialization of ItemReference that defines a Locator and a Path field.
- the Locator field identifies an item domain. It is processed by a locator resolution method that can resolve the value ofthe Locator to an item domain.
- the Path field contains a (relative) path in the storage platform namespace rooted at the item domain provided by the Locator.
- an Item can belong to more than one Item Folder, such that when an Item is accessed in one Item Folder and revised, this revised Item can then be accessed directly from another Item folder.
- access to an Item may occur from different Item Folders, what is actually being accessed is in fact the very same Item.
- an Item Folder does not necessarily own all of its member Items, or may simply co-own Items in conjunction with other folders, such that the deletion of an Item Folder does not necessarily result in the deletion ofthe Item.
- an Item must belong to at least one Item Folder so that if the sole Item Folder for a particular Item is deleted then, for some embodiments, the Item is automatically deleted or, in alternative embodiments, the Item automatically becomes a member of a default Item Folder (e.g., a "Trash Can" Item Folder conceptually similar to similarly-named folders used in various file-and-folder- based systems).
- Items may also belong to Categories based on common described characteristic such as (a) an Item Type (or Types), (b) a specific immediate or inherited property (or properties), or (c) a specific value (or values) corresponding to an Item property.
- Categories are conceptually different form Item Folders in that, whereas Item Folders may comprise Items that are not interrelated (i.e., without a common described characteristic), each Item in a Category has a common type, property, or value (a "commonality") that is described for that Category, and it is this commonality that forms the basis for its relationship to and among the other Items in the Category.
- Fig. 4 illustrates the structural relationship between Items, Item Folders, and Categories.
- a plurality of Items 402, 404, 406, 408, 410, 412, 414, 416, 418, and 420 are members of various Item Folders 422, 424, 426, 428, and 430. Some Items may belong to more than one Item Folder, e.g., Item 402 belong to Item Folders 422 and 424.
- Some Items e.g., Item 402, 404, 406, 408, 410, and 412 are also members of one or more Categories 432, 434, and 436, while other times, e.g., Items 414, 416, 418, and 420, may belong to no Categories (although this is largely unlikely in certain embodiments where the possession of any property automatically implies membership in a Category, and thus an Item would have to be completely featureless in order not to be a member of any category in such an embodiment).
- both Categories and Item Folders have structures more akin to directed graphs as shown. In any event, the Items, Item Folders, and Categories are all Items (albeit of different Item Types).
- the Base Schema defines certain special types of Items and properties, and the features of these special foundational types from which subtypes can be further derived.
- the use of this Base Schema allows a programmer to conceptually distinguish Items (and their respective types) from properties (and their respective types).
- the Base Schema sets forth the foundational set of properties that all Items may possess as all Items (and their corresponding Item Types) are derived from this foundational Item in the Base Schema (and its corresponding Item Type).
- the Base Schema defines three top-level types: Item, Extension, and PropertyBase.
- the Item type is defined by the properties of this foundational "Item” Item type.
- the top level property type "PropertyBase” has no predefined properties and is merely the anchor from which all other property types are derived and tlirough which all derived property types are interrelated (being commonly derived from the single property type).
- the Extension type properties define which Item the extension extends as well as identification to distinguish one extension from another as an Item may have multiple extensions.
- ItemFolder is a subtype ofthe Item Item type that, in addition to the properties inherited from Item, features a Relationship for establishing links to its members (if any), whereas both IdentityKey and Property are subtypes of PropertyBase.
- Fig. 8A is a block diagram illustrating Items in the Core Schema
- Fig. 8B is a block diagram illustrating the property types in the Core Schema.
- the Core Schema defines a set of core Item types that, directly (by Item type) or indirectly (by Item subtype), characterize all Items into one or more Core Schema Item types which the Item-based hardware/software interface system understands and can directly process in a predetermined and predictable way.
- the predefined Item types reflect the most common Items in the Item-based hardware/software interface system and thus a level of efficiency is gained by the Item-based hardware/software interface system understanding these predefined Item types that comprise the Core Schema.
- the Core Schema is not extendable —that is, no additional Item types can be subtyped directly from the Item type in the Base Schema except for the specific predefined derived Item types that are part ofthe Core Schema.
- the storage platform mandates the use ofthe Core Schema Item types since every subsequent Item type is necessarily a subtype of a Core Schema Item type. This structure enables a reasonable degree of flexibility in defining additional Item types while also preserving the benefits of having a predefined set of core Item types.
- the specific Item types supported by the Core Schema may include one or more ofthe following: • Categories: Items of this Item Type (and subtypes derived therefrom) represent valid Categories in the Item-based hardware/software interface system. • Commodities: Items that are identifiable things of value. • Devices: Items having a logical structure that supports information processing capabilities. • Documents: Items with content that is not interpreted by the Item-based hardware/software interface system but is instead interpreted by an application program corresponding to the document type. • Events: Items that record certain occurrences in the environment. • Locations: Items representing physical locations (e.g., geographical locations). • Messages: Items of communication between two or more principals (defined below).
- the specific property types supported by the Core Schema may include one or more ofthe following: Certificates (derived from the foundational PropertyBase type in the Base Schema) Principal Identity Keys (derived from the IdentityKey type in the Base Schema) Postal Address (derived from the Property type in the Base Schema) Rich Text (derived from the Property type in the Base Schema) EAddress (derived from the Property type in the Base Schema) Identity SecurityPackage (derived from the Relationship type in the Base Schema) RoleOccupancy (derived from the Relationship type in the Base Schema) BasicPresence (derived from the Relationship type in the Base Schema)
- Relationships are binary relationships where one Item is designated as source and the other Item as target.
- the source Item and the target Item are related by the relationship.
- the source Item generally controls the life-time ofthe relationship. That is, when the source Item is deleted, the relationship between the Items is also deleted.
- Relationships are classified into: Containment and Reference relationships. The containment relationships control the life-time ofthe target Items, while the reference relationships do not provide any life-time management semantics.
- Fig. 12 illustrates the manner in which relationships are classified.
- the Containment relationship types are further classified into Holding and Embedding relationships. When all holding relationships to an Item are removed, the Item is deleted.
- a holding relationship controls the life-time ofthe target through a reference counting mechanism.
- the embedding relationships enable modeling of compound Items and can be thought of as exclusive holding relationships.
- An Item can be a target of one or more holding relationships; but an Item can be target of exactly one embedding relationship.
- An Item that is a target of an embedding relationship can not be a target of any other holding or embedding relationships.
- Reference relationships do not control the lifetime ofthe target Item. They may be dangling - the target Item may not exist. Reference relationships can be used to model references to Items anywhere in the global Item name space (i.e. including remote data stores).
- Fetching an Item does not automatically fetch its relationships. Applications must explicitly request the relationships of an Item.
- Relationship Declaration [0133] The explicit relationship types are defined with the following elements: • A relationship name is specified in the Name attribute. • Relationship type, one ofthe following: Holding, Embedding, Reference. This is specified in the Type attribute. • Source and target endpoints. Each endpoint specifies a name and the type ofthe referenced Item. • The source endpoint field is generally of type ItemlD (not declared) and it must reference an Item in the same data store as the relationship instance.
- the target endpoint field must be of type ItemlDReference and it must reference an Item in the same store as the relationship instance.
- the target endpoint can be of any ItemReference type and can reference Items in other storage platform data stores.
- one or more fields of a scalar or PropertyBase type can be declared. These fields may contain data associated with the relationship.
- Relationship instances are stored in a global relationships table. • Every relationship instance is uniquely identified by the combination (source ItemlD, relationship ID). The relationship ID is unique within a given source ItemlD for all relationships sourced in a given Item regardless of their type. [0134] The source Item is the owner ofthe relationship.
- the storage platform API 322 provides mechanisms for exposing relationships associated with an Item.
- the relationship can not be created if the person Item that is referenced by the source reference does not exist. Also, if the person Item is deleted, the relationship instances between the person and organization are deleted. However, if the Organization Item is deleted, the relationship is not deleted and it is dangling.
- Holding relationships are used to model reference count based life-time management ofthe target Items.
- An Item can be a source endpoint for zero or more relationships to Items.
- An Item that is not an embedded Item can be a target of in one or more holding relationships.
- the target endpoint reference type must be ItemlDReference and it must reference an Item in the same store as the relationship instance.
- Holding relationships enforce lifetime management ofthe target endpoint.
- the creation of a holding relationship instance and the Item that it is targeting is an atomic operation. Additional holding relationship instances can be created that are targeting the same Item. When the last holding relationship instance with a given Item as target endpoint is deleted the target Item is also deleted.
- the types ofthe endpoint Items specified in the relationship declaration will generally be enforced when an instance ofthe relationship is created. The types ofthe endpoint Items can not be changed after the relationship is established.
- Holding relationships play a key role in forming the Item namespace.
- the holding relationships form a directed acyclic graph (DAG).
- DAG directed acyclic graph
- the FolderMembers relationship enables the concept of a Folder as a generic collection of Items.
- Embedding relationships model the concept of exclusive control ofthe lifetime ofthe target Item. They enable the concept of compound Items.
- the creation of an embedding relationship instance and the Item that it is targeting is an atomic operation.
- An Item can be a source of zero or more embedding relationship.
- an Item can be a target of one and only one embedding relationship.
- An Item that is a target of an embedding relationship can not be a target of a holding relationship.
- the target endpoint reference type must be ItemlDReference and it must reference an Item in the same data store as the relationship instance.
- the types ofthe endpoint Items specified in the relationship declaration will generally be enforced when an instance ofthe relationship is created.
- the types ofthe endpoint Items can not be changed after the relationship is established.
- Embedding relationships control the operational consistency ofthe target endpoint. For example the operation of serializing of an Item may include serialization of all the embedding relationships that source from that Item as well as all of their targets; copying an Item also copies all its embedded Items.
- the reference relationships do not guarantee the existence ofthe target, nor do they guarantee the type ofthe target as specified in the relationship declaration. This means that the reference relationships can be dangling. Also, the reference relationship can reference Items in other data stores. Reference relationships can be thought of as a concept similar to links in web pages.
- Any reference type is allowed in the target endpoint.
- the Items that participate in a reference relationship can be of any Item type.
- Reference relationships are used to model most non-lifetime management relationships between Items. Since the existence ofthe target is not enforced, the reference relationship is convenient to model loosely-coupled relationships. The reference relationship can be used to target Items in other data stores including stores on other computers.
- e) Rules and Constraints [0157] The following additional rules and constraints apply for relationships: • An Item must be a target of (exactly one embedding relationship) or (one or more holding relationships). One exception is the root Item. An Item can be a target of zero or more reference relationships • An Item that is a target of embedding relationship can not be source of holding relationships.
- the storage platform of the present invention supports ordering of relationships.
- the ordering is achieved tlirough a property named "Order" in the base relationship definition.
- Order field There is no uniqueness constraint on the Order field. The order ofthe relationships with the same "order” property value is not guaranteed, however it is guaranteed that they may be ordered after relationships with lower "order” value and before relationships with higher “order” field value.
- Applications can get the relationships in the default order by ordering on the combination ( SourceltemlD, RelationshipID, Order). All relationship instances sourced from a given Item are ordered as a single collection regardless ofthe type ofthe relationships in the collection. This however guarantees that all relationships of a given type (e.g., FolderMembers) are an ordered subset o the relationship collection for a given Item.
- the data store API 312 for manipulating relationships implement a set of operations that support ordering of relationships.
- the value ofthe "Order” property ofthe new relationship may be smaller then OrdFirst.
- Insert fterLast( SourceltemlD, Relationship ) inserts the relationship as the last relationship in the collection. The value ofthe "Order” property ofthe new relationship may be greater then OrdLast.
- InsertAt( SourceltemlD, ord, Relationship) inserts a relationship with the specified value for the "Order” property.
- InsertBefore( SourceltemlD, ord, Relationship ) inserts the relationship before the relationship with the given order value.
- the new relationship may be assigned "Order" value that is between OrdPrev and ord, noninclusive.
- InsertAfteif SourceltemlD, ord, Relationship inserts the relationship after the relationship with the given order value.
- the new relationship may be assigned "Order" value that is between ord and OrdNext, non-inclusive.
- MoveBefore( SourceltemlD, ord, RelationshipID) moves the relationship with given relationship ID before the relationship with specified "Order” value.
- the relationship may be assigned a new "Order” value that is between OrdPrev and ord, non-inclusive.
- MoveAfter( SourceltemlD, ord, RelationshipID) moves the relationship with given relationship ID after the relationship with specified "Order” value.
- the relationship may be assigned a new order value that is between ord and OrdNext, non-inclusive.
- a Relationship provides a directed binary relationship that is "extended" by one Item (the source) to another Item (the target).
- a Relationship is owned by the source Item (the Item that extended it), and thus the Relationship is removed if the source is removed (e.g., the Relationship is deleted when the source Item is deleted).
- a Relationship may share ownership of (co-own) the target Item, and such ownership might be reflected in the IsOwned property (or its equivalent) ofthe Relationship (as shown in Fig. 7 for the Relationship property type).
- creation of a new IsOwned Relationship automatically increments a reference count on the target Item, and deletion of such a Relationship may decrement the reference count on the target Item.
- Items continue to exist if they have a reference count greater than zero, and are automatically deleted if and when the count reaches zero.
- an Item Folder is an Item that has (or is capable of having) a set of Relationships to other Items, these other Items comprising the membership ofthe Item Folder.
- Other actual implementations of Relationships are possible and anticipated by the present invention to achieve the functionality described herein.
- a Relationship is a selectable connection from one object to another. The ability for an Item to belong to more than one Item Folder, as well as to one or more Categories, and whether these Items, Folders, and Categories are public or private, is determined by the meanings given to the existence (or lack thereof) in an Item-based structure.
- Logical Relationships are the meanings assigned to a set of Relationships, regardless of physical implementation, which are specifically employed to achieve the functionality described herein.
- Logical Relationships are established between the Item and its Item Folder(s) or Categories (and vice versa) because, in essence, Item Folders and Categories are each a special type of Item. Consequently, Item Folders and Categories can be acted upon the same way as any other Item — copied, added to an email message, embedded in a document, and so and so forth without limitation — and Item Folders and Categories can be serialized and de-serialized (imported and exported) using the same mechanisms as for other Items.
- the aforementioned Relationships which represent the relationship between an Item and it Item Folder(s) can logically extend from the Item to the Item Folder, from the Item Folder to the Item, or both.
- a Relationship that logically extends from an Item to an Item Folder denotes that the Item Folder is public to that Item and shares its membership information with that Item; conversely, the lack of a logical Relationship from an Item to an Item Folder denotes that the Item Folder is private to that Item and does not share its membership information with that Item.
- a Relationship that logically extends from an Item Folder to an Item denotes that the Item is public and sharable to that Item Folder
- the lack of a logical Relationship from the Item Folder to the Item denotes that the Item is private and non-sharable.
- Item Folder 900 is a block diagram illustrating an Item Folder (which, again, is an Item itself), its member Items, and the interconnecting Relationships between the Item Folder and its member Items.
- the Item Folder 900 has as members a plurality of Items 902, 904, and 906.
- Item Folder 900 has a Relationship 912 from itself to Item 902 which denotes that the Item 902 is public and sharable to Item Folder 900, its members 904 and 906, and any other Item Folders, Categories, or Items (not shown) that might access Item Folder 900.
- Item 902 there is no Relationship from Item 902 to the Item Folder 900 which denotes that Item Folder 900 is private to Item 902 and does not share its membership information with Item 902.
- Item 904 does have a Relationship 924 from itself to Item Folder 900 which denotes that the Item Folder 900 is public and shares its membership information with Item 904.
- Item 904 there is no Relationship from the Item Folder 900 to Item 904 which denotes that Item 904 is private and not sharable to Item Folder 900, its other members 902 and 906, and any other Item Folders, Categories, or Items (not shown) that might access Item Folder 900.
- Item Folder 900 In contrast with its Relationships (or lack thereof) to Items 902 and 904, Item Folder 900 has a Relationship 916 from itself to the Item 906 and Item 906 has a Relationship 926 back to Item Folder 900, which together denote that Item 906 is public and sharable to Item Folder 900, its members 902 and 904, and any other Item Folders, Categories, or Items (not shown) that might access Item Folder 900, and that Item Folder 900 is public and shares its membership information with Item 906.
- Categories are described by a commonality that is common to all of its member Items. Consequently the membership of a Category is inherently limited to Items having the described commonality and, in certain embodiments, all Items meeting the description of a Category are automatically made members ofthe Category. Thus, whereas Item Folders allow trivial type structures to be represented by their membership, Categories allow membership based on the defined commonality. [0168] Of course Category descriptions are logical in nature, and therefore a Category may be described by any logical representation of types, properties, and/or values.
- a logical representation for a Category may be its membership to comprise Items have one of two properties or both. If these described properties for the Category are "A" and "B", then the Categories membership may comprise Items having property A but not B, Items having property B but not A, and Items having both properties A and B.
- This logical representation of properties is described by the logical operator "OR” where the set of members described by the Category are Items having property A OR B. Similar logical operands (including without limitation “AND”, “XOR”, and “NOT” alone or in combination) can also be used describe a category as will be appreciated by those of skill in the art.
- FIG. 10 is a block diagram illustrating a Category (which, again, is an Item itself), its member Items, and the interconnecting Relationships between the Category and its member Items.
- the Category 1000 has as members a plurality of Items 1002, 1004, and 1006, all of which share some combination of common properties, values, or types 1008 as described (commonality description 1008') by the Category 1000.
- Category 1000 has a Relationship 1012 from itself to Item 1002 which denotes that the Item 1002 is public and sharable to Category 1000, its members 1004 and 1006, and any other Categories, Item Folders, or Items (not shown) that might access Category 1000.
- Item 1002 does not have a Relationship 1024 from itself to Category 1000 which denotes that the Category 1000 is public and shares its membership information with Item 1004.
- Category 1000 has a Relationship 1016 from itself to Item 1006 and Item 1006 has a Relationship 1026 back to Category 1000, which altogether denotes that Item 1006 is public and sharable to Category 1000, its Item members 1002 and 1004, and any other Categories, Item Folders, or Items (not shown) that might access Category 1000, and that the Category 1000 is public and shares its membership information with Item 1006.
- Item Folder structures and/or Category structures are prohibited, at the hardware/software interface system level, from containing cycles.
- Item Folder and Category structures are akin to directed graphs
- the embodiments that prohibit cycles are akin to directed acyclic graphs (DAGs) which, by mathematical definition in the art of graph theory, are directed graphs wherein no path starts and ends at the same vertex. 6.
- DAGs directed acyclic graphs
- the storage platform is intended to be provided with an initial set of schemas 340, as described above.
- the storage platform allows customers, including independent software vendor (ISVs), to create new schemas 344 (i.e. new Item and Nested Element types).
- ISVs independent software vendor
- This section addresses the mechanism for creating such schemas by extending the Item types and Nested Element types (or simply "Element" types) defined in the initial set of schemas 340.
- extension ofthe initial set of Item and Nested Element types is constrained as follows: • an ISV is allowed to introduce new Item types, i. e.
- subtype Base.Item • an ISV is allowed to introduce new Nested Element types, i.e. subtype Base.NestedElement; • an ISV is allowed to introduce new extensions, i.e. subtype Base.NestedElement; but, • an ISV cannot subtype any types (Item, Nested Element, or Extension types) defined by the initial set of storage platform schemas 340.
- Extensions are strongly typed instances but (a) they camiot exist independently and (b) they must be attached to an Item or Nested Element.
- Extensions are also intended to address the "multi-typing" issue. Since, in some embodiments, the storage platform may not support multiple inheritance or overlapping subtypes, applications can use Extensions as a way to model overlapping type instances (e.g. Document is a legal document as well a secure document).
- the ItemlD field contains the ItemlD ofthe item that the extension is associated with. An Item with this ItemlD must exist. The extension can not be created if the item with the given ItemlD does not exist.
- Extensions When the Item is deleted all the extensions with the same ItemlD are deleted.
- the tuple (ItemID,ExtensionID) uniquely identifies an extension instance.
- Extension types have fields; • Fields can be of primitive or nested element types; and • Extension types can be sub-typed.
- Extension types can not be sources and targets of relationships; • Extension type instances can not exist independently from an item; and • Extension types can not be used as field types in the storage platform type definitions
- extension types There are no constraints on the types of extensions that can be associated with a given Item type. Any extension type is allowed to extend any item type. When multiple extension instances are attached to an item, they are independent from each other in both structure and behavior. [0182] The extension instances are stored and accessed separately from the item. All extension type instances are accessible from a global extension view. An efficient query can be composed that will return all the instances of a given type of extension regardless of what type of item they are associated with. The storage platform APIs provides a programming model that can store, retrieve and modify extensions on items. [0183] The extension types can be type sub-typed using the storage platform single inheritance model. Deriving from an extension type creates a new extension type.
- Extension type instances can be directly accessed tlirough the view associated with the extension type.
- the ItemlD ofthe extension indicates which item they belong to and can be used to retrieve the corresponding Item object from the global Item view.
- the extensions are considered part ofthe item for the purposes of operational consistency.
- the Copy/Move, Backup/Restore and other common operations that the storage platform defines may operate on the extensions as part ofthe item. [0184]
- a Contact type is defined in the Windows Type set.
- CRMExtension and HRExtension are two independent extensions that can be attached to Contact items. They are created and accessed independently of each other. [0188] In the above example, the fields and methods ofthe CRMExtension type cannot override fields or methods o the Contact hierarchy. It should be noted that instances ofthe CRMExtension type can be attached to Item types other than Contact.
- Nested element is part of the item Storage Item hierarchy is Item extension Stored with item stored in its own hierarchy is stored tables in its own tables Query/Search Can query item Can query item Can generally be tables extension tables , queried only within the containing item I context Query/Search Can search across Can search across Can generally only scope all instances of an all instances of an search within nested item type item extension type element type instances of a singe (containing) item Relationship Can have No Relationships to No Relationships to semantics Relationships to item extensions nested elements items Association to Can be related to Can generally only Related to item via items other items via be related via fields.
- the elements are part of and soft extension semantics the item Relationships is similar to embedded item b) Extending NestedElement types [0192] Nested Element types are not extended with the same mechanism as the Item types. Extensions of nested elements are stored and accessed with the same mechanisms as fields of nested element types.
- the NestedElement type inherits from this type.
- the NestedElement element type additionally defines a field that is a multi-set of Elements.
- ⁇ Type Name-'NestedElement" BaseType "Base.
- NestedElement extensions are different from item extensions in the following ways: • Nested element extensions are not extension types. They do not belong to the extension type hierarchy that is rooted in the Base.Extension type. • Nested element extensions are stored along with the other fields ofthe item and are not globally accessible - a query can not be composed that retrieves all instances of a given extension type. • These extensions are stored the same way as other nested elements (ofthe item) are stored. Like other nested sets, the NestedElement extensions are stored in a UDT. They are accessible through the Extensions field ofthe nested element type.
- Item extensions vs NestedElement extensions Item Extension NestedElement Extension Storage Item extension hierarchy is Stored like nested elements stored in its own tables Query/Search Can query item extension Can generally only be tables queried within the containing item context Query/Search Can search across all Can generally only search scope instances of an item within nested element type extension type instances of a singe (containing) item Programmability Need special extension NestedElement extensions APIs and special querying are like any other multion extension tables valued field of nested element; normal nested element type APIs are used Behavior Can associate behavior No behavior permitted (?) Relationship No Relationships to item No Relationships to semantics extensions NestedElement extensions Item ID Shares the item id ofthe Does not have its own item item id.
- the data store is implemented on a database engine.
- the database engine comprises a relational database engine that implements the SQL query language, such as the Microsoft SQL Server engine, with object relational extensions.
- This section describes the mapping ofthe data model that the data store implements to the relational store and provides information on the logical API consumed by storage platform clients, in accordance with the present embodiment. It is understood, however, that a different mapping may be employed when a different database engine is employed. Indeed, in addition to implementing the storage platform conceptual data model on a relational database engine, it can also be implemented on other types of databases, e.g. object-oriented and XML databases.
- An object-oriented (OO) database system provides persistence and transactions for programming language objects (e.g. C++, Java).
- the storage platform notion of an "item” maps well to an "Object” in object-oriented systems, though embedded collections would have to be added to Objects.
- Other storage platform type concepts like inheritance and nested element types, also map object-oriented type systems.
- Object-oriented systems typically already support object identity; hence, item identity can be mapped to object identity.
- the item behaviors (operations) map well to object methods.
- object-oriented systems typically lack organizational capabilities and are poor in searching. Also, object-oriented systems to do not provide support for unstructured and semi-structured data.
- XML databases Similar to object-oriented systems, XML databases, based on XSD (XML Schema Definition), support a single-inheritance based type system. The item type system ofthe present invention could be mapped to the XSD type model. XSDs also do not provide support for behaviors. The XSDs for items would have to be augmented with item behaviors. XML databases deal with single XSD documents and lack organization and broad search capabilities.
- Fig. 13 is a diagram illustrating a notification mechanism.
- Fig. 14 is a diagram illustrating an example in which two transactions are both inserting a new record into the same B-Tree.
- Fig. 15 illustrates a data change detection process.
- Fig. 16 illustrates an exemplary directory tree.
- Fig. 17 shows an example in which an existing folder of a directory-based file system is moved into the storage platform data store. 1.
- the relational database engine 314 which in one embodiment comprises the Microsoft SQL Server engine, supports built-in scalar types.
- Built-in scalar types are "native" and “simple”. They are native in the sense that the user cannot define their own types and they are simple in that they cannot encapsulate a complex structure.
- User- defined types hereinafter: UDTs provide a mechanism for type extensibility above and beyond the native scalar type system by enabling users to extend the type system by defining complex, structured types.
- a UDT can be used anywhere in the type system that a built-in scalar type might be used
- the storage platform schemas are mapped to UDT classes in the database engine store.
- Data store Items are mapped to UDT classes deriving from the Base.Item type.
- Like Items, Extensions are also mapped to UDT classes and make use of inheritance.
- the root Extension type is Base.Extension, from which all Extension types are derived.
- a UDT is a CLR class - it has state (i.e., data fields) and behavior (i.e., routines).
- UDTs are defined using any ofthe managed languages - C#, VB.NET, etc.
- UDT methods and operators can be invoked in T-SQL against an instance of that type.
- a UDT can be: the type of a column in a row, the type of a parameter of a routine in T-SQL, or the type of a variable in T-SQL [0204]
- the mapping of storage platform schemas to UDT classes is fairly straightforward at a high level. Generally, a storage platform Schema is mapped to a CLR namespace. A storage platform Type is mapped to a CLR class. The CLR class inheritance mirrors the storage platform Type inheritance, and a storage platform Property is mapped to a CLR class property. 1.
- Item Mapping Given the desirability for Items to be globally searchable, and the support in the relational database ofthe present embodiment for inheritance and type substitutability, one possible implementation for Item storage in the database store would be to store all Items in a single table with a column of type Base.Item. Using type substitutability, Items of all types could be stored, and searches could be filtered by Item type and sub-type using Yukon's "is of (Type)" operator. [0206] However, due to concerns about the overhead associated with such an approach, in the present embodiment, the Items are divided by top-level type, such that Items of each type "family" are stored in a separate table.
- a "shadow" table is used to store copies of globally searchable properties for all Items. This table may be maintained by the Update() method ofthe storage platform API, through which all data changes are made. Unlike the type family tables, this global Item table contains only the top-level scalar properties ofthe Item, not the full UDT Item object.
- the global Item table allows navigation to the Item object stored in a type family table by exposing an ItemlD and a TypelD.
- the ItemlD will generally uniquely identify the Item within the data store.
- the TypelD may be mapped using metadata, which is not described here, to a type name and the view containing the Item. Since finding an Item by its ItemlD may be a common operation, both in the context ofthe global Item table and otherwise, a Getltem() function is provided to retrieve an Item object given an Item's ItemlD.
- all queries of Items might be against views built on the Item tables described above. Specifically, views may be created for each Item type against the appropriate type family table. These type views may select all Items ofthe associated type, including sub-types.
- Extensions are very similar to Items and have some ofthe same requirements. As another root type supporting inheritance, Extensions are subject to many ofthe same considerations and trade-offs in storage. Because of this, a similar type family mapping is applied to Extensions, rather than a single table approach. Of course, in other embodiments, a single table approach could be used.
- an Extension is associated with exactly one Item by ItemlD, and contains an ExtensionID that is unique in the context ofthe Item.
- a function might be provided to retrieve an Extension given its identity, which consists of an ItemlD and ExtensionID pair.
- Nested Elements are types that can be embedded in Items, Extensions, Relationships, or other Nested Elements to form deeply nested structures. Like Items and Extensions, Nested Elements are implemented as UDT's, but they are stored within an Items and Extensions. Therefore, Nested Elements have no storage mapping beyond that of their Item and Extension containers. In other words, there are no tables in the system which directly store instances of NestedElement types, and there are no views dedicated specifically to Nested Elements. 5. Object Identity [0211] Each entity in the data model, i.e., each Item, Extension and Relationship, has a unique key value.
- An Item is uniquely identified by its Itemld.
- An Extension is uniquely identified by a composite key of (Itemld, Extensionld).
- a Relationship is identified by a composite key (Itemld, Relationshipld).
- Itemld, Extensionld and Relationshipld are GUID values.
- SQL Object Naming All objects created in the data store can be stored in a SQL schema name derived from the storage platform schema name. For example, the storage platform Base schema (often called “Base”) may produce types in the "[System.Storage]” SQL schema such as "[System.Storage] .Item”. Generated names are prefixed by a qualifier to eliminate naming conflicts.
- exclamation character (!) is used as a separator for each logical part ofthe name.
- Table below outlines the naming convention used for objects in the data store. Each schema element (Item, Extension, Relationship and View), is listed along with the decorated naming convention used to access instances in the data store.
- views are provided to support Relationships and Views (as defined by the Data Model). All SQL views and underlying tables in the storage platform are read-only. Data may be stored or changed using the UpdateO method ofthe storage platform API, as described more fully below.
- Each view explicitly defined in a storage platform schema (defined by the schema designer, and not automatically generated by the storage platform) is accessible by the named SQL view [ ⁇ schema-name>].[View! ⁇ view-name>]. For example, a view named "BookSales" in the schema "AcmePublisher.Books" would be accessible using the name "[AcmePublisher.Books].[View!BookSales]”.
- Type specific column(s) (Properties ofthe declared type) • Type specific views (family views) also contain an object column which returns the object [0217]
- Members of each type family are searchable using a series of Item views, with there being one view per Item type in the data store.
- Fig. 28 is a diagram illustrating the concept of an Item search view. a) Item [0218]
- Each Item search view contains a row for each instance of an Item ofthe specific type or its subtypes. For example, the view for Document could return instances of Document, LegalDocument and ReviewDocument. Given this example, the Item views can be conceptualized as shown in Fig. 29.
- Master Item Search View Each instance of a storage platform data store defines a special Item view called the Master Item View. This view provides summary information on each Item in the data store. The view provides one column per Item type property, a column which described the type ofthe Item and several columns which are used to provide change tracking and synchronization info ⁇ nation. The master item view is identified in a data store using the name "[System.Storage]. [Master ⁇ tem]".
- Each Item type also has a search view. While similar to the root Item view, this view also provides access to the Item object via the "_Item" column.
- Each typed item search view is identified in a data store using the name [schemaName].[itemTypeNan ⁇ e]. For example [AcmeCorp.Doc]. [OfficeDoc].
- Item Extensions [0221] All Item Extensions in a WinFS Store are also accessible using search views.
- (1) Master Extension Search View [0222] Each instance of a data store defines a special Extension view called the Master Extension View. This view provides summary information on each Extension in the data store. The view has a column per Extension property, a column which describes the type ofthe Extension and several columns which are used to provide change tracking and synchronization info ⁇ nation.
- the master extension view is identified in a data store using the name "[System.Storage]. [Master! Extension]".
- Each Extension type also has a search view. While similar to the master extension view, this view also provides access to the Item object via the _Extension column.
- Each typed extension search view is identified in a data store using the name [schemaName].[T xtensio ⁇ extensionTypeName]. For example [ AcmeCorp .Doc] . [Extension! OfficeDocExt] .
- Each declared Relationship also has a search view which returns all instances of the particular relationship. While similar to the master relationship view, this view also provides named columns for each property ofthe relationship data.
- Each relationship instance search view is identified in a data store using the name [schemaName] . [Relationshi ! relationshipName] . For example [AcmeCorp.Doc]. [Relationship !DocumentAuthor].
- Createltem (Creates a new item in the context of an embedding or holding relationship) b. Updateltem (updates an existing Item) 2.
- Relationship operations a. CreateRelationship (creates an instance of a reference or holding relationship) b. UpdateRelationship (updates a relationship instance) c. DeleteRelationship (removes a relationship instances) 3.
- Change tracking information on the AcmeCorp.Document.Document Item type is available from the Master Item View "[System.Storage].[Master!Item]” and typed Item search view [AcmeCorp.Document] . [Document] .
- Change tracking information in the master search views provides information on the creation and update versions of an element, information on which sync partner created the element, which sync partner last updated the element and the version numbers from each partner for creation and update. Partners in sync relationships (described below) are identified by partner key.
- a single UDT object named _ChangeTrackingInfo of type [System.Storage.Storej.ChangeTrackinglnfo contains all this information. The type is defined in the System.Storage schema. _ChangeTrackingInfo is available in all global search views for Item, Extension and Relationship. The type definition of ChangeTrackinglnfo is:
- each typed search view provides additional information recording the sync state of each element in the sync topology.
- Tombstones The data store provides tombstone information for Items, Extensions and Relationships. The tombstone views provide information about both live and tombstoned entities (items, extensions and relationships) in one place. The item and extension tombstone views do not provide access to the corresponding object, while the relationship tombstone view provides access to the relationship object (the relationship object is NULL in the case of a tombstoned relationship).
- Item Tombstones [0236] Item tombstones are retrieved from the system via the view [System. Storage] . [Tombstone ! Item] .
- Extension Tombstones are retrieved from the system using the view [System.Storage]. [TombstoneiExtension]. Extension change tracking information is similar to that provided for Items with the addition ofthe Extensionld property.
- Relationships Tombstone [0238] Relationship tombstones are retrieved from the system via the view [System. Storage]. [Tombstone! Relationship]. Relationships tombstone information is similar to that provided for Extensions. However, additional info ⁇ nation is provided on the target ItemRef ofthe relationship instance. In addition, the relationship object is also selected.
- the data store provides a tombstone cleanup task. This task determines when tombstone information may be discarded. The task computes a bound on the local create / update version and then truncates the tombstone info ⁇ nation by discarding all earlier tombstone versions.
- Helper APIs and Functions [0240] The Base mapping also provides a number of helper functions. These functions are supplied to aid common operations over the data model. a) Function [System.Storage].GetItem
- Metadata There are two types of metadata represented in the Store: instance metadata (the type of an Item, etc), and type metadata.
- instance metadata the type of an Item, etc
- type metadata the type metadata
- Schema metadata is stored in the data store as instances of Item types from the Meta schema.
- Instance Metadata is used by an application to query for the type of an Item and finds the extensions associated with an Item. Given the Itemld for an Item, an application can query the global item view to return the type ofthe Item and use this value to query the Meta.Type view to return information on the declared type ofthe Item. For example,
- E. SECURITY In general, all securable objects arrange their access rights using the access mask format shown in the Fig. 26. In this format, the low-order 16 bits are for object-specific access rights, the next 7 bits are for standard access rights, which apply to most types of objects, and the 4 high-order bits are used to specify generic access rights that each object type can map to a set of standard and object-specific rights.
- the ACCESS_SYSTEM_SECURITY bit corresponds to the right to access the object's SACL.
- item specific rights are placed in the Object Specific Rights section (low order 16-bits).
- Fig. 27 depicts a new identically protected security region being carved out of an existing security region, in accordance with one embodiment of a security model.
- the storage platform provides a notifications capability that allows applications to track data changes. This feature is primarily intended for applications which maintain volatile state or execute business logic on data change events. Applications register for notifications on items, item extensions and item relationships. Notifications are delivered asynchronously after data changes have been committed. Applications may filter notifications by item, extension and relationship type as well as type of operation. [0248] According to one embodiment, the storage platfo ⁇ n API 322 provides two kinds of interfaces for notifications. First, applications register for simple data change events triggered by changes to items, item extensions and item relationships. Second, applications create "watcher" objects to monitor sets of items, item extensions and relationships between items.
- the storage platform ofthe present invention is, in at least some embodiments, intended to be embodied as an integral part ofthe hardware/software interface system of a computer system.
- the storage platform ofthe present invention may be embodied as an integral part of an operating system, such as the Microsoft Windows family of operating systems.
- the storage platform API becomes a part ofthe operating system APIs through which application programs interact with the operating system.
- the storage platfonn becomes the means through which application programs store information on the operating system, and the Item based data model ofthe storage platform therefore replaces the traditional files system of such an operating system.
- the storage platfo ⁇ n might replace the NTFS file system implemented in that operating system.
- application programs access the services ofthe NTFS file system through the Win32 APIs exposed by the Windows family of operating systems.
- the storage platfo ⁇ n enables application programs which rely on the Win32 programming model to access the contents of both the data store ofthe storage platform as well as the traditional NTFS file system.
- the storage platform uses a naming convention that is a superset ofthe Win32 naming conventions to facilitate easy interoperability.
- the storage platform supports accessing files and directories stored in a storage platform volume through the Win32 API.
- H. STORAGE PLATFORM API [0253] The storage platform comprises an API that enables application programs to access the features and capabilities ofthe storage platform discussed above and to access items stored in the data store. This section describes one embodiment of a storage platform API ofthe storage platform ofthe present invention. Details regarding this functionality can be found in the related applications incorporated by reference earlier herein, with some of this information summarized below for convenience.
- a Containment Folder is an item which contains holding Relationships to other Items and is the equivalent ofthe common concept of a file system folder.
- FIG. 19 illustrates the basic architecture ofthe storage platform API, in accordance with the present embodiment.
- the storage platform API uses SQLClient 1900 to talk to the local data store 302 and may also use SQLClient 1900 to talk to remote data stores (e.g., data store 340).
- the local store 302 may also talk to the remote data store 340 using either DQP (Distributed Query Processor) or tlirough the the storage platform synchronization service ("Sync”) described below.
- DQP Distributed Query Processor
- Sync storage platform synchronization service
- the storage platform API 322 also acts as the bridge API for data store notifications, passing application's subscriptions to the notification engine 332 and routing notifications to the application (e.g., application 350a, 350b, or 350c), as also described above.
- the storage platform API 322 may also define a limited "provider" architecture so that it can access data in Microsoft Exchange and AD.
- Fig. 20 schematically represents the various components ofthe storage platform API.
- the storage platform API consists ofthe following components: (1) data classes 2002, which represent the storage platform element and item types, (2) runtime framework 2004, which manages object persistence and provides support classes 2006; and (3) tools 2008, which are used to generate CLR classes from the storage platform schemas.
- Fig. 22 illustrates the runtime framework in operation.
- the runtime framework operates as follows: 1. An application 350a, 350b, or 350c binds to an item in the storage platform. 2.
- the framework 2004 creates an ItemContext object 2202 co ⁇ esponding to the bound item and returns it to the application. 3.
- the application submits a Find on this ItemContext to get a collection of Items; the returned collection is conceptually an object graph 2204 (due to relationships). 4.
- the application changes, deletes, and inserts data. 5.
- Fig. 23 illustrates the execution of a "FindAll" operation.
- Fig. 24 illustrates the process by which storage platform API classes are generated from the storage platform Schema
- Fig. 25 illustrates the schema on which the File API is based.
- the storage platform API includes a namespace for dealing with file objects. This namespace is called System.Storage.Files.
- the data members ofthe classes in System.Storage.Files directly reflect the information stored in the storage platfonn store; this information is "promoted" from the file system objects or may be created natively using the Win32 API.
- the System.Storage.Files namespace has two classes: Fileltem and Directoryltem.
- a programming interface (or more simply, interface) may be viewed as any mechanism, process, protocol for enabling one or more segment(s) of code to communicate with or access the functionality provided by one or more other segment(s) of code.
- a programming interface may be viewed as one or more mechanism(s), method(s), function call(s), module(s), object(s), etc.
- segment of code in the preceding sentence is intended to include one or more instructions or lines of code, and includes, e.g., code modules, objects, subroutines, functions, and so on, regardless ofthe terminology applied or whether the code segments are separately compiled, or whether the code segments are provided as source, intermediate, or object code, whether the code segments are utilized in a runtime system or process, or whether they are located on the same or different machines or distributed across multiple machines, or whether the functionality represented by the segments of code are implemented wholly in software, wholly in hardware, or a combination of hardware and software.
- a programming interface may be viewed generically, as shown in Fig. 30A or Fig. 30B.
- Fig. 30A illustrates an interface Interfacel as a conduit through which first and second code segments communicate.
- Fig. 3 OB illustrates an interface as comprising interface objects II and 12 (which may or may not be part ofthe first and second code segments), which enable first and second code segments of a system to communicate via medium M.
- interface objects II and 12 are separate interfaces ofthe same system and one may also consider that objects II and 12 plus medium M comprise the interface.
- aspects of such a programming interface may include the method whereby the first code segment transmits information (where "information" is used in its broadest sense and includes data, commands, requests, etc.) to the second code segment; the method whereby the second code segment receives the information; and the structure, sequence, syntax, organization, schema, timing and content ofthe information.
- the underlying transport medium itself may be unimportant to the operation ofthe interface, whether the medium be wired or wireless, or a combination of both, as long as the information is transported in the manner defined by the interface.
- information may not be passed in one or both directions in the conventional sense, as the information transfer may be either via another mechanism (e.g. information placed in a buffer, file, etc. separate from information flow between the code segments) or non-existent, as when one code segment simply accesses functionality performed by a second code segment. Any or all of these aspects may be important in a given situation, e.g., depending on whether the code segments are part of a system in a loosely coupled or tightly coupled configuration, and so this list should be considered illustrative and non- limiting.
- Figs. 30A and 30B may be factored to achieve the same result, just as one may mathematically provide 24, or 2 times 2 time 3 times 2.
- the function provided by interface Interfacel may be subdivided to convert the communications ofthe interface into multiple interfaces InterfacelA, Interface IB, Interface IC, etc. while achieving the same result.
- the function provided by interface II may be subdivided into multiple interfaces Ila, lib, lie, etc. while achieving the same result.
- interface 12 ofthe second code segment which receives information from the first code segment may be factored into multiple interfaces I2a, I2b, I2c, etc.
- the number of interfaces included with the 1st code segment need not match the number of interfaces included with the 2nd code segment.
- the functional spirit of interfaces Interfacel and II remain the same as with Figs. 30A and 30B, respectively.
- the factoring of interfaces may also follow associative, commutative, and other mathematical properties such that the factoring may be difficult to recognize.
- 30A includes a function call Square(input, precision, output), a call that includes three parameters, input, precision and output, and which is issued from the 1st Code Segment to the 2nd Code Segment. If the middle parameter precision is of no concern in a given scenario, as shown in Fig. 32A, it could just as well be ignored or even replaced with a meaningless (in this situation) parameter. One may also add an additional parameter of no concern. In either event, the functionality of square can be achieved, so long as output is returned after input is squared by the second code segment. Precision may very well be a meaningful parameter to some downstream or other portion ofthe computing system; however, once it is recognized that precision is not necessary for the na ⁇ ow purpose of calculating the square, it may be replaced or ignored.
- Inline Coding It may also be feasible to merge some or all ofthe functionality of two separate code modules such that the "interface" between them changes form. For example, the functionality of Figs.
- FIG. 33A the previous 1st and 2nd Code Segments of Fig. 30A are merged into a module containing both of them.
- the code segments may still be communicating with each other but the interface may be adapted to a form which is more suitable to the single module.
- formal Call and Return statements may no longer be necessary, but similar processing or response(s) pursuant to interface Interfacel may still be in effect.
- part (or all) of interface 12 from Fig. 30B may be written inline into interface II to form interface II".
- interface 12 is divided into I2a and I2b, and interface portion I2a has been coded in-line with interface II to form interface II".
- interface II For a concrete example, consider that the interface II from Fig. 30B performs a function call square (input, output), which is received by interface 12, which after processing the value passed with input (to square it) by the second code segment, passes back the squared result with output. In such a case, the processing performed by the second code segment (squaring input) can be performed by the first code segment without a call to the interface.
- Divorce A communication from one code segment to another may be accomplished indirectly by breaking the communication into multiple discrete communications. This is depicted schematically in Figs. 34A and 34B. As shown in Fig.
- one or more piece(s) of middleware (Divorce Interface(s), since they divorce functionality and / or interface functions from the original interface) are provided to convert the communications on the first interface, Interfacel, to conform them to a different interface, in this case interfaces Interface2A, Interface2B and Interface2C.
- middleware Divorce Interface(s)
- a third code segment can be introduced with divorce interface DI1 to receive the communications from interface II and with divorce interface DI2 to transmit the interface functionality to, for example, interfaces I2a and I2b, redesigned to work with DI2, but to provide the same functional result.
- DI1 and DI2 may work together to translate the functionality of interfaces II and 12 of Fig. 30B to a new operating system, while providing the same or similar functional result.
- the JIT compiler may be written so as to dynamically convert the communications from the 1st Code Segment to the 2nd Code Segment, i.e., to conform them to a different interface as may be required by the 2nd Code Segment (either the original or a different 2nd Code Segment).
- This is depicted in Figs. 35A and 35B.
- Fig. 35 A this approach is similar to the Divorce scenario described above. It might be done, e.g., where an installed base of applications are designed to communicate with an operating system in accordance with an Interface 1 protocol, but then the operating system is changed to use a different interface.
- the JIT Compiler could be used to conform the communications on the fly from the installed-base applications to the new interface ofthe operating system.
- Section A discloses several embodiments ofthe present invention, while Section B focuses on various embodiments of an API for synchronization.
- the storage platform provides a synchronization service 330 that (i) allows multiple instances ofthe storage platform (each with its own data store 302) to synchronize parts of their content according to a flexible set of rules, and (ii) provides an infrastructure for third parties to synchronize the data store ofthe storage platform ofthe present invention with with other data sources that implement proprietary protocols.
- Storage-platfonn-to-storage-platform synchronization occurs among a group of participating "replicas.” For example, with reference to Fig. 3, it may be desirable to provide synchronization between the data store 302 ofthe storage platform 300 with another remote data store 338 under the control of another instance ofthe storage platform, perhaps running on a different computer system. The total membership of this group is not necessarily known to any given replica at any given time.
- Different replicas can make the changes independently (i.e., concurrently). The process of synchronization is defined as making every replica aware ofthe changes made by other replicas. This synchronization capability is inherently multi-master.
- the synchronization capability ofthe present invention allows replicas to: determine which changes another replica is aware of; request information about changes that this replica is not aware of; convey information about changes that the other replica is not aware of; determine when two changes are in conflict with each other; apply changes locally; convey conflict resolutions to other replicas to ensure convergence; and resolve the conflicts based on specified policies for conflict resolutions.
- Storage-Platform-to-Storage-Platform Synchronization [0277] The primary application ofthe synchronization service 330 ofthe storage platform ofthe present invention is to synchronize multiple instances ofthe storage platform (each with its own data store).
- the synchronization service operates at the level ofthe storage « ' ) «.- F platform schemas (rather than the underlying tables ofthe database engine 314). Thus, for example, "Scopes" are used to define synchronization sets as discussed below. [0278]
- the synchronization service operates on the principle of "net changes”. Rather than recording and sending individual operations (such as with transactional replication), the synchronization service sends the end-result of those operations, thus often consolidating the results of multiple operations into a single resulting change. [0279] The synchronization service does not in general respect transaction boundaries.
- sync is initiated on one side by an SCA. That SCA informs the local synchronization service to synchronize with the remote partner. On the other side, the synchronization service is awoken by the messages sent by the synchronization service from the originating machine. It responds based on the persistent configuration info ⁇ nation (see mappings below) present on the destination machine. The synchronization service can be run on schedule or in response to events. In these cases, the synchronization service implementing the schedule becomes the SCA. [0282] To enable synchronization, two steps need to be taken. First, the schema designer must annotate the storage platfo ⁇ n schema with appropriate sync semantics (designating Change Units as described below).
- a fundamental concept ofthe synchronization service is that of a Change Unit.
- a Change Unit is a smallest piece of schema that is individually tracked by the storage platfo ⁇ n. For every Change Unit, the synchronization service may be able to determine whether it changed or did not change since the last sync.
- Designating Change Units in the schema selves several purposes. First, it determines how chatty the synchronization service is on the wire.
- the synchronization service allows schema designers to participate in the process.
- Joe wants to keep My Documents folders of his several computers in sync
- Joe defines a community folder called, say, JoesDocuments.
- Joe configures a mapping between the hypothetical JoesDocuments folder and the local My Documents folder. From this point on, when Joe's computers synchronize with each other, they talk in terms of documents in JoesDocuments, rather than their local items. This way, all Joe's computers understand each other without having to know who the others are — the Community Folder becomes the lingua franca ofthe sync community.
- Configuring the synchronization service consists of three steps: (1) defining mappings between local folders and community folders; (2) defining sync profiles that determine what gets synchronized (e.g.
- Community Folder mappings are stored as XML configuration files on individual machines. Each mapping has the following schema: /mappings/ communityF older This element names the community folder that this mapping is for. The name follows the syntax rules of Folders. /mappings/localF older This element names the local folder that the mapping transforms into. The name follows the syntax rules of Folders. The folder must already exist for the mapping to be valid. The items within this folder are considered for synchronization per this mapping.
- /mappings/transformations This element defines how to transform items from the community folder to the local folder and back. If absent or empty, no transformations are performed. In particular, this means that no IDs are mapped. This configuration is primarily useful for creating a cache of a Folder.
- /mappings/transformations/mapIDs This element requests that newly generated local IDs be assigned to all ofthe items mapped from the community folder, rather than reusing community IDs. The Sync Runtime will maintain ID mappings to convert items back and forth.
- /mappings/transformations/localRoot This element requests that all root items in the community folder be made children ofthe specified root. /mappings/runAs This element controls under whose authority requests against this mapping are processed.
- a Sync Profile is a total set of parameters needed to kick off synchronization. It is supplied by an SCA to the Sync Runtime to initiate sync.
- Sync profiles for storage platform- to-storage platform synchronization contain the following information: • Local Folder, to serve as the source and destination for changes; • Remote Folder name to synchronize with - this Folder must be published from the remote partner by way of a mapping as defined above; • Direction - the synchronization service supports send-only, receive-only, and send-receive sync; • Local Filter ⁇ selects what local information to send to the remote partner.
- the synchronization service provides a runtime CLR class that allows simple building of Sync Profiles.
- Profiles can also be serialized to and from XML files for easy storage (often alongside schedules). However, there is no standard place in the storage platform where all the profiles are stored; SCAs are welcome to construct a profile on the spot without ever persisting it. Note that there is no need to have a local mapping to initiate sync. All sync information can be specified in the profile. The mapping is, however, required in order to respond to sync requests initiated by the remote side. (3) Schedules [0295] In one embodiment, the synchronization service does not provide its own scheduling infrastructure. Instead, it relies on another component to peform this task - the Windows Scheduler available with the Microsoft Windows operating system.
- the synchronization service includes a command-line utility that acts as an SCA and triggers synchronization based on a sync profile saved in an XML file.
- This utility makes it very easy to configure the Windows Scheduler to run synchronization either on schedule, or in response to events such as user logon or logoff.
- Conflict handling in the synchronization service is divided into three stages: (1) conflict detection, which occurs at change application time - this step determines if a change can be safely applied; (2) automatic conflict resolution and logging - during this step (that takes place immediately after the conflict is detected) automatic conflict resolvers are consulted to see if the conflict can be resolved - if not, the conflict can be optionally logged; and (3) conflict inspection and resolution - this step takes place if some conflicts have been logged, and occurs outside of the context ofthe sync session - at this time, logged conflicts can be resolved and removed from the log.
- conflict Detection [0297] In the present embodiment, the synchronization service detects two types of conflicts: knowledge-based and constraint-based.
- a knowledge-based conflict occurs when two replicas make independent changes to the same Change Unit. Two changes are called independent if they are made without knowledge of each other — in other words, the version ofthe first is not covered by the knowledge ofthe second and vice versa.
- the synchronization service automatically detects all such conflicts based on the replicas' knowledge as described above.
- Constraint-Based Conflicts There are cases where independent changes violate an integrity constraint when applied together. For instance, two replicas creating a file with the same name in the same directory could cause such a conflict to occur. [0301] A constraint-based conflict involves two independent changes (just like a knowledge-based one), but they do not affect the same Change Unit. Rather, they affect different Change Units but with a constraint existing between them. [0302] The synchronization service detects constraint violations at change application time and raises constraint-based conflicts automatically. Resolving constraint-based conflicts usually requires custom code that modifies the changes in such as way as to not violate the constraint; The synchronization service does not provide a general-purpose mechanism for doing so.
- the synchronization service can take one of three actions (selected by the sync initiator in the Sync Profile): (1) reject the change, returning it back to sender; (2) log a conflict into a conflict log; or (3) resolve the conflict automatically. [0304] If the change is rejected, the synchronization service acts as if the change did not arrive at the replica. A negative acknowledgement is sent back to the originator. This resolution policy is primarily useful on head-less replicas (such as file servers) where logging conflicts is not feasible. Instead, such replicas force the others to deal with the conflicts by rejecting them. [0305] Sync initiators configure conflict resolution in their Sync Profiles.
- the synchronization service supports combining multiple conflict resolvers in a single profile in the following ways - first, by specifying a list of conflict resolvers to be tried one after another, until one of them succeeds; and second, by associating conflict resolvers with conflict types, e.g. directing update-update knowledge-based conflicts to one resolver, but all the other conflicts to the log.
- conflict resolvers with conflict types, e.g. directing update-update knowledge-based conflicts to one resolver, but all the other conflicts to the log.
- This list includes: • local-wins: disregard incoming changes if in conflict with locally stored data; • remote-wins: disregard local data if in conflict with incoming changes; • last- writer- wins: pick either local-wins or remote-wins per Change Unit based on the timestamp ofthe change (note that the synchronization service in general does not rely on clock values; this conflict resolver is the sole exception to that rule); • Deterministic: pick a winner in a manner that is guaranteed to be the same on all replicas, but not otherwise meaningful - one embodiment ofthe synchronization services uses lexicographic comparisons of partner IDs to implement this feature. [0307] In addition, ISVs can implement and install their own conflict resolvers.
- Custom conflict resolvers may accept configuration parameters; such parameters must be specified by the SCA in the Conflict Resolution section ofthe Sync Profile.
- a conflict resolver When a conflict resolver handles a conflict, it returns the list of operations that need to be performed (in lieu ofthe conflicting change) back to the runtime. The synchronization service then applies these operations, having properly adjusted remote knowledge to include what the conflict handler has considered.
- conflict resolutions can be viewed as joins — combining two branches to form a single point. Thus, conflict resolutions turn version histories into DAGs.
- conflict Logging A very particular kind of a conflict resolver is the Conflict Logger.
- the synchronization service logs conflicts as Items of type ConflictRecord. These records are related back to the items that are in conflict (unless the items themselves have been deleted). Each conflict record contains: the incoming change that caused the conflict; the type ofthe conflict: update-update, update-delete, delete-update, insert-insert, or constraint; and the version ofthe incoming change and the knowledge ofthe replica sending it. Logged conflicts are available for inspection and resolution as described below.
- conflict Inspection and Resolution The synchronization service provides an API for applications to examine the conflict log and to suggest resolutions ofthe conflicts in it.
- the API allows application to enumerate all conflicts, or conflicts related to a given Item. It also allows such applications to resolve logged conflicts in one of three ways: (1) remote wins — accepting the logged change and overwriting the conflicting local change; (2) local wins — ignoring conflicting parts ofthe logged change; and (3) suggest new change — where the application proposes a merge that, in its opinion, resolves the conflict. Once conflicts are resolved by an application, the synchronization service removes them from the log. (d) Convergence of Replicas and Propagation of Conflict Resolutions [0313] In complex synchronization scenarios, the same conflict can be detected at multiple replicas.
- the synchronization seivice forwards conflict resolutions to other replicas.
- the synchronization service automatically finds any conflict records in the log that are resolved by this update and eliminates them. In this sense, a conflict resolution at one replica is binding on all the other replicas.
- the synchronization service applies the principle of binding conflict resolution and picks one ofthe two resolutions to win over the other automatically.
- the storage platform provides an architecture for ISVs to implement Sync Adapters that allow the storage platform to synchronize to legacy systems such as Microsoft Exchange, AD, Hotmail, etc.
- Sync Adapters benefit from the many Sync Service provided by the synchronization service, as described below.
- Sync Adapters do not need to be implemented as plug-ins into some storage platform architecture. If desired, a "sync adapter" can simply be any application that utilizes the synchronization service runtime interfaces to obtain services such as change enumeration and application.
- Sync Adapter writers are encouraged to expose the standard Sync Adapter interface, which runs sync given the Sync Profile as described above. The profile provides configuration information to the adapter, some of which adapters pass to the Sync Runtime to control runtime services (e.g.
- the synchronization service provides a number of sync services to adapter writers. For the rest of this section, it is convenient to refer to the machine on which the storage platform is doing synchronization as the "client” and the non-storage platform backend that the adapter is talking to as the "server”.
- (1) Change Enumeration [0321] Based on the change-tracking data maintained by the synchronization service, Change Enumeration allows sync adapters to easily enumerate the changes that have occurred to a data store Folder since the last time synchronization with this partner was attempted.
- Fig. 24 illustrates the process by which storage platform API classes are generated from the storage platform Schema.
- the primary function of change application is to automatically detect conflicts.
- a conflict is defined as two overlapping changes being made without knowledge of each other.
- Change Application raises a conflict if an overlapping local change that is not covered by the adapter's knowledge is detected.
- adapters may use either stored or supplied anchors.
- Change Application supports efficient storage of adapter-specific metadata. Such data may be attached by the adapter to the changes being applied, and might be stored by the synchronization service. The data might be returned on next change enumeration.
- Sync adapters may specify the policy for conflict resolution when applying changes. If specified, conflicts may be passed on to the specified conflict handler and resolved (if possible). Conflicts can also be logged. It is possible that the adapter may detect a conflict when attempting to apply a local change to the backend. In such a case, the adapter may still pass the conflict on to the Sync Runtime to be resolved according to policy. In addition, Sync Adapters may request that any conflicts detected by the synchronization service be sent back to them for processing. This is particularly convenient in the case where the backend is capable of storing or resolving conflicts.
- Adapter Implementation While some "adapters" are simply applications utilizing runtime interfaces, adapters are encouraged to implement the standard adapter interfaces. These interfaces allow Sync Controlling Applications to: request that the adapter perform synchronization according to a given Sync Profile; cancel on-going synchronization; and receive progress reporting (percentage complete) on an ongoing sync. 3. Security [0329] The synchronization service strives to introduce as little as possible into the security model implemented by the storage platform. Rather than defining new rights for synchronization, existing rights are used.
- the synchronization service does not maintain secure authorship infonnation.
- a change is made at replica A by user U and forwarded to replica B, the fact that the change was originally made at A (or by U) is lost. If B forwards this change to replica C, this is done under B's authority, not that of A. This leads to the following limitation: if a replica is not trusted to make its own changes to an item, it cannot forward changes made by others.
- the synchronization service When the synchronization service is initiated, it is done by a Sync Controlling Application.
- the synchronization service impersonates the identity ofthe SCA and performs all operations (both locally and remotely) under that identity. To illustrate, observe that user U cannot cause the local synchronization service to retrieve changes from a remote storage platform for items that user U does not have read access. 4.
- Manageability [0332] Monitoring a distributed community of replicas is a complex problem.
- the synchronization service may use a "sweep" algorithm to collect and distribute information about the status ofthe replicas. The properties ofthe sweep algorithm ensure that information about all configured replicas is eventually collected and that failing (non-responsive) replicas are detected. [0333] This community-wide monitoring information is made available at every replica.
- Sync Replica Most applications are only interested in tracking, enumerating and synchronizing changes for a given subset of items within the WinFS store. The set of items that take part in a synchronization operation is termed as a Synchronization Replica.
- a Replica is defined in terms of items contained within a given WinFS containment hierarchy (usually rooted at a Folder item).
- WinFS Sync provides a mechanism to define, manage and cleanup replicas. Every replica has a GUID identifier that uniquely identifies it within a given WinFS store.
- Sync Partner A sync partner is defined as an entity capable of affecting changes on WinFS items, extensions and relationships. Thus, every WinFS store can be termed as a sync partner.
- EDS external data source
- Every partner has a GUID identifier that uniquely identifies it.
- Sync Community A synchronization community is defined as a collection of replicas that are kept in sync by means of peer-to-peer synchronization operations. These replicas may all be in the same WinFS store, different WinFS stores, or even manifest themselves as virtual replicas on non-WinFS stores. WinFS sync does not prescribe or mandate any specific topology for the community, especially if the only sync operations in the community are through the WinFS Sync service (WinFS adapter). Synchronization adapters (defined below) may introduce their own topology restrictions. [0340] Change Tracking, Change Units and Versions: Every WinFS store tracks changes to all local WinFS Items, Extensions and Relationships. Changes are tracked at the level of change unit granularity defined in the schema.
- top-level fields of any Item, Extension and Relationship type can be sub-divided by the schema designer into change units, with the smallest granularity being one top-level field.
- every change unit is assigned a Version, where a version is a pair of sync partner id and a version number (the version number is a partner-specific monotonically increasing number). Versions are updated as changes happen in the store locally or as they are obtained from other replicas.
- Sync Knowledge represents the state of a given sync replica at any time, i.e. it encapsulates meta-data about all the changes a given replica is aware of, either local or from other replicas. WinFS sync maintains and updates knowledge for sync replicas across sync operations.
- a synchronization adapter is a managed code application that accesses WinFS Sync services through the Sync Runtime API and enables synchronization of WinFS data to a non-WinFS data store. Depending on the requirements ofthe scenario, it's upto the adapter developer as to which subset of WinFS data and what WinFS data types to synchronize.
- the adapter is responsible for communication with the EDS, transforming WinFS schemas to and from EDS supported schemas and defining and managing its own configuration and metadata.
- WinFS Sync Adapter API For adapters that synchronize WinFS data to external non-WinFS stores and cannot produce or maintain knowledge in WinFS format, WinFS Sync provides services to obtain remote knowledge that can be used for subsequent change enumeration or application operations. Depending on the capabilities ofthe backend store, the adapter may wish to store this remote knowledge on the backend or on the local WinFS store.
- a synchronization "replica” is a structure that represents a set of data in a "WinFS" store that exists in a single logical location, whereas data on a non-"WinFS” store is called a “data source” and generally requires the use of a adapter.
- Remote Knowledge When a given sync replica wishes to obtain changes from another replica it provides it's own knowledge as a baseline against which the other replica enumerates changes. Similarly, when a given replica wishes to send changes to another replica, it provides it's own knowledge as a baseline which can be used by the remote replica for detecting conflicts. This knowledge about the other replica that's provided during sync change enumeration and application is termed a Remote Knowledge. 2.
- the synchronization API separates into two parts: the synchronization configuration API and the synchronization controller API.
- the synchronization Configuration API enables applications to configure synchronization and to specify parameters for a particular synchronization session between two replicas.
- configuration parameters include the set of Items to be synchronized, the type of synchronization (one-way or two-way), information about the remote data source, and the conflict resolution policy.
- the synchronization controller API initiates a synchronization session, cancels synchronization, and receives progress and e ⁇ or infonnation about the on-going synchronization.
- synchronization adapters for synchronizing information between "WinFS" and non-"WinFS” data sources.
- Examples of adapters include an adapter that synchronizes address book information between a "WinFS” contacts folder and a non-WinFS mailbox.
- adapter developers might use the "WinFS" synchronization core services API described herein for accessing services provided by the "WinFS” synchronization platform in order to develop schema transformation code between the "WinFS" schema and the non-" WinFS” data source schema.
- the adapter developer provides protocol support for communicating changes with the non-" WinFS” data source.
- a synchronization adapter is invoked and controlled by using the synchronization controller API and reports progress and e ⁇ ors using this API.
- a synchronization adapter may be unnecessary if "WinFS” to "WinFS” synchronization services are integrated within the hardware/software interface system.
- several such embodiments provides a set of synchronization services for both "WinFS” to "WinFS” and synchronization adapter solutions that include: • Tracking of changes to "WinFS" items, extensions and relationships. • Support for efficient incremental change enumeration since a given past state. • Application of external changes to "WinFS”. • Conflict handling during change application.
- FIG. 36 which illustrates a three instances of a common data store and the components for synchronizing them.
- a first system 3602 has a WinFS data store 3612 comprising a WinFS-to-WinFS Sync services 3622 and Core Sync Services 3624, for WinFS-to- nonWinFS synchronization, which exposes 3646 a Sync API 3652 for utilization.
- a second system 3604 has a WinFS data store 3614 comprising a WinFS-to- WinFS Sync services 3632 and Core Sync Services 3634, for WinFS-to-non WinFS synchronization, which exposes 3646 a Sync API 3652 for utilization.
- the first system 3602 and the second system 3604 synchronize 3642 via their respective WinFS-to-WinFS Sync seivices 3622 and 3632.
- the first system 3602 is aware of and directly synchronizes with both the second system 3604 and third system 3606.
- neither the second system 3604 nor the third system 3606 are aware of each other and, thus, do not synclironize their changes directly with each other but, instead, changes that occur on one system must propogate tlirough the first system 3602.
- Change Enumeration allows sync adapters to easily enumerate the changes that have occu ⁇ ed to a data store Folder since the last time synchronization with this partner was attempted based on the change-tracking data maintained by the synchronization service.
- change enumeration several embodiments ofthe present invention are directed to: • the efficient enumeration of changes to Items, Extensions and Relationships in a given replica, relative to a specified Knowledge instance.
- change application allows Sync Adapters to apply changes received from their backend to the local storage platform since the adapters are expected to transform the changes to the storage platform schema.
- several embodiments ofthe present invention are directed to: • the application of incremental changes from other replicas (or non-WinFS stores) with co ⁇ esponding updates to WinFS change metadata. • the detection of conflicts on change application at change unit granularity. • the reporting of success, failure and conflicts at individual change unit level on change application, so that applications (including adapters and sync controlling apps) can use that information for progress, e ⁇ or and status reporting and for updating their backend state, if any.
- remoteKnowledge ctx.ReplicaSynchronizer.GetUpdatedRemoteKnowledge(); reader.Close();
- ctx.ReplicaSynchronizer.ConflictPolicy conflictPolicy
- ctx.ReplicaSynchronizer.RemotePartnerld remotePartnerld
- ctx.ReplicaSynchronizer.RemoteKnowledge remoteKnowledge
- ctx.ReplicaSynchronizer.ChangeStatusEvent + FOO_OnChangeStatusEvent;
- Adapter // Obtain changes from remote store. Adapter is responsible for retrieving // it's backend specific metadata from the store. This can be an extension // on the replica.
- remoteAnchor FOO_GetRemoteAnchorFromStore();
- FOO_RemoteChangeCollection remoteChanges FOO_GetRemoteChanges( remoteAnchor );
- the adapter API provides: • A standard mechanism to register adapters with the hardware/software interface system synchronization framework. • A standard mechanism for adapters to declare their capabilities and the type of configuration info ⁇ nation needed to initialize the adapter. • A standard mechanism for passing initialization information to the adapter. • A mechanism for adapters to report progress status back to the applications invoking synchronization. • A mechanism to report any e ⁇ ors that occur during synchronization. • A mechanism to request cancellation of an ongoing synchronization operation. [0357] There are two potential process models for adapters, depending on the requirements ofthe scenario. The adapter can execute in the same process space as the application invoking it or in a separate process all by itself.
- the adapter To execute in its own separate process, the adapter defines its own factory class, which is used to instantiate the adapter.
- the factory can return an instance ofthe adapter in the same process as the invoking application, or return a remote instance ofthe adapter in a different Microsoft common language runtime application domain or process.
- a default factory implementation is provided which instantiates the adapter in the same process.
- many adapters will run in the same process as the invoking application.
- the out of process model is usually required for one or both ofthe following reasons: • Security purposes.
- the adapter must run in the process space of a certain process or service. •
- the adapter has to process requests from other sources — for example, incoming network requests — in addition to processing requests from invoking applications. [0358] Referring to Fig.
- one embodiment ofthe present invention presumes a simple adapter that is unaware of how state is calculated or it associated metadata is exchanged.
- synchronization is achieved by the replica, in regard to the data source with which it wants to synchronize, by first, at step 3702, determining which changes have occu ⁇ ed since it last synchronized with said data source, and the replica then transmits the incremental changes that have occurred since this last synchronization based on its present state information, and this present state information and incremental changes are to the data source via the adapter.
- the adapter upon receiving the change data from the replica in the previous step, implements as many changes to the data source as possible, tracks which changes are successful and which fail, and transmits the success-and-failure info back to WinFS (ofthe replica).
- WinFS The hardware/software interface system ofthe replica (WinFS)
- WinFS upon receiving the success-and-failure info from the replica, then calculates the new state information for the data source, stores this information for future use by its replica, and transmits this new state info back to the data source, that is, to the adapter for storage and subsequent use by the adapter.
- Each replica is a defined synchronization subset of data from the entirety of a data store — a slice of data having multiple instances.
- Conflict resolution policies are handled by each replica (and adaptor/data source combination) individually — that is, each replica is able to resolve conflicts based on its own criteria and conflict resolution schema. Moreove, while differences in each instance ofthe data store may result and lead to additional future conflicts, the incremental and sequential enumeration of conflicts as reflected in updated state information is invisible to other replicas that receive that updated state info ⁇ nation.
- At the root ofthe sync schema is the replica which has a base type to define a root folder (in fact, a root Item) that has a unique ID, an ID for the sync community in which it is a member, and whatever filters and other elements are necessary or desireable for the specific replica.
- Each replica's "mapping" is maintained within the replica and, as such, the mapping for any particular replica is limited to the other replicas such replica knows about. While this mapping may only comprise a subset ofthe entire sync community, changes to said replica will still propogate to the entire sync community via commonly shared replicas (although any particular replica is unaware of which other replicas it is commonly sharing with an unknown replica).
- each replica may have multiple mappings in order to allow different synchronization behavior with different sync partners in the same sync community.
- a replica's mapping need only contain the community identification and the mapping identification of a sync partner; in this way, the replica is able to synchronize with a partner without necessarily knowing the physical location of the sync partner replica (thus enhancing security for the sync partner replica).
- the sync schema includes both a plurality of predefined conflict handlers available to all replicas, as well as the ability for user/developer defined custom conflict handlers.
- the schema also may also include three special "conflict resolvers”: (a) a conflict "filter” which resolves different conflicts in different ways based, e.g., (i) how to handle when same change unit changed in two places, (ii) how to handle when a change unit is changed in one place but deleted in another; and (iii) how to handle when two different change units have the same name in two different locations; (b) conflict "handler list” where each element of the list specifies a series of actions to attempt in order until the conflict is successfully resolved; and (c) a "do-nothing" log that tracks the conflict but takes no further action without user intervention.
- conflict resolvers e.g., (i) how to handle when same change unit changed in two places, (ii) how to handle when a change unit is changed in one place but deleted in another; and (iii) how to handle when two different change units have the same name in two different locations.
- conflict "handler list” where each element of the list specifies a
- sync schema and use of replicas enables a true distributed peer-to-peer mutli- master synchronization community. Moreover, there is no sync community type, but the sync community exists simply as a value in the community field ofthe replicas themselves.
- Every replica has its own metadata for tracking incremental change enumeration and storing state information for the other replicas that are known in the sync community.
- Change units have their own metadata comprising: a version comprising a partner key plus a partner change number; an Item/Extension/Relationship versioning for each change unit; Knowledge regarding the changes a replica has seen/received from the sync community; a GUID and Local ID configuration; and a GUID stored on a reference relationship for cleanup.
- the present invention is directed to a storage platform for organizing, searching, and sharing data.
- the storage platform ofthe present invention extends and broadens the concept of data storage beyond existing file systems and database systems, and is designed to be the store for all types of data, including structured, non- structured, or semi-structured data, such as relational (tabular) data, XML, and a new form of data called Items.
- structured, non- structured, or semi-structured data such as relational (tabular) data, XML, and a new form of data called Items.
- the storage platfo ⁇ n ofthe present invention enables more efficient application development for consumers, knowledge workers and enterprises. It offers a rich and extensible application programming interface that not only makes available the capabilities inherent in its data model, but also embraces and extends existing file system and database access methods.
- This program code may be stored on a computer-readable medium, such as a magnetic, electrical, or optical storage medium, including without limitation a floppy diskette, CD-ROM, CD-RW, DVD-ROM, DVD-RAM, magnetic tape, flash memory, hard disk drive, or any other machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer or server, the machine becomes an apparatus for practicing the invention.
- a computer-readable medium such as a magnetic, electrical, or optical storage medium, including without limitation a floppy diskette, CD-ROM, CD-RW, DVD-ROM, DVD-RAM, magnetic tape, flash memory, hard disk drive, or any other machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer or server, the machine becomes an apparatus for practicing the invention.
- the present invention may also be embodied in the form of program code that is transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, over a network, including the Internet or an intranet, or via any other form of transmission, wherein, when the program code is received and loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
- program code When implemented on a general-purpose processor, the program code combines with the processor to provide a unique apparatus that operates analogously to specific logic circuits.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US646632 | 2003-08-21 | ||
WOPCT/US03/26144 | 2003-08-21 | ||
US10/646,632 US7529811B2 (en) | 2003-08-21 | 2003-08-21 | Systems and methods for the implementation of a core schema for providing a top-level structure for organizing units of information manageable by a hardware/software interface system |
PCT/US2003/026144 WO2005029313A1 (en) | 2003-08-21 | 2003-08-21 | Systems and methods for data modeling in an item-based storage platform |
US10/693,362 US8166101B2 (en) | 2003-08-21 | 2003-10-24 | Systems and methods for the implementation of a synchronization schemas for units of information manageable by a hardware/software interface system |
US693362 | 2003-10-24 | ||
PCT/US2004/024287 WO2005024626A1 (en) | 2003-08-21 | 2004-07-29 | Systems for the implementation of a synchronization schemas |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1573508A1 EP1573508A1 (en) | 2005-09-14 |
EP1573508A4 true EP1573508A4 (en) | 2006-04-26 |
Family
ID=34279605
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP04757348A Ceased EP1573508A4 (en) | 2003-08-21 | 2004-07-29 | Systems for the implementation of a synchronization schemas |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP1573508A4 (en) |
JP (1) | JP4583375B2 (en) |
KR (1) | KR101109399B1 (en) |
CN (1) | CN1739093B (en) |
WO (1) | WO2005024626A1 (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8166101B2 (en) | 2003-08-21 | 2012-04-24 | Microsoft Corporation | Systems and methods for the implementation of a synchronization schemas for units of information manageable by a hardware/software interface system |
US8131739B2 (en) | 2003-08-21 | 2012-03-06 | Microsoft Corporation | Systems and methods for interfacing application programs with an item-based storage platform |
US8238696B2 (en) | 2003-08-21 | 2012-08-07 | Microsoft Corporation | Systems and methods for the implementation of a digital images schema for organizing units of information manageable by a hardware/software interface system |
US7590643B2 (en) | 2003-08-21 | 2009-09-15 | Microsoft Corporation | Systems and methods for extensions and inheritance for units of information manageable by a hardware/software interface system |
US7805422B2 (en) | 2005-02-28 | 2010-09-28 | Microsoft Corporation | Change notification query multiplexing |
US7788163B2 (en) | 2005-03-11 | 2010-08-31 | Chicago Mercantile Exchange Inc. | System and method of utilizing a distributed order book in an electronic trade match engine |
US7991740B2 (en) * | 2008-03-04 | 2011-08-02 | Apple Inc. | Synchronization server process |
EP2625630A4 (en) * | 2010-10-04 | 2017-02-08 | Telefonaktiebolaget LM Ericsson (publ) | Data model pattern updating in a data collecting system |
TWI497311B (en) * | 2013-03-28 | 2015-08-21 | Quanta Comp Inc | Inter-device communication transmission system and method thereof |
US10402744B2 (en) | 2013-11-18 | 2019-09-03 | International Busniess Machines Corporation | Automatically self-learning bidirectional synchronization of a source system and a target system |
US9542467B2 (en) | 2013-11-18 | 2017-01-10 | International Business Machines Corporation | Efficiently firing mapping and transform rules during bidirectional synchronization |
US9367597B2 (en) | 2013-11-18 | 2016-06-14 | International Business Machines Corporation | Automatically managing mapping and transform rules when synchronizing systems |
US11977549B2 (en) | 2016-09-15 | 2024-05-07 | Oracle International Corporation | Clustering event processing engines |
CN113886191B (en) * | 2021-10-25 | 2024-07-12 | 北京轻舟智航科技有限公司 | Real-time tracking data processing method and device for automatic driving system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5774717A (en) * | 1995-12-15 | 1998-06-30 | International Business Machines Corporation | Method and article of manufacture for resynchronizing client/server file systems and resolving file system conflicts |
WO2002075539A2 (en) * | 2001-03-16 | 2002-09-26 | Novell, Inc. | Client-server model for synchronization of files |
US6553391B1 (en) * | 2000-06-08 | 2003-04-22 | International Business Machines Corporation | System and method for replicating external files and database metadata pertaining thereto |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE69625751T2 (en) * | 1995-07-07 | 2003-10-02 | Sun Microsystems, Inc. | Method and system for synchronizing execution of events when testing software |
US5893106A (en) * | 1997-07-11 | 1999-04-06 | International Business Machines Corporation | Object oriented server process framework with interdependent-object creation |
EP1187421A3 (en) * | 2000-08-17 | 2004-04-14 | FusionOne, Inc. | Base rolling engine for data transfer and synchronization system |
US7711771B2 (en) * | 2001-05-25 | 2010-05-04 | Oracle International Corporation | Management and synchronization application for network file system |
JP2005509979A (en) * | 2001-11-15 | 2005-04-14 | ヴィスト・コーポレーション | Asynchronous synchronization system and method |
GB0128243D0 (en) * | 2001-11-26 | 2002-01-16 | Cognima Ltd | Cognima patent |
-
2004
- 2004-07-29 JP JP2006523856A patent/JP4583375B2/en not_active Expired - Fee Related
- 2004-07-29 WO PCT/US2004/024287 patent/WO2005024626A1/en active Application Filing
- 2004-07-29 EP EP04757348A patent/EP1573508A4/en not_active Ceased
- 2004-07-29 CN CN2004800023968A patent/CN1739093B/en not_active Expired - Fee Related
- 2004-07-29 KR KR1020057012324A patent/KR101109399B1/en active IP Right Grant
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5774717A (en) * | 1995-12-15 | 1998-06-30 | International Business Machines Corporation | Method and article of manufacture for resynchronizing client/server file systems and resolving file system conflicts |
US6553391B1 (en) * | 2000-06-08 | 2003-04-22 | International Business Machines Corporation | System and method for replicating external files and database metadata pertaining thereto |
WO2002075539A2 (en) * | 2001-03-16 | 2002-09-26 | Novell, Inc. | Client-server model for synchronization of files |
Non-Patent Citations (4)
Title |
---|
RICHARD G GUY ET AL: "Implementation of the Ficus Replicated File System", PROCEEDINGS OF THE SUMMER USENIX CONFERENCE, 30 June 1990 (1990-06-30), pages 63 - 71, XP002234187 * |
See also references of WO2005024626A1 * |
SESHADRI P ET AL: "SQLServer for Windows CE-a database engine for mobile and embedded platforms", DATA ENGINEERING, 2000. PROCEEDINGS. 16TH INTERNATIONAL CONFERENCE ON SAN DIEGO, CA, USA 29 FEB.-3 MARCH 2000, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 29 February 2000 (2000-02-29), pages 642 - 644, XP010378761, ISBN: 0-7695-0506-6 * |
SYNCML CONSORTIUM: "SyncML Sync Protocol, version 1.0", SYNCML CONSORTIUM, 7 December 2000 (2000-12-07), XP002217356 * |
Also Published As
Publication number | Publication date |
---|---|
JP4583375B2 (en) | 2010-11-17 |
CN1739093B (en) | 2010-05-12 |
KR20070083241A (en) | 2007-08-24 |
WO2005024626A1 (en) | 2005-03-17 |
CN1739093A (en) | 2006-02-22 |
KR101109399B1 (en) | 2012-01-30 |
EP1573508A1 (en) | 2005-09-14 |
JP2007503049A (en) | 2007-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7743019B2 (en) | Systems and methods for providing synchronization services for units of information manageable by a hardware/software interface system | |
US8166101B2 (en) | Systems and methods for the implementation of a synchronization schemas for units of information manageable by a hardware/software interface system | |
US7483923B2 (en) | Systems and methods for providing relational and hierarchical synchronization services for units of information manageable by a hardware/software interface system | |
US7512638B2 (en) | Systems and methods for providing conflict handling for peer-to-peer synchronization of units of information manageable by a hardware/software interface system | |
US7401104B2 (en) | Systems and methods for synchronizing computer systems through an intermediary file system share or device | |
US7917534B2 (en) | Systems and methods for extensions and inheritance for units of information manageable by a hardware/software interface system | |
CA2512185C (en) | Systems and methods for providing synchronization services for units of information manageable by a hardware/software interface system | |
ZA200504391B (en) | Systems and methods for extensions and inheritance for units of information manageable by a hardware/software interface system | |
EP1620781A2 (en) | Systems and methods for the implementation of a digital images schema for organizing units of information manageable by a hardware/software interface system | |
CA2506337C (en) | Systems and methods for extensions and inheritance for units of information manageable by a hardware/software interface system | |
JP4580389B2 (en) | System and method for synchronizing computer systems via an intermediary file system share or intermediary device | |
EP1573508A1 (en) | Systems for the implementation of a synchronization schemas | |
NZ540221A (en) | Systems and methods for extensions and inheritance for units of information manageable by a hardware/software interface system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20050620 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL HR LT LV MK |
|
A4 | Supplementary search report drawn up and despatched |
Effective date: 20060315 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06F 17/30 20060101AFI20060309BHEP |
|
DAX | Request for extension of the european patent (deleted) | ||
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20171116 |