WO2011110534A2 - Virtual shell - Google Patents

Virtual shell Download PDF

Info

Publication number
WO2011110534A2
WO2011110534A2 PCT/EP2011/053410 EP2011053410W WO2011110534A2 WO 2011110534 A2 WO2011110534 A2 WO 2011110534A2 EP 2011053410 W EP2011053410 W EP 2011053410W WO 2011110534 A2 WO2011110534 A2 WO 2011110534A2
Authority
WO
WIPO (PCT)
Prior art keywords
module
virtual
objects
requestable
namespaces
Prior art date
Application number
PCT/EP2011/053410
Other languages
French (fr)
Other versions
WO2011110534A3 (en
Inventor
Ernest Artiaga Amouroux
Antonio CORTÉS ROSSELLÓ
Jonathan MARTÍ FRAIZ
Original Assignee
Universitat Politècnica De Catalunya
Barcelona Supercomputing Center - Centro Nacional De Supercomputación
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universitat Politècnica De Catalunya, Barcelona Supercomputing Center - Centro Nacional De Supercomputación filed Critical Universitat Politècnica De Catalunya
Priority to EP11709070.4A priority Critical patent/EP2585946A2/en
Publication of WO2011110534A2 publication Critical patent/WO2011110534A2/en
Publication of WO2011110534A3 publication Critical patent/WO2011110534A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/188Virtual file systems
    • G06F16/192Implementing virtual folder structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/188Virtual file systems

Definitions

  • the present invention relates to a virtual shell for producing one or more responses to a user request related to one or more objects.
  • the present invention also relates to a context monitoring parameters module, to a multi-namespaces module, to an underlying interface module, to an interception module, and to a logic module, each of said modules for being used in the virtual shell.
  • the present invention relates to a method for producing one or more responses to a user request related to one or more objects, and to a computer program product comprising program instructions for causing a computer to perform said method.
  • Storage systems are designed to keep data meant to be persistent (at least for a certain time).
  • the time to access data in a storage system is usually divided into “seek time” (time to prepare the access - e.g. to move the read/write head to the right position in a rotational disk and wait for the desired data to reach the position below the head) and “transfer time” (time to actually send the data from the storage system to the system where it is to be used).
  • eek time time to prepare the access - e.g. to move the read/write head to the right position in a rotational disk and wait for the desired data to reach the position below the head
  • transfer time time to actually send the data from the storage system to the system where it is to be used.
  • storage systems try to improve performance by reducing the impact of the "seek time”, and a simple way to do this is to access as much data as possible at the same position; as a consequence, data is usually grouped in large “blocks", which can be read or written at once.
  • the "pure data” required by the user are not the only possible contents of a storage system. Additional information may be required by the storage system, for example, to provide higher-level abstractions (such as files or directories) and link them to the "pure data", to provide a namespace to organize and provide means to locate a specific piece of data, or to attach "attributes” to pieces of data (e.g. the type of said data or who is the owner). This additional information is called “metadata”. As well as “pure data”, the additional "metadata" is usually expected to be persistent; therefore, it also needs to be stored in a storage system (either in the same media where the "pure data” resides, or somewhere else), and the considerations about the benefits of large “blocks” also apply to them.
  • file systems can be considered a particular type of storage system offering abstractions such as “files” or “directories” on top of “pure data” (the files' contents) and “metadata” (the namespace and file's attributes, among other possible things).
  • files can be considered a particular type of storage system offering abstractions such as “files” or “directories” on top of “pure data” (the files' contents) and “metadata” (the namespace and file's attributes, among other possible things).
  • metadata the namespace and file's attributes, among other possible things.
  • Another trend consists in integrating data and metadata with tight links, possibly storing them together (as the "metadata" of a file, for example, will be probably needed to access adequately the "pure data” of that file). Examples of these trends can be easily found. For instance, the old FAT- based file system grouped inside a directory entry the name of the file directory, its attributes and its starting physical location. Directories were a special type of file with a collection of entries corresponding to their children in the hierarchy, so that all the "metadata" of the directory contents was retrieved together.
  • file attributes and physical location information in most Unix-based file systems are grouped in a single data structure called "i-node", while the namespace is specified via directory contents (a special file containing file and subdirectory names, and their corresponding "i-node” identifier).
  • directory contents a special file containing file and subdirectory names, and their corresponding "i-node” identifier.
  • MS Windows NTFS file system groups most file attributes in a large Master File Table (MFT).
  • MFT entries also contain the name of the file, and in case of small files (or directories) may also contain the actual "pure data" of the file (or the list of file or subdirectory entries, for directories).
  • the amalgamation of "pure data” and "metadata” may be even higher than in the case of some Unix-based file systems.
  • the hierarchical file system structure becomes rigid in the sense that, by default, a specific file or directory is only available in a unique precise location of such directory tree. Moreover, the file organization is fixed for the whole system: it is possible to protect or restrict access to certain areas, but it is not possible to offer radically distinct fully functional views to different users or application, for example.
  • the word “attribute” usually refers to pieces of data intrinsically related to objects. It is important to remark the fact that the values categorized as “attributes” depend on the object itself, and not on external factors, the environment, history, or other parts of the system. For instance, when considering file system objects (such as files, directories, or symbolic links, for example) the following are examples of “attributes”: the file size, the owner of the file, the creation or modification times, or specific access permissions associated to each defined security domain. Some examples of things that cannot be considered “attributes” in the current sense are: the user accessing an object or his/her role, the time of the system, the properties of the back-end storage where the object resides (e.g.
  • the disk size, or the maximum file size allowed in a particular file system or the path in a hierarchical tree followed to reach an object for a particular access (as some file systems allow different paths to point to a single object, and the specific path followed for a particular access depends on factors external to the object).
  • the "attributes" of an object may contain information that the system (or, in particular, the file system) knows how to interpret. For example, a file system may know that an object representing a file with a size value of zero represents an empty file (i.e. a file with no data) and may use that information for its own advantage (e.g. to apply certain optimizations); similarly, a system may also be able to interpret and access control list associated to a particular object, and be able to prevent unauthorized access based on said access control list.
  • the "attributes” may also contain data that do not necessarily have a meaning for the system; in other words, they may consist of arbitrary pieces of data assigned by an upper layer, an application, ora user.
  • Some literature refers to this particular type of “attributes” as “extended attributes", and they are also intrinsically attached to a particular object.
  • a typical example of such arbitrary “attributes” could be "tags” that some systems assign to objects in order to categorize them. For example, they could be used to tag an audio file as containing rock music, and/or belonging to a certain album by a certain author; then, an application could use such information to perform specific actions, such as display the name of the file in a particular fashion.
  • Arbitrary "attributes” can also be used as a means by middleware or system software to provide support for functionalities not available in an underlying file system. For example, if an underlying file system does not support long names, a middleware could save a long name in an attribute and display it when required, while the underlying file system keeps using short names for its own management.
  • object “attributes” can also consist of references to other objects (either direct or indirect), or data that can be used to generate such references.
  • a directory object in a file system may have an attribute indicating which its parent directory is, or a symbolic link may contain a path that can be converted to a reference to a different file system object.
  • Attributes are those pieces of data considered properties intrinsically attached to the object itself and not depending on the environment, properties or behaviour of elements other than the object. Unless explicitly said otherwise, it will not be made a distinction between the “attributes” and “extended attributes” or “tags” and all of them will be named as simply “attributes”.
  • a feature of the present invention is the ability of some "objects” to "refer to” other "objects".
  • the "referred objects” represent the "contents” of the "referring object” (for example, in a file system environment, it could be considered that a file, as an abstraction, is represented by a first object, while the actual data contained in the file is represented by a second object, referred by the first object - so the second object represents the "contents" of the first object).
  • Such “references” to "objects” are, in fact, pieces of data that allow (either directly or indirectly) to identify, and eventually to reach, the "referred object” and can be considered as pieces of data intrinsically related to the "referring object”; they can be considered, therefore, “attributes” of the "referring object”.
  • the expression “contents location” will be used to categorize entities specifically related to such pieces of data.
  • a piece of data leading to a "virtual object” (which will be defined below) is called a “virtual contents location”
  • a piece of data leading to a "requestable object” (also defined below) is called a "requestable object contents location”.
  • a contents location container An entity able to contain one or more of such pieces of data is called a "contents location container”. Therefore, when said that a certain type of object (e.g. a “virtual object”) comprises a “contents location container”, this is equivalent to say that said object has an "attribute” whose values are one or more pieces of data of type “virtual contents location” or “requestable object contents location”. (Note that having a “contents location container” able to contain multiple “contents locations” of either type, is equivalent to having multiple “contents location containers” with single values in them; the first form will be mainly used, but it must be noted that both forms are interchangeable.)
  • a "content location" piece of data does not need to be a direct reference to an object: it may be and indirect reference (requiring several "hops" to reach the "target” object) or may even require, at some step, some processing to be converted to an entity allowing to reach the "target” object (e.g. an opaque piece of data that has to be decoded by a specific module to determine to which object it refers).
  • a "user” makes requests related to one or more "objects", and responses to such requests are produced by a "virtual shell".
  • user it is meant the user, application, library, daemon or any other kind of system or mechanism able to make requests related to "objects” through any means.
  • an "object” may be any entity that can be operated with (e.g. queried, updated, modified, created, removed, interacted, etc.); for the sake of clarity, different types of objects will be defined and differently named to avoid confusions.
  • underlying objects is used to refer to “objects” that exist outside the “virtual shell”.
  • underlying objects may have “attributes” and, for convenience, it may be assumed that such "underlying objects” are comprised in one or more "underlying object repositories” (or simply “underlying repositories”).
  • the "underlying objects” could be files, directories and/or any other file system objects, and the file system itself could be considered an "underlying repository”.
  • the "underlying repository” may have properties which do not depend on any particular "underlying object” contained in it, but may be considered related to them (e.g. in the case of a file system, the total amount of free space, or the maximum size allowed for a file, could be considered examples of such properties associated to the "underlying repository” itself).
  • an "underlying object” In order to be “accessible” by the "user”, an "underlying object” must have one or more corresponding "requestable objects". Opposite to "underlying objects”, “requestable objects” exist inside the “virtual shell”, and may refer to one or more "underlying objects”. There are multiple possible relations between “requestable objects” and "underlying objects”. The simplest one would be a "requestable object” being related to a single "underlying object” (e.g. a file in an underlying file system made available as a single “requestable file” through the virtual shell). Nevertheless, other combinations are possible: a single “requestable object” may refer to multiple "underlying objects” (e.g.
  • a "requestable file” being striped into two or more fragments stored in different devices, or a “requestable file” referring to different "underlying files” containing different historical versions of the "requestable file” contents); multiple “requestable objects” may refer to a single “underlying object” (e.g. two different "requestable files” being stored in different ranges of a single “underlying file”, such as in a package file as "tar” or “zip”); and, of course, any combination of the previous. Additionally, a “requestable object” may not refer to any “underlying object” at all; that would be the case where the "requestable object” is generated and handled internally by the "virtual shell” itself (e.g.
  • a file whose contents can be dynamically generated as needed, such as /dev/null or /dev/zero in Unix environments).
  • underlying objects with no corresponding "requestable objects” inside the “virtual shell”; such objects cannot be explicitly manipulated by “user” requests, but may be used internally by the “virtual shell” to keep any kind of data (e.g. the "virtual shell” can use one or more "underlying files” to keep its state information and be able to recover in case of failure, and such "underlying files” may not be accessible, or even visible, by the "user", so that no corresponding "requestable object” is related to them.
  • attribute in the context of this invention, it is considered that, when an "underlying object” is related to at least one "requestable object”, the “underlying object” attributes can be made accessible as equivalent “requestable object” attributes; in other words, when a "user” request involves querying or modifying a "requestable object” attribute, the corresponding "underlying object” attribute or attributes can also be queried or modified accordingly.
  • the invention deals with a third type of objects: the "virtual objects”.
  • the "virtual objects” exist inside the “virtual shell” and as any object, can have a set of “attributes” (for clarity, attributes of "virtual objects” will be often referred as “virtual attributes”). They serve two main purposes: providing structural support, and providing “virtual views” of "requestable objects”.
  • a "virtual object” does not have to refer to any particular “requestable object”: they may be purely virtual constructs that can be used, for example, as containers or as references to other elements (for example, in a file system context, a "virtual object” may act as a "virtual directory", containing a list of entries, and not corresponding to any particular actual directory in an underlying file system; also, as a different example, a "virtual object” may have an attribute referring to a particular entry in a namespace, acting as a "symbolic link” or "shortcut”).
  • “virtual objects” When acting as “virtualization” elements, “virtual objects” usually refer to one or more “requestable objects", having a set of virtual attributes independent from the “attributes” of the referred “requestable objects”. Then, the “virtual objects” may either offer potentially different “virtual views” of the same “requestable object”, or combine multiple "requestable objects” into a unified "view”.
  • ра ⁇ н ⁇ е ком ⁇ онент may contain different values for a set of categorization attributes, but refer to the same "requestable object”; then, a categorization tool would classify the same "requestable object” in different ways depending on the "virtual object” being accessed (which, in turn, could depend on the user, his/her role, or the application being used).
  • a "virtual object” can transparently alter the accesses to a referred "requestable object” (e.g.
  • accessing a first "virtual object” may cause the encryption of the object contents, while accessing a second "virtual object” referring to the same “requestable object” may access the contents in clear - and the selection of the "virtual object” to use could depend, for example, on the user/application making the access).
  • a "virtual object” could refer to several “requestable objects” and divert the requested access to one of them depending on environment conditions (for example, a “virtual object” may represent a file containing a photography; when accessing from a fully fledged computer, the “virtual object” may direct the access to a "requestable object” containing a high resolution version, while when accessing from a low capacity hardware - or through a slow connection - the "virtual object” may divert the access to a different "requestable object” containing a scaled down version).
  • namespace refers to the way objects are named and organized.
  • namespace management or handling include the set of data required to maintain such organization, and the operations that can be done to create, destroy, modify, or query such organization.
  • names spaces are the hierarchical tree-like organizations used by many popular file systems (such as NTFS, or ext4, for example).
  • Another example could be the hierarchy, usually generated in an automatic way by some applications, that organizes, for example, music files by author and album name.
  • an “entry” is used to refer to an element in a "namespace”.
  • an “entry” may either act as a container for other "entries”, or be final. Independently of being final or not, an “entry” may also represent an actual object, or may be simply an artefact to facilitate the organization of the actual objects.
  • a directory is an actual type of object (with its corresponding data structures stored in the disk and its own internal identifier), which is able to act as a "container” of other file system objects (either other directories, files, or other types of objects), and which has an "entry" representing it in the file system "namespace”.
  • an "entry" representing a particular album and acting as a container for the files with the album tracks does not have to correspond to an actual object (e.g. a directory or a file in the underlying system).
  • namespaces may be generated automatically from a set of objects and some generating rules, and might not be directly operable. On the contrary, other "namespaces” allow direct manipulation via operability.
  • Typical operations on a “namespace” may include creating or removing "entries” in a given “container entry”, hiding or showing specific "entries”, renaming “entries”, moving or duplicating “entries” from one container into another, or linking an "entry” to either an object or another "entry”.
  • Manipulating the "namespace” may or may not have an effect on the actual objects being represented, depending on the system offering the "namespace”.
  • a particular system providing one or more "namespaces” to organize its objects may restrict the available operations, either by reducing the set of operations that can be performed, or by imposing restrictions on the conditions for a certain operation to be valid or authorized.
  • virtual namespace (or simply “namespace) will be used to refer to each of the namespaces handled by the "multi-namespaces module”.
  • namesspaces will comprise “entries” able to refer to either “virtual objects” or “requestable objects” (depending on the particular embodiments) and to organize them.
  • underlying namespace will be explicitly used to refer to the organization of "underlying objects” in a particular "underlying repository”.
  • requests made by the "user” include accesses to the "objects” (either “virtual objects” or “requestable objects") but also requests about the "virtual shell” itself and the environment (e.g. how many "objects” are handled by the “virtual shell”, or how much storage capacity the “virtual shell” can handle through any attached “underlying repositories”).
  • accesses to “objects” it is meant any type of “access” including manipulation of "contents” (pure data associated to the “object”), such as creating contents, removing totally or in part the contents, replacing totally or in part the contents, adding contents, reading totally or in part the contents, querying information about the contents or any combination of the previous; manipulation of "attributes”, such as creating attributes, removing attributes, replacing totally or in part attribute values, adding attribute values, reading values, querying information about the attributes, or any combination of the previous; and also including operations on the object as a whole, such as creating a new object, removing an object, duplicating an object, querying or setting attributes from or to the object (such as setting the owner, or querying the size of associated contents, for example), or associating or de-associating a particular object to or from an "entry” in a given "namespace”.
  • requests from the "user” may also involve manipulation of namespaces, including, for example, namespace creation or removal, selection of namespaces, or
  • a given object may be potentially associated to multiple "entries” in a "namespace", and even appear simultaneously in several "namespaces”.
  • the requests allowed when referring to an "object” via the different “entries” and the responses to such requests do not have to be necessarily the same, but the "object” must be left in a consistent state (i.e. further requests must be possible - unless the object has been destroyed - and the expected responses must be generated).
  • the "virtual shell" may be the responsible to guarantee such consistency.
  • the "virtual shell” may be also responsible to coordinate simultaneous or concurrent requests and guarantee consistent responses and a consistent state of the "objects”.
  • file systems also provide a specific "semantics".
  • File systems offer a high level view of the storage to the user (and the applications), allowing the organization of data into files, which are usually placed in a hierarchical namespace composed of directories, subdirectories and other types of objects.
  • Applications use the file system functionalities via the file system interface, which is composed by a set of operations (sequences of requests and responses) that allow the interactions with the file system. So, file systems also define the "semantics" of such operations, establishing if a particular request is valid and the possible effects it may have under specific circumstances.
  • a "virtual shell” is able to offer a "file system semantics” if it is able to intercept an interface of requests and responses and provide a behaviour which is compatible with the "semantics" of conventional file system (e.g. POSIX or NTFS).
  • conventional file system e.g. POSIX or NTFS.
  • a "virtual shell” can offer multiple "semantics” simultaneously (e.g. it could offer a POSIX standard interface to POSIX applications, a NTFS semantics to a Windows-based application, and even a different semantics to a specially tailored application), maintaining a consistent behaviour for each particular "semantics”.
  • file system semantics implies that result of a particular interface request is not determined only by “attributes” intrinsically related to the object (file, directory%) being accessed, but also to more complex factors, such as the position of the object in a particular hierarchy, the path followed to reach it, environment factors such as time of access or the geographical location of the client trying to access it, or even the timing of a sequence of operations.
  • the authorization to remove an object does not depend on the permissions of the object itself, but on the permissions of the parent directory, according to the path followed to reach the object; on NTFS, certain directory operations such as renames can be forbidden if certain requests are being processed on descendant objects in the "name space" hierarchy; and as a last example, the position where some new content is written into a file may also depend on other requests being concurrently carried out by other users or applications.
  • POSIX the authorization to remove an object does not depend on the permissions of the object itself, but on the permissions of the parent directory, according to the path followed to reach the object; on NTFS, certain directory operations such as renames can be forbidden if certain requests are being processed on descendant objects in the "name space" hierarchy; and as a last example, the position where some new content is written into a file may also depend on other requests being concurrently carried out by other users or applications.
  • monitoring parameters will be used to refer to any parameter, property or value that can be used to determine and/or specify a condition of the system or the virtual shell (e.g. functional or performance conditions), and that can be used to launch predetermined actions as response to said conditions, excluding the values that can be intrinsically related to particular objects (i.e. excluding the "attributes").
  • the "virtual shell” may determine the requests that can be executed according to said conditions which are comprised in the "monitoring parameters" (even regardless of the "attributes" of the object being accessed - e.g. forbidding all administrator accesses out of office hours).
  • the virtual shell may also modify the result of such request, that is to say, the same request on the same virtual object may be allowed but deliver different responses depending on the "monitoring parameters" - for example, depending on the application issuing the request (e.g. providing encrypted data to a backup application, and clear data to an editor).
  • the "monitoring parameters" can take into account different combinations of variables such as, for example, the user making the request, the access pattern followed by previous requests, the optimum object size of the repository of the object being accessed, the time of the request, the available resources (e.g. hardware capacity, connection bandwidth, etc.), arbitrary information communicated to the system or the virtual shell by any means (e.g. a user or application sending a message activating a specific "operation mode" through some interface), and so on.
  • variables such as, for example, the user making the request, the access pattern followed by previous requests, the optimum object size of the repository of the object being accessed, the time of the request, the available resources (e.g. hardware capacity, connection bandwidth, etc.), arbitrary information communicated to the system or the virtual shell by any means (e.g. a user or application sending a message activating a specific "operation mode" through some interface), and so on.
  • a monitoring parameter is the path in a hierarchical organization followed to reach a particular object for a particular access.
  • Many systems allow a single object to appear in different positions of the organization (in other words, a single object may have multiple "paths"); even if the list of all possible paths might be considered an "attribute" of the object, the precise path used by a particular access does not depend on the object itself, but probably on the history of previous requests issued by the same user (which does not have anything to do with intrinsic properties of the object being accessed).
  • monitoring parameters not being object "attributes” are, for instance, the connection bandwidth between the user making a request and the system fulfilling such request, the properties of any hardware used by the system (e.g. the maximum capacity of a particular disk, the amount of memory of the device used by the client - user or application - to use the system, or the CPU speed of a certain server), performance characteristics of any related software (e.g. the optimum directory size in a particular underlying file system), the user making the request and his/her role, the time when the request was issued, the history of previous requests made by a user, or to the system in general, the number of users, the geographical location of the user making the request or a particular "mode of operation” activated either automatically or explicitly through any mechanism.
  • hardware e.g. the maximum capacity of a particular disk, the amount of memory of the device used by the client - user or application - to use the system, or the CPU speed of a certain server
  • performance characteristics of any related software e.g. the optimum directory size in a particular
  • the "monitoring parameters" are managed by the virtual shell, but said parameters can be located in the virtual shell, or in some of the underlying file systems, or in any related devices, or in the environment, or provided by any other source by any means, or any combination of the previous.
  • the virtual shell managing the "monitoring parameters” must be understood as the virtual shell creating and managing said created parameters, or the virtual shell capturing and managing said captured parameters from underlying file systems, related devices, the environment, other sources, or any combination of the previous.
  • the “monitoring parameters” have been divided into two different sets: those which allow the "virtual shell” to obtain the necessary context information to offer a minimum “file system semantics” (which we call “context monitoring parameters”) and the rest of monitoring parameters (the so-called “non-context monitoring parameters”).
  • “context monitoring parameters” comprise: the identity of the "user” issuing the request (by “identity” referring to user identifiers, groups and/or any specific identity information used by a particular system), the path or paths in a particular namespace used to reach the object or objects related to a particular request, the history of previous requests related to a coherent sequence of requests (e.g. an open/read/close sequence), the capacity and/or availability of the underlying object repositories (if any), and the current time.
  • the rest of “monitoring parameters” i.e. not being “context monitoring parameters" will be referred as “non-context monitoring parameters”.
  • the present invention provides a virtual shell for producing one or more responses to a user request related to one or more objects, the virtual shell comprising:
  • a context monitoring parameters module for producing data related to one or more context monitoring parameters
  • a multi-namespaces module for producing data related to one or more namespaces, each of said namespaces comprising one or more entries, each of said entries referring to a set of requestable objects, and each of the requestable objects having a set of requestable object attributes and being referred from one or more entries of one or more of said namespaces, wherein the organization of each namespace is decoupled from the attributes of the requestable objects referred from the entries of the namespace, said organizations allowing arbitrarily structured namespaces;
  • an underlying interface module for producing data related to a set of underlying objects comprising data related to requestable objects related to the requested objects
  • a logic module for interacting with the context monitoring parameters module, for interacting with the multi-namespaces module, for interacting with the underlying interface module, and for producing data for the interception module from a set of the data produced by the context monitoring parameters module, a set of the data produced by the multi-namespaces module and a set of the data produced by the underlying interface module, the interaction with the multi-namespaces module depending on a subset of the set of the data produced by the context monitoring parameters module;
  • an interception module for intercepting the user request, for interacting with the logic module and for producing the responses to the user request from a set of the data produced by the logic module.
  • This virtual shell allows overcoming the above-mentioned limitations of current storage systems related to restricted organization possibilities in terms of, for example, low flexibility and difficulty of integration, thanks to the combination of the different virtual shell modules.
  • the virtual shell allows one or more users to organize the same set of objects in different ways, and by combining it with a context monitoring parameters module and the logic module, each of such multiple organizations can be used according to a fully fledged semantics (for example, a file system semantics, which requires not only object data, but also environment information - monitoring parameters - to be implemented); so, the virtual shell can provide flexibility through multiple fully-functional views of a set of objects.
  • the underlying interface module (which allows accessing underlying storage systems), combined with the logic module and the context monitoring parameters module, allows to leverage the different functionalities of different storage systems (so that the resulting unified storage system is not limited by the least capable system, but can be complemented by the modules in the virtual shell to compensate any missing functionalities).
  • the virtual shell also provides easy means for integrating multiple underlying storage systems, in a completely transparent way (even allowing, for example, to divert files in a single directory to different underlying storage systems, according to any given criteria), and for providing multiple fully functional and flexible simultaneous organizations of data.
  • the consistency of the system is achieved through the function of the interception module which, by intercepting user requests and preventing a direct access to the underlying systems, allowing the virtual shell to have a complete control of the involved objects and guarantee their consistency.
  • the present invention provides a context monitoring parameters module for use in the virtual shell, the context monitoring parameters module comprising:
  • This context monitoring parameters module allows handling effectively the environment conditions that allow providing a non-trivial semantics (for example, a file system semantics). Such a semantics cannot be implemented based only on "attributes" or tags associated to individual objects, and the context monitoring parameters module provides the additional means to capture, handle, and store (if needed) the additional necessary data.
  • the context monitoring parameters module needs to deal, at least, with the identity of the user making the request, the paths specified to reach a particular object in a particular request, the history of previous requests associated to a particular sequence of requests, the capacity and/or availability of the underlying object repositories (if any), and the current time From this information, the context monitoring parameters module can provide data to the logic module, so that it can interact with the other modules providing them with data derived from context monitoring parameters.
  • a multi-namespaces module for use in the virtual shell, the multi-namespaces module comprising:
  • This multi-namespaces module allows the simultaneous co-existence of multiple namespaces (or organizations of objects) and decoupling the entries representing the objects from the objects themselves, so that a particular object can be referred by multiple entries, possibly from different namespaces. Additionally, this module handles the multiple namespaces in such a way that each namespace is decoupled from a particular underlying repository; this allows, for example, to offer a file system directory hierarchy (a namespace) where the files in a particular directory actually reside in different underlying object repositories.
  • the decoupling of namespaces and entries from the corresponding underlying objects and underlying repositories also allows a separated management that leads to ease of implementation and performance optimizations.
  • the present invention also provides an underlying interface module for use in the virtual shell, the underlying interface module comprising: ⁇ computing means for interacting with the logic module of the virtual shell;
  • an interception module for use in the virtual shell, the interception module comprising:
  • This interception module allows preventing direct interactions between the user and the underlying repositories (and, in particular, with the underlying objects), thus providing a unique entry point from the user to the virtual shell by intercepting the user requests. This effectively gives complete control of the underlying repositories (and consequently, the underlying objects) to the virtual shell, enabling the virtual shell to guarantee the consistency of the whole system. Additionally, the interception of all requests from the user and the generation of corresponding responses, allows the virtual shell to emulate an actual underlying repository and let any unmodified user applications to work on top of the virtual shell as if it was a native underlying repository (e.g. in a Windows system, a virtual shell could use the interception module to emulate an NTFS file system, and let any standard applications to run on top of said virtual shell, without taking into account which are the actual underlying file systems).
  • a native underlying repository e.g. in a Windows system
  • the present invention provides a logic module for use in the virtual shell, the logic module comprising:
  • This logic module allows dealing with user requests according to one or more specific semantics, by coordinating the interaction with the other modules of the virtual shell and by elaborating the data received from them, according to specific rules, heuristics and/or algorithms.
  • the virtual shell is able to behave following non-trivial semantics and, for example, be able to completely emulate a file system (such as NTFS or POSIX file systems).
  • the logic module is also able to provide the multi-namespaces module with the necessary data to select particular namespaces to use for particular requests, either based on internal algorithms or from data received from other modules of the virtual shell.
  • the present invention provides a method for producing one or more responses to a user request related to one or more objects, the method comprising:
  • producing, by means of a logic module, data for an interception module from a set of the data produced by the context monitoring parameters module and obtained by means of the logic module interacting with the context monitoring parameters module, a set of the data produced by the multi- namespaces module and obtained by means of the logic module interacting with the multi-namespaces module, and a set of the data produced by the underlying interface module and obtained by means of the logic module interacting with the underlying interface module, the interaction of the logic module with the multi-namespaces module depending on a subset of the set of the data produced by the context monitoring parameters module;
  • a computer program product comprising program instructions for causing a computer to perform the method for producing one or more responses to a user request related to one or more objects.
  • the invention also relates to such a computer program product embodied on a storage medium (for example, a CD-ROM, a DVD, a USB drive, on a computer memory or on a read-only memory) or carried on a carrier signal (for example, on an electrical or optical carrier signal).
  • Figure 1 is a schematic representation of a modular architecture of a virtual shell according to an embodiment of the invention
  • Figure 2 is an entity-relationship diagram of a data model related to a virtual shell according to an embodiment of the invention
  • Figure 3 is an entity-relationship diagram of a data model related to a virtual shell according to an embodiment of the invention, said data model further comprising requestable object contents locations in relation to Figure 2;
  • Figure 4 is an entity-relationship diagram of a data model related to a virtual shell according to an embodiment of the invention, said data model further comprising virtual objects and virtual contents locations in relation to Figure 2;
  • Figure 5 is an entity-relationship diagram of a data model related to a virtual shell according to an embodiment of the invention, said data model further comprising requestable object contents locations in relation to Figure 4;
  • Figure 6 is a schematic representation of a possible structure of data and some related logic for the multi-namespaces module producing a reproducible ordered list of entries, according to an embodiment of the invention.
  • the virtual shell 100 may be described as comprising:
  • a context monitoring parameters module 1 16 for producing data related to one or more 202 context monitoring parameters 201 ;
  • a multi-namespaces module 1 18 for producing data related to one or more 205 namespaces 206, each of said namespaces 206 comprising one or more 207 entries 208, each of said entries 208 referring to a set 210 of requestable objects 21 1 , and each of the requestable objects 21 1 having a set 213 of requestable object attributes 212 and being referred from one or more 209 entries 208 of one or more of said namespaces 206, wherein the organization of each namespace 206 is decoupled from the attributes 212 of the requestable objects 21 1 referred from the entries 208 of the namespace 206, said organizations allowing arbitrarily structured namespaces 206;
  • an underlying interface module 1 17 for producing data related to a set of underlying objects comprising data related to requestable objects related to the requested objects;
  • the virtual shell may produce responses to user requests in such a way that the objects related to requests represent files, directories and other file system objects, and the virtual shell of the invention presents the user virtual, but fully functional, views of file system directories, allowing the user to organize the files directories and other file system objects as he/she pleases, while imple menting a completely different directory layout in one or more underlying file systems.
  • the user is assumed to communicate by some means with another system or mechanism actually able to perform manipulations, accesses or operations on a particular object or objects.
  • the interception module is responsible for cap turing or intercepting such communication between the user and the system or mechanism meant to perform the manipulation, access or operation on the desired object or objects. That involves intercepting user requests and generat ing the corresponding responses (so that the virtual shell appears as "invis ible" to the user).
  • all direct interaction from the user to underlying repositories and underlying objects comprised in them may be pre- vented.
  • This allows the virtual shell to have full control of the underlying repos itory, being able, for example, to generate and handle metadata that allow extending the functionalities of the underlying repositories (e.g. providing mul tiple arbitrary namespaces), without risking inconsistencies due to untracked interactions from the user.
  • a certain system e.g. a POSIX- based file system
  • a certain system could create an additional arbitrary name space by using directories and symbolic links to the entries in the "official" hierarchy; nevef theless, nothing prevents the user from removing the targets of the links in the "official” hierarchy, so that the links would be broken, and the additional namespace would become unusable (not to speak about the fact that symbolic links do not behave exactly as regular files, which may also cause trouble).
  • intents to remove the target of a reference would be tracked and could be either prevented, or an action could be done on the referrer (e.g. also remov ing it) to avoid inconsistent states.
  • interception module for example, the "official" un derlying hierarchy could be completely hidden to all or some users (effectively preventing any unwanted manipulation), and the responses to requests direc ted to referrer entries could be manipulated so that they behave as actual files, and not symbolic links, for example.
  • One of the possible approaches to capture such communication is making the interception module appear as the entity the user wants to communicate with (e.g. by providing the same interface and semantics, i.e.the conventions used to send requests and/or receive responses, and the protocol used in the com munications).
  • the interception module could provide a file system interface in order to allow the software application to work as if it was communicating with a real file system.
  • the interception module may fulfil another functionality: translating the poten tially diverse interfaces used by the user (e.g. when expecting different file systems with different calls and possibly different semantics) to a common interface with the logic module. This allows isolating all interface-dependent characteristics, making the development of the rest of the virtual shell easier. Obviously, having a single interception module able to deal with different user interfaces and translating them to a single logic module interface can be considered equivalent to have multiple interception modules specialized in specif ic user interfaces and connected to the same logic module with the common interface.
  • the interception module could involve the modification of any combination of the application itself, any libraries used by the application, the operating system call mechanism, the operating system and related drivers, any daemons involved at any level and related lib raries, and/or the media used to transmit the communications.
  • FUSE File system in User space
  • FreeBSD FreeBSD
  • NetBSD NetBSD
  • Mac OS X/Darwin OpenSolaris
  • GNU/Hurd Windows-based system
  • FUSE provides a kernel module that exports VFS-like callbacks to user-space applio ations.
  • An embodiment of the virtual shell for the decoupling framework was implemented as one of such applications receiving the file system operation callbacks.
  • the interception module component in such embodiment is the one responsible for receiving the callbacks from FUSE, forwarding the requests to the logic module, and providing sensible defaults in case some of the requests not implemented by the virtual shell.
  • a similar approach in a MS Windows based environment could consist in us ing the so-called mini-filter drivers.
  • the mini-filter drivers can be hooked in the Windows Operating System Input/Output stack, and allow to intercept the requests and responses to and from the underlying file system, as well as ma- nipulate, or even cancel such requests and/or responses.
  • the context monitoring parameters module may be able to capture, store and produce data related to context monitoring parameters. For example, in some embodiments of the virtual shell offering a file system interface and semantics, the user identity can be obtained from the user request, which could be fop warded to the context monitoring parameters by the logic module.
  • the path used to refer to a particular object in a namespace can be decomposed in directory components by the context monitoring parameters module, and the individual components can be sent back to the logic module; the logic module may then ask the multi-namespaces module for individual validation of each path component and for obtaining references to any related objects, so that they can be transiently kept by the context monitoring parameters module and provided to the logic module when required for any procedure (e.g. to i plement a semantics feature involving to the parent directory of an object be- ing accessed).
  • Sequences of related requests can be handled by the context monitoring parameters module by creating a handle upon the first request of the set of related request (e.g. an open request on a file), keeping such handle, and for warding a reference to it to the logic module for inclusion into the correspond ing response; next related requests (e.g. read or write operations) may include the reference to the handle, so that the logic module can ask the context mon- itoring parameters module to validate it and determine that the sequence of requests is valid.
  • next related requests e.g. read or write operations
  • the context monitoring parameters module can also request the logic module to interact with the underlying interface module to gather information about which underlying interface modules are available, if any, which is their capa city, and how much free storage space is available, for example.
  • the context monitoring parameters may use means to obtain the current time (for example, by accessing specific hardware, or by querying it from an external source).
  • the context monitoring parameters module can maintain a cache for any piece of data gathered, so that they can be reused if needed.
  • the context monitoring parameters module can keep persistent or temporary tracking information about the user requests and the corresponding responses generated, so that a log of the virtual shell activity can be produced.
  • the multi-namespaces module may be in charge of, for example, translating the file system object identifier issued by the application (e.g. a file path) into a file system object reference internal to the virtual shell (e.g. an internal identifi er). This module may also handle requests related to directory or folder con- tent management, and also requests related to renaming file system objects, or moving them to different locations in the directory tree of a particular namespace or several namespaces.
  • the multi-namespaces module may generate data to be used internally by the virtual shell (usually to be kept by the context monitoring parameters module, in order to be used for future requests). For example, in embodiments of the virtual shell offering a file system semantics, an "open" operation of a certain path may cause the resolution of that path against a particular namespace selected according to context monitoring parameters; in that situation, the response from the multi-namespaces module may comprise not only the data related to the "open" object, but also information about the selected namespace, the particular entry, and/or its "parent” entry, if any (for example, to be associated by the context monitoring parameter module to the corresponding "handle", so that the specific entry and namespace can be tracked and used for handling further related requests).
  • the multi-namespaces module may use directives received through the interaction with the logic module to determine the particular namespace to use for certain requests. Such directives from the logic module can be based on data provided by the context monitoring parameters module.
  • the multi-namespaces module can use a repository (e.g. a database) to store the entries of all namespaces. Entries of different namespaces can have some means to indicate to which namespace they belong (for example, entries may have an associated tag referring to the namespace).
  • the multi-namespaces module can use multiple repositories to store entries corresponding to different namespaces (e.g. one repos itory per namespace). In this case, a namespace identifier can be used to select the appropriate repository to use.
  • the namespace to be used can be selected from information produced by the context monitoring parameters module (for example, based on identity information: depending on the user making a request, or its role, a particular namespace may be used).
  • Interactions between the logic module and the multi-namespaces module may comprise requests to make arbitrary manipulations to one or more namespaces, such as creating a folder (an entry containing other entries), re moving a folder, renaming a folder, creating entries, removing entries, rena ing entries, moving an entry from one folder to another, etc. just to cite a few examples.
  • Such requests may be directly related to requests from the user.
  • data related to namespace organization may be associated with the entries.
  • an entry may contain a reference (e.g. an identifier) to the entry containing it (e.g. a parent directory in a namespace emulating a file system directory hierarchy).
  • a reference e.g. an identifier
  • an entry may also contain other information, such data identifying the namespace to which it belongs.
  • additional data may need to be explicitly kept to maintain such arbitrary name spaces (for example, which entry is the root entry of a particular namespace - assuming that the concept of root entry exists for such namespace organization).
  • multi-namespaces module could maintain container hierarchy by having a container entry containing references to its contained entries (e.g. in a file system directory hierarchy, a directory entry could contain references to its children - files, subdirectories, etc.)
  • this functional ity can be provided by the multi-namespaces module independently from any existing or non-existing support in the underlying repositories, if any.
  • the multi-namespaces module also keeps references and other data related to objects referred by the entries of the namespaces.
  • the multi-namespaces mod- ule may provide the references and any other data related to objects referred by the entries of the namespaces.
  • the multi-namespaces module also keeps refer- ences and other data (such as attributes) related to objects referred by the entries of the namespaces, such pieces of data related to objects can be kept as separate entities from the entries themselves. This way, the entries and the objects referred by them can be decoupled, so that the same object can be simultaneously referred from multiple entries, possibly from multiple namespaces. In the same way, multiple entries, even from different namespaces, may refer to the same object.
  • a way to implement the decoupling between entries and referred objects consists of having the entry containing a reference (e.g. an identifier) of the object being referred.
  • a reference e.g. an identifier
  • an entry may also contain some other immutable pieces of data related to the referred objects (such as the type of the object being referred - e.g. in a file system environment, if the entry corresponds to a regular file, a directory, a symbolic link, etc.)
  • the multi-namespaces module may use a repository (e.g. a database table) to keep data related to the referred objects by the entries (e.g. requestable objects), including, for example, all of part of their at tributes.
  • a repository e.g. a database table
  • the multi-namespaces attributes can keep some of the attributes related to different types of file system objects (regular files, directories, etc.). For ex ample, it may keep security elements (owner, group, access permissions and control lists), statistical information (access times, utilization indicators, etc.), infrastructure-related fields (such as file system identifiers, hard link counters, symbolic link paths, etc.), and extended attributes, for example.
  • the logic module can be responsible to request such information and handle it appropriately (e.g. by applying security access restrictions and determining if a particular object referred by an entry is accessible or not).
  • the data kept by the multi-namespaces module related to objects referred from the entries of the namespaces may contain pieces of data that can be used by the logic module to request accesses to the underly ing repositories.
  • a requestable object representing a file may have the data belonging to such file in an actual file in an underlying file sys tern;
  • the data related to the requestable object kept by the multi-namespaces module may comprise, for example, the path of the actual file with the data in the underlying file system, so that the logic module can use that path to request the underlying interface module to access the file contents, or to recover attributes related to the underlying object (i.e. the file in the underlying file system) such as, for example, the file size.
  • each reference to requestable objects comprises a set of data related to a set of underlying objects comprised in one or more underlying repositories; and the underlying interface module is adapted to interact with the underlying repositories taking into account said data related to said set of underlying objects.
  • the data related to the requestable object that a multi-namespaces module can keep does not need to contain direct information about the underlying object, but one or more pieces of data that the logic mod- ule can convert into information to access the underlying repositories through the underlying interface module.
  • the multi-namespaces modules can keep opaque handles related to requestable objects that the logic module can convert into usable directions to underlying files.
  • the multi-namespaces module may handle symbolic and hard links as part of the namespace management.
  • symbolic links can be handled by having entries comprising paths pointing to other entries in either the same or a different namespace; hard links can be achieved by having different entries to refer to the same object.
  • the multi-namespaces module can be able to overlap several namespaces, so that the namespace visible by the user is a combination of two or more namespaces handled by the multi-namespace module.
  • the rules to combine namespaces could be specified by the user through specific requests from the user captured by the interception module, or could be determined internally by the virtual shell from the context monitoring paramet ers. For example, in embodiments offering a file system semantics, a user could use a namespace for home files and a different one for work files; the home namespace could be hidden during work hours and, during non-work hours, it could be visible, overlapped to work namespace - additionally, the files accessed through the work namespace during off-work hours could be treated as read-only.
  • the logic module checking a path associated to a request against multiple namespaces handled by the multi-namespace module, and applying the corresponding modifications to the responses according to the namespace used to resolve the path, the context-monitoring parameters, and possible previous requests from the user indicating rules to combine multiple namespaces.
  • the attributes of the requestable objects representing file system objects may include access control (owner, group and related access permissions), directory specific information (e.g. the number of entries) and/or size and time data for non-regular files.
  • Sizes and access times for regular files might be associated to attrib utes of the underlying object containing the data of the represented file, if any; therefore, to access them, a request could be sent to the logic module to trigger the recovery of the underlying object attributes via the underlying interface module.
  • the main goal of the underlying interface module is to isolate the virtual shell from the specific characteristics of the underlying object repositories. In some embodiments, this allows the logic module to refer to underlying objects without having to understand how the underlying objects are stored or even organized in the underlying repositories.
  • the logic module can issue a request with an explicit reference to an underlying object (e.g. a file path in an underlying file system): even if the logic module is not able to interpret it, the piece of data can be directly used by the underlying in terface module to access the underlying repository.
  • the logic module may issue opaque handles, that are converted into actual references to underlying repositories by the underlying interface module (e.g. by maintaining a mapping table between handles and actual reference information, or by applying a certain function to the handle - e.g. "decoding” and “verifying” using cryptographic means).
  • Said conversion of underlying object handles could also be influenced by context monitoring parameters; for example, a particular request for an underlying object creation could be diverted to differ ent underlying repositories depending on the free space in each of the avaH able underlying repositories.
  • the underly ing interface module can also be the way to access and/or interact with an underlying repository as a whole (e.g. attaching or deattaching a particular re pository to the virtual shell, or querying characteristics of the repository, such as the capacity, the maximum file size - if it is a file system -, the optimum performance parameters, etc.)
  • the underlying interface module may have an addition- al functionality consisting on being able to access data from the environment (e.g. the current time, the available memory, network availability, etc.). This type of information is likely to be consumed by the context monitoring parameters module.
  • An embodiment of the virtual shell can use a file system as an underlying re pository for the underlying objects (e.g. data corresponding to files represen ted by entries in the namespaces can be stored in actual files in an underlyling repository).
  • the interaction of the virtual shell's underlying interface module with the underlying file system can be done via basic POSIX calls, which means that the virtual shell has no dependency on a particular underlying file system, and can operate on top of anyone providing such interface.
  • Some embodiments of the virtual shell can use file systems which do not provide a strict POSIX interface (such as NTFS, NFS, or SMB/CIFS-based systems).
  • file systems which do not provide a strict POSIX interface (such as NTFS, NFS, or SMB/CIFS-based systems).
  • the interaction of the virtual shell's underlying interface module with the underlying file system can be done through the specific file system calls supported by said file systems.
  • the data used by the underlying interface module does not include any knowledge of low-level data storage (in particular, there are no references about disks, blocks or other storage objects - though they could be easily included if necessary). It is up to the underlying file system (or any other advance storage system) to take decisions about low-level data server selection, striping, block/object placement, etc. Therefore, the references to undep lying objects kept, handled or generated by other modules (e.g. the logic mod- ule), or the attributes of requestable objects related to underlying objects can be adjusted accordingly - i.e. they do not need to contain, refer or generate low level storage information.
  • advanced storage systems e.g. file systems or databases, possibly being local, re mote, distributed, etc.
  • the logic module is the responsible to coordinate and communicate with the rest of components of the virtual shell in order to provide a behaviour imple menting one or more semantics, understanding by semantics a set of rules, specifications and conventions that allow fulfilling certain expectations when the user interacts with the virtual shell.
  • the logic module may exchange data and/or requests with one or more of the other components of the virtual shell, and it may keep its own transient and/or persistent data.
  • the logic module may act either as a reaction to user requests, and/or due to changes to any type of context monitoring parameters, and/or requests from other system components.
  • a request to obtain, for instance, a regular file's attributes may return part of the requestable objects attributes from the multi-namespaces module (e.g. access permissions, etc.) and then trigger a request to the logic module to retrieve attributes of the related underlying object containing the file's data (e.g. the size and/or the modification times).
  • the logic module may act autonomously. For example, it may initiate a transfer and/or duplication of underlying objects across differ ent underlying object repositories (for instance, to improve request balancing and/or to guarantee availability).
  • the logic module may also perform operations on the underlying repositories autonomously from application requests for maintenance, inform ation gathering, data reorganization, data migration, data transformation, physical metadata checks, physical metadata updates, or any other cause.
  • the logic module may forward requests from other virtual shell modules to the underlying interface module in order to obtain data from or perform actions on one or more of the underlying repositories, or the underlying objects contained in them.
  • the logic module may perform accesses to the underly- ing repositories based on data coming from other modules of the virtual shell (e.g. attributes of requestable objects obtained from the multi-namespaces or from any other module). Such data may comprise direct or indirect references to underlying objects. The logic module may be able to translate such references into data usable by the underlying interface module, or may use other modules of the virtual shell to perform either the final or intermediate translations.
  • data coming from other modules of the virtual shell e.g. attributes of requestable objects obtained from the multi-namespaces or from any other module.
  • Such data may comprise direct or indirect references to underlying objects.
  • the logic module may be able to translate such references into data usable by the underlying interface module, or may use other modules of the virtual shell to perform either the final or intermediate translations.
  • the logic module could maintain a table mapping references from other modules into data suitable to be included in a request to the underlying interface module.
  • Another way to implement this functionality would be the references from the other modules having the necessary data encoded in some way, and the logic module decoding such references to obtain the necessary data.
  • additional post-processing of the data would be possible (e.g. a function could be applied, so part of the data is altered before being sent to the underlying interface module or after being received from the underlying interface module - for instance, to encrypt or decrypt data).
  • the logic module can use context monitoring parameters to determine the actual data to be used for interacting with the underlying interface module.
  • the "main approach" of virtual shell of the invention provides very high flexibility to storage systems and, at the same time, simplifies the integra tion of multiple underlying storage repositories.
  • the virtual shell comprises a multi-namespaces module able to deal with multiple namespaces simultaneously. This simultaneousness is achieved by decoupling the entries of the namespaces from the requestable objects, so that namespaces can be organized arbitrarily and independently with respect to the requestable objects referred by the entries. Nevertheless, this level of flexibility may be higher in terms of, for example, the objects required by the user having different attributes depending on the namespace through which they are accessed.
  • different entries possibly from different namespaces can refer to different requestable objects to access "data-independent" attributes, and such requestable objects could access to the same underlying objects to access both data and attributes related to such data.
  • a single requestable object could be used, referred by multiple entries, and having different sets of "data-independent" attributes tagged by the namespace being used, and still maintaining a reference to an underlying object for "data-dependent" accesses - note that this implementa- tion could be considered slightly restrictive with respect to the previous one, as a particular object would depend on the namespace to select the set of attribute values to use; extensions to the tags could be used to overcome this limitation, but this could bring other drawbacks in actual implementations: large attribute sets difficult to handle, possible access contention issues, etc.
  • the selection of particular namespaces and the operation with the referred ob jects need to be based in the context monitoring parameters provided by the context monitoring parameters module; without them, the logic module would not be able to offer complex semantics to the user (such as that of a conventional file system).
  • the internal consistency of the virtual shell is guaranteed by the interception module, which intercepts all interactions from the user to the required objects, avoiding untracked modifications that could create inconsistencies between the virtual shell internal representations and the underlying repositories and the underlying objects contained in them.
  • the underlying in terface module permits isolation between the internals of the virtual shell and the differential aspects of different underlying repositories, making easier to extend support for new potential underlying repositories.
  • the "main approach" of the virtual shell of the invention offers very flexible organization possibilities by providing multiple independent namespaces (i.e. where changes through one of them do not need to necessarily imply changes in a different namespace - neither regarding the struo ture of the namespace, nor the objects referred by the namespace entries), each namespace being able to offer fully functional complex semantics (e.g. emulating a file system), and being able to use diverse storage systems as underlying repositories for storing data related to objects accessible through the virtual shell .
  • the multi-namespaces module 1 18 of the virtual shell 100 is adapted to produce the data related to namespaces 206 further taking into account that each entry 208 of each namespace 206 refers to its related set 210 of requestable objects 21 1 through one or more 301 requestable object contents locations 300, each requestable object contents location 300 referring to a subset 303 of the related set 210 of requestable objects 21 1 , and each of the requestable objects 21 1 being referred from one or more 302 requestable object contents locations 300;
  • the virtual shell 100 further comprises a contents locations module 1 10 for producing data related to one or more references to one or more requestable objects 21 1 from data related to requestable object contents locations 300; and
  • the logic module 109 of the virtual shell 100 is adapted to interact 1 15 with the contents locations module 1 10 and to produce the set of data for the interception module 104 further from a set of the data produced by the contents locations module 1 10, the interaction 1 15 with the contents locations module 1 10 depending on a subset of the set of the data produced by the context monitoring parameters module 1 16.
  • the multi-namespace module keeps information about both the multiple namespaces and about the requestable objects referred from the entries of the namespaces.
  • the multi-namespaces module has to understand the characteristics of both sets of entities (the namespaces and the entries in one side, and the requestable objects and/or their related data in the other side). This situation may add complexity to the module, as characteristics, behaviour, and potential requests on both sets of entities are different.
  • a possibility to mitigate the issue mentioned above is to separate the management of namespaces and entries from the management of the objects referred by those entries.
  • a way to achieve this is to use an opaque handle to implement the reference from entries to referred objects, and let such handle to be interpreted outside the multi-namespaces module.
  • the objects referred by the entries may be of potentially different types, without affecting the structure, functionality or characteristics of the multi-namespaces module (e.g. we could use the same multi-namespaces module to handle objects which represent different things, have different attributes, or are encoded in different ways).
  • Each of the entries of namespaces handled by the multi-namespaces module may comprise a content location container to contain such opaque references to objects.
  • such content location containers may be able to contain "requestable object contents location" data for one or more requestable objects.
  • the multi-namespaces module may be adapted to produce the data related to namespaces further taking into account that each entry of each namespace comprises said contents location container adapted to contain one or more requestable object contents locations.
  • the data in a contents location container may contain, possibly encoded in some way, a tag or an equivalent piece of data indicating to which type of object the opaque handle refers (for example, if it contains a requestable object contents location). This tag can be used by some other component to identify which module may be able to interpret and/or deal with the opaque handle.
  • the data in the contents location container may contain additional pieces of data related to the referred object; such pieces of data might be extracted and interpreted directly by some other component, without having to access the whole referred object.
  • the "requestable object contents location" in the contents location container associated to a particular entry of a namespace may contain, encoded in some way, additional information such as the file system object internal identifier (e.g. an i-node number) and/or the type of the file system object (e.g. if it represents a regular file, a symbolic link, a directory, etc.).
  • the additional pieces of data are immutable in a file system context; note that having to make a request to the multi- namespaces module to partially replace a part of the additional information encoded in the data in a contents location container (e.g.
  • the contents location container may contain data referring to multiple objects (of the same, or of different types). This can be achieved either by storing a set of "handles” (e.g. a list of "requestable object contents locations"), or by storing a single "handle” encoding multiple references to objects.
  • the multi-namespaces module may provide an interface to add or remove individual "handles" in an entry's contents location container, as well as accessing to all of them as a whole.
  • a contents location container of some entries could be empty (or, equivalently, contain "null” values - e.g. "requestable object contents locations” not referring to any requestable object).
  • Such feature can be used for several purposes such as, for example, deal with entries of namespaces which simply act as namespace structural support (e.g. a container not backed by any corresponding requestable object), or to deal with particular features or situations (e.g. placeholders, or errors).
  • the "requestable object contents locations” are opaque handles from the multi-namespace module perspective. So, they must be interpreted and handled somewhere else.
  • a first approach would consist of having the logic module deal with them. Despite being possible, that would increase the burden on the logic module, making it more complex than necessary.
  • the virtual shell may introduce a new component: a contents locations module able to interpret and deal with "requestable object contents locations”.
  • the contents locations module may keep data related to requestable objects, including their attributes (or even keep the whole requestable objects themselves).
  • the contents location module may provide the attributes, other related data, or references (direct or indirect) to underlying objects related to the requestable objects, if any.
  • a requestable object may not have any related underlying objects. This may imply that either said requestable object does not have "pure data” contents, or it may imply that their "data” is generated internally by the virtual shell.
  • the contents locations module is responsible for providing the means to generate such data on demand. For example, a 7dev/zero" file in a Unix system could be implemented by the contents locations module generating zero-filled data when the logic module requests information about the files contents; in a similar way, it could generate random data for a 7dev/random" file, or discard any modification requests to the contents of a 7dev/null" file.
  • a requestable object may refer to related underlying objects.
  • a requestable object represents a regular file in a file system
  • a related underlying object e.g. a file in an underlying file system
  • a requestable object may refer to multiple underlying objects containing data associated to the requestable objects. Such multiplicity of underlying objects may contain either alternative versions of the data or complementary versions of the data (e.g. different underlying objects may contain alternative contents for a file depending on the user accessing to it, or they may contain different sections of the data which, concatenated, could represent the totality of the data associated to the requestable object). In such situations, the contents location module may be the responsible to select the underlying object (or underlying objects) to act upon, possibly based on context-monitoring parameters and the particular received requests.
  • a requestable object may also refer to related underlying objects containing additional metadata (e.g. additional attributes) about the requestable object.
  • additional metadata e.g. additional attributes
  • a requestable object may have a related underlying object containing the "pure data" associated to the requestable object, and additional related underlying objects containing information about the pure data (e.g. the date, time and place where a photo has been taken), its format, or even code able to process the "pure data”.
  • a requestable object may refer to multiple underlying objects containing additional metadata (e.g. additional attributes) about the requestable object, such multiplicity of underlying objects being able to contain either alternative or complementary fragments of the metadata.
  • additional metadata e.g. additional attributes
  • the contents location module may be responsible to select the appropriate underlying object or underlying objects to act upon, possibly depending on context-monitoring parameters and the received requests.
  • References (either direct or indirect) to underlying objects related to requestable objects can be considered as attributes of the requestable objects.
  • the contents location module can be responsible of providing such references to underlying objects to the logic module upon request, so that the logic module can request the appropriate access to corresponding underlying repository through the underlying interface module.
  • Such references to underlying objects may be directly usable by the underlying interface module, or may require some additional processing (e.g. decoding, or selection - if multiple references are provided) by the logic module or the underlying interface module itself.
  • a request from the user to access data related to a requestable object through a namespace entry would involve the logic module sending requests to the multi-namespaces module with data related to the original user request and from the context monitoring parameters module, in order to obtain (a) "requestable object contents location(s)" associated to the desired entry; then, the logic module would feed such "requestable object contents location(s)" to the contents location module, possibly providing additional information from the context monitoring parameters module; the contents locations module would then produce data related to the requestable object referred by the "requestable object contents location(s)", possibly including the references to the underlying objects required by the logic module to perform the requested access to the data through the underlying interface module.
  • a request to access requestable object attributes related with data stored in underlying objects may also result in the contents locations module producing references to underlying objects, to be used by the logic module through the underlying interface module.
  • a request to update the represented file's modification time could result in the logic module obtaining a "requestable object content location" from the multi-namespaces module, feeding such "requestable object content location” to the contents locations module, and obtaining a reference to the underlying object (the actual file) whose access time has to be modified, and requesting the modification of such attribute of the underlying file to the underlying interface module.
  • a similar path would be followed, for instance, for requests related to querying or modifying the size of a file represented by a requestable object linked to an underlying object.
  • a request for accessing requestable objects attributes may not require the contents location module to produce references to the underlying objects, especially when the particular attributes to be accessed are not directly related to relevant underlying object properties; then, the required access can be performed internally by the contents locations module.
  • a request to access some file attributes such as the access permissions can be performed directly by the contents location module without requiring access to underlying repositories, if the contents locations module has its own means to keep such information as attributes of the requestable object.
  • the request path would consist, as usual, in the logic module interacting with multi-namespaces module to obtain a "requestable object contents location" to be feed to the contents location module, together with data about the access to be performed and, possibly, data from the context monitoring parameters module).
  • the contents location module can keep some of the attributes related to different types of file system objects (regular files, directories, etc.) For example, it may keep security elements (owner, group, access permissions, control lists, etc.), statistical information, extended attributes, etc. As already mentioned, the contents location module can also keep references to underlying objects where additional information related to the requestable object can be stored (e.g. the data of a file represented by a requestable object).
  • One of the functions of the contents locations module is to convert the "requestable object contents locations" fed into it into direct references to requestable objects, so that the contents locations module can act on the specified and related requestable objects and/or produce data related to them.
  • a "requestable object contents location” can be a direct reference to data in the contents location module related to the corresponding requestable object (e.g. a memory pointer to a data structure, or a key in a database table storing requestable objects data).
  • Some processing may be needed to convert the "requestable object contents locations” into usable references to requestable objects (e.g. a mapping table can be used to map "requestable object contents locations” into references to requestable objects; another possibility is to apply a certain function to the "requestable object content location” to "decode” it and extract a requestable object reference from the result of the function).
  • a mapping table can be used to map "requestable object contents locations” into references to requestable objects; another possibility is to apply a certain function to the "requestable object content location” to "decode” it and extract a requestable object reference from the result of the function).
  • Conversion of a "requestable object contents location" into a reference to a specific requestable object can be influenced by context monitoring parameters.
  • a "requestable object contents location” could contain references to multiple requestable objects; the specific requestable object to use could be determined by the contents locations module based on context monitoring parameters (e.g. a "requestable object contents locations” coming from an entry representing a file could be resolved into different requestable objects - possibly referring to different underlying files containing different data - depending on the user making the request, or the time of the request).
  • a "requestable object contents locations” can be generated by the contents location module either upon request or automatically when a requestable object is created, and communicated to other modules (e.g. via the logic module) so that they can be used to refer to the requestable objects when needed (e.g. to have an entry in a namespace to refer to a requestable object).
  • the contents locations module may use a repository (e.g. a database table) to keep data related to the requestable objects, possibly including their attributes.
  • the contents location module can take into account information about underlying repositories (e.g. obtained via the logic module) to decide where and how to organize the underlying objects related to requestable objects.
  • the contents location module may be able to produce references to underlying objects and/or underlying repositories that are directly usable by the logic module to access them via the underlying interface module.
  • References to underlying objects from requestable objects may be kept as direct references, and the logic module can simply forward them to the underlying interface module.
  • the contents locations module may keep references from requestable objects to related underlying objects in such a form that requests containing them may need to be processed in order to be forwarded to the underlying interface module via the logic module (e.g. a reference to an underlying object can be kept in a "repository-agnostic" way, and then be converted into a form adapted to the characteristics of the actual underlying repository where the underlying object resides).
  • the contents locations module comprises one or more contents locations sub-modules, each of said contents locations sub-modules being adapted to produce data related to underlying objects comprised in underlying repositories with equivalent properties.
  • the contents locations module can be able to alter in any way the data being stored to or retrieved from underlying objects related to requestable objects (e.g. encrypting/decrypting and/or compressing/uncompressing data), possibly based on context-monitoring parameters. In the same way, it may also alter responses related to requestable object attributes (e.g. clearing the returned value about a "last access time" attribute depending on the user making the request).
  • object related to requestable objects e.g. encrypting/decrypting and/or compressing/uncompressing data
  • responses related to requestable object attributes e.g. clearing the returned value about a "last access time" attribute depending on the user making the request.
  • the logic module could also be able to perform totally or partially some of the functionalities described for the contents location module (such as resolving multiple references to requestable objects - and possibly selecting which to use -, resolving multiple references to underlying objects - and possibly selecting which to use-, altering the data being stored to or retrieved from underlying objects, and adapting the references to underlying objects according to specific characteristics of the underlying repositories, just to mention some examples).
  • the responsibility for such possibly overlapped functionality can be shifted to from the logic module to the contents location module.
  • the contents location module may assume the entire burden for handling the requestable objects and related underlying objects, while the logic module can be released from such tasks and focus on the coordination of the different modules and the implementation of the algorithms providing the desired semantics of the virtual shell . It is also important to note that, with the introduction of "requestable object contents locations", the logic module can also be made agnostic with respect to the internals of requestable object management, and even with respect to the specific characteristics of the underlying objects and the underlying repositories (thanks to the contents locations module being able to deal with them).
  • the logic module can also be made agnostic with respect to the internals of requestable object management, and even with respect to the specific characteristics of the underlying objects and the underlying repositories (thanks to the contents locations module being able to deal with them).
  • the "requestable object contents locations approach' of the virtual shell, in some embodiments of the invention supported by Figure 1 and Figure 4:
  • the multi-namespaces module 1 18 of the virtual shell 100 is adapted to produce the data related to namespaces 206 further taking into account that each entry 208 of each namespace 206 refers to its related set 210 of requestable objects 21 1 through one or more 400 virtual contents locations 401 , each virtual contents location 401 referring to a set 403 of virtual objects 404 and each virtual object 404 referring to a subset 406 of the related set 210 of requestable objects 21 1 , each of the requestable objects 21 1 being referred from one or more 405 virtual objects 404 and each virtual object 404 being referred from one or more 402 virtual contents locations 401;
  • the virtual shell 100 further comprises a virtual objects module 107 for producing data related to one or more references to one or more requestable objects 21 1 from data related to virtual contents locations 401; and the virtual objects module 107 is adapted to produce the data related to references to requestable objects 21 1 further taking into account that each virtual object404 has a set 410 of virtual attributes 409, said virtual attributes 409 being decoupled from attributes 212 related to requestable objects 21 1 ;
  • the multi-namespaces module 1 18 is adapted to produce the data related to namespaces 206 further taking into account that the organization of each namespace 206 is decoupled from the virtual attributes 409 of the virtual objects 404 referred from the entries 208 of the namespace 206 through the virtual contents locations 401;
  • the logic module 109 is adapted to interact 1 1 1 with the virtual objects module 107 and to produce the set of data for the interception module 104 further from a set of the data produced by the virtual objects module 107, the interaction 1 1 1 with the virtual objects module 107 depending on a subset of the set of the data produced by the context monitoring parameters module 1 16.
  • the entries of namespaces handled by the multi-namespace module may contain references to requestable objects, and said requestable object may have a set of related attributes and a set of underlying objects (which, for example, may contain "pure data" related to the requestable object).
  • each requestable object may be referred from multiple entries, possibly from different namespaces.
  • the flexibility of the virtual shell may be further improved by overcoming some limitations of the previously described embodiments; for example: attributes of a requestable object are the same independently of the entry used to access the object; nevertheless, it could be interesting to see different attributes (e.g. owner, access permissions, tags, etc.) depending on the specific namespace used to reach the object (or even depending on the specific entry used).
  • a possible embodiment to provide support for such multiple views of attributes may be having each requestable object attribute adapted to contain a set of values, and tag each value with an identifier of the entry or entries through which it should be possible.
  • a solution is theoretically possible, it has several drawbacks.
  • a requestable object may be referred by multiple virtual objects having their own set of virtual attributes (potentially independent from the attributes in the referred requestable objects - i.e. with attributes not present in the requestable object, or with different values) which, in turn, may be referred by multiple entries.
  • a set of entries (from the same or from different namespaces) to refer to a requestable object maintaining a consistent view of the attributes, said set of entries may refer to the same virtual object which, in turn, may refer to the desired requestable object; on the contrary, if it is needed to provide different views from a set of entries, it is possible to have said set of entries referring to different virtual objects which, in turn, may refer to a single desired requestable object.
  • the multiplicity in the other direction can be maintained: a single entry can refer to multiple virtual objects, a single virtual object can refer to multiple requestable objects and, of course, a combination of both the previous situations is also possible.
  • virtual objects can be operated upon based on requests from the user, as if they effectively were the requestable objects of previous approaches. In particular, this means that they can be fully functional and be used according to any semantics provided by the virtual shell. For example, without virtual objects, it is still possible to modify the values of a requestable object (e.g. "on-the-fly") when responding to user requests to offer alternative views; but this would be a "read-only” alternative view: if the user tries to modify the value of the altered attribute (e.g.
  • attributes in a virtual object can be accessed in any way (including read and written) independently of the requestable object being referred, and the logic module can operate independently on them according the desired semantics (for example, different virtual objects referring to the same requestable object could have different access permissions, and the logic module could allow or disallow access to the requestable object depending on which virtual object has been used as "intermediate" hop).
  • the requestable objects can be used to contain attributes related to "pure data" (e.g., references to underlying objects, if any, or the size of "pure data” associated to the object - either if contained in an underlying object or if generated inside the virtual shell), whereas virtual objects can be used to keep in their virtual attributes all the data that does not depend on the "pure data" associated to the object or on related underlying objects, if any.
  • attributes related to "pure data” e.g., references to underlying objects, if any, or the size of "pure data” associated to the object - either if contained in an underlying object or if generated inside the virtual shell
  • virtual objects can be used to keep in their virtual attributes all the data that does not depend on the "pure data” associated to the object or on related underlying objects, if any.
  • a requestable object could be used to keep references to the underlying object and related information (e.g.
  • Each of the entries of namespaces handled by the multi-namespaces module may comprise a content location container to contain such opaque references to objects.
  • such content location containers may be able to contain "virtual contents location" data for one or more virtual objects.
  • the multi-namespaces module may be adapted to produce the data related to namespaces further taking into account that each entry of each namespace comprises said contents location container adapted to contain one or more virtual contents locations.
  • tags to identify the type of handle in the container
  • additional information in the handle that can be extracted without needing to recover the referred object
  • the possibility of encoding references to multiple objects in the case of a "virtual contents location", to multiple virtual objects
  • the possibility to have empty containers or, equivalently in this case, having "virtual contents locations” not referring to any virtual objects - i.e. being "empty”
  • the "virtual contents locations” are opaque handles from the multi-namespace module perspective. So, they must be interpreted and handled somewhere else.
  • the virtual shell in the invention may comprise a virtual objects module able to interpret and deal with "virtual contents locations”.
  • the virtual objects module may keep data related to virtual objects, including their virtual attributes (or even the whole virtual objects themselves).
  • the virtual objects module may keep data related to requestable objects (including their attributes - or even the whole requestable objects themselves) referred from the virtual objects. On request by the logic module, the virtual objects module may produce virtual attributes or other data related to the virtual objects and/or data related to the requestable objects referred from the virtual objects, if any. (Depending on the particular embodiment, either references to the requestable objects, or other type of data related to requestable objects or to underlying objects referred by the requestable objects, if any.)
  • a virtual object may not have any related requestable object; for example, a virtual object may be able to keep the relevant associated data in virtual attributes, without requiring a requestable object (e.g. a virtual object representing a directory of a POSIX file system that exist in a single namespace, and that has no corresponding data in an underlying repository; another example would be a virtual object implementing a "shortcut" in a Windows-like system: one of the virtual attributes of the virtual objects would just keep a path in a namespace, regardless of what such entry represents, or even if it exists).
  • a requestable object e.g. a virtual object representing a directory of a POSIX file system that exist in a single namespace, and that has no corresponding data in an underlying repository
  • another example would be a virtual object implementing a "shortcut" in a Windows-like system: one of the virtual attributes of the virtual objects would just keep a path in a namespace, regardless of what such entry represents, or even if
  • a virtual object may refer to multiple requestable objects.
  • Such multiplicity of requestable objects may refer to either alternative versions of data and/or attributes or complementary versions of data and/or attributes (e.g. different requestable objects may contain references and different information on how to access to different underlying objects with alternative contents for a file represented by a virtual object; or the different requestable objects may contain directions to different underlying objects containing different fragments of the file represented by the virtual object).
  • References (either direct or indirect) to requestable objects related to virtual objects can be considered as virtual attributes of the virtual objects.
  • the virtual objects module can be responsible of providing such references to requestable objects to the logic module upon request, so that the logic module can perform additional operations on them, or feed them to additional modules for further processing.
  • a request from the user to access data related to a virtual object through a namespace entry would involve the logic module sending requests to the multi-namespaces module with data related to the original user request and from the context monitoring parameters module, in order to obtain (a) "virtual contents location(s)" associated to the desired entry; then, the logic module would feed such "virtual contents location(s)" to the virtual objects module, possibly providing additional information from the context monitoring parameters module; the virtual objects module could then produce data related to the virtual object(s) referred by the "virtual contents location(s)", possibly including the references to related requestable objects; at that point, the logic module may decide to further interact with additional modules to act upon the obtained requestable object(s) in order to perform the user request.
  • Such additional modules may comprise the virtual objects module again (e.g. when the requestable objects are also handled and kept by the virtual objects module); in that case, the virtual objects module may perform the final desired action on the corresponding requestable object without the intermediate step of sending a requestable object reference back to the logic module.
  • a request for accessing virtual objects attributes do not necessarily require the virtual objects module to produce references to the requestable objects, especially when the particular attributes to be accessed are not directly related to relevant requestable object or underlying object properties; then, the required access can be performed internally by the virtual objects module.
  • a request to access some file attributes such as the access permissions can be performed directly by the virtual objects module without having to refer to underlying requestable objects.
  • requestable objects can be managed by the virtual objects module as they were managed by the multi-namespaces module and/or the logic module as described for previous embodiment descriptions.
  • the virtual objects module When the virtual objects module is adapted to deal with and keep information about the requestable objects, the responsibility for such possibly overlapped functionality can be shifted to the virtual objects module.
  • the advantage of this is reducing the complexity of implementation of both the logic module and the multi-namespaces module by having the virtual objects module to completely deal with aspects related to requestable objects.
  • One of the functions of the virtual objects module may be converting the "virtual contents locations" fed into it into direct references to virtual objects, so that the virtual objects module can act on the specified and related virtual objects, or produce data related to them.
  • all the mechanisms that were described in previous descriptions allowing the contents location module to convert a "requestable object contents location” into a direct reference to a requestable object can also be used by the virtual objects module to convert a "virtual contents location” into a direct reference to a virtual object.
  • the conversion of a "virtual contents location” into a reference to a specific virtual object can be influenced by context monitoring parameters. For example, a "virtual contents location", as mentioned before, could contain references to multiple virtual objects; the specific virtual object to use could be determined by the contents locations module based on context monitoring parameters.
  • the "virtual object contents locations" can be generated by the virtual objects module either upon request or automatically when a virtual object is created, and communicated to other modules (e.g. via the logic module) so that they can be used to refer to the virtual objects when needed (e.g. to have an entry in a namespace to refer to a virtual object).
  • the virtual objects module may use a repository (e.g. a database table) to keep data related to the virtual objects, possibly including their virtual attributes.
  • a repository e.g. a database table
  • the virtual objects module may use a repository (e.g. a database table) to keep data related to the requestable objects, possibly including their attributes.
  • a repository e.g. a database table
  • the virtual objects module may use the same structures to internally represent both types of objects.
  • the virtual objects module uses repositories to keep data related with virtual objects and data related with requestable objects, then the same repository (e.g. a database table) could be used to keep data about both types of objects and their attributes.
  • the virtual objects module can be able to alter in any way the data being accessed through related requestable objects (including, for example, "pure data" and/or attributes) possibly based on context-monitoring parameters. For example, it could digitally sign and/or verify digital signatures of data being accessed through a particular virtual object.
  • the virtual objects module 107 of the virtual shell 100 is adapted to produce the data related to references to requestable objects further taking into account that each virtual object 404 refers to its related subset 406 of requestable objects 21 1 through one or more 500 requestable object contents locations 501 , each requestable object contents location 501 referring to a subset 503 of the related subset 406 of requestable objects 21 1 , and each of the requestable objects 21 1 being referred from one or more 502 requestable object contents locations 501 ;
  • the virtual shell 100 further comprises a contents locations module 1 10 for producing data related to one or more references to one or more requestable objects 21 1 from data related to requestable object contents locations 501 ; and
  • the logic module 109 is adapted to interact 1 15 with the contents locations module 1 10 and to produce the set of data for the interception module 104 further from a set of the data produced by the contents locations module 1 10, said interaction 1 15 with the contents locations module 1 10 depending on a subset of the set of the data produced by the context monitoring parameters module 1 16.
  • the virtual objects module keeps information about both the virtual objects and about the requestable objects referred from the virtual objects. Additionally, it must be noted that requestable objects may be complex entities, as they may contain references to underlying objects contained into underlying repositories. Thus, the virtual objects module would have to understand the characteristics of both virtual and requestable objects and, possibly, be also able to interact with underlying objects that may exist on underlying repositories with different characteristics. All these factors may introduce certain complexities to the implementation and/or potential extensibility of the virtual objects module.
  • the "requestable object contents locations approach” introduces a “contents location module” specialized in handling requestable objects and “requestable object contents locations” (opaque handles referencing requestable objects) improving the flexibility of the virtual shell in relation to the "main approach”. Nevertheless, this "requestable object contents locations approach” may imply certain inflexibility when, for example, performing actions on objects through a given namespace (e.g. changing an attribute - for instance, a tag - associated to an object) and producing the corresponding effects isolated from other namespaces while keeping the intended functionality (i.e., following the same example, having the tag appeared as unchanged in other namespaces, while having the desired functionality - for instance, to generate a categorization - in the namespace where said tag has been changed).
  • performing actions on objects through a given namespace e.g. changing an attribute - for instance, a tag - associated to an object
  • producing the corresponding effects isolated from other namespaces while keeping the intended functionality (i.e., following the same example
  • said certain flexibility is solved by the introduction of virtual objects as intermediaries between entries and requestable objects, and having a virtual objects module separated from the multi-namespaces module to handle the virtual objects and the related objects (namely requestable objects and related underlying objects).
  • said "virtual objects approach” may be further improved by adding some kind of "knowledge" to the virtual objects about the characteristics of the requestable objects and related data.
  • a virtual object may comprise a contents location container able to contain one or more of such "requestable objects contents locations”.
  • the virtual objects module (107) may adapted to produce the data related to references to requestable objects further taking into account that each virtual object comprises said contents location container adapted to contain one or more requestable object content locations.
  • the contents location containers introduced in the previous paragraph may have the same characteristics and functionalities as the contents location containers of the "requestable object contents locations approach". As a matter of fact, the same structure could be used to implement both contents locations containers. So, any characteristics and techniques described so far related to contents locations containers associated to entries may also be applied to contents locations containers associated to virtual objects.
  • any previously described characteristics and techniques related to either "virtual contents locations” and/or “requestable object contents locations” when used with contents locations containers associated to entries may also be applied to "virtual contents locations” and/or “requestable object contents locations” when used with contents locations containers associated to virtual objects.
  • the same type of "virtual contents locations” could be used for both types of containers, and the same type of "requestable object contents locations” could also be used for both types of containers.
  • the "virtual contents locations" and the “requestable object contents locations” could have the same format and/or characteristics (e.g. they could be encoded in the same way, or have the same mechanism for conversion into actual references to either virtual or requestable objects).
  • a request from a user asking to perform a certain access to an object may result in: the logic module interacting with the multi-namespaces module to resolve a "path" against a certain namespace to locate a particular entry (possibly taking into account context monitoring parameters) in order to obtain a "virtual content location"; the logic module interacting with the virtual objects module, feeding the "virtual content location” to the virtual objects module (possibly taking into account context monitoring parameters), and the virtual objects module performing the required access to the corresponding virtual object if possible, or returning a "requestable object contents location", in case of said "requestable object contents location” being necessary to perform the required access; in the case of the virtual objects module returning said "requestable object contents location", the logic module interacting with the contents locations module, feeding the "requestable object contents locations” to the contents location module (possibly taking into account context monitoring parameters); and the contents locations module performing the required access to the corresponding requestable object (including, if necessary, actions upon related underlying objects carried out by
  • All characteristics, techniques and functionalities about the contents locations module described for previously mentioned embodiments may also apply to the contents location module in embodiments where the virtual shell also comprises a virtual objects module.
  • the virtual objects module could be adapted to deal with and keep data (e.g. attributes) about requestable objects, and even to their related underlying objects, if any.
  • all characteristics, techniques and functionalities about the virtual objects module described for previously mentioned embodiments may also apply to the virtual objects module in embodiments where the virtual shell also comprises a contents location module.
  • different modules may use repositories (e.g. databases) to keep data related to the objects they manage.
  • different modules may share repositories to keep their data. For example, if, in a particular embodiment, the structures for virtual objects and requestable objects are the same, the contents locations module and the virtual objects module could use the same repository to keep all or part of their data.
  • these solutions may either be equivalent, or have positive or negative effects in several aspects (for example, in terms of performance).
  • the addition of the virtual objects module to the virtual shell in combination with a contents locations module allows increasing the flexibility of the virtual shell by allowing the implementation of really independent namespaces (by allowing further decoupling between the multiple namespaces and the requestable objects) and, at the same time, simplifies the implementation of the virtual shell by concentrating the burden of dealing with the specifics of potentially different underlying repositories into the contents locations module, thus isolating other components (namely the multi- namespaces module, the virtual objects module and, up to a certain extent, the logic module) from aspects which may be considered external to the virtual shell .
  • the virtual shell 100 further comprises a non-context monitoring parameters module 105 for producing data related to one or more non-context monitoring parameters 204;
  • the logic module 109 is adapted to interact 108 with the non-context monitoring parameters module 105 and to produce the set of data for the interception module 104 further from a set of the data produced by the non- context monitoring parameters module 105;
  • the logic module 109 is adapted to interact 1 14 with the multi-namespaces module 1 18 further depending on a subset of the set of the data produced by the non-context monitoring parameters module 105.
  • Some of the described embodiments are focused on providing file system semantics to the user, so that applications can run on top of the virtual shell as if the virtual shell was a regular file system.
  • some embodiments of the virtual shell provide a POSIX interface for the applications to deal with the file systems in a Unix-like environment; some embodiments developed for Windows-based operating systems provide an NTFS-like interface for the applications to deal with the virtual shell as file system.
  • said embodiments take into account a minimum set of monitoring parameters required to implement such specific semantics (e.g. user identity, current time, handling of request history to track related requests, or capacity of data repositories), which have been named as "context monitoring parameters”. Nevertheless, there are additional monitoring parameters that can be captured, generated and or used by a virtual shell to provide additional features; in order to distinguish them from the minimum set mentioned above, we call these additional monitoring parameters "non-context monitoring parameters". In general, any previously described principle referring to context monitoring parameters can be applied to non-context monitoring parameters, and vice versa (unless explicitly specified otherwise)
  • any module being able to take any decision or alter any procedure and/or data based on data related to context monitoring parameters can also be able to take any decision or alter any procedure and/or data based on data related to non- context monitoring parameters. Adding non-context monitoring parameters to any decision procedure does not require any complex additional infrastructure: just being able to accept new pieces of data as inputs of the functions taking the decisions.
  • any module able to interact with the context monitoring parameters module may also be able to interact with a non-context monitoring parameters module in an equivalent way.
  • the context monitoring parameters module and the non-context monitoring parameters module could be united in a combined monitoring parameters module being able to produce data related to context and/or non-context monitoring parameters.
  • the list of connected networks and their characteristics are considered non-context monitoring parameters.
  • This type of data can be obtained, for example, from the operating system using system calls or specific commands.
  • a multi-namespaces module could be used, for example, by a multi-namespaces module to select a namespace containing versions of files with reduced size (e.g. low resolution images, or low quality audio files) when the network available is a 3G network, and use a namespace with full-sized versions of files when a wifi network is available.
  • the geographical location of the user issuing a request is considered a non-context monitoring parameter.
  • the geographical location of the computer can be obtained by several ways: for example, getting the coordinates from a built-in GPS and checking whether these coordinates fall inside a given geographical area, getting the approximate geographical location from the user's current IP address, or by having the user feed that information through a specific user request intercepted by the virtual shell (other ways may be also conceived).
  • Some embodiments could use the geographical location of the user to select specific namespaces when creating and accessing objects representing files so that, for example, a file can only be accessed when the user is in the country where the file was created (as may be required in some cases for medical files, for example).
  • the monitoring parameters may affect the interaction with namespaces and/or the interaction with requestable objects and/or the interaction with any related underlying objects.
  • the value of certain monitoring parameters when a requestable object is created e.g. geographical location of the user
  • requestable objects are handled by a module other than the multi-namespaces module (e.g. the virtual objects module)
  • said other module may also be able to accept monitoring parameters as part of the interactions in which it participates.
  • the monitoring parameters may affect the interaction with namespaces and/or the interaction with virtual objects and/or the interaction with requestable objects and/or any related underlying objects.
  • the virtual objects module may be able to accept monitoring parameters as part of its interaction with other modules. For example, the value of certain monitoring parameters when a virtual object is created may be recorded as an attribute of the virtual object, so that it can be used later to compare with current values of monitoring parameters and make a selection (as commented before for requestable objects in other embodiment descriptions).
  • any module may use monitoring parameters (either context or non-context) as input to take any decisions or to alter its behaviour in any way.
  • any module can be easily adapted to accept monitoring parameters (context or non-context) as part of their interaction with other modules.
  • non-context monitoring parameters allows the virtual shell to automatically tune its behaviour according to a large number of environment conditions (e.g. hardware capacity, network availability, geographical location, etc.). Thanks to this, the virtual shell can adapt to changing conditions and isolate the user from changes in the operating environment as much as possible.
  • the logic module 109 is adapted to interact 1 15 with the contents locations module 1 10 further depending on a subset of the set of the data produced by the non-context monitoring parameters module 105. That is to say, non-context monitoring parameters may also affect the interactions and accesses to objects handled by the contents locations module (e.g. requestable objects and/or their related underlying objects, if any).
  • the contents location module may use a non-context monitoring parameter such as the network availability to decide which underlying object to use when several underlying objects containing data replicas are associated to a particular requestable object (e.g. it may decide to use an underlying object residing in a local underlying repository if the network is not available, instead of an underlying object in a remote underlying repository).
  • a non-context monitoring parameter such as the network availability to decide which underlying object to use when several underlying objects containing data replicas are associated to a particular requestable object (e.g. it may decide to use an underlying object residing in a local underlying repository if the network is not available, instead of an underlying object in a remote underlying repository).
  • the contents location module may use a non-context monitoring parameter, such as the geographical location of the user, to decide in which underlying repository to create underlying objects related to user requests (e.g. it may choose a remote repository geographically near the current location of the user for improved performance).
  • the virtual objects module 107 comprises one or more virtual objects sub-modules, each of said virtual objects sub-modules being adapted to access to virtual attributes 409 of virtual objects 404 and/or to attributes 212 of requestable objects 21 1 , each of said attributes 409;212 having a volatility level according to a predetermined volatility classification.
  • Values of virtual attributes of virtual objects may have different levels of volatility, from immutable values to highly volatile attributes. It was mentioned in the background art section that traditional trends tended to group metadata (e.g. attributes) together, so that, being usually small, they could be packed and transferred in an efficient way. Nevertheless, such packing also has negative consequences: for example, updating an attribute usually implies the implementation locking the attributes set in order to keep the consistency; but at the same time, this may prevent another process from accessing an unmodified attribute while another attributed is being updated (lock contention). In distributed systems, the effects are even more noticeable. For example, the virtual objects module could be implemented in a distributed way (e.g.
  • a usual technique in such implementations is to use a cache mechanism to avoid unnecessary transfers and, in some embodiments, this could involve keeping a local copy of virtual attributes of a virtual object from the remote repository.
  • File systems are a clear example where this may occur.
  • i-node keeping all file system object (file, directory, etc.) related information.
  • Such information include some immutable information which is needed very often (e.g. the i- node identifier, and the file object type), information which is also used very often (possibly by multiple clients) and rarely updated (e.g. the owner, the group, and the access permissions) and information which is rarely needed but updated very often (e.g. the file size, or the access and modification times).
  • the i- node could be, in some embodiments, represented by a virtual object, the information of the i-node being the virtual attributes. If the virtual objects module treats the virtual attributes of the virtual object as a unit, the issues mentioned in the previous paragraph may arise.
  • a solution for said type of situations may be to classify the attributes in "volatility levels" according to how often they are updated (and possibly also taking into account how often they are needed). Using such classification, a virtual objects module can treat each set of attributes in a specific way, according to their characteristics (for example, it may use different caching policies - e.g. with different lease times, etc.). In particular, some embodiments may provide a virtual objects module having virtual objects sub- modules specialized in treating each set of virtual attributes according to a certain volatility classification.
  • the classification of virtual attributes in volatility levels may be done a-priori and be static (i.e. a virtual attribute of a virtual object has a pre-defined volatility level) or it may be dynamic (i.e. the classification of a virtual attribute of a virtual object may change according to its use - and therefore, its handling may be delegated to different sub-modules at different times). It is important to mention that all considerations about volatility of virtual attributes of virtual objects are also applicable to attributes of requestable objects.
  • the virtual objects sub- modules dedicated to specific volatility levels could also be used to handle attributes of requestable objects.
  • the contents locations module could comprise contents locations sub-modules dedicated to handle attributes of requestable objects with specific volatility levels, analogously to the virtual objects sub-modules dedicated to handle virtual attributes with specific volatility levels.
  • the virtual objects module 107 comprises a contents locations virtual objects sub-module being adapted to access to virtual attributes 409 of virtual objects 404, said virtual attributes 409 referring to requestable objects 21 1 .
  • the requestable objects can be used to keep low level data and attributes (e.g. those referring to underlying objects); such attributes and related data could be independent from the namespaces or particular entries (e.g. possible attributes of a requestable object related to an underlying file system could be the path of an underlying file in an underlying file system - the reference to the underlying object - and possibly the file size, or the time of the last modification; such values do not depend on the particular entry or namespace used to access the object, and a change to them is likely to have a global effect across namespaces); moreover, the requestable objects, by further possibly depending on underlying objects, may require complex protocols for updating certain attributes (e.g. the value of an attribute - e.g.
  • the size of un underlying object - may be required to keep consistent with the actual size of the related underlying object).
  • virtual objects can be used to keep high level virtual attributes which may have a more restricted scope and may effectively depend on the particular entry or namespace used to access them, while other virtual attributes may be related to or refer to requestable objects and, possibly being of global scope, may have more strict requirements of consistency guarantees. In particular, this may involve different needs regarding consistency and, depending on the embodiment, lock management. Therefore, in some embodiments, the virtual objects module may have a sub- module (the contents location virtual objects sub-module) specialized in dealing with virtual attributes of virtual objects related to requestable objects.
  • Such sub-module may implement the appropriate means for manipulating the references to requestable objects and related data with the adequate consistency guarantees (e.g. by using an appropriate locking protocol when performing updates to virtual attributes related to requestable objects).
  • virtual attributes not related to requestable objects may be able to use simplified consistency mechanisms, if any at all.
  • the virtual objects module 107 comprises a non-contents locations virtual objects sub-module being adapted to access to virtual attributes 409 of virtual objects 404, said virtual attributes 409 not referring to requestable objects 21 1 .
  • virtual attributes referring to requestable objects may require specialized protocols for handling, so that, in some embodiments, the virtual objects module may comprise a contents location virtual objects sub- module specialized to deal with such particular type of virtual attributes
  • the virtual objects module comprises a non-contents locations virtual objects sub-module, specialized in dealing with virtual attributes which are independent from requestable objects and related data. Being aware of the independence from related requestable objects, such sub-module may implement optimizations in both the way to access and the way to keep data about said virtual attributes (e.g. said sub- module may handle accesses to such virtual attributes without requiring locks, or it may keep them in a separate repository, so that accessing to them does not interfere with accesses related to requestable objects which, possibly, may require more complex access methods.
  • the virtual objects module 107 comprises an exclusive process for each virtual object 404, said exclusive process having exclusivity for performing updating access to virtual attributes 409 of said virtual object 404.
  • the virtual shell of the invention may work in an environment where more than one user may issue requests simultaneously.
  • a well known technique to deal with these situations is using explicit locking mechanisms: the code implementing a request tries to acquire locks on the required data structures; if the locks are granted, the code may continue; otherwise, it has to wait until the current lock owner releases them. Nevertheless, in situations with lots of activity, or when large groups of structures have to be manipulated, this technique can lead to lock contention. Even with no conflicts, acquiring and releasing locks may produce a certain overhead.
  • the virtual objects module can use a novel approach to deal with concurrent activity: instead of locking data structures to allow its use from different "threads” or “processes", each data structure has an assigned “thread” or “process” which is the only one with permissions to modify the data.
  • the virtual objects module may associate one of these "exclusive processes" to each virtual object (e.g. representing a file system object), and such "exclusive process” would be the only one with permissions to modify the virtual attributes of the virtual object associated to said "exclusive process” (e.g., the data related to requestable objects pointing to underlying files containing the represented file's data - if any -, the target path of a symbolic link - if the virtual object represents a symbolic link - and directory specific fields - if the virtual object represents a directory).
  • the virtual objects module may associate one of these "exclusive processes" to each virtual object (e.g. representing a file system object), and such "exclusive process” would be the only one with permissions to modify the virtual attributes of the virtual object associated to said "exclusive process” (e.g., the data related to requestable objects pointing to underlying files containing the represented file's data - if any -, the target path of a symbolic link - if the virtual object represents a symbolic link - and directory specific fields
  • other processes may be allowed to modify virtual attributes that do not belong to their associated virtual objects (e.g. by directly accessing them from a repository - for instance, a database). This information is usually treated as a hint. Nevertheless, the values of certain virtual attributes can be considered safe, even when read by non-owner processes, because the authorized "exclusive processes" modify them in a particular order that allow checking its correctness (e.g. using version numbers and non- reusable object identifiers is a well-known technique that allow the implementation of some of these checking methods).
  • a request arrives to the virtual object module to update a virtual object, it is forwarded to the "exclusive process" associated to the target virtual object. If the request involves a modification, then the process perform such modification (e.g. by updating a repository with virtual attributes data such as, for instance, a database), otherwise, the requested data is included in the response. Once the process sends the response back to the caller, it may wait for the next request.
  • the request is sent to the "main" object's "exclusive process” (the directory, in the example) which may interact with the "exclusive process” of the other involved virtual object (the file to be removed, for example) by any necessary means (e.g. by exchanging messages).
  • the implementation of "exclusive processes" associated to single virtual objects, used by the virtual objects module has been done using the Erlang/OTP environment, which provides support for extremely lightweight processes (a single Erlang node may handle up to a few million processes).
  • the data related to virtual objects was kept in a repository based in the Mnesia database (a database provided with the Erlang/OTP environment).
  • Mnesia database a database provided with the Erlang/OTP environment.
  • a single "exclusive process” may be used to handle more than one virtual object.
  • the subsets of virtual objects handled by different “exclusive processes” should be disjoint (i.e. two different process with any overlap in time should not have ownership of the same virtual object, even at different times).
  • some embodiments of the virtual shell may avoid contention problems and costly synchronization mechanisms by the virtual object module using a dedicated "exclusive process" for each virtual object (at least, to perform updates to its corresponding virtual object). This mechanism may improve the performance and/or the scalability of the virtual shell, especially when the virtual shell is adapted to operate in environments where several users may issue simultaneous requests to the virtual shell (for example, parallel and/or distributed systems).
  • the virtual objects module 107 is adapted to activate and/or deactivate exclusive processes for virtual objects.
  • the virtual object module uses dedicated "exclusive processes" to handle individual virtual objects, not all virtual objects must have an associated active "exclusive process” at all times.
  • the "exclusive process” may be eliminated and disappear, for example, after a period of inactivity, and be re-created when the associated object is needed again.
  • the virtual objects module may be responsible for activating and/or deactivating the dedicated "exclusive processes" associated to individual virtual objects
  • a new “exclusive process” may be created for such virtual object for taking ownership of the corresponding virtual attributes, possibly keeping them internally cached.
  • the virtual objects module may forward the request to said "exclusive process” for processing.
  • the virtual shell and, in particular, the virtual objects module can maintain a greater control on the amount of resources being used to handle virtual objects.
  • the virtual objects module 107 is adapted to deactivate exclusive processes for virtual objects depending on an inactivity time threshold and/or available resources.
  • the virtual objects module can keep track of the inactivity time of the "exclusive processes" associated to virtual objects, and eliminate them after a given period of inactivity.
  • a simple way to do this could be to maintain a "least recently used” list of active "exclusive processes", and set up a timer to check said list at intervals and elinninate the processes exceeding a certain inactivity time threshold.
  • the inactivity time threshold for different "exclusive processes” may also be different. For example, it may depend on the cost to start a given "exclusive process" associated to a particular virtual object, if different virtual objects involve different costs. Another possibility is to use heuristics based on knowledge about the system (for example, it may happen that a certain virtual object is used regularly, but with inactivity intervals larger than for other virtual objects - then, said virtual object may have its inactivity time threshold adapted accordingly).
  • the virtual objects module may decide to eliminate "exclusive processes” when it detects that operating resources are scarce. This may be used to balance between performance (having "exclusive processes” active and ready) and capacity (having enough resources to start new "exclusive processes” if a bunch of requests on new virtual objects arrive)
  • each exclusive process for virtual object is adapted to deactivate itself and/or to activate/deactivate one or more of the other exclusive processes.
  • Each "exclusive process" associated to a virtual object may be able to deactivate itself when certain conditions occur.
  • the advantage of having an "exclusive process” to deactivate itself is that each "exclusive process” may keep its own state and track the conditions that make it a candidate for deactivation, instead of the virtual objects module having to maintain global information about all the "exclusive processes” and their state and conditions in order to deactivate them when necessary.
  • the capacity of an "exclusive process” to deactivate itself does not necessarily prevent the virtual objects module (or other "exclusive processes") from being able to eliminate such "exclusive process”.
  • An “exclusive process” may be able to activate other processes without the intervention of the virtual objects module (for example, when said "exclusive process” requires the collaboration of a different "exclusive process” for a different virtual object which is not activated)
  • An “exclusive process” may eliminate “another exclusive process”. For example, a requirement to deactivate an "exclusive process” associated to a virtual object representing a directory in a file system, may cause said "exclusive process” to request the deactivation of "another exclusive processes” associated to virtual objects representing the children of said directory.
  • each exclusive process for virtual object is adapted to deactivate itself and/or to deactivate other exclusive processes depending on an inactivity time threshold and/or available resources.
  • each exclusive process for virtual object comprises one or more exclusive sub-processes, each of said exclusive sub-processes being dedicated to virtual attributes 409 of said virtual object 404, each of said virtual attributes 409 having a volatility level according to a predetermined volatility classification.
  • virtual attributes of virtual objects may have different volatility levels: some virtual attributes may be immutable or rarely modified, and some virtual attributes may be updated more often.
  • each "exclusive process” may have one or more collaborating sub-processes specialized in handling virtual attributes with a certain level of volatility, in the same way that the objects locations sub- modules operated in previously described embodiments.
  • each exclusive process for virtual object comprises a contents locations exclusive sub-process being dedicated to virtual attributes 409 of said virtual object 404, each of said virtual attributes 409 referring to requestable objects.
  • dealing with requestable objects may be more complex than dealing with virtual objects, as requestable objects may have related underlying objects and it may be necessary (for some purposes) to keep at least certain attributes of requestable objects synchronized with data related to underlying objects. Therefore, when virtual attributes of virtual objects refer to requestable objects, extra care may be needed to operate with them (e.g. an update may eventually involve interaction with a possible unreliable underlying repository - thus having an increased risk of delays or errors that may need special handling).
  • each of such "exclusive processes” may have an associate "contents location exclusive sub-process" specialized in dealing with virtual attributes related (or requiring access) to requestable objects related to the virtual object
  • each exclusive process for virtual object comprises a non-contents locations exclusive sub-process being dedicated to virtual attributes 409 of said virtual object 404, each of said virtual attributes 409 not referring to requestable objects.
  • such "exclusive processes” may have a sub- process specialized in dealing with virtual attributes related to requestable objects.
  • the "exclusive processes” may also have an associated "non-content locations exclusive sub-process” specialized in dealing with virtual attributes not related to requestable objects. This may be useful, for example, when such virtual attributes not related to requestable objects are handled and/or stored in a different way (e.g.
  • the multi- namespaces module 1 18 is adapted to produce a reproducible ordered list 605 of entries comprising one or more disjoint partial results 606 from a set of entries of namespaces, each entry of said set of entries having one or more related properties being able to be inputted to at least one hash function, said reproducible ordered list being obtained from:
  • the multi-namespaces module can be requested to produce a list of entries according certain criteria (for example, to obtain the list of all entries in a namespace being associated to another entry acting as a container).
  • certain criteria for example, to obtain the list of all entries in a namespace being associated to another entry acting as a container.
  • the virtual shell provides file system semantics, such feature can be used, for example, to provide a directory listing
  • a possible approach could be using a lock mechanism (or a database transaction, if the entries are stored in a database) to prevent modifications to the affected entries while producing the listing. For short listings of entries this may be effective; nevertheless for potentially very large listings, this has a number of problems: for example, generating a large listing may take a considerable amount of time, causing performance problems due to delays to other requests waiting for the locks to be released. Also, in distributed environments where the listing may have to be transferred between components in different nodes, potential problems arise: either all the listing is transferred at once (probably causing long response times due to having to wait for all data to arrive) or the listing is fragmented (facing complex error situations in case the communication fails at the middle of the transfer, leaving the locks acquired).
  • POSIX file system semantics
  • a well-known solution in the field of file systems to handle large directories is to maintain the entries of a directory in an ordered tree structure, which is usually stored as special file.
  • By having the entries ordered it is relatively simple to keep a pointer to the last entry listed, and generate the listing fragment by fragment. If a new entry is added before the current pointer, it will be ignored; if an already listed entry was removed and then re-created, it will be placed before the current pointer, so it will not be listed twice; and, finally if an entry is either created or removed after the current pointer, it does not affect the part of the listing already generated.
  • the trees also have issues, especially when growing to large scales: in order to be efficient, trees should be kept balanced when new entries are added and when exiting entries are removed.
  • Balancing a tree while keeping it ordered may be a costly operation, which in the case of file systems is aggravated by the fact that, in order to take advantage of block devices, the leaves of the trees are not entries, but groups of entries packages in one or more consecutive blocks. As such groups can keep only a limited amount of entries, adding or removing an entry may involve splitting or uniting blocks of entries, causing non-trivial (and potentially time consuming) reorganizations of data. Furthermore, in current distributed environments, maintaining an ordered tree which could be distributed across several nodes for scalability purposes, adds even more complexity and cost to its management, making it a poorly scalable solution Ideally, a possible solution could be storing the entries in a table in a database.
  • Some embodiments of the invention comprise a mechanism that allow storing entries in non-explicitly ordered repositories (such as database tables), while allowing the implementation of entry listings (even fulfilling complex semantics such as POSIX).
  • the mechanism is based in the fact that all entries have some distinctive characteristic that allows to differentiate them (for example, all entries in a directory must have different "names"), so that such characteristic can be used as input to a function (e.g. a hash function), and so that the set of distinct results obtained from applying such function to the entries to be listed can be smaller than the set of entries to be listed
  • a function e.g. a hash function
  • the way to produce and to list entries in a reproducible order from partial subsets of entries starts by generating the list of distinct results of the selected hash function applied to the chosen characteristic of the entries to be listed (e.g. their names).
  • Each distinct hash value defines a disjoint subset of the entries to be listed (said subset containing the entries for which the hash function returns the same hash value).
  • the hash function should be chosen in such a way that the number of distinct hash values and its size (its number of bits) is adequate to be transferred in a single shot from the component having the set of entries to be listed to the component generating the list.
  • each subset of entries defined by each hash value should also have a size adequate for being transferred in a single shot with a reasonable cost, and sorted in the component generating the ordered listing (note that a relatively large number of hash values with relatively few bits can be packet in relatively small space, and be used to define a large number of subsets, which can be potentially small).
  • the component generating the ordered listing may sort said hash values following any arbitrary sorting criteria. Then, component generating the ordered listing requests the subset of entries corresponding to the first hash value, orders the subset of entries according to any criteria, and starts processing it to generate the whole list (e.g. forwarding the subset to the user entry by entry). The same procedure is then repeated for each distinct hash value defining a subset of entries, until all subsets of entries have been obtained and, thus, the list of all entries has been generated.
  • Replying the entry listing in the same order from an arbitrary point can be achieved as far as the list of distinct hash values is maintained (or recovered and re-organized into the original order). For example, a handle to indicate a given position in the listing to start the reply in an efficient way would consist of the name of the desired entry and the hash value corresponding to it.
  • embodiments using the above mechanism improve flexibility of the previously described embodiments, because the user can request listings of large amounts of entries and the virtual shell producing the corresponding results with very high reliability and efficiency.
  • said listings of potentially large amounts of entries in a reproducible way, fragment by fragment are obtained without the need of locking large amounts of data or for long (or even arbitrary) amounts of time.
  • the mechanism does not require any state in the component storing the entries (e.g. to track the last entry returned), thus avoiding potential causes of errors in distributed environments.
  • the repository used to keep the entries to be listed does not need to have any predefined ordering semantics (in particular, tables in arbitrary databases can be used, enabling the system to take advantage of the database technology, including scalability and distributed operation).
  • the multi-namespaces module 1 18 is adapted to obtain each hash output from selecting a set of bits from the result of applying the hash function to the properties of each entry of the set of entries, said set of bits depending on the number of entries from which the reproducible ordered list of entries is produced.
  • the multi-namespaces module of some embodiments may use the mechanism described above to generate listings of entries from partial sets of entries defined by hash values. Nevertheless, this mechanism has a drawback: given a certain hash function generating a hash value of a given number of bits for each entry, the potential number of subsets of the set of entries generated does not take into account the number of entries to be listed. So, it may happen that, for a small set of entries (e.g.
  • the subsets generated are too small to be handled efficiently (they may have very few entries and the sequence of obtaining each subset might have a higher cost than getting all the entries at once); on the contrary, for very large sets of entries, the division may produce too few subsets (so the subsets are still too large to be handled effectively).
  • a solution for this drawback can consist of using a hash function returning a relatively large number of bits for each entry, and then using a subset of the bits depending on the number of entries to be listed: for small sets, just a few bits (or even none at all, to select the whole set in one shot) may be used; on the contrary, the larger the set, the more number of bits from the hash value can be used to define the subsets, so the size of the generated subset of entries for partial lists is kept within a reasonable amount.
  • each entry of namespaces comprises at least one hash container adapted to contain the result of applying the hash function to the properties of the entry.
  • the hash value could be stored as a column in the record containing the entry information. With such implementation, recovering the list of distinct hash values from the database could be done in an easy and efficient way (especially if the column is indexed).
  • variable number of bits e.g. a variable length prefix of the hash value
  • recovering it could be implemented by selecting entries from a database table with hash values within a specific range, where the minimum value of the range could be obtained by using the desired hash value prefix followed by zeros (0) up to complete the full length of the hash values, and maximum value of the range could be obtained by using the desired hash value prefix followed by ones (1 ) up to complete the full length of the hash values.
  • the multi-namespaces module 1 18 is adapted to select the hash function to be applied to one or more of the properties of each entry of the set of entries depending on data from the context monitoring parameters module 1 16 and/or from the non-context monitoring parameters module 105.
  • the hash function used to divide a set of entries into smaller subsets for listing purposes could be modified depending on the number of entries to be listed (note that changing the number of used bits from the result of the hash value is, in fact, equivalent to changing the hash function).
  • the component generating the listing e.g. the user client
  • a low capacity hardware e.g. a mobile device
  • the selection of the hash function to use for generating subsets of entries may depend on context and/or non-context monitoring parameters (such as, for example, the bandwidth between components, or the type of hardware being used by the user client). All the details provided in previous descriptions about the context monitoring parameters module comprised in the virtual shell are also applicable to the context monitoring parameters module for being used in the virtual shell.
  • each of the modules or sub-modules can be detached and run in the same or different nodes.
  • any module or sub-module can be divided (for performance, scalability or any other implementation reasons) into components and sub-components, which may also be detached and run either locally or in remote locations with respect to the other components.
  • the logic module and the virtual objects module could run in different systems.
  • a particular module for instance, the multi- namespace module
  • could be divided in components also running in different nodes for instance, a component running in the user's client system - e.g.
  • the virtual shell may be able to deal with multiple users simultaneously, possibly in different remote locations.
  • the underlying repositories they may also be remote and/or of parallel or distributed nature (e.g. a remote network underlying file system or a parallel or distributed underlying file system).
  • the virtual shell can be able to deal with underlying data contents being modified from several locations (either simultaneously or not).
  • the virtual shell of the invention may present the user a virtual view of file system directories, allowing the user to organize the files as he/she pleases, while implementing a completely different directory layout in an underlying file system.
  • Some embodiments of the virtual shell were developed for a Unix-like operating system (namely Linux) which provided a POSIX interface for the applications to deal with the file systems. This involves dealing with particular pieces of metadata and a specific set of operations to manipulate the files. Nevertheless, one embodiment of the virtual shell has been designed so that it can be easily adapted to contain and handle different pieces of data (monitoring parameters, attributes, etc.) and respond to a different set of requests. In other words, the design has no hard dependences on a specific operating system or a specific file system interface. Embodiments developed for Windows-based operating systems and providing an NTFS-like interface for the applications to deal with file systems follow the same principles.
  • the virtual shell can offer POSIX file system semantics and the file system objects (e.g. files, directories, etc.) are represented by requestable objects
  • the attributes of such requestable may include access control (owner, group and related access permissions), symbolic and hard link management, and size and time data for non-regular files (sizes and access times management for regular files rely on the underlying file system used as underlying repository).
  • Directory handling can combine the features of the multi-namespaces module (to handle the directory hierarchy) with the attributes of requestable objects associated to the entries acting as containers (e.g.
  • the attributes of the requestable objects do not need to include any knowledge of low-level data storage (in particular, there are no references about disks, blocks or other storage objects - though they could be easily included if necessary): the reference to the underlying object could be reduced to the path of the underlying file (and it may be left up to the underlying file system to take decisions about low-level data server selection, striping, block/object placement, etc.)
  • POSIX-based systems assign an internal identifier (the so-called "i-node” number) toeach file system object, and such "i-node” number must be made available to the user; then, the virtual shell also needs to generate and such "i-node” numbers (for a number of technical reasons, it is not feasible to simply forward "i-node” numbers from the underlying file system: for example, there may be several underlying file systems - with possibly duplicated "i-no
  • i-node numbers are somewhat dependant on the operating system used by the user, as they have to be fed back to it (one of the possible requests in a POSIX system is a request for translation from an entry name into an "i-node" number) and they can be used by the kernel as identifiers in future requests. Most systems use integers with 32 or 64 bits, while a few can use larger integer sizes. It is usually assumed by the system that an i-node number represents a unique object in a given file system, and that such i-node number will not be reused until the previous object is destroyed.
  • an i-node number could be reused as far as the underlying system keeps no reference about the previous object.
  • some virtual shell embodiments may use a "forget" request from the user to notify some references to a particular i-node number have been released. (Given this scenario, it is clear that user applications should never relay on i-node numbers to identify files.)
  • Some systems use a "generation number" associated to the i-node number which is increased when such i-node number is re-used; nevertheless, this is not usually visible outside the system.
  • the virtual shell may internally identify and refer to requestable objects representing file system objects by means of identifiers (internally called “inums” in some embodiments). These identifiers are unique, at least for the life time of the object. Such requestable object identifiers handled internally do not necessarily match the i-node numbers provided to the kernel. So, an inum value (the internal identifier), in general, has to be mapped to a locally unused "i-node” number adequate for the user.
  • identifiers are unique, at least for the life time of the object.
  • Such requestable object identifiers handled internally do not necessarily match the i-node numbers provided to the kernel. So, an inum value (the internal identifier), in general, has to be mapped to a locally unused "i-node” number adequate for the user.
  • the virtual shell module may perform an explicit map. This could be achieved, for example, by keeping a list (or a similar structure) of locally unused i-node numbers, and a hash table (or a similar structure) with the mapped inum values to assigned i-node numbers, and possibly also the other way round (such data structures could be either local or remote, and could be temporary - and, for example, cleaned up on system restart - or made persistent (using, for example, a database or some other kind of backend storage).
  • Some possible situations where having a limited size numeric identifier for the file system objects could be a handicap are, among others: having a very large set file objects exceeding the capacity of the local i-node numbers (larger identifiers could be used to identify the file system objects, and only those in the "active" working set would need to be assigned a smaller sized i- node number); using monotonically increasing identifiers which may, eventually, exceed the limited size of the i-node numbers (using monotonically increasing and/or any other type of non-repeatable identifiers has some interesting properties that may be used to optimize certain internal algorithms); using mechanisms to generate distributed unique identifiers (such identifiers are usually large, to avoid the chances of independently generating the same identifier for different objects); etc.
  • Some file system request as, for example, file creation are delicate operations, which require careful coordination between the different modules of the virtual shell.
  • the multi-namespaces module with the assistance of logic module, will have to check the access permissions and create a new entry in a namespace.
  • a contents locations module may need to request the creation of an actual underlying file somewhere in the underlying file system and possibly create the representation of the corresponding requestable object and its attributes.
  • the new entry and the new requestable object will be linked (by means of a "requestable object contents location", for example).
  • the context monitoring parameter module may be required to generate a "handle" to the new file, so that further requests on it (e.g.
  • reads and writes can be tracked and associated to the original file creation operation (and to any particular flags indicated during the creation of the file) without needing to repeat searches of the path in the namespace.
  • requestable objects are managed by a contents locations module (though, of course, it is clear that the same techniques and mechanisms described can be applied to any other module able to handle requestable objects in different embodiments).
  • the first step to fulfil a file creation request is to generate an identifier for new requestable object (the so-called "inum").
  • This identifier should be unique across a possibly distributed system where several users may issue simultaneous requests to create files, and where the modules of the virtual shell (including the contents locations module) can also be distributed in multiple components.
  • the module component responsible to create the requestable object (which may be a contents locations module component, if such module is available) uses a the following mechanism: instead of having each component generating the identifiers on its own, they request them to a global identifier server component (which may depend, for example, on the logic module) which gets requests from all participating contents locations module components and guarantees that they receive different identifiers.
  • a specific embodiment of the identifier service uses a monotonically increasing counter to generate identifiers, and has the particularity that assigned identifiers are never reused, even if the corresponding requestable object has been deleted (for long running systems, counter overflow is not a problem, as the number of bits used for the counter could be dynamically increased).
  • the contents location module components may request a range of identifiers (instead of a single one), and keep them locally and use them whenever needed.
  • a file creation request also requires generating an identifier for the underlying object that may need to be created (e.g. an actual file in an underlying file system).
  • This identifier should also be unique across a possibly distributed system (where different contents locations module components may be simultaneously serving different requests on their own - and therefore, maybe trying to simultaneously create underlying objects, possibly in the same underlying repository).
  • some embodiments allow each contents locations module component to independently generate unique identifiers for the underlying. One way to achieve this would be assigning each component of each virtual shell module (in particular, to each component of a distributed contents location module) a unique identifier.
  • the component identifier can be used to "tag" a locally generated identifier for the file being created (which could be based, for example, in a monotonically increasing counter), with the possibility of reusing identifiers for objects not used anymore at a local scope by keeping, for example, a list of unused identifiers.
  • This mechanism allows a contents locations module component to request the creation of an underlying object associated to a requestable object representing a file system object without risk of underlying object name collisions with other contents locations module components.
  • the identifier of the underlying object could also be based, or even contain, the unique requestable object identifier (the "inum") or use it either directly or as a base to locally generate unique file names in an underlying file system.
  • the second step consists in requesting the underlying interface module to create, if needed, the corresponding underlying object in the underlying repository, as indicated by the contents locations module (in particular, using the unique underlying object identifier generated as explained in previous paragraphs). If a file or any other thing is actually "created” or not will depend on the particular underlying repository behaviour and/or policies implemented by the contents location module (for example, creation of the underlying object could be delayed until first use, or a pre-created underlying object from a pool of previously created underlying objects could be used). As mentioned for previous embodiments, the interaction between the contents locations module and the underlying interface module could use the logic module as an intermediate step.
  • the contents location module will have a valid reference to an underlying object will be able to associate it to the corresponding requestable object as an attribute (e.g. by keeping the path of the underlying file in the underlying file system as a requestable object attribute).
  • the next step consists of updating the requestable object related to the parent directory of the file system object being created (a reference to such requestable object - probably in the form of a "requestable object contents location" - should have been obtained in an initial step from the multi- namespaces module, when the entry corresponding to the parent directory path has been checked against a given namespace).
  • a request to the multi-namespaces module (possibly via the logic module) can be sent in order to create the corresponding entry in the desired namespace, and associate it to a "requestable object contents location" referring to the requestable object for the newly created file system object.
  • the request to the multi-namespaces module fails (for example, because an entry in the desired namespace already existed with the same name), a notification is sent back to the contents location module, so that the created underlying file (if any) can be removed and the corresponding requestable object can also be eliminated.
  • the discarded identifiers e.g. the "inum" for the new requestable object
  • the identifier can be stored locally by the corresponding contents location module component to be used later (as the corresponding requestable object was not associated to any entry and it was eliminated, the identifier was not "officially assigned", and therefore a convenient rule about not re-using assigned identifiers is maintained).
  • the procedure just described can be used not only for files, but also for other types of file system objects (in particular, those that can be created with a mknod system call in a POSIX-based file system), such as named pipes, Unix domain sockets, or character and block devices.
  • file system objects in particular, those that can be created with a mknod system call in a POSIX-based file system
  • Directories and symbolic links on the contrary, do not require the existence of corresponding underlying objects (all necessary information can be kept in requestable object attributes not related to any underlying repository) and, thus, the procedure can be simplified for this type of objects.
  • file systems may have more elaborated semantics.
  • some of them may allow a conditional behaviour: preparing a file for access ("opening" the file) if it exists, and creating it if it does not exist and then preparing it for access.
  • Such combined semantics may produce a race situation in a distributed system.
  • the initial procedure is similar to the simple "create-only" method: each component generates a new requestable object with its corresponding underlying object (with any required interaction with the underlying repositories via the underlying interface module and, possibly, the logic module) and, then, tries to register it in a namespace via the multi-namespaces module.
  • At least one of them should succeed, but the other (or others, if more than two contents locations module components were trying to create files), instead of just getting the error, will get the "requestable object contents location" corresponding to the file that was successfully registered; then, they will discard its own requestable object and proceed to prepare the access to the registered one (which involves interaction with the logic module in order to make the necessary checks on the registered file system object type and permissions - it could happen that different types of file system objects were intended to be created with the same entry name: for instance, a file and a directory).
  • An alternative implementation just discards the information and re-tries to create its own file again (theoretically, in a highly volatile environment, the described failure situation could be repeated indefinitely, leading to starvation; to avoid that, a retry counter could be used to return an error after a certain number of retries).
  • a request to access certain file system object attributes may require accessing the underlying file related to the requestable object representing the target file.
  • file system object attributes e.g. "utime” to modify a file's access and modification times, or "stat" to retrieve a file's size
  • the reason is that these values are quite related to actual data manipulation, and the underlying file systems usually handle them well; duplicating them in the requestable object attributes and keeping them synchronized with the underlying file attributes would probably be an unnecessary cost (though it could be done, if necessary).
  • the virtual shell modules and their components may maintain a cache mechanism by themselves, or take advantage of mechanisms already implemented in the base technology (e.g. FUSE, when used as one of the base technologies, allows the multi- namespaces module to specify the expiration times for cached directory entries in the operating system kernel, and also allows the contents location module components to specify expiration times for some of the file system object attributes cached in the operating system kernel).
  • FUSE when used as one of the base technologies, allows the multi- namespaces module to specify the expiration times for cached directory entries in the operating system kernel, and also allows the contents location module components to specify expiration times for some of the file system object attributes cached in the operating system kernel).
  • some embodiments implemented a a specific cache mechanism to keep requestable object attributes and entry related data in "local" components of the corresponding modules to reduce interactions with "remote” components.
  • the cached data is provided with a limited-time lease.
  • the provider module may issue specific invalidation requests whenever the affected pieces of data are going to be modified from a different component.
  • lease handling is decoupled from data requests (i.e. from requests asking for specific entry information to the multi-namespaces module or asking for requestable object related information to the contents locations module, for example); in other words, the lease is not sent back to the caching component together with the response to a request: instead, the response is sent as fast as possible from the component having the desired data, and it triggers a decoupled concurrent mechanism that will end up in the corresponding lease being sent to the requesting component afterwards. Though this may not seem intuitive, it may result in better response times, while the cache efficacy is not affected by the delay in a significant way.
  • the number of leases granted for a specific piece of data can be limited.
  • This limitation may help to keep the synchronization and invalidation costs bounded. Beyond this limit, the system may work as a no-cache system. The possible performance penalties could be compensated by the reduction of synchronization costs.
  • Most file systems store information related to the file system objects and the namespaces using specific structures (for example i-nodes and directories) which are usually grouped (e.g. the i-nodes are usually packed together in certain sections of disks, and directories are potentially huge collections of entries with the same "parent”) and stored in some storage media managed by the file system.
  • some embodiments of the virtual shell organizes the data required by the different virtual shell modules as records in large tables indexed by a key value (or, at most, the combination of a few values) without any explicit grouping. Therefore, any hash-like structure (a set of entries accessible via a key) would be adequate for storing the necessary pieces of data, such as entry information for the multi-namespaces module, requestable object attributes for the contents locations module or virtual objects and their virtual attributes for a virtual objects module (of course, other structures such as list, ordered lists, tables, etc. would also be possible implementations, though probably less efficient, either in access time or required space).
  • databases fulfil the functionality of hashes, with some additional features: multiple keys, or atomic transactions, for example.
  • advanced database engines may al so provide fault tolerance mechanisms or distribution support.
  • some embodiments of the virtual shell can use one or more databases as backend for the data required by the different virtual shell modules.
  • any database could be used (Berkeley DB, MySql, Oracle, Postgres, etc, just to name a few of them - of course, any missing feature could be completed by the code of the module using it).
  • some embodiments of the virtual shell can use Mnesia, a database that is part of the Erlang/OTP suite.
  • Said database is optimized for simple queries in soft real time distributed environments, and has built-in support for transactions, fault tolerance mechanisms and data distribution.
  • An interesting property is that said database is able to keep and manipulate its tables in memory (for efficiency) while sending the information in the background to a persistent media (safety).
  • the embodiments of the invention described with reference to the drawings comprise computer apparatus and processes performed in computer apparatus, the invention also extends to computer programs, particularly computer programs on or in a carrier, adapted for putting the invention into practice.
  • the program may be in the form of source code, object code, a code intermediate source and object code such as in partially compiled form, or in any other form suitable for use in the implementation of the processes according to the invention.
  • the carrier may be any entity or device capable of carrying the program.
  • the carrier may comprise a storage medium, such as a ROM, for example a CD ROM or a semiconductor ROM, or a magnetic recording medium, for example a floppy disc or hard disk.
  • a storage medium such as a ROM, for example a CD ROM or a semiconductor ROM, or a magnetic recording medium, for example a floppy disc or hard disk.
  • the carrier may be a transmissible carrier such as an electrical or optical signal, which may be conveyed via electrical or optical cable or by radio or other means.
  • the carrier may be constituted by such cable or other device or means.
  • the carrier may be an integrated circuit in which the program is embedded, the integrated circuit being adapted for performing, or for use in the performance of, the relevant processes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

Virtual shell for producing one or more responses to a user request related to one or more objects, the virtual shell comprising: • a context monitoring parameters module for producing data related to one or more context monitoring parameters; · a multi-namespaces module for producing data related to one or more arbitrarily structured namespaces; • an underlying interface module for producing data related to a set of underlying objects comprising data related to requestable objects related to the requested objects; · a logic module for producing data for an interception module from data from the context monitoring parameters module, the multi-namespaces module and the underlying interface module, said data from the multi-namespaces module depending on data from the context monitoring parameters module; • the interception module for intercepting the user request and for producing the responses to the user request from data from the logic module.

Description

VIRTUAL SHELL
The present invention relates to a virtual shell for producing one or more responses to a user request related to one or more objects.
The present invention also relates to a context monitoring parameters module, to a multi-namespaces module, to an underlying interface module, to an interception module, and to a logic module, each of said modules for being used in the virtual shell.
Furthermore, the present invention relates to a method for producing one or more responses to a user request related to one or more objects, and to a computer program product comprising program instructions for causing a computer to perform said method.
BACKGROUND ART
Storage systems are designed to keep data meant to be persistent (at least for a certain time). The time to access data in a storage system is usually divided into "seek time" (time to prepare the access - e.g. to move the read/write head to the right position in a rotational disk and wait for the desired data to reach the position below the head) and "transfer time" (time to actually send the data from the storage system to the system where it is to be used). Traditionally, storage systems try to improve performance by reducing the impact of the "seek time", and a simple way to do this is to access as much data as possible at the same position; as a consequence, data is usually grouped in large "blocks", which can be read or written at once.
The "pure data" required by the user are not the only possible contents of a storage system. Additional information may be required by the storage system, for example, to provide higher-level abstractions (such as files or directories) and link them to the "pure data", to provide a namespace to organize and provide means to locate a specific piece of data, or to attach "attributes" to pieces of data (e.g. the type of said data or who is the owner). This additional information is called "metadata". As well as "pure data", the additional "metadata" is usually expected to be persistent; therefore, it also needs to be stored in a storage system (either in the same media where the "pure data" resides, or somewhere else), and the considerations about the benefits of large "blocks" also apply to them.
Moreover, due to the fact that "metadata" is usually related to the characteristics of the storage system where the "pure data" is stored, "metadata" for different storage systems may probably be different and incompatible.
Note that file systems can be considered a particular type of storage system offering abstractions such as "files" or "directories" on top of "pure data" (the files' contents) and "metadata" (the namespace and file's attributes, among other possible things). Both file systems and other types of storage systems (e.g. object storage devices) share many characteristics, so both expressions will be used as equivalent, unless explicitly stated.
Traditionally, storage systems (and, in particular, file systems) have evolved to group together the relatively small pieces of "metadata", so that they could be stored and retrieved in a single shot (for example by accessing consecutive "blocks" in a disk), and then kept in memory to avoid additional transfers when accessing the actual "pure data" of the files. Obviously, it may be interesting to group metadata about files that are going to be used together, or in a much related way. To this end, some file systems, for example, tend to interpret directories as a "hint" to indicate "locality" or "relationship". Another trend consists in integrating data and metadata with tight links, possibly storing them together (as the "metadata" of a file, for example, will be probably needed to access adequately the "pure data" of that file). Examples of these trends can be easily found. For instance, the old FAT- based file system grouped inside a directory entry the name of the file directory, its attributes and its starting physical location. Directories were a special type of file with a collection of entries corresponding to their children in the hierarchy, so that all the "metadata" of the directory contents was retrieved together.
In a different example, file attributes and physical location information in most Unix-based file systems are grouped in a single data structure called "i-node", while the namespace is specified via directory contents (a special file containing file and subdirectory names, and their corresponding "i-node" identifier). Though the exact disk layout depends on each particular implementation, it can be organized so that directory listings and attributes of their files are physically close in the storage system, interleaving both "pure data" and "metadata". Some implementations are even able to store the "pure data" of small files inside the very same "i-node" structure.
Finally, the MS Windows NTFS file system groups most file attributes in a large Master File Table (MFT). The MFT entries also contain the name of the file, and in case of small files (or directories) may also contain the actual "pure data" of the file (or the list of file or subdirectory entries, for directories). As a result, the amalgamation of "pure data" and "metadata" may be even higher than in the case of some Unix-based file systems.
The advance in computing systems has brought new ways to use and access the stored data (e.g. distributed systems, secure data sharing, etc.), which push the architecture of file storage systems to the limits, making them inadequate to handle the new needs. There are two key aspects that conventional systems make difficult to handle: flexibility and integration.
Due to the interlinked nature of "metadata" (in particular metadata related to the namespace) and "pure data" in storage systems, the hierarchical file system structure becomes rigid in the sense that, by default, a specific file or directory is only available in a unique precise location of such directory tree. Moreover, the file organization is fixed for the whole system: it is possible to protect or restrict access to certain areas, but it is not possible to offer radically distinct fully functional views to different users or application, for example. Note that there are mechanisms to offer different "virtual" views of a file system, usually based on file attributes - nevertheless, their functionality is commonly limited to "search" for items or organize them according to certain criteria, and have to rely on the "official" underlying tree for manipulation of such items; moreover the possible organizations are usually limited and not arbitrary, for example, by being automatically generated from attributes of the displayed objects. Another situation that commonly arises is the fact that a computer system may use several storage systems at the same time (e.g. several internal disks with different file systems, an external USB disk, or remote file systems such as SMB/CIFS or NFS). Unfortunately, the architecture and internal behaviour of file systems, as well as the differences in their respective metadata, prevent their integration at low level. As a consequence, users are forced to deal with separated file and directory hierarchies for each file system and handle them in different ways (as different file system usually have different functionalities and/or semantics, that applications must be aware of - e.g. an application working flawlessly in NTFS may refuse to work on SMB/CIFS, or may fail on an NFS-based system).
Some attempts to address this issue consist of "hiding" the multiple file systems behind a common system (normally offering a SMB/CIFS or NFS interface, since the functionality of such systems is somehow limited and easy to implement from other systems). Still, some issues are not solved by this approach: the applications cannot use the full semantics of the system behind (e.g. NTFS behind SMB/CIFS behaves as SMB/CIFS to the application - so the application must be adapted to use SMB/CIFS, and not its preferred file system), and there is a single "official" namespace, which usuallyresults from the combination of the "official" namespaces of each system behind, but without merging them (e.g. all files in a given directory are usually stored in the same storage system). As mentioned before, some system may offer "virtual" views on top of the "official" combined view, but with the limitations mentioned above (e.g. restricted organization possibilities, or having to refer to the "official" path for object manipulation). For example, the US patent application US2008/0256354A1 , which discloses systems and methods for managing digital assets in a distributed computing, states (paragraph 1 12) that "client software provides functionality to display the digital assets according to a user defined organizational semantic" that "used in combination with virtual folders, the users can create on-the-fly organizations of the digital assets". Thus, it is derived that the system/methods of said US invention have the limitation of providing very rigid or restricted organizations of digital assets for only displaying purposes. These systems/methods are just capable of displaying the digital assets differently than the underlying system only according to a pre-determined organization (based on asset categories).
PRELIMINARY DEFINITIONS
In order to avoid confusions and facilitate understanding of descriptions related to the present invention, this section provides numerous and detailed definitions of key concepts in the context of the present invention.
In the field of the invention, the word "attribute" usually refers to pieces of data intrinsically related to objects. It is important to remark the fact that the values categorized as "attributes" depend on the object itself, and not on external factors, the environment, history, or other parts of the system. For instance, when considering file system objects (such as files, directories, or symbolic links, for example) the following are examples of "attributes": the file size, the owner of the file, the creation or modification times, or specific access permissions associated to each defined security domain. Some examples of things that cannot be considered "attributes" in the current sense are: the user accessing an object or his/her role, the time of the system, the properties of the back-end storage where the object resides (e.g. the disk size, or the maximum file size allowed in a particular file system), or the path in a hierarchical tree followed to reach an object for a particular access (as some file systems allow different paths to point to a single object, and the specific path followed for a particular access depends on factors external to the object).
The "attributes" of an object may contain information that the system (or, in particular, the file system) knows how to interpret. For example, a file system may know that an object representing a file with a size value of zero represents an empty file (i.e. a file with no data) and may use that information for its own advantage (e.g. to apply certain optimizations); similarly, a system may also be able to interpret and access control list associated to a particular object, and be able to prevent unauthorized access based on said access control list.
On the other hand, the "attributes" may also contain data that do not necessarily have a meaning for the system; in other words, they may consist of arbitrary pieces of data assigned by an upper layer, an application, ora user. Some literature refers to this particular type of "attributes" as "extended attributes", and they are also intrinsically attached to a particular object. A typical example of such arbitrary "attributes" could be "tags" that some systems assign to objects in order to categorize them. For example, they could be used to tag an audio file as containing rock music, and/or belonging to a certain album by a certain author; then, an application could use such information to perform specific actions, such as display the name of the file in a particular fashion. Of course, the fact that a particular file contains rock music does not have to have any meaning for the underlying file system. Other examples of use could be a document management system using such "attributes" to associate document properties to a document file (e.g. an expiration date, the author, or who modified the file and when).
Arbitrary "attributes" can also be used as a means by middleware or system software to provide support for functionalities not available in an underlying file system. For example, if an underlying file system does not support long names, a middleware could save a long name in an attribute and display it when required, while the underlying file system keeps using short names for its own management.
Finally, object "attributes" can also consist of references to other objects (either direct or indirect), or data that can be used to generate such references. For example, a directory object in a file system may have an attribute indicating which its parent directory is, or a symbolic link may contain a path that can be converted to a reference to a different file system object.
In summary, "attributes" are those pieces of data considered properties intrinsically attached to the object itself and not depending on the environment, properties or behaviour of elements other than the object. Unless explicitly said otherwise, it will not be made a distinction between the "attributes" and "extended attributes" or "tags" and all of them will be named as simply "attributes".
As it will be explained in later descriptions, a feature of the present invention is the ability of some "objects" to "refer to" other "objects". In several embodiments, it will be considered that the "referred objects" represent the "contents" of the "referring object" (for example, in a file system environment, it could be considered that a file, as an abstraction, is represented by a first object, while the actual data contained in the file is represented by a second object, referred by the first object - so the second object represents the "contents" of the first object). Such "references" to "objects" are, in fact, pieces of data that allow (either directly or indirectly) to identify, and eventually to reach, the "referred object" and can be considered as pieces of data intrinsically related to the "referring object"; they can be considered, therefore, "attributes" of the "referring object". Despite being "attributes", the expression "contents location" will be used to categorize entities specifically related to such pieces of data. In particular, a piece of data leading to a "virtual object" (which will be defined below) is called a "virtual contents location", and a piece of data leading to a "requestable object" (also defined below) is called a "requestable object contents location". An entity able to contain one or more of such pieces of data is called a "contents location container". Therefore, when said that a certain type of object (e.g. a "virtual object") comprises a "contents location container", this is equivalent to say that said object has an "attribute" whose values are one or more pieces of data of type "virtual contents location" or "requestable object contents location". (Note that having a "contents location container" able to contain multiple "contents locations" of either type, is equivalent to having multiple "contents location containers" with single values in them; the first form will be mainly used, but it must be noted that both forms are interchangeable.)
It is important to note that a "content location" piece of data does not need to be a direct reference to an object: it may be and indirect reference (requiring several "hops" to reach the "target" object) or may even require, at some step, some processing to be converted to an entity allowing to reach the "target" object (e.g. an opaque piece of data that has to be decoded by a specific module to determine to which object it refers).
In the context of the present invention, a "user" makes requests related to one or more "objects", and responses to such requests are produced by a "virtual shell". By "user" it is meant the user, application, library, daemon or any other kind of system or mechanism able to make requests related to "objects" through any means. In a general sense, an "object" may be any entity that can be operated with (e.g. queried, updated, modified, created, removed, interacted, etc.); for the sake of clarity, different types of objects will be defined and differently named to avoid confusions.
The expression "underlying objects" is used to refer to "objects" that exist outside the "virtual shell". As any other object, such "underlying objects" may have "attributes" and, for convenience, it may be assumed that such "underlying objects" are comprised in one or more "underlying object repositories" (or simply "underlying repositories"). For example, in the context of a conventional file system, the "underlying objects" could be files, directories and/or any other file system objects, and the file system itself could be considered an "underlying repository". Note that the "underlying repository" may have properties which do not depend on any particular "underlying object" contained in it, but may be considered related to them (e.g. in the case of a file system, the total amount of free space, or the maximum size allowed for a file, could be considered examples of such properties associated to the "underlying repository" itself).
In the context of the invention, the "user" cannot issue a request directly referring an "underlying object". In order to be "accessible" by the "user", an "underlying object" must have one or more corresponding "requestable objects". Opposite to "underlying objects", "requestable objects" exist inside the "virtual shell", and may refer to one or more "underlying objects". There are multiple possible relations between "requestable objects" and "underlying objects". The simplest one would be a "requestable object" being related to a single "underlying object" (e.g. a file in an underlying file system made available as a single "requestable file" through the virtual shell). Nevertheless, other combinations are possible: a single "requestable object" may refer to multiple "underlying objects" (e.g. a "requestable file" being striped into two or more fragments stored in different devices, or a "requestable file" referring to different "underlying files" containing different historical versions of the "requestable file" contents); multiple "requestable objects" may refer to a single "underlying object" (e.g. two different "requestable files" being stored in different ranges of a single "underlying file", such as in a package file as "tar" or "zip"); and, of course, any combination of the previous. Additionally, a "requestable object" may not refer to any "underlying object" at all; that would be the case where the "requestable object" is generated and handled internally by the "virtual shell" itself (e.g. a file whose contents can be dynamically generated as needed, such as /dev/null or /dev/zero in Unix environments). Finally, there may also be "underlying objects" with no corresponding "requestable objects" inside the "virtual shell"; such objects cannot be explicitly manipulated by "user" requests, but may be used internally by the "virtual shell" to keep any kind of data (e.g. the "virtual shell" can use one or more "underlying files" to keep its state information and be able to recover in case of failure, and such "underlying files" may not be accessible, or even visible, by the "user", so that no corresponding "requestable object" is related to them.
Regarding "attributes", in the context of this invention, it is considered that, when an "underlying object" is related to at least one "requestable object", the "underlying object" attributes can be made accessible as equivalent "requestable object" attributes; in other words, when a "user" request involves querying or modifying a "requestable object" attribute, the corresponding "underlying object" attribute or attributes can also be queried or modified accordingly.
As an example, if referring to a conventional file system, it could be considered a "requestable object" as the representation, inside the "virtual shell", of the data contents of a file (both in the case where such data is stored externally in an "underlying object" - e.g. a set of blocks in the disk - or is generated on-the-fly by the system itself). Finally, the invention deals with a third type of objects: the "virtual objects". The "virtual objects" exist inside the "virtual shell" and as any object, can have a set of "attributes" (for clarity, attributes of "virtual objects" will be often referred as "virtual attributes"). They serve two main purposes: providing structural support, and providing "virtual views" of "requestable objects".
As a structural support element, a "virtual object" does not have to refer to any particular "requestable object": they may be purely virtual constructs that can be used, for example, as containers or as references to other elements (for example, in a file system context, a "virtual object" may act as a "virtual directory", containing a list of entries, and not corresponding to any particular actual directory in an underlying file system; also, as a different example, a "virtual object" may have an attribute referring to a particular entry in a namespace, acting as a "symbolic link" or "shortcut").
When acting as "virtualization" elements, "virtual objects" usually refer to one or more "requestable objects", having a set of virtual attributes independent from the "attributes" of the referred "requestable objects". Then, the "virtual objects" may either offer potentially different "virtual views" of the same "requestable object", or combine multiple "requestable objects" into a unified "view".
As an example of the structural support functionality, different "virtual objects" may contain different values for a set of categorization attributes, but refer to the same "requestable object"; then, a categorization tool would classify the same "requestable object" in different ways depending on the "virtual object" being accessed (which, in turn, could depend on the user, his/her role, or the application being used). In a related example, a "virtual object" can transparently alter the accesses to a referred "requestable object" (e.g. accessing a first "virtual object" may cause the encryption of the object contents, while accessing a second "virtual object" referring to the same "requestable object" may access the contents in clear - and the selection of the "virtual object" to use could depend, for example, on the user/application making the access).
As an example of the virtualization functionality, a "virtual object" could refer to several "requestable objects" and divert the requested access to one of them depending on environment conditions (for example, a "virtual object" may represent a file containing a photography; when accessing from a fully fledged computer, the "virtual object" may direct the access to a "requestable object" containing a high resolution version, while when accessing from a low capacity hardware - or through a slow connection - the "virtual object" may divert the access to a different "requestable object" containing a scaled down version).
When the "virtual shell" has support for "virtual objects", requests from the "user" are related to such "virtual objects"; when "virtual objects" are not supported, then the requests are related to "requestable objects". In the context of the present invention, requests are never directly related to "underlying objects".
All types of "objects" are "organized" in some way, so that they can be located and/or referred to. The term "namespace" refers to the way objects are named and organized. Thus, the terms "namespace" management or handling include the set of data required to maintain such organization, and the operations that can be done to create, destroy, modify, or query such organization.
Examples of "name spaces" are the hierarchical tree-like organizations used by many popular file systems (such as NTFS, or ext4, for example). Another example could be the hierarchy, usually generated in an automatic way by some applications, that organizes, for example, music files by author and album name.
The term "entry" is used to refer to an element in a "namespace". In general, an "entry" may either act as a container for other "entries", or be final. Independently of being final or not, an "entry" may also represent an actual object, or may be simply an artefact to facilitate the organization of the actual objects.
For example, in the ext4 file system, a directory is an actual type of object (with its corresponding data structures stored in the disk and its own internal identifier), which is able to act as a "container" of other file system objects (either other directories, files, or other types of objects), and which has an "entry" representing it in the file system "namespace".
On the contrary, in a system that organizes existing music files by author and album name, an "entry" representing a particular album and acting as a container for the files with the album tracks does not have to correspond to an actual object (e.g. a directory or a file in the underlying system).
Some "namespaces" may be generated automatically from a set of objects and some generating rules, and might not be directly operable. On the contrary, other "namespaces" allow direct manipulation via operability. Typical operations on a "namespace" may include creating or removing "entries" in a given "container entry", hiding or showing specific "entries", renaming "entries", moving or duplicating "entries" from one container into another, or linking an "entry" to either an object or another "entry". Manipulating the "namespace" may or may not have an effect on the actual objects being represented, depending on the system offering the "namespace". Also, a particular system providing one or more "namespaces" to organize its objects may restrict the available operations, either by reducing the set of operations that can be performed, or by imposing restrictions on the conditions for a certain operation to be valid or authorized.
In the context of the present invention, the term "virtual namespace" (or simply "namespace") will be used to refer to each of the namespaces handled by the "multi-namespaces module". Such "namespaces" will comprise "entries" able to refer to either "virtual objects" or "requestable objects" (depending on the particular embodiments) and to organize them. The term "underlying namespace" will be explicitly used to refer to the organization of "underlying objects" in a particular "underlying repository".
Note that "requests" made by the "user" include accesses to the "objects" (either "virtual objects" or "requestable objects") but also requests about the "virtual shell" itself and the environment (e.g. how many "objects" are handled by the "virtual shell", or how much storage capacity the "virtual shell" can handle through any attached "underlying repositories"). By "accesses" to "objects" it is meant any type of "access" including manipulation of "contents" (pure data associated to the "object"), such as creating contents, removing totally or in part the contents, replacing totally or in part the contents, adding contents, reading totally or in part the contents, querying information about the contents or any combination of the previous; manipulation of "attributes", such as creating attributes, removing attributes, replacing totally or in part attribute values, adding attribute values, reading values, querying information about the attributes, or any combination of the previous; and also including operations on the object as a whole, such as creating a new object, removing an object, duplicating an object, querying or setting attributes from or to the object (such as setting the owner, or querying the size of associated contents, for example), or associating or de-associating a particular object to or from an "entry" in a given "namespace". Finally, requests from the "user" may also involve manipulation of namespaces, including, for example, namespace creation or removal, selection of namespaces, or changes in the organization of a particular namespace.
As mentioned when discussing the "namespaces", a given object may be potentially associated to multiple "entries" in a "namespace", and even appear simultaneously in several "namespaces". The requests allowed when referring to an "object" via the different "entries" and the responses to such requests do not have to be necessarily the same, but the "object" must be left in a consistent state (i.e. further requests must be possible - unless the object has been destroyed - and the expected responses must be generated). In the context of the current invention, the "virtual shell" may be the responsible to guarantee such consistency.
Additionally, several "requests" may be "processed" simultaneously, related to the same or to different "objects". In the context of the current invention, the "virtual shell" may be also responsible to coordinate simultaneous or concurrent requests and guarantee consistent responses and a consistent state of the "objects".
It is important to note that certain requests may have sense only when performed in certain sequences (for example, in a file system environment, a "read" or "write" request may be valid only when there has been a previous successful "open" request, and before a corresponding "close" request). In such situations, the "virtual shell" may also be responsible to track the history of requests and responses, and to keep any necessary data to guarantee that any possible request dependences are satisfied and responses are correct.
The term "semantics" will be used to refer to the specification of "user" requests and responses to interact with the "virtual shell", the conditions under which such requests are valid (including the inter-request dependences mentioned above), the expected results of such requests, as well as the potential effects on the "objects" related to the requests, or any other "objects" referred by them or by the "virtual shell" itself.
In the field of storage systems, it may be understood that file systems also provide a specific "semantics". File systems offer a high level view of the storage to the user (and the applications), allowing the organization of data into files, which are usually placed in a hierarchical namespace composed of directories, subdirectories and other types of objects. Applications use the file system functionalities via the file system interface, which is composed by a set of operations (sequences of requests and responses) that allow the interactions with the file system. So, file systems also define the "semantics" of such operations, establishing if a particular request is valid and the possible effects it may have under specific circumstances. Some of such "semantics" are considered, officially or de- facto, as "standard" (such as POSIX, or the NTFS behaviour in the Windows world); applications expect file systems to honour them, and failure to do so may result in the application either refusing to work or malfunctioning (unless the corresponding application program is tailored to recognize and deal with the non-standard specifics of the file system).
In the context of this invention, it will be said that a "virtual shell" is able to offer a "file system semantics" if it is able to intercept an interface of requests and responses and provide a behaviour which is compatible with the "semantics" of conventional file system (e.g. POSIX or NTFS). In particular, this means that a conventional application should be able to work and be fully functional on top of the "virtual shell" as if it was a conventional file system, without needing to make any changes to the application.
Obviously, it can be possible for a "virtual shell" to offer multiple "semantics" simultaneously (e.g. it could offer a POSIX standard interface to POSIX applications, a NTFS semantics to a Windows-based application, and even a different semantics to a specially tailored application), maintaining a consistent behaviour for each particular "semantics".
It is important to remark that, for the majority of file systems, the "file system semantics" implies that result of a particular interface request is not determined only by "attributes" intrinsically related to the object (file, directory...) being accessed, but also to more complex factors, such as the position of the object in a particular hierarchy, the path followed to reach it, environment factors such as time of access or the geographical location of the client trying to access it, or even the timing of a sequence of operations. For example, in POSIX, the authorization to remove an object does not depend on the permissions of the object itself, but on the permissions of the parent directory, according to the path followed to reach the object; on NTFS, certain directory operations such as renames can be forbidden if certain requests are being processed on descendant objects in the "name space" hierarchy; and as a last example, the position where some new content is written into a file may also depend on other requests being concurrently carried out by other users or applications. In order to handle these conditions and use them effectively, it is not enough with a basic infrastructure to store "attributes" or tags associated to an object: additional means to capture and handle the relevant information are needed. In the context of the present invention, data and relevant factors not being intrinsically related to particular objects (i.e. not being "attributes") will be referred as "monitoring parameters".
So, the term "monitoring parameters" will be used to refer to any parameter, property or value that can be used to determine and/or specify a condition of the system or the virtual shell (e.g. functional or performance conditions), and that can be used to launch predetermined actions as response to said conditions, excluding the values that can be intrinsically related to particular objects (i.e. excluding the "attributes").
For instance, if the "user" issuing a request satisfies a determined role profile and the request is made within a certain time period, the "virtual shell" may determine the requests that can be executed according to said conditions which are comprised in the "monitoring parameters" (even regardless of the "attributes" of the object being accessed - e.g. forbidding all administrator accesses out of office hours). The virtual shell may also modify the result of such request, that is to say, the same request on the same virtual object may be allowed but deliver different responses depending on the "monitoring parameters" - for example, depending on the application issuing the request (e.g. providing encrypted data to a backup application, and clear data to an editor).
The "monitoring parameters" can take into account different combinations of variables such as, for example, the user making the request, the access pattern followed by previous requests, the optimum object size of the repository of the object being accessed, the time of the request, the available resources (e.g. hardware capacity, connection bandwidth, etc.), arbitrary information communicated to the system or the virtual shell by any means (e.g. a user or application sending a message activating a specific "operation mode" through some interface), and so on.
Another example of a "monitoring parameter" is the path in a hierarchical organization followed to reach a particular object for a particular access. Many systems allow a single object to appear in different positions of the organization (in other words, a single object may have multiple "paths"); even if the list of all possible paths might be considered an "attribute" of the object, the precise path used by a particular access does not depend on the object itself, but probably on the history of previous requests issued by the same user (which does not have anything to do with intrinsic properties of the object being accessed).
Other examples of "monitoring parameters" not being object "attributes" are, for instance, the connection bandwidth between the user making a request and the system fulfilling such request, the properties of any hardware used by the system (e.g. the maximum capacity of a particular disk, the amount of memory of the device used by the client - user or application - to use the system, or the CPU speed of a certain server), performance characteristics of any related software (e.g. the optimum directory size in a particular underlying file system), the user making the request and his/her role, the time when the request was issued, the history of previous requests made by a user, or to the system in general, the number of users, the geographical location of the user making the request or a particular "mode of operation" activated either automatically or explicitly through any mechanism.
In the context of the present invention, the "monitoring parameters" are managed by the virtual shell, but said parameters can be located in the virtual shell, or in some of the underlying file systems, or in any related devices, or in the environment, or provided by any other source by any means, or any combination of the previous. Thus, the virtual shell managing the "monitoring parameters" must be understood as the virtual shell creating and managing said created parameters, or the virtual shell capturing and managing said captured parameters from underlying file systems, related devices, the environment, other sources, or any combination of the previous.
For the sake of clarity, and considering that some of the embodiments of the invention are focused to provide a "virtual shell" offering "file system semantics", the "monitoring parameters" have been divided into two different sets: those which allow the "virtual shell" to obtain the necessary context information to offer a minimum "file system semantics" (which we call "context monitoring parameters") and the rest of monitoring parameters (the so-called "non-context monitoring parameters").
In particular, "context monitoring parameters" comprise: the identity of the "user" issuing the request (by "identity" referring to user identifiers, groups and/or any specific identity information used by a particular system), the path or paths in a particular namespace used to reach the object or objects related to a particular request, the history of previous requests related to a coherent sequence of requests (e.g. an open/read/close sequence), the capacity and/or availability of the underlying object repositories (if any), and the current time. The rest of "monitoring parameters" (i.e. not being "context monitoring parameters") will be referred as "non-context monitoring parameters".
SUMMARY OF THE INVENTION Taking into account the previously described linnitations of the currently known technologies, there thus still exists a need for improved systems, methods and computer programs solving the drawbacks of said current technologies which offer very restricted organization possibilities The object of the present invention is to fulfil such a need.
Said object is achieved with a virtual shell according to claim 1 , a context monitoring parameters module according to claim 22, a multi-namespaces module according to claim 23, an underlying interface module according to claim 24, an interception module according to claim 25, a logic module according to claim 26, a method according to claim 27, and a computer program product according to claim 31 . In a first aspect, the present invention provides a virtual shell for producing one or more responses to a user request related to one or more objects, the virtual shell comprising:
a context monitoring parameters module for producing data related to one or more context monitoring parameters;
· a multi-namespaces module for producing data related to one or more namespaces, each of said namespaces comprising one or more entries, each of said entries referring to a set of requestable objects, and each of the requestable objects having a set of requestable object attributes and being referred from one or more entries of one or more of said namespaces, wherein the organization of each namespace is decoupled from the attributes of the requestable objects referred from the entries of the namespace, said organizations allowing arbitrarily structured namespaces;
an underlying interface module for producing data related to a set of underlying objects comprising data related to requestable objects related to the requested objects;
• a logic module for interacting with the context monitoring parameters module, for interacting with the multi-namespaces module, for interacting with the underlying interface module, and for producing data for the interception module from a set of the data produced by the context monitoring parameters module, a set of the data produced by the multi-namespaces module and a set of the data produced by the underlying interface module, the interaction with the multi-namespaces module depending on a subset of the set of the data produced by the context monitoring parameters module;
• an interception module for intercepting the user request, for interacting with the logic module and for producing the responses to the user request from a set of the data produced by the logic module.
This virtual shell allows overcoming the above-mentioned limitations of current storage systems related to restricted organization possibilities in terms of, for example, low flexibility and difficulty of integration, thanks to the combination of the different virtual shell modules. In particular, by integrating a multi- namespaces module where entries can be indirectly linked to objects, the virtual shell allows one or more users to organize the same set of objects in different ways, and by combining it with a context monitoring parameters module and the logic module, each of such multiple organizations can be used according to a fully fledged semantics (for example, a file system semantics, which requires not only object data, but also environment information - monitoring parameters - to be implemented); so, the virtual shell can provide flexibility through multiple fully-functional views of a set of objects. Additionally, the underlying interface module (which allows accessing underlying storage systems), combined with the logic module and the context monitoring parameters module, allows to leverage the different functionalities of different storage systems (so that the resulting unified storage system is not limited by the least capable system, but can be complemented by the modules in the virtual shell to compensate any missing functionalities). When finally combined with the multi-namespaces, which decouples the namespaces from a given storage system, the virtual shell also provides easy means for integrating multiple underlying storage systems, in a completely transparent way (even allowing, for example, to divert files in a single directory to different underlying storage systems, according to any given criteria), and for providing multiple fully functional and flexible simultaneous organizations of data. The consistency of the system is achieved through the function of the interception module which, by intercepting user requests and preventing a direct access to the underlying systems, allowing the virtual shell to have a complete control of the involved objects and guarantee their consistency.
In a second aspect, the present invention provides a context monitoring parameters module for use in the virtual shell, the context monitoring parameters module comprising:
• computing means for interacting with the logic module of the virtual shell;
• computing means for producing data related to one or more context monitoring parameters. This context monitoring parameters module allows handling effectively the environment conditions that allow providing a non-trivial semantics (for example, a file system semantics). Such a semantics cannot be implemented based only on "attributes" or tags associated to individual objects, and the context monitoring parameters module provides the additional means to capture, handle, and store (if needed) the additional necessary data. In particular, the context monitoring parameters module needs to deal, at least, with the identity of the user making the request, the paths specified to reach a particular object in a particular request, the history of previous requests associated to a particular sequence of requests, the capacity and/or availability of the underlying object repositories (if any), and the current time From this information, the context monitoring parameters module can provide data to the logic module, so that it can interact with the other modules providing them with data derived from context monitoring parameters. In a third aspect of the invention, it is provided a multi-namespaces module for use in the virtual shell, the multi-namespaces module comprising:
• computing means for interacting with the logic module of the virtual shell; • computing means for producing data related to one or more namespaces, each of said namespaces comprising one or more entries, each of said entries referring to a set of requestable objects, and each of the requestable objects having a set of requestable object attributes and being referred from one or more entries of one or more of said namespaces, wherein the organization of each namespace is decoupled from the attributes of the requestable objects referred from the entries of the namespace, said organizations allowing arbitrarily structured namespaces. This multi-namespaces module allows the simultaneous co-existence of multiple namespaces (or organizations of objects) and decoupling the entries representing the objects from the objects themselves, so that a particular object can be referred by multiple entries, possibly from different namespaces. Additionally, this module handles the multiple namespaces in such a way that each namespace is decoupled from a particular underlying repository; this allows, for example, to offer a file system directory hierarchy (a namespace) where the files in a particular directory actually reside in different underlying object repositories. The decoupling of namespaces and entries from the corresponding underlying objects and underlying repositories also allows a separated management that leads to ease of implementation and performance optimizations.
In a fourth aspect, the present invention also provides an underlying interface module for use in the virtual shell, the underlying interface module comprising: · computing means for interacting with the logic module of the virtual shell;
• computing means for producing data related to a set of underlying objects comprising data related to requestable objects related to the requested objects. The underlying interface module allows isolating the means to interact with the possibly multiple underlying repositories containing underlying objects. This allows providing the logic module with simplified interface to deal with underlying objects, while the underlying interface module can take care of the different characteristics of the different underlying repositories, enabling the virtual shell to be extended to support new types of underlying repositories. In a fifth aspect of the invention, it is provided an interception module for use in the virtual shell, the interception module comprising:
• computing means for intercepting the user request;
• computing means for interacting with the logic module of the virtual shell;
• computing means for producing the responses to the user request from a set of data produced by the logic module.
This interception module allows preventing direct interactions between the user and the underlying repositories (and, in particular, with the underlying objects), thus providing a unique entry point from the user to the virtual shell by intercepting the user requests. This effectively gives complete control of the underlying repositories (and consequently, the underlying objects) to the virtual shell, enabling the virtual shell to guarantee the consistency of the whole system. Additionally, the interception of all requests from the user and the generation of corresponding responses, allows the virtual shell to emulate an actual underlying repository and let any unmodified user applications to work on top of the virtual shell as if it was a native underlying repository (e.g. in a Windows system, a virtual shell could use the interception module to emulate an NTFS file system, and let any standard applications to run on top of said virtual shell, without taking into account which are the actual underlying file systems).
In a sixth aspect, the present invention provides a logic module for use in the virtual shell, the logic module comprising:
• computing means for interacting with the context monitoring parameters module of the virtual shell;
• computing means for interacting with the multi-namespaces module of the virtual shell; • computing means for interacting with the underlying interface module of the virtual shell;
• computing means for producing data for the interception module of the virtual shell from a set of data produced by the context monitoring parameters module, a set of data produced by the multi-namespaces module and a set of data produced by the underlying interface module, the interaction with the multi-namespaces module depending on a subset of the set of data produced by the context monitoring parameters module. This logic module allows dealing with user requests according to one or more specific semantics, by coordinating the interaction with the other modules of the virtual shell and by elaborating the data received from them, according to specific rules, heuristics and/or algorithms. As a result, the virtual shell is able to behave following non-trivial semantics and, for example, be able to completely emulate a file system (such as NTFS or POSIX file systems). Additionally, the logic module is also able to provide the multi-namespaces module with the necessary data to select particular namespaces to use for particular requests, either based on internal algorithms or from data received from other modules of the virtual shell.
In a seventh aspect, the present invention provides a method for producing one or more responses to a user request related to one or more objects, the method comprising:
• producing, by means of a context monitoring parameters module, data related to one or more context monitoring parameters;
• producing, by means of a multi-namespaces module, data related to one or more namespaces, each of said namespaces comprising one or more entries, each of said entries referring to a set of requestable objects, and each of the requestable objects having a set of requestable object attributes and being referred from one or more entries of one or more of said namespaces, wherein the organization of each namespace is decoupled from the attributes of the requestable objects referred from the entries of the namespace, said organizations allowing arbitrarily structured namespaces;
• producing, by means of an underlying interface module, data related to a set of underlying objects comprising data related to requestable objects related to the requested objects;
· producing, by means of a logic module, data for an interception module from a set of the data produced by the context monitoring parameters module and obtained by means of the logic module interacting with the context monitoring parameters module, a set of the data produced by the multi- namespaces module and obtained by means of the logic module interacting with the multi-namespaces module, and a set of the data produced by the underlying interface module and obtained by means of the logic module interacting with the underlying interface module, the interaction of the logic module with the multi-namespaces module depending on a subset of the set of the data produced by the context monitoring parameters module;
· intercepting, by means of the interception module, the user request and producing, by means of the interception module, the responses to the user request from a set of the data produced by the logic module and obtained by means of the interception module interacting with the logic module. In a eighth aspect of the invention, it is provided a computer program product comprising program instructions for causing a computer to perform the method for producing one or more responses to a user request related to one or more objects. The invention also relates to such a computer program product embodied on a storage medium (for example, a CD-ROM, a DVD, a USB drive, on a computer memory or on a read-only memory) or carried on a carrier signal (for example, on an electrical or optical carrier signal).
Optional and advantageous features of the virtual shell and related method are set out in the dependent claims.
Additional objects, advantages and features of embodiments of the invention will become apparent to those skilled in the art upon examination of the description, or may be learned by practice of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS Particular embodiments of the present invention will be described in the following by way of non-limiting examples, with reference to the appended drawings, in which:
Figure 1 is a schematic representation of a modular architecture of a virtual shell according to an embodiment of the invention;
Figure 2 is an entity-relationship diagram of a data model related to a virtual shell according to an embodiment of the invention;
Figure 3 is an entity-relationship diagram of a data model related to a virtual shell according to an embodiment of the invention, said data model further comprising requestable object contents locations in relation to Figure 2;
Figure 4 is an entity-relationship diagram of a data model related to a virtual shell according to an embodiment of the invention, said data model further comprising virtual objects and virtual contents locations in relation to Figure 2; Figure 5 is an entity-relationship diagram of a data model related to a virtual shell according to an embodiment of the invention, said data model further comprising requestable object contents locations in relation to Figure 4;
Figure 6 is a schematic representation of a possible structure of data and some related logic for the multi-namespaces module producing a reproducible ordered list of entries, according to an embodiment of the invention.
DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
In the following descriptions, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be understood, however, to one skilled in the art, that the present invention may be practiced without some or all of these specific details. In other instances, well known elements have not been described in detail in order not to unnecessarily obscure the present invention.
With reference to Figure 1 and Figure 2, the virtual shell 100 may be described as comprising:
• a context monitoring parameters module 1 16 for producing data related to one or more 202 context monitoring parameters 201 ;
a multi-namespaces module 1 18 for producing data related to one or more 205 namespaces 206, each of said namespaces 206 comprising one or more 207 entries 208, each of said entries 208 referring to a set 210 of requestable objects 21 1 , and each of the requestable objects 21 1 having a set 213 of requestable object attributes 212 and being referred from one or more 209 entries 208 of one or more of said namespaces 206, wherein the organization of each namespace 206 is decoupled from the attributes 212 of the requestable objects 21 1 referred from the entries 208 of the namespace 206, said organizations allowing arbitrarily structured namespaces 206;
an underlying interface module 1 17 for producing data related to a set of underlying objects comprising data related to requestable objects related to the requested objects;
· a logic module 109 for interacting 1 12 with the context monitoring parameters module 1 16, for interacting 1 14 with the multi-namespaces module 1 18, for interacting 1 13 with the underlying interface module 1 17, and for producing data for the interception module 104 from a set of the data produced by the context monitoring parameters module 1 16, a set of the data produced by the multi-namespaces module 1 18 and a set of the data produced by the underlying interface module 1 17, the interaction 1 14 with the multi-namespaces module 1 18 depending on a subset of the set of the data produced by the context monitoring parameters module 1 16;
• an interception module 104 for intercepting the user request 102, for interacting 106 with the logic module 109 and for producing the responses 103 to the user request 102 from a set of the data produced by the logic module 109. This "main approach" of the virtual shell, introduced in the previous paragraph, mainly offers higher flexible organization possibilities in comparison with the currently known technologies, as it will be argued in following descriptions.
The virtual shell may produce responses to user requests in such a way that the objects related to requests represent files, directories and other file system objects, and the virtual shell of the invention presents the user virtual, but fully functional, views of file system directories, allowing the user to organize the files directories and other file system objects as he/she pleases, while imple menting a completely different directory layout in one or more underlying file systems.
The user is assumed to communicate by some means with another system or mechanism actually able to perform manipulations, accesses or operations on a particular object or objects. The interception module is responsible for cap turing or intercepting such communication between the user and the system or mechanism meant to perform the manipulation, access or operation on the desired object or objects. That involves intercepting user requests and generat ing the corresponding responses (so that the virtual shell appears as "invis ible" to the user).
By intercepting all requests from the user, all direct interaction from the user to underlying repositories and underlying objects comprised in them may be pre- vented. This allows the virtual shell to have full control of the underlying repos itory, being able, for example, to generate and handle metadata that allow extending the functionalities of the underlying repositories (e.g. providing mul tiple arbitrary namespaces), without risking inconsistencies due to untracked interactions from the user.
For example, without the interception module, a certain system (e.g. a POSIX- based file system) could create an additional arbitrary name space by using directories and symbolic links to the entries in the "official" hierarchy; nevef theless, nothing prevents the user from removing the targets of the links in the "official" hierarchy, so that the links would be broken, and the additional namespace would become unusable (not to speak about the fact that symbolic links do not behave exactly as regular files, which may also cause trouble). On the contrary, with the interception module intercepting the user requests, intents to remove the target of a reference would be tracked and could be either prevented, or an action could be done on the referrer (e.g. also remov ing it) to avoid inconsistent states. Moreover, more sophisticated functionality is also available through an interception module: for example, the "official" un derlying hierarchy could be completely hidden to all or some users (effectively preventing any unwanted manipulation), and the responses to requests direc ted to referrer entries could be manipulated so that they behave as actual files, and not symbolic links, for example. One of the possible approaches to capture such communication is making the interception module appear as the entity the user wants to communicate with (e.g. by providing the same interface and semantics, i.e.the conventions used to send requests and/or receive responses, and the protocol used in the com munications). For example, if the user was a software application that re- trieves and/or stores data in a file system, the interception module could provide a file system interface in order to allow the software application to work as if it was communicating with a real file system.
The interception module may fulfil another functionality: translating the poten tially diverse interfaces used by the user (e.g. when expecting different file systems with different calls and possibly different semantics) to a common interface with the logic module. This allows isolating all interface-dependent characteristics, making the development of the rest of the virtual shell easier. Obviously, having a single interception module able to deal with different user interfaces and translating them to a single logic module interface can be considered equivalent to have multiple interception modules specialized in specif ic user interfaces and connected to the same logic module with the common interface.
Other approaches could consist on hooking into the user, the intended target system, the communication means itself, the communication media, any auxiliary and/or complementary systems used by either the user and/or the intended target system, or any combination of the previous, with the intention of being able to totally or partially prevent, emulate, capture, record, duplicate, ma nipulate or alter in any way such communication. Using the previous example of a software application using a file system, the interception module could involve the modification of any combination of the application itself, any libraries used by the application, the operating system call mechanism, the operating system and related drivers, any daemons involved at any level and related lib raries, and/or the media used to transmit the communications.
There are multiple and well-known techniques for interception. Just to mention a couple of examples, an embodiment intended to provide a POSIX file system semantics could use FUSE as a base for the interception module. FUSE (File system in User space) is a well-known, and robust, tool to intercept file system operations and implement file system replacements. Originally developed for Linux, it is currently available for several operating systems (in eluding, FreeBSD, NetBSD, Mac OS X/Darwin, OpenSolaris and GNU/Hurd), and has been also ported to Windows. Therefore, a similar interception com ponent could be implemented, at least, for all these operating systems. FUSE provides a kernel module that exports VFS-like callbacks to user-space applio ations. An embodiment of the virtual shell for the decoupling framework was implemented as one of such applications receiving the file system operation callbacks. In particular, the interception module component in such embodiment is the one responsible for receiving the callbacks from FUSE, forwarding the requests to the logic module, and providing sensible defaults in case some of the requests not implemented by the virtual shell. A similar approach in a MS Windows based environment could consist in us ing the so-called mini-filter drivers. The mini-filter drivers can be hooked in the Windows Operating System Input/Output stack, and allow to intercept the requests and responses to and from the underlying file system, as well as ma- nipulate, or even cancel such requests and/or responses. These features can be used to implement a mechanism similar to the one provide by FUSE for POSIX-like file systems. Of course, other well-known mechanisms exist and could be used as interception techniques for virtual shells offering file-system semantics, as well as non-file system semantics (e.g. library interposition, etc.)
The context monitoring parameters module may be able to capture, store and produce data related to context monitoring parameters. For example, in some embodiments of the virtual shell offering a file system interface and semantics, the user identity can be obtained from the user request, which could be fop warded to the context monitoring parameters by the logic module.
The path used to refer to a particular object in a namespace can be decomposed in directory components by the context monitoring parameters module, and the individual components can be sent back to the logic module; the logic module may then ask the multi-namespaces module for individual validation of each path component and for obtaining references to any related objects, so that they can be transiently kept by the context monitoring parameters module and provided to the logic module when required for any procedure (e.g. to i plement a semantics feature involving to the parent directory of an object be- ing accessed).
Sequences of related requests can be handled by the context monitoring parameters module by creating a handle upon the first request of the set of related request (e.g. an open request on a file), keeping such handle, and for warding a reference to it to the logic module for inclusion into the correspond ing response; next related requests (e.g. read or write operations) may include the reference to the handle, so that the logic module can ask the context mon- itoring parameters module to validate it and determine that the sequence of requests is valid.
The context monitoring parameters module can also request the logic module to interact with the underlying interface module to gather information about which underlying interface modules are available, if any, which is their capa city, and how much free storage space is available, for example.
In some embodiments, the context monitoring parameters may use means to obtain the current time (for example, by accessing specific hardware, or by querying it from an external source).
The context monitoring parameters module can maintain a cache for any piece of data gathered, so that they can be reused if needed.
In some embodiments, the context monitoring parameters module can keep persistent or temporary tracking information about the user requests and the corresponding responses generated, so that a log of the virtual shell activity can be produced.
The multi-namespaces module may be in charge of, for example, translating the file system object identifier issued by the application (e.g. a file path) into a file system object reference internal to the virtual shell (e.g. an internal identifi er). This module may also handle requests related to directory or folder con- tent management, and also requests related to renaming file system objects, or moving them to different locations in the directory tree of a particular namespace or several namespaces.
The multi-namespaces module may generate data to be used internally by the virtual shell (usually to be kept by the context monitoring parameters module, in order to be used for future requests). For example, in embodiments of the virtual shell offering a file system semantics, an "open" operation of a certain path may cause the resolution of that path against a particular namespace selected according to context monitoring parameters; in that situation, the response from the multi-namespaces module may comprise not only the data related to the "open" object, but also information about the selected namespace, the particular entry, and/or its "parent" entry, if any (for example, to be associated by the context monitoring parameter module to the corresponding "handle", so that the specific entry and namespace can be tracked and used for handling further related requests). The multi-namespaces module may use directives received through the interaction with the logic module to determine the particular namespace to use for certain requests. Such directives from the logic module can be based on data provided by the context monitoring parameters module. The multi-namespaces module can use a repository (e.g. a database) to store the entries of all namespaces. Entries of different namespaces can have some means to indicate to which namespace they belong (for example, entries may have an associated tag referring to the namespace). In some embodiments, the multi-namespaces module can use multiple repositories to store entries corresponding to different namespaces (e.g. one repos itory per namespace). In this case, a namespace identifier can be used to select the appropriate repository to use. When validating or accessing particular entries, the namespace to be used can be selected from information produced by the context monitoring parameters module (for example, based on identity information: depending on the user making a request, or its role, a particular namespace may be used). Interactions between the logic module and the multi-namespaces module may comprise requests to make arbitrary manipulations to one or more namespaces, such as creating a folder (an entry containing other entries), re moving a folder, renaming a folder, creating entries, removing entries, rena ing entries, moving an entry from one folder to another, etc. just to cite a few examples. Such requests may be directly related to requests from the user. This allows creating arbitrary organizations of the namespaces according to user needs, and completely unrelated to the objects being referred from the entries in the namespaces. Nevertheless, in some embodiments, such arbitrary namespace manipulations can be generated internally by the virtual shell itself, depending, for example, on object attributes, but the multi-namespaces module in the virtual shell of the invention is not limited to that possibility, but allow completely arbitrary namespaces.
In order to achieve said arbitrary namespaces, data related to namespace organization may be associated with the entries. For example, in a hierarchical namespace, an entry may contain a reference (e.g. an identifier) to the entry containing it (e.g. a parent directory in a namespace emulating a file system directory hierarchy). This way, the namespace organization can be handled independently from the objects referred by the entries or their attributes. Additionally, an entry may also contain other information, such data identifying the namespace to which it belongs. Obviously, in some embodiments, depending on the implementation, additional data may need to be explicitly kept to maintain such arbitrary name spaces (for example, which entry is the root entry of a particular namespace - assuming that the concept of root entry exists for such namespace organization). Other embodiments could maintain container hierarchy by having a container entry containing references to its contained entries (e.g. in a file system directory hierarchy, a directory entry could contain references to its children - files, subdirectories, etc.) Note that this functional ity can be provided by the multi-namespaces module independently from any existing or non-existing support in the underlying repositories, if any. In some embodiments of the invention, the multi-namespaces module also keeps references and other data related to objects referred by the entries of the namespaces. On request by the logic module, the multi-namespaces mod- ule may provide the references and any other data related to objects referred by the entries of the namespaces.
In the embodiments where the multi-namespaces module also keeps refer- ences and other data (such as attributes) related to objects referred by the entries of the namespaces, such pieces of data related to objects can be kept as separate entities from the entries themselves. This way, the entries and the objects referred by them can be decoupled, so that the same object can be simultaneously referred from multiple entries, possibly from multiple namespaces. In the same way, multiple entries, even from different namespaces, may refer to the same object.
A way to implement the decoupling between entries and referred objects consists of having the entry containing a reference (e.g. an identifier) of the object being referred. For performance reasons, or simply for implementation convenience, in some embodiments, an entry may also contain some other immutable pieces of data related to the referred objects (such as the type of the object being referred - e.g. in a file system environment, if the entry corresponds to a regular file, a directory, a symbolic link, etc.)
One of the advantages of decoupling the entries from the referred objects is that the namespaces can be handled independently from the referred objects. This has two main consequences: increased flexibility (an n-to-n relation between entries and objects can easily be implemented) and performance ad- vantages (requests related to namespaces do not affect the referred objects, and vice versa; so, there is less risk of contention in the virtual shell operation).
In some embodiments, the multi-namespaces module may use a repository (e.g. a database table) to keep data related to the referred objects by the entries (e.g. requestable objects), including, for example, all of part of their at tributes. In embodiments where the virtual shell is able to provide file system behaviour, the multi-namespaces attributes can keep some of the attributes related to different types of file system objects (regular files, directories, etc.). For ex ample, it may keep security elements (owner, group, access permissions and control lists), statistical information (access times, utilization indicators, etc.), infrastructure-related fields (such as file system identifiers, hard link counters, symbolic link paths, etc.), and extended attributes, for example. In some embodiments where the multi-namespaces module keeps information about attributes of objects related to entries of namespaces, the logic module can be responsible to request such information and handle it appropriately (e.g. by applying security access restrictions and determining if a particular object referred by an entry is accessible or not).
In some embodiments, the data kept by the multi-namespaces module related to objects referred from the entries of the namespaces may contain pieces of data that can be used by the logic module to request accesses to the underly ing repositories. For example, a requestable object representing a file may have the data belonging to such file in an actual file in an underlying file sys tern; in some embodiments, the data related to the requestable object kept by the multi-namespaces module may comprise, for example, the path of the actual file with the data in the underlying file system, so that the logic module can use that path to request the underlying interface module to access the file contents, or to recover attributes related to the underlying object (i.e. the file in the underlying file system) such as, for example, the file size.
In preferred embodiments, each reference to requestable objects comprises a set of data related to a set of underlying objects comprised in one or more underlying repositories; and the underlying interface module is adapted to interact with the underlying repositories taking into account said data related to said set of underlying objects. In some embodiments, the data related to the requestable object that a multi-namespaces module can keep does not need to contain direct information about the underlying object, but one or more pieces of data that the logic mod- ule can convert into information to access the underlying repositories through the underlying interface module. For example the multi-namespaces modules can keep opaque handles related to requestable objects that the logic module can convert into usable directions to underlying files. In some embodiments providing a file system for each namespace, the multi-namespaces module may handle symbolic and hard links as part of the namespace management. For example, symbolic links can be handled by having entries comprising paths pointing to other entries in either the same or a different namespace; hard links can be achieved by having different entries to refer to the same object. These functionalities are independent from any support (or lack of support) for equivalent features in the underlying repositor ies, if any.
In some embodiments, the multi-namespaces module can be able to overlap several namespaces, so that the namespace visible by the user is a combination of two or more namespaces handled by the multi-namespace module. The rules to combine namespaces could be specified by the user through specific requests from the user captured by the interception module, or could be determined internally by the virtual shell from the context monitoring paramet ers. For example, in embodiments offering a file system semantics, a user could use a namespace for home files and a different one for work files; the home namespace could be hidden during work hours and, during non-work hours, it could be visible, overlapped to work namespace - additionally, the files accessed through the work namespace during off-work hours could be treated as read-only. This can be implemented, for example, by the logic module checking a path associated to a request against multiple namespaces handled by the multi-namespace module, and applying the corresponding modifications to the responses according to the namespace used to resolve the path, the context-monitoring parameters, and possible previous requests from the user indicating rules to combine multiple namespaces. In some embodiments implementing file system semantics, the attributes of the requestable objects representing file system objects may include access control (owner, group and related access permissions), directory specific information (e.g. the number of entries) and/or size and time data for non-regular files. Sizes and access times for regular files might be associated to attrib utes of the underlying object containing the data of the represented file, if any; therefore, to access them, a request could be sent to the logic module to trigger the recovery of the underlying object attributes via the underlying interface module. The main goal of the underlying interface module is to isolate the virtual shell from the specific characteristics of the underlying object repositories. In some embodiments, this allows the logic module to refer to underlying objects without having to understand how the underlying objects are stored or even organized in the underlying repositories.
There are several ways to implement such isolation. For example, the logic module can issue a request with an explicit reference to an underlying object (e.g. a file path in an underlying file system): even if the logic module is not able to interpret it, the piece of data can be directly used by the underlying in terface module to access the underlying repository. Alternatively, the logic module may issue opaque handles, that are converted into actual references to underlying repositories by the underlying interface module (e.g. by maintaining a mapping table between handles and actual reference information, or by applying a certain function to the handle - e.g. "decoding" and "verifying" using cryptographic means). Said conversion of underlying object handles could also be influenced by context monitoring parameters; for example, a particular request for an underlying object creation could be diverted to differ ent underlying repositories depending on the free space in each of the avaH able underlying repositories.
Additionally to be used as a means to access underlying objects, the underly ing interface module can also be the way to access and/or interact with an underlying repository as a whole (e.g. attaching or deattaching a particular re pository to the virtual shell, or querying characteristics of the repository, such as the capacity, the maximum file size - if it is a file system -, the optimum performance parameters, etc.)
Obviously, different underlying repositories with different interfaces may co-ex ist and be used simultaneously by the virtual shell. Several alternatives may be used to deal with this situation: for example, a single underlying interface module could be able to deal with different underlying repository interfaces. Another embodiment would consist of having different specialized underlying interface modules, and let the logic module direct the requests to the appropriate one.
In some embodiments, the underlying interface module may have an addition- al functionality consisting on being able to access data from the environment (e.g. the current time, the available memory, network availability, etc.). This type of information is likely to be consumed by the context monitoring parameters module. An embodiment of the virtual shell can use a file system as an underlying re pository for the underlying objects (e.g. data corresponding to files represen ted by entries in the namespaces can be stored in actual files in an underlyling repository). The interaction of the virtual shell's underlying interface module with the underlying file system can be done via basic POSIX calls, which means that the virtual shell has no dependency on a particular underlying file system, and can operate on top of anyone providing such interface. Some embodiments of the virtual shell can use file systems which do not provide a strict POSIX interface (such as NTFS, NFS, or SMB/CIFS-based systems). The interaction of the virtual shell's underlying interface module with the underlying file system can be done through the specific file system calls supported by said file systems.
As the strictly required operations are very simple (create and remove pieces of data, and read or write data at specific offsets), it would be very easy to i plement an embodiment of the virtual shell able to use any other new file sys- tern interface. Similarly, it would not be hard to port the underlying interface module so that it could work directly on top of certain storage media (for instance, storage media meaning either a raw storage device - e.g. a disk or a similar block device - or more sophisticated constructs such as object storage devices -OSD- or any other mechanism able to store data).
When embodiments of the virtual shell use advanced storage systems as underlying repositories (e.g. file systems or databases, possibly being local, re mote, distributed, etc.), the data used by the underlying interface module does not include any knowledge of low-level data storage (in particular, there are no references about disks, blocks or other storage objects - though they could be easily included if necessary). It is up to the underlying file system (or any other advance storage system) to take decisions about low-level data server selection, striping, block/object placement, etc. Therefore, the references to undep lying objects kept, handled or generated by other modules (e.g. the logic mod- ule), or the attributes of requestable objects related to underlying objects can be adjusted accordingly - i.e. they do not need to contain, refer or generate low level storage information.
The logic module is the responsible to coordinate and communicate with the rest of components of the virtual shell in order to provide a behaviour imple menting one or more semantics, understanding by semantics a set of rules, specifications and conventions that allow fulfilling certain expectations when the user interacts with the virtual shell.
In order to perform its functions, the logic module may exchange data and/or requests with one or more of the other components of the virtual shell, and it may keep its own transient and/or persistent data.
In some embodiments, the logic module may act either as a reaction to user requests, and/or due to changes to any type of context monitoring parameters, and/or requests from other system components.
For example, if an embodiment uses requestable objects to represent file system objects, and some attributes of them are kept by the multi-namespaces module, a request to obtain, for instance, a regular file's attributes may return part of the requestable objects attributes from the multi-namespaces module (e.g. access permissions, etc.) and then trigger a request to the logic module to retrieve attributes of the related underlying object containing the file's data (e.g. the size and/or the modification times). In some embodiments, the logic module may act autonomously. For example, it may initiate a transfer and/or duplication of underlying objects across differ ent underlying object repositories (for instance, to improve request balancing and/or to guarantee availability). For example, the logic module may also perform operations on the underlying repositories autonomously from application requests for maintenance, inform ation gathering, data reorganization, data migration, data transformation, physical metadata checks, physical metadata updates, or any other cause. In some embodiments, the logic module may forward requests from other virtual shell modules to the underlying interface module in order to obtain data from or perform actions on one or more of the underlying repositories, or the underlying objects contained in them.
In some embodiments, the logic module may perform accesses to the underly- ing repositories based on data coming from other modules of the virtual shell (e.g. attributes of requestable objects obtained from the multi-namespaces or from any other module). Such data may comprise direct or indirect references to underlying objects. The logic module may be able to translate such references into data usable by the underlying interface module, or may use other modules of the virtual shell to perform either the final or intermediate translations.
For example, in some embodiments, the logic module could maintain a table mapping references from other modules into data suitable to be included in a request to the underlying interface module. Another way to implement this functionality would be the references from the other modules having the necessary data encoded in some way, and the logic module decoding such references to obtain the necessary data. Of course, additional post-processing of the data would be possible (e.g. a function could be applied, so part of the data is altered before being sent to the underlying interface module or after being received from the underlying interface module - for instance, to encrypt or decrypt data). In some embodiments, the logic module can use context monitoring parameters to determine the actual data to be used for interacting with the underlying interface module.
In summary, the "main approach" of virtual shell of the invention provides very high flexibility to storage systems and, at the same time, simplifies the integra tion of multiple underlying storage repositories. To this end, the virtual shell comprises a multi-namespaces module able to deal with multiple namespaces simultaneously. This simultaneousness is achieved by decoupling the entries of the namespaces from the requestable objects, so that namespaces can be organized arbitrarily and independently with respect to the requestable objects referred by the entries. Nevertheless, this level of flexibility may be higher in terms of, for example, the objects required by the user having different attributes depending on the namespace through which they are accessed. This can be achieved by decoupling the attributes associated to the required object but independent from the "pure data" associated to that objects, from those attributes related to the "pure data" of the object (such as, for example, the size of the data), which may be achieved by decoupling the requestable object referred by entries from one or more underlying objects representing the "pure data" of the object. By theses means, different entries, possibly from different namespaces can refer to different requestable objects to access "data-independent" attributes, and such requestable objects could access to the same underlying objects to access both data and attributes related to such data. In other embodiments, a single requestable object could be used, referred by multiple entries, and having different sets of "data-independent" attributes tagged by the namespace being used, and still maintaining a reference to an underlying object for "data-dependent" accesses - note that this implementa- tion could be considered slightly restrictive with respect to the previous one, as a particular object would depend on the namespace to select the set of attribute values to use; extensions to the tags could be used to overcome this limitation, but this could bring other drawbacks in actual implementations: large attribute sets difficult to handle, possible access contention issues, etc. The selection of particular namespaces and the operation with the referred ob jects need to be based in the context monitoring parameters provided by the context monitoring parameters module; without them, the logic module would not be able to offer complex semantics to the user (such as that of a conventional file system). The internal consistency of the virtual shell is guaranteed by the interception module, which intercepts all interactions from the user to the required objects, avoiding untracked modifications that could create inconsistencies between the virtual shell internal representations and the underlying repositories and the underlying objects contained in them. The underlying in terface module permits isolation between the internals of the virtual shell and the differential aspects of different underlying repositories, making easier to extend support for new potential underlying repositories.
In conclusion, the "main approach" of the virtual shell of the invention offers very flexible organization possibilities by providing multiple independent namespaces (i.e. where changes through one of them do not need to necessarily imply changes in a different namespace - neither regarding the struo ture of the namespace, nor the objects referred by the namespace entries), each namespace being able to offer fully functional complex semantics (e.g. emulating a file system), and being able to use diverse storage systems as underlying repositories for storing data related to objects accessible through the virtual shell .
In some embodiments of the invention supported by Figure 1 and Figure 3:
the multi-namespaces module 1 18 of the virtual shell 100 is adapted to produce the data related to namespaces 206 further taking into account that each entry 208 of each namespace 206 refers to its related set 210 of requestable objects 21 1 through one or more 301 requestable object contents locations 300, each requestable object contents location 300 referring to a subset 303 of the related set 210 of requestable objects 21 1 , and each of the requestable objects 21 1 being referred from one or more 302 requestable object contents locations 300;
· the virtual shell 100 further comprises a contents locations module 1 10 for producing data related to one or more references to one or more requestable objects 21 1 from data related to requestable object contents locations 300; and
the logic module 109 of the virtual shell 100 is adapted to interact 1 15 with the contents locations module 1 10 and to produce the set of data for the interception module 104 further from a set of the data produced by the contents locations module 1 10, the interaction 1 15 with the contents locations module 1 10 depending on a subset of the set of the data produced by the context monitoring parameters module 1 16.
This "requestable object contents locations approach" of the virtual shell, introduced in the previous paragraph, has the advantage of improving the flexibility of the "main approach", as it will be argued in following descriptions.
In the "main approach", the multi-namespace module keeps information about both the multiple namespaces and about the requestable objects referred from the entries of the namespaces. Thus, the multi-namespaces module has to understand the characteristics of both sets of entities (the namespaces and the entries in one side, and the requestable objects and/or their related data in the other side). This situation may add complexity to the module, as characteristics, behaviour, and potential requests on both sets of entities are different.
A possibility to mitigate the issue mentioned above is to separate the management of namespaces and entries from the management of the objects referred by those entries. A way to achieve this is to use an opaque handle to implement the reference from entries to referred objects, and let such handle to be interpreted outside the multi-namespaces module.
Apart from the advantage of simplifying the multi-namespaces module, this possibility has other advantages; for example, the objects referred by the entries may be of potentially different types, without affecting the structure, functionality or characteristics of the multi-namespaces module (e.g. we could use the same multi-namespaces module to handle objects which represent different things, have different attributes, or are encoded in different ways). Each of the entries of namespaces handled by the multi-namespaces module may comprise a content location container to contain such opaque references to objects. In particular, when the entries refer to requestable objects, such content location containers may be able to contain "requestable object contents location" data for one or more requestable objects. In this case, the multi-namespaces module may be adapted to produce the data related to namespaces further taking into account that each entry of each namespace comprises said contents location container adapted to contain one or more requestable object contents locations.
The data in a contents location container may contain, possibly encoded in some way, a tag or an equivalent piece of data indicating to which type of object the opaque handle refers (for example, if it contains a requestable object contents location). This tag can be used by some other component to identify which module may be able to interpret and/or deal with the opaque handle. The data in the contents location container may contain additional pieces of data related to the referred object; such pieces of data might be extracted and interpreted directly by some other component, without having to access the whole referred object. For example, in embodiments of the virtual shell offering file system semantics and representing file system objects with requestable objects, the "requestable object contents location" in the contents location container associated to a particular entry of a namespace may contain, encoded in some way, additional information such as the file system object internal identifier (e.g. an i-node number) and/or the type of the file system object (e.g. if it represents a regular file, a symbolic link, a directory, etc.). In the mentioned example, the additional pieces of data are immutable in a file system context; note that having to make a request to the multi- namespaces module to partially replace a part of the additional information encoded in the data in a contents location container (e.g. to replace data about file system type of the represented object) would break the philosophy of the multi-namespace module being agnostic with respect to the referred objects, and thus would mitigate the advantages of using opaque handles in content location containers. The contents location container may contain data referring to multiple objects (of the same, or of different types). This can be achieved either by storing a set of "handles" (e.g. a list of "requestable object contents locations"), or by storing a single "handle" encoding multiple references to objects. In the first case, the multi-namespaces module may provide an interface to add or remove individual "handles" in an entry's contents location container, as well as accessing to all of them as a whole. The way to proceed when multiple references to objects are available will be determined by the component able to deal with such objects. For example, different "requestable object contents location" in a contents location container may represent different replicas of a file, and any of them could be chosen; or they may represent different "stripes" of a file, having to access all of them in order to obtain the complete file. The "requestable object contents locations" may contain additional information or tags indicating how such multiplicity should be handled.
A contents location container of some entries could be empty (or, equivalently, contain "null" values - e.g. "requestable object contents locations" not referring to any requestable object). Such feature can be used for several purposes such as, for example, deal with entries of namespaces which simply act as namespace structural support (e.g. a container not backed by any corresponding requestable object), or to deal with particular features or situations (e.g. placeholders, or errors). As mentioned above, the "requestable object contents locations" are opaque handles from the multi-namespace module perspective. So, they must be interpreted and handled somewhere else. A first approach would consist of having the logic module deal with them. Despite being possible, that would increase the burden on the logic module, making it more complex than necessary. As an alternative solution, the virtual shell may introduce a new component: a contents locations module able to interpret and deal with "requestable object contents locations". The contents locations module may keep data related to requestable objects, including their attributes (or even keep the whole requestable objects themselves). On request by the logic module, the contents location module may provide the attributes, other related data, or references (direct or indirect) to underlying objects related to the requestable objects, if any.
A requestable object may not have any related underlying objects. This may imply that either said requestable object does not have "pure data" contents, or it may imply that their "data" is generated internally by the virtual shell. In this latter case, the contents locations module is responsible for providing the means to generate such data on demand. For example, a 7dev/zero" file in a Unix system could be implemented by the contents locations module generating zero-filled data when the logic module requests information about the files contents; in a similar way, it could generate random data for a 7dev/random" file, or discard any modification requests to the contents of a 7dev/null" file.
A requestable object may refer to related underlying objects. For example, when a requestable object represents a regular file in a file system, a related underlying object (e.g. a file in an underlying file system) may contain all or part of the represented file's data contents.
A requestable object may refer to multiple underlying objects containing data associated to the requestable objects. Such multiplicity of underlying objects may contain either alternative versions of the data or complementary versions of the data (e.g. different underlying objects may contain alternative contents for a file depending on the user accessing to it, or they may contain different sections of the data which, concatenated, could represent the totality of the data associated to the requestable object). In such situations, the contents location module may be the responsible to select the underlying object (or underlying objects) to act upon, possibly based on context-monitoring parameters and the particular received requests.
Additionally, a requestable object may also refer to related underlying objects containing additional metadata (e.g. additional attributes) about the requestable object. For example, a requestable object may have a related underlying object containing the "pure data" associated to the requestable object, and additional related underlying objects containing information about the pure data (e.g. the date, time and place where a photo has been taken), its format, or even code able to process the "pure data".
As in the case of underlying objects containing "pure data" related to requestable objects, a requestable object may refer to multiple underlying objects containing additional metadata (e.g. additional attributes) about the requestable object, such multiplicity of underlying objects being able to contain either alternative or complementary fragments of the metadata. In such cases, the contents location module may be responsible to select the appropriate underlying object or underlying objects to act upon, possibly depending on context-monitoring parameters and the received requests.
References (either direct or indirect) to underlying objects related to requestable objects can be considered as attributes of the requestable objects. The contents location module can be responsible of providing such references to underlying objects to the logic module upon request, so that the logic module can request the appropriate access to corresponding underlying repository through the underlying interface module. Such references to underlying objects may be directly usable by the underlying interface module, or may require some additional processing (e.g. decoding, or selection - if multiple references are provided) by the logic module or the underlying interface module itself. For example, a request from the user to access data related to a requestable object through a namespace entry would involve the logic module sending requests to the multi-namespaces module with data related to the original user request and from the context monitoring parameters module, in order to obtain (a) "requestable object contents location(s)" associated to the desired entry; then, the logic module would feed such "requestable object contents location(s)" to the contents location module, possibly providing additional information from the context monitoring parameters module; the contents locations module would then produce data related to the requestable object referred by the "requestable object contents location(s)", possibly including the references to the underlying objects required by the logic module to perform the requested access to the data through the underlying interface module. A request to access requestable object attributes related with data stored in underlying objects may also result in the contents locations module producing references to underlying objects, to be used by the logic module through the underlying interface module. For example, in embodiments offering file system semantics and representing files with requestable objects with data in an actual file in an underlying file system, a request to update the represented file's modification time could result in the logic module obtaining a "requestable object content location" from the multi-namespaces module, feeding such "requestable object content location" to the contents locations module, and obtaining a reference to the underlying object (the actual file) whose access time has to be modified, and requesting the modification of such attribute of the underlying file to the underlying interface module. A similar path would be followed, for instance, for requests related to querying or modifying the size of a file represented by a requestable object linked to an underlying object.
A request for accessing requestable objects attributes, such requestable objects possibly related to underlying objects, may not require the contents location module to produce references to the underlying objects, especially when the particular attributes to be accessed are not directly related to relevant underlying object properties; then, the required access can be performed internally by the contents locations module. For example, in a virtual shell offering file system semantics and having the files represented by requestable objects, a request to access some file attributes such as the access permissions can be performed directly by the contents location module without requiring access to underlying repositories, if the contents locations module has its own means to keep such information as attributes of the requestable object. (Of course, the request path would consist, as usual, in the logic module interacting with multi-namespaces module to obtain a "requestable object contents location" to be feed to the contents location module, together with data about the access to be performed and, possibly, data from the context monitoring parameters module).
In some embodiments where the virtual shell is able to provide file system semantics, the contents location module can keep some of the attributes related to different types of file system objects (regular files, directories, etc.) For example, it may keep security elements (owner, group, access permissions, control lists, etc.), statistical information, extended attributes, etc. As already mentioned, the contents location module can also keep references to underlying objects where additional information related to the requestable object can be stored (e.g. the data of a file represented by a requestable object).
One of the functions of the contents locations module is to convert the "requestable object contents locations" fed into it into direct references to requestable objects, so that the contents locations module can act on the specified and related requestable objects and/or produce data related to them. Several ways can be used to perform this conversion. A "requestable object contents location" can be a direct reference to data in the contents location module related to the corresponding requestable object (e.g. a memory pointer to a data structure, or a key in a database table storing requestable objects data).
Some processing may be needed to convert the "requestable object contents locations" into usable references to requestable objects (e.g. a mapping table can be used to map "requestable object contents locations" into references to requestable objects; another possibility is to apply a certain function to the "requestable object content location" to "decode" it and extract a requestable object reference from the result of the function).
Conversion of a "requestable object contents location" into a reference to a specific requestable object can be influenced by context monitoring parameters. For example, a "requestable object contents location", as mentioned before, could contain references to multiple requestable objects; the specific requestable object to use could be determined by the contents locations module based on context monitoring parameters (e.g. a "requestable object contents locations" coming from an entry representing a file could be resolved into different requestable objects - possibly referring to different underlying files containing different data - depending on the user making the request, or the time of the request).
A "requestable object contents locations" can be generated by the contents location module either upon request or automatically when a requestable object is created, and communicated to other modules (e.g. via the logic module) so that they can be used to refer to the requestable objects when needed (e.g. to have an entry in a namespace to refer to a requestable object). The contents locations module may use a repository (e.g. a database table) to keep data related to the requestable objects, possibly including their attributes. The contents location module can take into account information about underlying repositories (e.g. obtained via the logic module) to decide where and how to organize the underlying objects related to requestable objects. For example, it may decide to select a particular underlying repository to store an underlying object with "pure data" related to a requestable object based on the available capacity of such underlying repository, or may decide to split and/or replicate the data into multiple underlying objects, possibly in multiple underlying repositories, to improve, for instance, the performance or the reliability. The contents location module may be able to produce references to underlying objects and/or underlying repositories that are directly usable by the logic module to access them via the underlying interface module.
References to underlying objects from requestable objects may be kept as direct references, and the logic module can simply forward them to the underlying interface module.
The contents locations module may keep references from requestable objects to related underlying objects in such a form that requests containing them may need to be processed in order to be forwarded to the underlying interface module via the logic module (e.g. a reference to an underlying object can be kept in a "repository-agnostic" way, and then be converted into a form adapted to the characteristics of the actual underlying repository where the underlying object resides).
In embodiments of the invention, the contents locations module comprises one or more contents locations sub-modules, each of said contents locations sub-modules being adapted to produce data related to underlying objects comprised in underlying repositories with equivalent properties.
The contents locations module can be able to alter in any way the data being stored to or retrieved from underlying objects related to requestable objects (e.g. encrypting/decrypting and/or compressing/uncompressing data), possibly based on context-monitoring parameters. In the same way, it may also alter responses related to requestable object attributes (e.g. clearing the returned value about a "last access time" attribute depending on the user making the request).
As it can be observed, the logic module (as mentioned previously) could also be able to perform totally or partially some of the functionalities described for the contents location module (such as resolving multiple references to requestable objects - and possibly selecting which to use -, resolving multiple references to underlying objects - and possibly selecting which to use-, altering the data being stored to or retrieved from underlying objects, and adapting the references to underlying objects according to specific characteristics of the underlying repositories, just to mention some examples). In preferred embodiments of the invention, when a contents location module is available, the responsibility for such possibly overlapped functionality can be shifted to from the logic module to the contents location module. The advantage of this is reducing the complexity of implementation of the logic module by having the contents location module to completely deal with aspects related to requestable objects and their related underlying objects. This said, nothing prevents, of course, that in some embodiments, the responsibility can be shared (or even replicated) in both modules for performance, reliability, or simply convenience of implementation, for example.
Summarizing the main features of the "requestable object contents locations approach", it has to be emphasized that the addition of the contents location module to the "main approach" allows simplifying the implementation of the virtual shell and increasing the flexibility of the multi-namespace module, as well enabling optimizations to requestable object management by having a specialized module focused on them. In particular, the introduction of "contents location container" in the entries of the namespace, together with the use of "requestable object contents locations" as opaque handles referencing requestable objects allows isolating the namespace management from any specifics of the objects being referred by the entries. Then, the contents location module may assume the entire burden for handling the requestable objects and related underlying objects, while the logic module can be released from such tasks and focus on the coordination of the different modules and the implementation of the algorithms providing the desired semantics of the virtual shell . It is also important to note that, with the introduction of "requestable object contents locations", the logic module can also be made agnostic with respect to the internals of requestable object management, and even with respect to the specific characteristics of the underlying objects and the underlying repositories (thanks to the contents locations module being able to deal with them). Alternatively to the "requestable object contents locations approach' of the virtual shell, in some embodiments of the invention supported by Figure 1 and Figure 4:
the multi-namespaces module 1 18 of the virtual shell 100 is adapted to produce the data related to namespaces 206 further taking into account that each entry 208 of each namespace 206 refers to its related set 210 of requestable objects 21 1 through one or more 400 virtual contents locations 401 , each virtual contents location 401 referring to a set 403 of virtual objects 404 and each virtual object 404 referring to a subset 406 of the related set 210 of requestable objects 21 1 , each of the requestable objects 21 1 being referred from one or more 405 virtual objects 404 and each virtual object 404 being referred from one or more 402 virtual contents locations 401;
the virtual shell 100 further comprises a virtual objects module 107 for producing data related to one or more references to one or more requestable objects 21 1 from data related to virtual contents locations 401; and the virtual objects module 107 is adapted to produce the data related to references to requestable objects 21 1 further taking into account that each virtual object404 has a set 410 of virtual attributes 409, said virtual attributes 409 being decoupled from attributes 212 related to requestable objects 21 1 ;
the multi-namespaces module 1 18 is adapted to produce the data related to namespaces 206 further taking into account that the organization of each namespace 206 is decoupled from the virtual attributes 409 of the virtual objects 404 referred from the entries 208 of the namespace 206 through the virtual contents locations 401; and
the logic module 109 is adapted to interact 1 1 1 with the virtual objects module 107 and to produce the set of data for the interception module 104 further from a set of the data produced by the virtual objects module 107, the interaction 1 1 1 with the virtual objects module 107 depending on a subset of the set of the data produced by the context monitoring parameters module 1 16.
This "virtual objects approach" of the virtual shell introduced in the previous paragraph, has the advantage of improving the flexibility of the "requestable object contents locations approach", as it will be argued in following descriptions.
As said before, the entries of namespaces handled by the multi-namespace module may contain references to requestable objects, and said requestable object may have a set of related attributes and a set of underlying objects (which, for example, may contain "pure data" related to the requestable object). At the same time, each requestable object may be referred from multiple entries, possibly from different namespaces. Nevertheless, the flexibility of the virtual shell may be further improved by overcoming some limitations of the previously described embodiments; for example: attributes of a requestable object are the same independently of the entry used to access the object; nevertheless, it could be interesting to see different attributes (e.g. owner, access permissions, tags, etc.) depending on the specific namespace used to reach the object (or even depending on the specific entry used).
A possible embodiment to provide support for such multiple views of attributes may be having each requestable object attribute adapted to contain a set of values, and tag each value with an identifier of the entry or entries through which it should be possible. Although such a solution is theoretically possible, it has several drawbacks. First, it further entangles the multi-namespace entities (i.e. namespaces and entries) with the referred objects (the requestable objects), making difficult to handle both worlds separately and, therefore, making implementations more complex. Second, it is a cumbersome and non-scalable way to handle the different views (as we may have a large number of entries/namespaces, and also a large number of attributes with variations, so selecting the right set of values to use for each particular access could be complex and time-consuming).
By introducing virtual objects as an intermediate entity between the entries of the namespaces and the referred requestable objects, it is achieved a higher flexibility of the virtual shell. Following this model, a requestable object may be referred by multiple virtual objects having their own set of virtual attributes (potentially independent from the attributes in the referred requestable objects - i.e. with attributes not present in the requestable object, or with different values) which, in turn, may be referred by multiple entries. Whenever it is needed a set of entries (from the same or from different namespaces) to refer to a requestable object maintaining a consistent view of the attributes, said set of entries may refer to the same virtual object which, in turn, may refer to the desired requestable object; on the contrary, if it is needed to provide different views from a set of entries, it is possible to have said set of entries referring to different virtual objects which, in turn, may refer to a single desired requestable object. Of course, the multiplicity in the other direction can be maintained: a single entry can refer to multiple virtual objects, a single virtual object can refer to multiple requestable objects and, of course, a combination of both the previous situations is also possible. The introduction of virtual objects has another important advantage: the virtual objects can be operated upon based on requests from the user, as if they effectively were the requestable objects of previous approaches. In particular, this means that they can be fully functional and be used according to any semantics provided by the virtual shell. For example, without virtual objects, it is still possible to modify the values of a requestable object (e.g. "on-the-fly") when responding to user requests to offer alternative views; but this would be a "read-only" alternative view: if the user tries to modify the value of the altered attribute (e.g. access permissions), either the change has to be discarded (so it is not functional) or it is necessary to set it in the requestable object and, therefore, it is changed for all "views" (so the independence between "views" is lost - for example, between separate namespaces). On the contrary, attributes in a virtual object can be accessed in any way (including read and written) independently of the requestable object being referred, and the logic module can operate independently on them according the desired semantics (for example, different virtual objects referring to the same requestable object could have different access permissions, and the logic module could allow or disallow access to the requestable object depending on which virtual object has been used as "intermediate" hop).
The requestable objects can be used to contain attributes related to "pure data" (e.g., references to underlying objects, if any, or the size of "pure data" associated to the object - either if contained in an underlying object or if generated inside the virtual shell), whereas virtual objects can be used to keep in their virtual attributes all the data that does not depend on the "pure data" associated to the object or on related underlying objects, if any. For example, in a virtual shell offering file system semantics and using an underlying file system to keep the data of the represented files, a requestable object could be used to keep references to the underlying object and related information (e.g. its path in the underlying file system, or the access times), whereas a corresponding virtual object could be used to keep all attributes which are independent from the underlying repositories (e.g. the owner, access permissions, etc.) It is also important to mention that the introduction of virtual objects do not have any impact on the organization of the namespaces. The same techniques mentioned previously that allow having namespace organizations independent from the requestable objects (or their attributes) referred from the entries of the namespace still apply when such references are done through virtual objects; in other words, the namespace organization may also be independent from the virtual objects and their corresponding virtual attributes.
It was mentioned previously that having the multi-namespace keeping information about the multiple namespaces and about the requestable objects referred from the entries of the namespace may add complexity to the module and jeopardize the flexibility of the multi-namespaces module (as the multi- namespaces module has to understand the characteristics of both sets of entries). Obviously, the addition of virtual objects and their virtual attributes may aggravate this situation. A possible approach to mitigate this issue was also mentioned previously: using an opaque handle to implement the references from entries to the requested objects (including the intermediate virtual objects), and let such handle be interpreted outside the multi- namespaces module.
Each of the entries of namespaces handled by the multi-namespaces module may comprise a content location container to contain such opaque references to objects. In particular, when the entries refer to virtual objects, such content location containers may be able to contain "virtual contents location" data for one or more virtual objects. In this case, the multi-namespaces module may be adapted to produce the data related to namespaces further taking into account that each entry of each namespace comprises said contents location container adapted to contain one or more virtual contents locations.
It was commented previously that some embodiments could use content location containers in the entries to contain "requestable object contents locations". The same content location container can be adapted to contain both types of data ("virtual contents locations" and "requestable object contents locations"), without the multi-namespaces module needing to distinguish between both types of data. In any case, all characteristics and techniques applicable to the previously described content location containers for containing "requestable object content locations" and the characteristics of the multi-namespaces module to be able to deal with them are also applicable to content location containers for containing "virtual contents locations". Examples of such characteristics and features include possible use of tags to identify the type of handle in the container, the possibility of encoding additional information in the handle that can be extracted without needing to recover the referred object, the possibility of encoding references to multiple objects (in the case of a "virtual contents location", to multiple virtual objects) in a single handle, the possibility of storing multiple "handles" (possibly of different types) in the same content location container), and the possibility to have empty containers (or, equivalently in this case, having "virtual contents locations" not referring to any virtual objects - i.e. being "empty"). As mentioned above, the "virtual contents locations" are opaque handles from the multi-namespace module perspective. So, they must be interpreted and handled somewhere else. A first approach would consist of having the logic module deal with them. Despite being possible, that would increase the burden on the logic module, making it more complex than necessary. As an alternative solution, the virtual shell in the invention may comprise a virtual objects module able to interpret and deal with "virtual contents locations".
The virtual objects module may keep data related to virtual objects, including their virtual attributes (or even the whole virtual objects themselves).
The virtual objects module may keep data related to requestable objects (including their attributes - or even the whole requestable objects themselves) referred from the virtual objects. On request by the logic module, the virtual objects module may produce virtual attributes or other data related to the virtual objects and/or data related to the requestable objects referred from the virtual objects, if any. (Depending on the particular embodiment, either references to the requestable objects, or other type of data related to requestable objects or to underlying objects referred by the requestable objects, if any.)
A virtual object may not have any related requestable object; for example, a virtual object may be able to keep the relevant associated data in virtual attributes, without requiring a requestable object (e.g. a virtual object representing a directory of a POSIX file system that exist in a single namespace, and that has no corresponding data in an underlying repository; another example would be a virtual object implementing a "shortcut" in a Windows-like system: one of the virtual attributes of the virtual objects would just keep a path in a namespace, regardless of what such entry represents, or even if it exists).
A virtual object may refer to multiple requestable objects. Such multiplicity of requestable objects may refer to either alternative versions of data and/or attributes or complementary versions of data and/or attributes (e.g. different requestable objects may contain references and different information on how to access to different underlying objects with alternative contents for a file represented by a virtual object; or the different requestable objects may contain directions to different underlying objects containing different fragments of the file represented by the virtual object).
References (either direct or indirect) to requestable objects related to virtual objects can be considered as virtual attributes of the virtual objects.
The virtual objects module can be responsible of providing such references to requestable objects to the logic module upon request, so that the logic module can perform additional operations on them, or feed them to additional modules for further processing.
For example, a request from the user to access data related to a virtual object through a namespace entry would involve the logic module sending requests to the multi-namespaces module with data related to the original user request and from the context monitoring parameters module, in order to obtain (a) "virtual contents location(s)" associated to the desired entry; then, the logic module would feed such "virtual contents location(s)" to the virtual objects module, possibly providing additional information from the context monitoring parameters module; the virtual objects module could then produce data related to the virtual object(s) referred by the "virtual contents location(s)", possibly including the references to related requestable objects; at that point, the logic module may decide to further interact with additional modules to act upon the obtained requestable object(s) in order to perform the user request. Such additional modules may comprise the virtual objects module again (e.g. when the requestable objects are also handled and kept by the virtual objects module); in that case, the virtual objects module may perform the final desired action on the corresponding requestable object without the intermediate step of sending a requestable object reference back to the logic module.
A request for accessing virtual objects attributes, such virtual objects possibly related to requestable objects, do not necessarily require the virtual objects module to produce references to the requestable objects, especially when the particular attributes to be accessed are not directly related to relevant requestable object or underlying object properties; then, the required access can be performed internally by the virtual objects module. For example, in a virtual shell offering file system semantics and having the files represented by virtual objects, a request to access some file attributes such as the access permissions can be performed directly by the virtual objects module without having to refer to underlying requestable objects.
In case of a virtual object being related to requestable objects, such requestable objects can be managed by the virtual objects module as they were managed by the multi-namespaces module and/or the logic module as described for previous embodiment descriptions.
When the virtual objects module is adapted to deal with and keep information about the requestable objects, the responsibility for such possibly overlapped functionality can be shifted to the virtual objects module. The advantage of this is reducing the complexity of implementation of both the logic module and the multi-namespaces module by having the virtual objects module to completely deal with aspects related to requestable objects.
One of the functions of the virtual objects module may be converting the "virtual contents locations" fed into it into direct references to virtual objects, so that the virtual objects module can act on the specified and related virtual objects, or produce data related to them. In general, all the mechanisms that were described in previous descriptions allowing the contents location module to convert a "requestable object contents location" into a direct reference to a requestable object can also be used by the virtual objects module to convert a "virtual contents location" into a direct reference to a virtual object. The conversion of a "virtual contents location" into a reference to a specific virtual object can be influenced by context monitoring parameters. For example, a "virtual contents location", as mentioned before, could contain references to multiple virtual objects; the specific virtual object to use could be determined by the contents locations module based on context monitoring parameters.
The "virtual object contents locations" can be generated by the virtual objects module either upon request or automatically when a virtual object is created, and communicated to other modules (e.g. via the logic module) so that they can be used to refer to the virtual objects when needed (e.g. to have an entry in a namespace to refer to a virtual object).
The virtual objects module may use a repository (e.g. a database table) to keep data related to the virtual objects, possibly including their virtual attributes.
In case of the virtual objects module also keeping data related to the requestable objects, the virtual objects module may use a repository (e.g. a database table) to keep data related to the requestable objects, possibly including their attributes.
In case of the virtual objects module being able to deal with both data related to virtual objects and requestable objects, the virtual objects module may use the same structures to internally represent both types of objects. In particular, if the virtual objects module uses repositories to keep data related with virtual objects and data related with requestable objects, then the same repository (e.g. a database table) could be used to keep data about both types of objects and their attributes.
The virtual objects module can be able to alter in any way the data being accessed through related requestable objects (including, for example, "pure data" and/or attributes) possibly based on context-monitoring parameters. For example, it could digitally sign and/or verify digital signatures of data being accessed through a particular virtual object.
Summarizing the main features of the "virtual objects approach", it has to be emphasized that the addition of virtual objects module to the "requestable object contents locations approach" allows increasing the flexibility of the virtual shell by allowing further decoupling between the multiple namespaces and the requestable objects, allowing the implementation of really independent namespaces, where operations through a particular namespace do not need to affect the other namespaces. The use of the "contents location container" in the entries of the namespaces and the use of "virtual contents locations" as opaque handles referencing virtual objects also contribute to release the multi-namespaces module from having to know about any specifics of the objects being referred by the entries. Additionally the virtual objects module may partially assume the burden for handling requestable objects while the logic module can be partially released from that task.
In preferred embodiments of the invention according to the "virtual objects approach" of the virtual shell, said embodiments being supported by Figure 1 and Figure 5:
the virtual objects module 107 of the virtual shell 100 is adapted to produce the data related to references to requestable objects further taking into account that each virtual object 404 refers to its related subset 406 of requestable objects 21 1 through one or more 500 requestable object contents locations 501 , each requestable object contents location 501 referring to a subset 503 of the related subset 406 of requestable objects 21 1 , and each of the requestable objects 21 1 being referred from one or more 502 requestable object contents locations 501 ;
the virtual shell 100 further comprises a contents locations module 1 10 for producing data related to one or more references to one or more requestable objects 21 1 from data related to requestable object contents locations 501 ; and
· the logic module 109 is adapted to interact 1 15 with the contents locations module 1 10 and to produce the set of data for the interception module 104 further from a set of the data produced by the contents locations module 1 10, said interaction 1 15 with the contents locations module 1 10 depending on a subset of the set of the data produced by the context monitoring parameters module 1 16.
This "virtual objects and requestable object contents locations approach" of the virtual shell, introduced in the previous paragraph, has the advantage of improving the flexibility of the "main approach", of the "requestable object contents locations approach" and of the "virtual objects approach" as it will be argued in following descriptions.
In the "virtual objects approach", the virtual objects module keeps information about both the virtual objects and about the requestable objects referred from the virtual objects. Additionally, it must be noted that requestable objects may be complex entities, as they may contain references to underlying objects contained into underlying repositories. Thus, the virtual objects module would have to understand the characteristics of both virtual and requestable objects and, possibly, be also able to interact with underlying objects that may exist on underlying repositories with different characteristics. All these factors may introduce certain complexities to the implementation and/or potential extensibility of the virtual objects module.
The "requestable object contents locations approach" introduces a "contents location module" specialized in handling requestable objects and "requestable object contents locations" (opaque handles referencing requestable objects) improving the flexibility of the virtual shell in relation to the "main approach". Nevertheless, this "requestable object contents locations approach" may imply certain inflexibility when, for example, performing actions on objects through a given namespace (e.g. changing an attribute - for instance, a tag - associated to an object) and producing the corresponding effects isolated from other namespaces while keeping the intended functionality (i.e., following the same example, having the tag appeared as unchanged in other namespaces, while having the desired functionality - for instance, to generate a categorization - in the namespace where said tag has been changed). In the embodiments according to the "virtual objects approach", said certain flexibility is solved by the introduction of virtual objects as intermediaries between entries and requestable objects, and having a virtual objects module separated from the multi-namespaces module to handle the virtual objects and the related objects (namely requestable objects and related underlying objects). However, said "virtual objects approach" may be further improved by adding some kind of "knowledge" to the virtual objects about the characteristics of the requestable objects and related data.
A possibility to improve, even more, both flexibility and simplicity (and thus, ease of extensibility) of the virtual shell may be a combination of the principles of both solutions: "requestable object contents locations approach" and "virtual objects approach". A way to achieve this would be having the entries of the namespaces referring to virtual objects by means of "virtual contents locations", which could be handled by the virtual objects module, and then having the virtual objects referring to their related requestable objects by means of "requestable object contents locations", which could be handled by the contents location module, together with their related underlying objects, if any. Thus, a virtual object may comprise a contents location container able to contain one or more of such "requestable objects contents locations". It must be noted that, being such contents location container associated to each particular virtual object, it can also be considered as a virtual attribute of the virtual object. Then, the virtual objects module (107) may adapted to produce the data related to references to requestable objects further taking into account that each virtual object comprises said contents location container adapted to contain one or more requestable object content locations.
The contents location containers introduced in the previous paragraph may have the same characteristics and functionalities as the contents location containers of the "requestable object contents locations approach". As a matter of fact, the same structure could be used to implement both contents locations containers. So, any characteristics and techniques described so far related to contents locations containers associated to entries may also be applied to contents locations containers associated to virtual objects.
In an analogue way, any previously described characteristics and techniques related to either "virtual contents locations" and/or "requestable object contents locations" when used with contents locations containers associated to entries, may also be applied to "virtual contents locations" and/or "requestable object contents locations" when used with contents locations containers associated to virtual objects. In other words, the same type of "virtual contents locations" could be used for both types of containers, and the same type of "requestable object contents locations" could also be used for both types of containers.
Furthermore, in embodiments where virtual objects and requestable objects are represented internally with the same structures, the "virtual contents locations" and the "requestable object contents locations" could have the same format and/or characteristics (e.g. they could be encoded in the same way, or have the same mechanism for conversion into actual references to either virtual or requestable objects).
Then, a request from a user asking to perform a certain access to an object may result in: the logic module interacting with the multi-namespaces module to resolve a "path" against a certain namespace to locate a particular entry (possibly taking into account context monitoring parameters) in order to obtain a "virtual content location"; the logic module interacting with the virtual objects module, feeding the "virtual content location" to the virtual objects module (possibly taking into account context monitoring parameters), and the virtual objects module performing the required access to the corresponding virtual object if possible, or returning a "requestable object contents location", in case of said "requestable object contents location" being necessary to perform the required access; in the case of the virtual objects module returning said "requestable object contents location", the logic module interacting with the contents locations module, feeding the "requestable object contents locations" to the contents location module (possibly taking into account context monitoring parameters); and the contents locations module performing the required access to the corresponding requestable object (including, if necessary, actions upon related underlying objects carried out by feeding the necessary information to the logic module, so that the logic module can access the underlying repositories through the underlying interface module, if necessary).
All characteristics, techniques and functionalities about the contents locations module described for previously mentioned embodiments (including, but not limited to, any handling of underlying objects and/or underlying repositories) may also apply to the contents location module in embodiments where the virtual shell also comprises a virtual objects module. The virtual objects module could be adapted to deal with and keep data (e.g. attributes) about requestable objects, and even to their related underlying objects, if any. In general, all characteristics, techniques and functionalities about the virtual objects module described for previously mentioned embodiments may also apply to the virtual objects module in embodiments where the virtual shell also comprises a contents location module. Nevertheless, in preferred embodiments, when both a contents locations module and a virtual locations module are available, the responsibility of dealing with requestable objects and any related underlying objects can be shifted to the contents location module, while the virtual objects module can be left with just the responsibility to deal with virtual objects. This has the advantage of simplifying the virtual objects module by removing its need to deal with aspects possibly related with access to underlying repositories.
As mentioned in previously described embodiments, different modules may use repositories (e.g. databases) to keep data related to the objects they manage. In some embodiments, different modules may share repositories to keep their data. For example, if, in a particular embodiment, the structures for virtual objects and requestable objects are the same, the contents locations module and the virtual objects module could use the same repository to keep all or part of their data. Depending on the particular implementations, these solutions may either be equivalent, or have positive or negative effects in several aspects (for example, in terms of performance).
In conclusion, the addition of the virtual objects module to the virtual shell in combination with a contents locations module, allows increasing the flexibility of the virtual shell by allowing the implementation of really independent namespaces (by allowing further decoupling between the multiple namespaces and the requestable objects) and, at the same time, simplifies the implementation of the virtual shell by concentrating the burden of dealing with the specifics of potentially different underlying repositories into the contents locations module, thus isolating other components (namely the multi- namespaces module, the virtual objects module and, up to a certain extent, the logic module) from aspects which may be considered external to the virtual shell .
In preferred embodiments:
the virtual shell 100 further comprises a non-context monitoring parameters module 105 for producing data related to one or more non-context monitoring parameters 204;
the logic module 109 is adapted to interact 108 with the non-context monitoring parameters module 105 and to produce the set of data for the interception module 104 further from a set of the data produced by the non- context monitoring parameters module 105; and
the logic module 109 is adapted to interact 1 14 with the multi-namespaces module 1 18 further depending on a subset of the set of the data produced by the non-context monitoring parameters module 105. Some of the described embodiments are focused on providing file system semantics to the user, so that applications can run on top of the virtual shell as if the virtual shell was a regular file system. For example, some embodiments of the virtual shell provide a POSIX interface for the applications to deal with the file systems in a Unix-like environment; some embodiments developed for Windows-based operating systems provide an NTFS-like interface for the applications to deal with the virtual shell as file system. As the capacity of emulating file systems is one of the goals of some embodiments of the invention, said embodiments take into account a minimum set of monitoring parameters required to implement such specific semantics (e.g. user identity, current time, handling of request history to track related requests, or capacity of data repositories), which have been named as "context monitoring parameters". Nevertheless, there are additional monitoring parameters that can be captured, generated and or used by a virtual shell to provide additional features; in order to distinguish them from the minimum set mentioned above, we call these additional monitoring parameters "non-context monitoring parameters". In general, any previously described principle referring to context monitoring parameters can be applied to non-context monitoring parameters, and vice versa (unless explicitly specified otherwise)
In some embodiments, any module being able to take any decision or alter any procedure and/or data based on data related to context monitoring parameters (e.g. the multi-namespaces module) can also be able to take any decision or alter any procedure and/or data based on data related to non- context monitoring parameters. Adding non-context monitoring parameters to any decision procedure does not require any complex additional infrastructure: just being able to accept new pieces of data as inputs of the functions taking the decisions.
In some embodiments, any module able to interact with the context monitoring parameters module (e.g. the logic module), may also be able to interact with a non-context monitoring parameters module in an equivalent way. From now on, when referring to "monitoring parameters" in any following description, we will refer to context monitoring parameters and non-context monitoring parameters indistinctly.
In some embodiments, the context monitoring parameters module and the non-context monitoring parameters module could be united in a combined monitoring parameters module being able to produce data related to context and/or non-context monitoring parameters.
In some embodiments, the list of connected networks and their characteristics are considered non-context monitoring parameters. This type of data can be obtained, for example, from the operating system using system calls or specific commands. As using different connections may have different implications (e.g. a wifi connection is usually fast, unlimited and free, while a 3G connection is usually slow, may have limits on allowed use, and usually has to be paid), such monitoring parameter could be used, for example, by a multi-namespaces module to select a namespace containing versions of files with reduced size (e.g. low resolution images, or low quality audio files) when the network available is a 3G network, and use a namespace with full-sized versions of files when a wifi network is available.
In some embodiments, the geographical location of the user issuing a request is considered a non-context monitoring parameter. The geographical location of the computer can be obtained by several ways: for example, getting the coordinates from a built-in GPS and checking whether these coordinates fall inside a given geographical area, getting the approximate geographical location from the user's current IP address, or by having the user feed that information through a specific user request intercepted by the virtual shell (other ways may be also conceived). Some embodiments could use the geographical location of the user to select specific namespaces when creating and accessing objects representing files so that, for example, a file can only be accessed when the user is in the country where the file was created (as may be required in some cases for medical files, for example).
In some embodiments, the monitoring parameters may affect the interaction with namespaces and/or the interaction with requestable objects and/or the interaction with any related underlying objects. For example, the value of certain monitoring parameters when a requestable object is created (e.g. geographical location of the user) may be recorded as an attribute of the requestable object, so that it can be used later to compare with current values of monitoring parameters and make a selection. Note that when requestable objects are handled by a module other than the multi-namespaces module (e.g. the virtual objects module), said other module may also be able to accept monitoring parameters as part of the interactions in which it participates. In some embodiments where virtual objects are available, the monitoring parameters may affect the interaction with namespaces and/or the interaction with virtual objects and/or the interaction with requestable objects and/or any related underlying objects. In that case, the virtual objects module may be able to accept monitoring parameters as part of its interaction with other modules. For example, the value of certain monitoring parameters when a virtual object is created may be recorded as an attribute of the virtual object, so that it can be used later to compare with current values of monitoring parameters and make a selection (as commented before for requestable objects in other embodiment descriptions).
In some embodiments, any module may use monitoring parameters (either context or non-context) as input to take any decisions or to alter its behaviour in any way. In other words, any module can be easily adapted to accept monitoring parameters (context or non-context) as part of their interaction with other modules.
The use of non-context monitoring parameters allows the virtual shell to automatically tune its behaviour according to a large number of environment conditions (e.g. hardware capacity, network availability, geographical location, etc.). Thanks to this, the virtual shell can adapt to changing conditions and isolate the user from changes in the operating environment as much as possible. In preferred embodiments of the invention, the logic module 109 is adapted to interact 1 15 with the contents locations module 1 10 further depending on a subset of the set of the data produced by the non-context monitoring parameters module 105. That is to say, non-context monitoring parameters may also affect the interactions and accesses to objects handled by the contents locations module (e.g. requestable objects and/or their related underlying objects, if any). For example, in some embodiments, the contents location module may use a non-context monitoring parameter such as the network availability to decide which underlying object to use when several underlying objects containing data replicas are associated to a particular requestable object (e.g. it may decide to use an underlying object residing in a local underlying repository if the network is not available, instead of an underlying object in a remote underlying repository).
In some embodiments, the contents location module may use a non-context monitoring parameter, such as the geographical location of the user, to decide in which underlying repository to create underlying objects related to user requests (e.g. it may choose a remote repository geographically near the current location of the user for improved performance). In some embodiments of the invention, the virtual objects module 107 comprises one or more virtual objects sub-modules, each of said virtual objects sub-modules being adapted to access to virtual attributes 409 of virtual objects 404 and/or to attributes 212 of requestable objects 21 1 , each of said attributes 409;212 having a volatility level according to a predetermined volatility classification.
Values of virtual attributes of virtual objects may have different levels of volatility, from immutable values to highly volatile attributes. It was mentioned in the background art section that traditional trends tended to group metadata (e.g. attributes) together, so that, being usually small, they could be packed and transferred in an efficient way. Nevertheless, such packing also has negative consequences: for example, updating an attribute usually implies the implementation locking the attributes set in order to keep the consistency; but at the same time, this may prevent another process from accessing an unmodified attribute while another attributed is being updated (lock contention). In distributed systems, the effects are even more noticeable. For example, the virtual objects module could be implemented in a distributed way (e.g. using a local component interacting with a remote repository to access the required data); a usual technique in such implementations is to use a cache mechanism to avoid unnecessary transfers and, in some embodiments, this could involve keeping a local copy of virtual attributes of a virtual object from the remote repository. The issue arises when frequently used attributes with low volatility levels are packed and/or handled together with rarely needed attributes with high volatility: changes in the rarely needed attributes may require the invalidation of the group of cached attributes, removing also the highly needed - and probably unmodified - low volatility attributes (thus, limiting the effectiveness of some techniques such as caching)
File systems are a clear example where this may occur. For example, most of POSIX-based file systems use an internal structure called i-node keeping all file system object (file, directory, etc.) related information. Such information include some immutable information which is needed very often (e.g. the i- node identifier, and the file object type), information which is also used very often (possibly by multiple clients) and rarely updated (e.g. the owner, the group, and the access permissions) and information which is rarely needed but updated very often (e.g. the file size, or the access and modification times). In a virtual shell offering POSIX-based file system semantics, the i- node could be, in some embodiments, represented by a virtual object, the information of the i-node being the virtual attributes. If the virtual objects module treats the virtual attributes of the virtual object as a unit, the issues mentioned in the previous paragraph may arise.
A solution for said type of situations may be to classify the attributes in "volatility levels" according to how often they are updated (and possibly also taking into account how often they are needed). Using such classification, a virtual objects module can treat each set of attributes in a specific way, according to their characteristics (for example, it may use different caching policies - e.g. with different lease times, etc.). In particular, some embodiments may provide a virtual objects module having virtual objects sub- modules specialized in treating each set of virtual attributes according to a certain volatility classification.
The classification of virtual attributes in volatility levels may be done a-priori and be static (i.e. a virtual attribute of a virtual object has a pre-defined volatility level) or it may be dynamic (i.e. the classification of a virtual attribute of a virtual object may change according to its use - and therefore, its handling may be delegated to different sub-modules at different times). It is important to mention that all considerations about volatility of virtual attributes of virtual objects are also applicable to attributes of requestable objects.
In some embodiments, if the virtual objects module is also in charge of keeping information about requestable objects, the virtual objects sub- modules dedicated to specific volatility levels could also be used to handle attributes of requestable objects.
In some embodiments, where a contents locations module is in charge of keeping information about requestable objects, the contents locations module could comprise contents locations sub-modules dedicated to handle attributes of requestable objects with specific volatility levels, analogously to the virtual objects sub-modules dedicated to handle virtual attributes with specific volatility levels.
Preferably, the virtual objects module 107 comprises a contents locations virtual objects sub-module being adapted to access to virtual attributes 409 of virtual objects 404, said virtual attributes 409 referring to requestable objects 21 1 .
In some embodiments, the requestable objects can be used to keep low level data and attributes (e.g. those referring to underlying objects); such attributes and related data could be independent from the namespaces or particular entries (e.g. possible attributes of a requestable object related to an underlying file system could be the path of an underlying file in an underlying file system - the reference to the underlying object - and possibly the file size, or the time of the last modification; such values do not depend on the particular entry or namespace used to access the object, and a change to them is likely to have a global effect across namespaces); moreover, the requestable objects, by further possibly depending on underlying objects, may require complex protocols for updating certain attributes (e.g. the value of an attribute - e.g. the size of un underlying object - may be required to keep consistent with the actual size of the related underlying object). On the contrary, virtual objects can be used to keep high level virtual attributes which may have a more restricted scope and may effectively depend on the particular entry or namespace used to access them, while other virtual attributes may be related to or refer to requestable objects and, possibly being of global scope, may have more strict requirements of consistency guarantees. In particular, this may involve different needs regarding consistency and, depending on the embodiment, lock management. Therefore, in some embodiments, the virtual objects module may have a sub- module (the contents location virtual objects sub-module) specialized in dealing with virtual attributes of virtual objects related to requestable objects. Such sub-module may implement the appropriate means for manipulating the references to requestable objects and related data with the adequate consistency guarantees (e.g. by using an appropriate locking protocol when performing updates to virtual attributes related to requestable objects). On the contrary, virtual attributes not related to requestable objects may be able to use simplified consistency mechanisms, if any at all. In a preferred embodiment of the invention, the virtual objects module 107 comprises a non-contents locations virtual objects sub-module being adapted to access to virtual attributes 409 of virtual objects 404, said virtual attributes 409 not referring to requestable objects 21 1 .
As mentioned before, virtual attributes referring to requestable objects may require specialized protocols for handling, so that, in some embodiments, the virtual objects module may comprise a contents location virtual objects sub- module specialized to deal with such particular type of virtual attributes
In a similar way, in some embodiments, the virtual objects module comprises a non-contents locations virtual objects sub-module, specialized in dealing with virtual attributes which are independent from requestable objects and related data. Being aware of the independence from related requestable objects, such sub-module may implement optimizations in both the way to access and the way to keep data about said virtual attributes (e.g. said sub- module may handle accesses to such virtual attributes without requiring locks, or it may keep them in a separate repository, so that accessing to them does not interfere with accesses related to requestable objects which, possibly, may require more complex access methods.
In some embodiments, the virtual objects module 107 comprises an exclusive process for each virtual object 404, said exclusive process having exclusivity for performing updating access to virtual attributes 409 of said virtual object 404.
The virtual shell of the invention may work in an environment where more than one user may issue requests simultaneously.
It is not uncommon that certain requests may affect several virtual objects (for example, in a virtual shell offering file system semantics and representing file system objects as virtual objects, a file creation may involve initializing the new virtual object and modifying the object representing the parent directory); in other occasions, different requests may cause conflicts because they need to access the same virtual object in different ways (for example, trying to create a child of a directory while the directory is being listed).
A well known technique to deal with these situations is using explicit locking mechanisms: the code implementing a request tries to acquire locks on the required data structures; if the locks are granted, the code may continue; otherwise, it has to wait until the current lock owner releases them. Nevertheless, in situations with lots of activity, or when large groups of structures have to be manipulated, this technique can lead to lock contention. Even with no conflicts, acquiring and releasing locks may produce a certain overhead.
The virtual objects module can use a novel approach to deal with concurrent activity: instead of locking data structures to allow its use from different "threads" or "processes", each data structure has an assigned "thread" or "process" which is the only one with permissions to modify the data.
In embodiments of the virtual shell, the virtual objects module may associate one of these "exclusive processes" to each virtual object (e.g. representing a file system object), and such "exclusive process" would be the only one with permissions to modify the virtual attributes of the virtual object associated to said "exclusive process" (e.g., the data related to requestable objects pointing to underlying files containing the represented file's data - if any -, the target path of a symbolic link - if the virtual object represents a symbolic link - and directory specific fields - if the virtual object represents a directory).
In some embodiments, other processes may be allowed to modify virtual attributes that do not belong to their associated virtual objects (e.g. by directly accessing them from a repository - for instance, a database). This information is usually treated as a hint. Nevertheless, the values of certain virtual attributes can be considered safe, even when read by non-owner processes, because the authorized "exclusive processes" modify them in a particular order that allow checking its correctness (e.g. using version numbers and non- reusable object identifiers is a well-known technique that allow the implementation of some of these checking methods). The fact that only one "exclusive process" is allowed to modify a specific virtual attribute of a particular virtual object, and a strict ordering in which certain virtual attributes are updated, allow avoiding the need to enclose certain operations in large and costly synchronization procedures (such as database transactions), even when virtual attributes are read from non-owner processes. For example, when a new file is created, the virtual attributes of the virtual object representing the new file are updated first, after virtual object corresponding to the parent directory, and finally an entry in the name space is added; therefore, if an external process is able to reach the object following a path, it is sure that the related virtual attributes have been already set and are consistent.
In some embodiments, whenever a request arrives to the virtual object module to update a virtual object, it is forwarded to the "exclusive process" associated to the target virtual object. If the request involves a modification, then the process perform such modification (e.g. by updating a repository with virtual attributes data such as, for instance, a database), otherwise, the requested data is included in the response. Once the process sends the response back to the caller, it may wait for the next request. In cases where several virtual objects are involved (for example, removing a file, which may affect the virtual object representing the file itself and the virtual object representing the parent directory) the request is sent to the "main" object's "exclusive process" (the directory, in the example) which may interact with the "exclusive process" of the other involved virtual object (the file to be removed, for example) by any necessary means (e.g. by exchanging messages). In some embodiments, the implementation of "exclusive processes" associated to single virtual objects, used by the virtual objects module, has been done using the Erlang/OTP environment, which provides support for extremely lightweight processes (a single Erlang node may handle up to a few million processes). In such environments, the data related to virtual objects (including virtual attributes) was kept in a repository based in the Mnesia database (a database provided with the Erlang/OTP environment). Of course, an implementation following the same principles could also be done using different programming languages, and different database support.
In some embodiments, for performance or implementation reasons, a single "exclusive process" may be used to handle more than one virtual object. In order to achieve similar benefits to having a dedicated "exclusive process" per virtual object, the subsets of virtual objects handled by different "exclusive processes" should be disjoint (i.e. two different process with any overlap in time should not have ownership of the same virtual object, even at different times).
It is clear that equivalent approaches can be used to handle requestable objects (when handled by the virtual objects module, the contents locations module, or any other module).
In summary, some embodiments of the virtual shell may avoid contention problems and costly synchronization mechanisms by the virtual object module using a dedicated "exclusive process" for each virtual object (at least, to perform updates to its corresponding virtual object). This mechanism may improve the performance and/or the scalability of the virtual shell, especially when the virtual shell is adapted to operate in environments where several users may issue simultaneous requests to the virtual shell (for example, parallel and/or distributed systems). In preferred embodiments, the virtual objects module 107 is adapted to activate and/or deactivate exclusive processes for virtual objects.
In embodiments where the virtual object module uses dedicated "exclusive processes" to handle individual virtual objects, not all virtual objects must have an associated active "exclusive process" at all times. The "exclusive process" may be eliminated and disappear, for example, after a period of inactivity, and be re-created when the associated object is needed again. In some embodiments, the virtual objects module may be responsible for activating and/or deactivating the dedicated "exclusive processes" associated to individual virtual objects
In some embodiments, when a request arrives to the virtual object module to update a virtual object and its dedicated "exclusive process" does not exist, a new "exclusive process" may be created for such virtual object for taking ownership of the corresponding virtual attributes, possibly keeping them internally cached. Once the "exclusive process" is created, the virtual objects module may forward the request to said "exclusive process" for processing.
By activating and deactivating "exclusive processes", the virtual shell and, in particular, the virtual objects module, can maintain a greater control on the amount of resources being used to handle virtual objects. In preferred embodiments of the invention, the virtual objects module 107 is adapted to deactivate exclusive processes for virtual objects depending on an inactivity time threshold and/or available resources.
The virtual objects module can keep track of the inactivity time of the "exclusive processes" associated to virtual objects, and eliminate them after a given period of inactivity. A simple way to do this could be to maintain a "least recently used" list of active "exclusive processes", and set up a timer to check said list at intervals and elinninate the processes exceeding a certain inactivity time threshold.
The inactivity time threshold for different "exclusive processes" may also be different. For example, it may depend on the cost to start a given "exclusive process" associated to a particular virtual object, if different virtual objects involve different costs. Another possibility is to use heuristics based on knowledge about the system (for example, it may happen that a certain virtual object is used regularly, but with inactivity intervals larger than for other virtual objects - then, said virtual object may have its inactivity time threshold adapted accordingly).
The virtual objects module may decide to eliminate "exclusive processes" when it detects that operating resources are scarce. This may be used to balance between performance (having "exclusive processes" active and ready) and capacity (having enough resources to start new "exclusive processes" if a bunch of requests on new virtual objects arrive)
In some embodiments of the invention, each exclusive process for virtual object is adapted to deactivate itself and/or to activate/deactivate one or more of the other exclusive processes.
Each "exclusive process" associated to a virtual object may be able to deactivate itself when certain conditions occur. The advantage of having an "exclusive process" to deactivate itself is that each "exclusive process" may keep its own state and track the conditions that make it a candidate for deactivation, instead of the virtual objects module having to maintain global information about all the "exclusive processes" and their state and conditions in order to deactivate them when necessary. Of course, the capacity of an "exclusive process" to deactivate itself does not necessarily prevent the virtual objects module (or other "exclusive processes") from being able to eliminate such "exclusive process". An "exclusive process" may be able to activate other processes without the intervention of the virtual objects module (for example, when said "exclusive process" requires the collaboration of a different "exclusive process" for a different virtual object which is not activated) An "exclusive process" may eliminate "another exclusive process". For example, a requirement to deactivate an "exclusive process" associated to a virtual object representing a directory in a file system, may cause said "exclusive process" to request the deactivation of "another exclusive processes" associated to virtual objects representing the children of said directory.
Preferably, each exclusive process for virtual object is adapted to deactivate itself and/or to deactivate other exclusive processes depending on an inactivity time threshold and/or available resources.
The criteria commented in previously described embodiments to deactivate "exclusive processes" by the virtual objects module can also be used by "exclusive processes" as a criteria to deactivate themselves or other "exclusive processes".
In a preferred embodiment of the invention, each exclusive process for virtual object comprises one or more exclusive sub-processes, each of said exclusive sub-processes being dedicated to virtual attributes 409 of said virtual object 404, each of said virtual attributes 409 having a volatility level according to a predetermined volatility classification. As mentioned in some previous embodiment descriptions, virtual attributes of virtual objects may have different volatility levels: some virtual attributes may be immutable or rarely modified, and some virtual attributes may be updated more often. Having different volatility levels may mean that optimal policies to handle them may be also different (for example, in a distributed environment, virtual attributes with low volatility ratios could be leased for caching purposes for a long time, while high volatility virtual attributes should probably be associate to shorter cache expiration times - or not cached at all) In embodiments where the virtual objects have "exclusive processes" dedicated to handle them, each "exclusive process" may have one or more collaborating sub-processes specialized in handling virtual attributes with a certain level of volatility, in the same way that the objects locations sub- modules operated in previously described embodiments.
In some embodiments, there may be additional collaborating processes of "exclusive processes", which are not necessarily related to virtual objects. Such collaborating processes can be used, for instance, to assist in virtual attribute lease management for caching purposes. For example, when a cache request is accepted, a collaborating process representing the client can be created. From then on, all the cache-related interaction with that particular client in both directions (possibly including lease revocation, voluntary releases of cached data, and lease renewal) is done through the newly created collaborating process. When no data is cached, the process may die after a period of inactivity. The interaction between the collaborating process and "exclusive processes" or "exclusive sub-processes" handling data related to virtual objects can be done by various means (e.g. by exchanging messages). In some embodiments, each exclusive process for virtual object comprises a contents locations exclusive sub-process being dedicated to virtual attributes 409 of said virtual object 404, each of said virtual attributes 409 referring to requestable objects.
As mentioned in some previous embodiment descriptions, dealing with requestable objects may be more complex than dealing with virtual objects, as requestable objects may have related underlying objects and it may be necessary (for some purposes) to keep at least certain attributes of requestable objects synchronized with data related to underlying objects. Therefore, when virtual attributes of virtual objects refer to requestable objects, extra care may be needed to operate with them (e.g. an update may eventually involve interaction with a possible unreliable underlying repository - thus having an increased risk of delays or errors that may need special handling).
When virtual objects are handled by "exclusive processes", each of such "exclusive processes" may have an associate "contents location exclusive sub-process" specialized in dealing with virtual attributes related (or requiring access) to requestable objects related to the virtual object
In preferred embodiments, each exclusive process for virtual object comprises a non-contents locations exclusive sub-process being dedicated to virtual attributes 409 of said virtual object 404, each of said virtual attributes 409 not referring to requestable objects.
As mentioned previously, in embodiments providing "exclusive processes" associated to virtual objects, such "exclusive processes" may have a sub- process specialized in dealing with virtual attributes related to requestable objects. In the same way, the "exclusive processes" may also have an associated "non-content locations exclusive sub-process" specialized in dealing with virtual attributes not related to requestable objects. This may be useful, for example, when such virtual attributes not related to requestable objects are handled and/or stored in a different way (e.g. by being stored in a repository separated from another one used by virtual attributes related to requestable objects, and with different interfaces) In preferred embodiments of the invention, supported by Figure 6, the multi- namespaces module 1 18 is adapted to produce a reproducible ordered list 605 of entries comprising one or more disjoint partial results 606 from a set of entries of namespaces, each entry of said set of entries having one or more related properties being able to be inputted to at least one hash function, said reproducible ordered list being obtained from:
• obtaining a list of distinct hash outputs 601 from applying the hash function to one or more of the properties of each entry of the set of entries;
for each distinct hash output 602 according to a first ordering criteria:
• obtaining a subset 604 of the set of entries being related to said distinct hash output 602;
• ordering the obtained subset 604 of entries according to a second ordering criteria;
· providing the ordered subset 607 of entries as one of the disjoint partial results.
The multi-namespaces module can be requested to produce a list of entries according certain criteria (for example, to obtain the list of all entries in a namespace being associated to another entry acting as a container). In particular, when the virtual shell provides file system semantics, such feature can be used, for example, to provide a directory listing
In environments where the virtual shell may receive several requests simultaneously (possibly from different users), situations may occur where entries from the set being listed are to be removed while the listing is in progress, and/or where new entries that could appear in the listing are to be created while the listing is in progress. In these situations, most systems demand certain degree of consistency in the listing; usually, this means, at least, that the listing should not contain the same entries (i.e. entries with the same name and/or identifier, for example, regardless of them referring to the same or to different objects) more than once. In other words, removed entries may or may not appear in the listing, and newly created entries may either not appear at all, or appear in the listing as far as a previously existing entry with the same name (but already removed) was not already listed. A possible approach could be using a lock mechanism (or a database transaction, if the entries are stored in a database) to prevent modifications to the affected entries while producing the listing. For short listings of entries this may be effective; nevertheless for potentially very large listings, this has a number of problems: for example, generating a large listing may take a considerable amount of time, causing performance problems due to delays to other requests waiting for the locks to be released. Also, in distributed environments where the listing may have to be transferred between components in different nodes, potential problems arise: either all the listing is transferred at once (probably causing long response times due to having to wait for all data to arrive) or the listing is fragmented (facing complex error situations in case the communication fails at the middle of the transfer, leaving the locks acquired).
An additional problem is posed by some file system semantics as POSIX, which defines an interface which allows listing a directory fragment by fragment (entry by entry, specifically) and allows a user starting the listing, leaving it unfinished, and continue with the listing after an arbitrary amount of time. Furthermore, it may require the ability to partially replay a directory listing keeping the previous listing order (via telldir/seekdir interface).
A well-known solution in the field of file systems to handle large directories is to maintain the entries of a directory in an ordered tree structure, which is usually stored as special file. By having the entries ordered, it is relatively simple to keep a pointer to the last entry listed, and generate the listing fragment by fragment. If a new entry is added before the current pointer, it will be ignored; if an already listed entry was removed and then re-created, it will be placed before the current pointer, so it will not be listed twice; and, finally if an entry is either created or removed after the current pointer, it does not affect the part of the listing already generated. Nevertheless, the trees also have issues, especially when growing to large scales: in order to be efficient, trees should be kept balanced when new entries are added and when exiting entries are removed. Balancing a tree while keeping it ordered may be a costly operation, which in the case of file systems is aggravated by the fact that, in order to take advantage of block devices, the leaves of the trees are not entries, but groups of entries packages in one or more consecutive blocks. As such groups can keep only a limited amount of entries, adding or removing an entry may involve splitting or uniting blocks of entries, causing non-trivial (and potentially time consuming) reorganizations of data. Furthermore, in current distributed environments, maintaining an ordered tree which could be distributed across several nodes for scalability purposes, adds even more complexity and cost to its management, making it a poorly scalable solution Ideally, a possible solution could be storing the entries in a table in a database. Current database technologies allow full scale distribution of the data (so that the mechanism is scalable and can operate on distributed systems easily, contrary to what happened with ordered trees), and storage and retrieval of data (entries, in our case) is fast and effective. The issue with database tables is that, in general, they do not guarantee a particular ordering of the data contained in it. So, the obvious way to generate listings from them would be locking the table or using transactions (with the cost and potential problems commented above). Some embodiments of the invention comprise a mechanism that allow storing entries in non-explicitly ordered repositories (such as database tables), while allowing the implementation of entry listings (even fulfilling complex semantics such as POSIX). The mechanism is based in the fact that all entries have some distinctive characteristic that allows to differentiate them (for example, all entries in a directory must have different "names"), so that such characteristic can be used as input to a function (e.g. a hash function), and so that the set of distinct results obtained from applying such function to the entries to be listed can be smaller than the set of entries to be listed
The way to produce and to list entries in a reproducible order from partial subsets of entries starts by generating the list of distinct results of the selected hash function applied to the chosen characteristic of the entries to be listed (e.g. their names). Each distinct hash value defines a disjoint subset of the entries to be listed (said subset containing the entries for which the hash function returns the same hash value). The hash function should be chosen in such a way that the number of distinct hash values and its size (its number of bits) is adequate to be transferred in a single shot from the component having the set of entries to be listed to the component generating the list. Additionally, each subset of entries defined by each hash value should also have a size adequate for being transferred in a single shot with a reasonable cost, and sorted in the component generating the ordered listing (note that a relatively large number of hash values with relatively few bits can be packet in relatively small space, and be used to define a large number of subsets, which can be potentially small). Once the set of distinct hash values is received, the component generating the ordered listing may sort said hash values following any arbitrary sorting criteria. Then, component generating the ordered listing requests the subset of entries corresponding to the first hash value, orders the subset of entries according to any criteria, and starts processing it to generate the whole list (e.g. forwarding the subset to the user entry by entry). The same procedure is then repeated for each distinct hash value defining a subset of entries, until all subsets of entries have been obtained and, thus, the list of all entries has been generated.
It is important to note that the mechanism described above fulfils the required objectives: considering a "current" subset being processed, entries added or removed having hash values corresponding to previous entries are ignored (in particular, if an entry from a previous subset is removed and a new entry with the same name is added, it will fall within the same subset - already processed - so it will not appear twice); entries added or removed to the current set do not affect the listing (as they are added into or removed from the "official" repository, while the component generating the listing is working with its own copy); finally entries added or removed from unprocessed subsets will appear or disappear, respectively, from the final listing (but they will not produce duplicated results). There is a possibility that a new entry is added with a new corresponding hash value that was not in the initial list of distinct hash values received by the component generating the ordered listing; in this case, such entry will not appear in the final listing (which is also compatible with most of the file systems semantics - and particularly with POSIX).
Replying the entry listing in the same order from an arbitrary point can be achieved as far as the list of distinct hash values is maintained (or recovered and re-organized into the original order). For example, a handle to indicate a given position in the listing to start the reply in an efficient way would consist of the name of the desired entry and the hash value corresponding to it.
In summary, embodiments using the above mechanism improve flexibility of the previously described embodiments, because the user can request listings of large amounts of entries and the virtual shell producing the corresponding results with very high reliability and efficiency. In particular, said listings of potentially large amounts of entries in a reproducible way, fragment by fragment, are obtained without the need of locking large amounts of data or for long (or even arbitrary) amounts of time. Additionally, the mechanism does not require any state in the component storing the entries (e.g. to track the last entry returned), thus avoiding potential causes of errors in distributed environments. Finally, the repository used to keep the entries to be listed does not need to have any predefined ordering semantics (in particular, tables in arbitrary databases can be used, enabling the system to take advantage of the database technology, including scalability and distributed operation).
In some embodiments of the invention, the multi-namespaces module 1 18 is adapted to obtain each hash output from selecting a set of bits from the result of applying the hash function to the properties of each entry of the set of entries, said set of bits depending on the number of entries from which the reproducible ordered list of entries is produced.
The multi-namespaces module of some embodiments may use the mechanism described above to generate listings of entries from partial sets of entries defined by hash values. Nevertheless, this mechanism has a drawback: given a certain hash function generating a hash value of a given number of bits for each entry, the potential number of subsets of the set of entries generated does not take into account the number of entries to be listed. So, it may happen that, for a small set of entries (e.g. a small directory) the subsets generated are too small to be handled efficiently (they may have very few entries and the sequence of obtaining each subset might have a higher cost than getting all the entries at once); on the contrary, for very large sets of entries, the division may produce too few subsets (so the subsets are still too large to be handled effectively).
A solution for this drawback can consist of using a hash function returning a relatively large number of bits for each entry, and then using a subset of the bits depending on the number of entries to be listed: for small sets, just a few bits (or even none at all, to select the whole set in one shot) may be used; on the contrary, the larger the set, the more number of bits from the hash value can be used to define the subsets, so the size of the generated subset of entries for partial lists is kept within a reasonable amount.
Preferably, each entry of namespaces comprises at least one hash container adapted to contain the result of applying the hash function to the properties of the entry.
Having to compute hash values for the entries to be listed can be time- consuming. A possible way to avoid this cost is to calculate the hash value associated to given entry when the entry is created, and store the value of the hash value together with the entry.
In embodiments where the entries are stored in a database table, the hash value could be stored as a column in the record containing the entry information. With such implementation, recovering the list of distinct hash values from the database could be done in an easy and efficient way (especially if the column is indexed).
The use of a variable number of bits (e.g. a variable length prefix of the hash value) to define a subset, and recovering it, could be implemented by selecting entries from a database table with hash values within a specific range, where the minimum value of the range could be obtained by using the desired hash value prefix followed by zeros (0) up to complete the full length of the hash values, and maximum value of the range could be obtained by using the desired hash value prefix followed by ones (1 ) up to complete the full length of the hash values.
In the situation where a particular database does not allow the specification of ranges (e.g. some non SQL databases), a possible approach would be dividing the hash value in fragments, and match the records against the desired number of hash value fragments. This does not allow a tuning mechanism as fine grained as a range, but it may be used to get a fair approximation. In a preferred embodiments of the invention, the multi-namespaces module 1 18 is adapted to select the hash function to be applied to one or more of the properties of each entry of the set of entries depending on data from the context monitoring parameters module 1 16 and/or from the non-context monitoring parameters module 105.
In some of the previously described embodiments, the hash function used to divide a set of entries into smaller subsets for listing purposes could be modified depending on the number of entries to be listed (note that changing the number of used bits from the result of the hash value is, in fact, equivalent to changing the hash function).
Additionally, there may be other factors that may influence the choice of a given hash function, so that the resulting subsets are smaller or larger. For example, if the component generating the listing (e.g. the user client) is running on a low capacity hardware (e.g. a mobile device), it may be interesting to have smaller subsets, so that they can be adequately handled by such device; on the contrary, in an environment where the components generating the listing and storing the entries have high capacity and are connected through a robust network with large bandwidth, then it may be useful to generate larger subsets to reduce the overhead of multiple requests.
Therefore, in some embodiments, the selection of the hash function to use for generating subsets of entries (or, in particular, the selection of the number of bits to use from the hash values) may depend on context and/or non-context monitoring parameters (such as, for example, the bandwidth between components, or the type of hardware being used by the user client). All the details provided in previous descriptions about the context monitoring parameters module comprised in the virtual shell are also applicable to the context monitoring parameters module for being used in the virtual shell.
All the details provided in previous descriptions about the multi-namespaces module comprised in the virtual shell are also applicable to the multi- namespaces module for being used in the virtual shell.
All the details provided in previous descriptions about the underlying interface module comprised in the virtual shell are also applicable to the underlying interface module for being used in the virtual shell. All the details provided in previous descriptions about the interception module comprised in the virtual shell are also applicable to the interception module for being used in the virtual shell. All the details provided in previous descriptions about the logic module comprised in the virtual shell are also applicable to the logic module for being used in the virtual shell.
An important point to mention is that some embodiments of the virtual shell can be designed to operate in both local and distributed environments. In particular, each of the modules or sub-modules can be detached and run in the same or different nodes. Similarly, any module or sub-module can be divided (for performance, scalability or any other implementation reasons) into components and sub-components, which may also be detached and run either locally or in remote locations with respect to the other components. For example, the logic module and the virtual objects module could run in different systems. In another example, a particular module (for instance, the multi- namespace module) could be divided in components also running in different nodes (for instance, a component running in the user's client system - e.g. to implement a local cache for multi-namespace data to be used by a local component of the logic module -, and a remote component responsible to keep the necessary data about the entries (possibly in a database, which could also be remote or distributed). On the other side, in some embodiments, different modules and/or components of such modules could be merged into single units for efficiency or other implementation reasons.
In some embodiments, the virtual shell may be able to deal with multiple users simultaneously, possibly in different remote locations. Regarding the underlying repositories, they may also be remote and/or of parallel or distributed nature (e.g. a remote network underlying file system or a parallel or distributed underlying file system). In those scenarios, the virtual shell can be able to deal with underlying data contents being modified from several locations (either simultaneously or not). The virtual shell of the invention may present the user a virtual view of file system directories, allowing the user to organize the files as he/she pleases, while implementing a completely different directory layout in an underlying file system. Some embodiments of the virtual shell were developed for a Unix-like operating system (namely Linux) which provided a POSIX interface for the applications to deal with the file systems. This involves dealing with particular pieces of metadata and a specific set of operations to manipulate the files. Nevertheless, one embodiment of the virtual shell has been designed so that it can be easily adapted to contain and handle different pieces of data (monitoring parameters, attributes, etc.) and respond to a different set of requests. In other words, the design has no hard dependences on a specific operating system or a specific file system interface. Embodiments developed for Windows-based operating systems and providing an NTFS-like interface for the applications to deal with file systems follow the same principles.
In some embodiments where the virtual shell can offer POSIX file system semantics and the file system objects (e.g. files, directories, etc.) are represented by requestable objects, the attributes of such requestable may include access control (owner, group and related access permissions), symbolic and hard link management, and size and time data for non-regular files (sizes and access times management for regular files rely on the underlying file system used as underlying repository). Directory handling can combine the features of the multi-namespaces module (to handle the directory hierarchy) with the attributes of requestable objects associated to the entries acting as containers (e.g. to maintain extra information related to the directory, such as access permissions, number of entries, etc.) The attributes of the requestable objects do not need to include any knowledge of low-level data storage (in particular, there are no references about disks, blocks or other storage objects - though they could be easily included if necessary): the reference to the underlying object could be reduced to the path of the underlying file (and it may be left up to the underlying file system to take decisions about low-level data server selection, striping, block/object placement, etc.) It is important to mention that POSIX-based systems assign an internal identifier (the so-called "i-node" number) toeach file system object, and such "i-node" number must be made available to the user; then, the virtual shell also needs to generate and such "i-node" numbers (for a number of technical reasons, it is not feasible to simply forward "i-node" numbers from the underlying file system: for example, there may be several underlying file systems - with possibly duplicated "i-node" numbers - or some of the underlying file systems may not be POSIX-compliant and not provide such "i- node numbers".
It is worth mentioning that the valid range for i-node numbers is somewhat dependant on the operating system used by the user, as they have to be fed back to it (one of the possible requests in a POSIX system is a request for translation from an entry name into an "i-node" number) and they can be used by the kernel as identifiers in future requests. Most systems use integers with 32 or 64 bits, while a few can use larger integer sizes. It is usually assumed by the system that an i-node number represents a unique object in a given file system, and that such i-node number will not be reused until the previous object is destroyed. Nevertheless, some systems allow for a more dynamic schema: an i-node number could be reused as far as the underlying system keeps no reference about the previous object. To this end, some virtual shell embodiments may use a "forget" request from the user to notify some references to a particular i-node number have been released. (Given this scenario, it is clear that user applications should never relay on i-node numbers to identify files.) Some systems use a "generation number" associated to the i-node number which is increased when such i-node number is re-used; nevertheless, this is not usually visible outside the system.
In some embodiments, the virtual shell may internally identify and refer to requestable objects representing file system objects by means of identifiers (internally called "inums" in some embodiments). These identifiers are unique, at least for the life time of the object. Such requestable object identifiers handled internally do not necessarily match the i-node numbers provided to the kernel. So, an inum value (the internal identifier), in general, has to be mapped to a locally unused "i-node" number adequate for the user. Of course, if it can be guaranteed that the possible range of inum values is always a subset of the values accepted as i-node numbers by the user's operating system kernel, a possible map consists in directly using the inum value as i- node number.
Nevertheless, it would be possible to generate identifiers (inums) that do not match the valid i-node number range, or which are not even numbers. In this case, the virtual shell module (e.g. the logic module) may perform an explicit map. This could be achieved, for example, by keeping a list (or a similar structure) of locally unused i-node numbers, and a hash table (or a similar structure) with the mapped inum values to assigned i-node numbers, and possibly also the other way round (such data structures could be either local or remote, and could be temporary - and, for example, cleaned up on system restart - or made persistent (using, for example, a database or some other kind of backend storage).
Some possible situations where having a limited size numeric identifier for the file system objects could be a handicap are, among others: having a very large set file objects exceeding the capacity of the local i-node numbers (larger identifiers could be used to identify the file system objects, and only those in the "active" working set would need to be assigned a smaller sized i- node number); using monotonically increasing identifiers which may, eventually, exceed the limited size of the i-node numbers (using monotonically increasing and/or any other type of non-repeatable identifiers has some interesting properties that may be used to optimize certain internal algorithms); using mechanisms to generate distributed unique identifiers (such identifiers are usually large, to avoid the chances of independently generating the same identifier for different objects); etc.
Some file system request as, for example, file creation are delicate operations, which require careful coordination between the different modules of the virtual shell. On one side, the multi-namespaces module, with the assistance of logic module, will have to check the access permissions and create a new entry in a namespace. On the other side, a contents locations module may need to request the creation of an actual underlying file somewhere in the underlying file system and possibly create the representation of the corresponding requestable object and its attributes. Eventually, the new entry and the new requestable object will be linked (by means of a "requestable object contents location", for example). Last but not least, the context monitoring parameter module may be required to generate a "handle" to the new file, so that further requests on it (e.g. reads and writes) can be tracked and associated to the original file creation operation (and to any particular flags indicated during the creation of the file) without needing to repeat searches of the path in the namespace. For simplicity, we will assume in the following examples that requestable objects are managed by a contents locations module (though, of course, it is clear that the same techniques and mechanisms described can be applied to any other module able to handle requestable objects in different embodiments).
The first step to fulfil a file creation request, is to generate an identifier for new requestable object (the so-called "inum"). This identifier should be unique across a possibly distributed system where several users may issue simultaneous requests to create files, and where the modules of the virtual shell (including the contents locations module) can also be distributed in multiple components. In some embodiments, the module component responsible to create the requestable object (which may be a contents locations module component, if such module is available) uses a the following mechanism: instead of having each component generating the identifiers on its own, they request them to a global identifier server component (which may depend, for example, on the logic module) which gets requests from all participating contents locations module components and guarantees that they receive different identifiers. A specific embodiment of the identifier service uses a monotonically increasing counter to generate identifiers, and has the particularity that assigned identifiers are never reused, even if the corresponding requestable object has been deleted (for long running systems, counter overflow is not a problem, as the number of bits used for the counter could be dynamically increased). In order to avoid frequent requests, the contents location module components may request a range of identifiers (instead of a single one), and keep them locally and use them whenever needed.
Additionally, a file creation request also requires generating an identifier for the underlying object that may need to be created (e.g. an actual file in an underlying file system). This identifier should also be unique across a possibly distributed system (where different contents locations module components may be simultaneously serving different requests on their own - and therefore, maybe trying to simultaneously create underlying objects, possibly in the same underlying repository). Instead of using a global identifier service, some embodiments allow each contents locations module component to independently generate unique identifiers for the underlying. One way to achieve this would be assigning each component of each virtual shell module (in particular, to each component of a distributed contents location module) a unique identifier. This could be done in a coordinated way (for example by having one or more components - probably associated to the logic module - being able to generate identifiers and distribute them under request) or independently (for example by combining locally available information that cannot be repeated in different components - e.g. a combination of host address, uptime, local time, process or thread id of the component and possibly a random number would probably be enough). Then, the component identifier can be used to "tag" a locally generated identifier for the file being created (which could be based, for example, in a monotonically increasing counter), with the possibility of reusing identifiers for objects not used anymore at a local scope by keeping, for example, a list of unused identifiers. This mechanism allows a contents locations module component to request the creation of an underlying object associated to a requestable object representing a file system object without risk of underlying object name collisions with other contents locations module components. Note that the identifier of the underlying object could also be based, or even contain, the unique requestable object identifier (the "inum") or use it either directly or as a base to locally generate unique file names in an underlying file system.
The second step consists in requesting the underlying interface module to create, if needed, the corresponding underlying object in the underlying repository, as indicated by the contents locations module (in particular, using the unique underlying object identifier generated as explained in previous paragraphs). If a file or any other thing is actually "created" or not will depend on the particular underlying repository behaviour and/or policies implemented by the contents location module (for example, creation of the underlying object could be delayed until first use, or a pre-created underlying object from a pool of previously created underlying objects could be used). As mentioned for previous embodiments, the interaction between the contents locations module and the underlying interface module could use the logic module as an intermediate step.
At this point, if the underlying object "creation" was successful, the contents location module will have a valid reference to an underlying object will be able to associate it to the corresponding requestable object as an attribute (e.g. by keeping the path of the underlying file in the underlying file system as a requestable object attribute).
The next step consists of updating the requestable object related to the parent directory of the file system object being created (a reference to such requestable object - probably in the form of a "requestable object contents location" - should have been obtained in an initial step from the multi- namespaces module, when the entry corresponding to the parent directory path has been checked against a given namespace).
Finally, a request to the multi-namespaces module (possibly via the logic module) can be sent in order to create the corresponding entry in the desired namespace, and associate it to a "requestable object contents location" referring to the requestable object for the newly created file system object.
In case the request to the multi-namespaces module fails (for example, because an entry in the desired namespace already existed with the same name), a notification is sent back to the contents location module, so that the created underlying file (if any) can be removed and the corresponding requestable object can also be eliminated. Additionally, the discarded identifiers (e.g. the "inum" for the new requestable object) can be stored locally by the corresponding contents location module component to be used later (as the corresponding requestable object was not associated to any entry and it was eliminated, the identifier was not "officially assigned", and therefore a convenient rule about not re-using assigned identifiers is maintained).
The procedure just described can be used not only for files, but also for other types of file system objects (in particular, those that can be created with a mknod system call in a POSIX-based file system), such as named pipes, Unix domain sockets, or character and block devices. Directories and symbolic links, on the contrary, do not require the existence of corresponding underlying objects (all necessary information can be kept in requestable object attributes not related to any underlying repository) and, thus, the procedure can be simplified for this type of objects.
Apart from simply creating the file system object, some file systems may have more elaborated semantics. In particular, some of them may allow a conditional behaviour: preparing a file for access ("opening" the file) if it exists, and creating it if it does not exist and then preparing it for access.
Such combined semantics may produce a race situation in a distributed system. In the case that at least two contents locations module components are trying to create and prepare the requestable object and corresponding underlying object for a non-existing file, the initial procedure is similar to the simple "create-only" method: each component generates a new requestable object with its corresponding underlying object (with any required interaction with the underlying repositories via the underlying interface module and, possibly, the logic module) and, then, tries to register it in a namespace via the multi-namespaces module. At least one of them should succeed, but the other (or others, if more than two contents locations module components were trying to create files), instead of just getting the error, will get the "requestable object contents location" corresponding to the file that was successfully registered; then, they will discard its own requestable object and proceed to prepare the access to the registered one (which involves interaction with the logic module in order to make the necessary checks on the registered file system object type and permissions - it could happen that different types of file system objects were intended to be created with the same entry name: for instance, a file and a directory).
When a contents locations module component (or a logic module component), during an open/create request, is returned the "requestable object contents location" of someone else's file to open, instead of the "requestable object contents location" generated by itself, there is a chance that the actual file in the underlying repository is removed before it can be effectively opened. This situation could be handled in different ways. For example, and "open-in- progress" counter could be maintained for the data structures related to the file (e.g. the requestable object) and, despite being removed from the namespace by the multi-namespaces module, the removal of the requestable object and the related underlying file could be delayed until such counter reaches zero. An alternative implementation just discards the information and re-tries to create its own file again (theoretically, in a highly volatile environment, the described failure situation could be repeated indefinitely, leading to starvation; to avoid that, a retry counter could be used to return an error after a certain number of retries).
Following with the example of a virtual shell providing POSIX semantics on top of an underlying file system, a request to access certain file system object attributes (e.g. "utime" to modify a file's access and modification times, or "stat" to retrieve a file's size) may require accessing the underlying file related to the requestable object representing the target file. In the case of data access and modification times, the reason is that these values are quite related to actual data manipulation, and the underlying file systems usually handle them well; duplicating them in the requestable object attributes and keeping them synchronized with the underlying file attributes would probably be an unnecessary cost (though it could be done, if necessary). A similar situation occurs for the file "size": it is much cheaper to check with the actual underlying object when needed, than to keep it synchronized at all times. Nevertheless, as mentioned before, directories and symbolic links do not need any related underlying objects; so, their time-related information and their size can be directly maintained as attributes of the corresponding requestable object by, for example, the contents locations module.
When the modules of the virtual shell can be distributed, and even each individual module can be distributed in several components, caching is a useful technique to avoid some communication among components and improve the performance. Nevertheless, care must be taken to keep the cache synchronized with the actual data. The virtual shell modules and their components may maintain a cache mechanism by themselves, or take advantage of mechanisms already implemented in the base technology (e.g. FUSE, when used as one of the base technologies, allows the multi- namespaces module to specify the expiration times for cached directory entries in the operating system kernel, and also allows the contents location module components to specify expiration times for some of the file system object attributes cached in the operating system kernel). Nevertheless most of the caching mechanisms in the base technologies were not adapted to the specific virtual shell needs so, eventually, some embodiments implemented a a specific cache mechanism to keep requestable object attributes and entry related data in "local" components of the corresponding modules to reduce interactions with "remote" components. The cached data is provided with a limited-time lease. Additionally, the provider module may issue specific invalidation requests whenever the affected pieces of data are going to be modified from a different component.
A particular detail of the cache implementation in some embodiments is that lease handling is decoupled from data requests (i.e. from requests asking for specific entry information to the multi-namespaces module or asking for requestable object related information to the contents locations module, for example); in other words, the lease is not sent back to the caching component together with the response to a request: instead, the response is sent as fast as possible from the component having the desired data, and it triggers a decoupled concurrent mechanism that will end up in the corresponding lease being sent to the requesting component afterwards. Though this may not seem intuitive, it may result in better response times, while the cache efficacy is not affected by the delay in a significant way. The number of leases granted for a specific piece of data can be limited. This limitation may help to keep the synchronization and invalidation costs bounded. Beyond this limit, the system may work as a no-cache system. The possible performance penalties could be compensated by the reduction of synchronization costs. Most file systems store information related to the file system objects and the namespaces using specific structures (for example i-nodes and directories) which are usually grouped (e.g. the i-nodes are usually packed together in certain sections of disks, and directories are potentially huge collections of entries with the same "parent") and stored in some storage media managed by the file system. On the contrary, rather than grouping them in explicit structures, some embodiments of the virtual shell organizes the data required by the different virtual shell modules as records in large tables indexed by a key value (or, at most, the combination of a few values) without any explicit grouping. Therefore, any hash-like structure (a set of entries accessible via a key) would be adequate for storing the necessary pieces of data, such as entry information for the multi-namespaces module, requestable object attributes for the contents locations module or virtual objects and their virtual attributes for a virtual objects module (of course, other structures such as list, ordered lists, tables, etc. would also be possible implementations, though probably less efficient, either in access time or required space).
In particular, databases fulfil the functionality of hashes, with some additional features: multiple keys, or atomic transactions, for example. Moreover, advanced database engines may al so provide fault tolerance mechanisms or distribution support. For these reasons, some embodiments of the virtual shell can use one or more databases as backend for the data required by the different virtual shell modules. Essentially, any database could be used (Berkeley DB, MySql, Oracle, Postgres, etc, just to name a few of them - of course, any missing feature could be completed by the code of the module using it). Specifically, some embodiments of the virtual shell can use Mnesia, a database that is part of the Erlang/OTP suite. Said database is optimized for simple queries in soft real time distributed environments, and has built-in support for transactions, fault tolerance mechanisms and data distribution. An interesting property is that said database is able to keep and manipulate its tables in memory (for efficiency) while sending the information in the background to a persistent media (safety).
Although this invention has been disclosed in the context of certain preferred embodiments and examples, it will be understood by those skilled in the art that the present invention extends beyond the specifically disclosed embodiments to other alternative embodiments and/or uses of the invention and obvious modifications and equivalents thereof. Thus, it is intended that the scope of the present invention herein disclosed should not be limited by the particular disclosed embodiments described before, but should be determined only by a fair reading of the claims that follow.
Further, although the embodiments of the invention described with reference to the drawings comprise computer apparatus and processes performed in computer apparatus, the invention also extends to computer programs, particularly computer programs on or in a carrier, adapted for putting the invention into practice. The program may be in the form of source code, object code, a code intermediate source and object code such as in partially compiled form, or in any other form suitable for use in the implementation of the processes according to the invention. The carrier may be any entity or device capable of carrying the program.
For example, the carrier may comprise a storage medium, such as a ROM, for example a CD ROM or a semiconductor ROM, or a magnetic recording medium, for example a floppy disc or hard disk. Further, the carrier may be a transmissible carrier such as an electrical or optical signal, which may be conveyed via electrical or optical cable or by radio or other means.
When the program is embodied in a signal that may be conveyed directly by a cable or other device or means, the carrier may be constituted by such cable or other device or means. Alternatively, the carrier may be an integrated circuit in which the program is embedded, the integrated circuit being adapted for performing, or for use in the performance of, the relevant processes.

Claims

1 . A virtual shell (100) for producing one or more responses (103) to a user request (102) related to one or more objects, the virtual shell (100) comprising:
a context monitoring parameters module (1 16) for producing data related to one or more (202) context monitoring parameters (201 );
a multi-namespaces module (1 18) for producing data related to one or more (205) namespaces (206), each of said namespaces (206) comprising one or more (207) entries (208), each of said entries (208) referring to a set (210) of requestable objects (21 1 ), and each of the requestable objects (21 1 ) having a set (213) of requestable object attributes (212) and being referred from one or more (209) entries (208) of one or more of said namespaces (206), wherein the organization of each namespace (206) is decoupled from the attributes (212) of the requestable objects (21 1 ) referred from the entries (208) of the namespace (206), said organizations allowing arbitrarily structured namespaces (206);
• an underlying interface module (1 17) for producing data related to a set of underlying objects comprising data related to requestable objects related to the requested objects;
• a logic module (109) for interacting (1 12) with the context monitoring parameters module (1 16), for interacting (1 14) with the multi-namespaces module (1 18), for interacting (1 13) with the underlying interface module (1 17), and for producing data for the interception module (104) from a set of the data produced by the context monitoring parameters module (1 16), a set of the data produced by the multi-namespaces module (1 18) and a set of the data produced by the underlying interface module (1 17), the interaction (1 14) with the multi-namespaces module (1 18) depending on a subset of the set of the data produced by the context monitoring parameters module (1 16);
· an interception module (104) for intercepting the user request (102), for interacting (106) with the logic module (109) and for producing the responses (103) to the user request (102) from a set of the data produced by the logic module (109).
2. The virtual shell (100) according to claim 1 , wherein the multi-namespaces module (1 18) is adapted to produce the data related to namespaces (206) further taking into account that each entry (208) of each namespace (206) refers to its related set (210) of requestable objects (21 1 ) through one or more (301 ) requestable object contents locations (300), each requestable object contents location (300) referring to a subset (303) of the related set (210) of requestable objects (21 1 ), and each of the requestable objects (21 1 ) being referred from one or more (302) requestable object contents locations (300); wherein the virtual shell (100) further comprises a contents locations module (1 10) for producing data related to one or more references to one or more requestable objects (21 1 ) from data related to requestable object contents locations (300);
wherein the logic module (109) is adapted to interact (1 15) with the contents locations module (1 10) and to produce the set of data for the interception module (104) further from a set of the data produced by the contents locations module (1 10), the interaction (1 15) with the contents locations module (1 10) depending on a subset of the set of the data produced by the context monitoring parameters module (1 16).
3. The virtual shell (100) according to claim 1 , wherein the multi-namespaces module (1 18) is adapted to produce the data related to namespaces (206) further taking into account that each entry (208) of each namespace (206) refers to its related set (210) of requestable objects (21 1 ) through one or more (400) virtual contents locations (401 ) each virtual contents location (401 ) referring to a set (403) of virtual objects (404) and each virtual object (404) referring to a subset (406) of the related set (210) of requestable objects (21 1 ), each of the requestable objects (21 1 ) being referred from one or more (405) virtual objects (404) and each virtual object (404) being referred from one or more (402) virtual contents locations (401 )
wherein the virtual shell (100) further comprises a virtual objects module (107) for producing data related to one or more references to one or more requestable objects (21 1 ) from data related to virtual contents locations (401 ) wherein the virtual objects module is adapted to produce the data related to references to requestable objects (21 1 ) further taking into account that each virtual object (404) has a set (410) of virtual attributes (409), said virtual attributes (409) being decoupled from attributes (212) related to requestable objects (21 1 );
wherein the multi-namespaces module (1 18) is adapted to produce the data related to namespaces (206) further taking into account that the organization of each namespace (206) is decoupled from the virtual attributes (409) of the virtual objects (404) referred from the entries (208) of the namespace (206) through the virtual contents locations (401 )
wherein the logic module (109) is adapted to interact (1 1 1 ) with the virtual objects module (107) and to produce the set of data for the interception module (104) further from a set of the data produced by the virtual objects module (107), the interaction (1 1 1 ) with the virtual objects module (107) depending on a subset of the set of the data produced by the context monitoring parameters module (1 16).
4. The virtual shell (100) according to claim 3, wherein the virtual objects module (107) is adapted to produce the data related to references to requestable objects further taking into account that each virtual object (404) refers to its related subset (406) of requestable objects (21 1 ) through one or more (500) requestable object contents locations (501 ), each requestable object contents location (501 ) referring to a subset (503) of the related subset (406) of requestable objects (21 1 ), and each of the requestable objects (21 1 ) being referred from one or more (502) requestable object contents locations (501 );
wherein the virtual shell (100) further comprises a contents locations module (1 10) for producing data related to one or more references to one or more requestable objects (21 1 ) from data related to requestable object contents locations (501 ); wherein the logic module (109) is adapted to interact (1 15) with the contents locations module (1 10) and to produce the set of data for the interception module (104) further from a set of the data produced by the contents locations module (1 10), said interaction (1 15) with the contents locations module (1 10) depending on a subset of the set of the data produced by the context monitoring parameters module (1 16).
5. The virtual shell (100) according to any of claims3 or 4, wherein the virtual objects module (107) comprises one or more virtual objects sub-modules, each of said virtual objects sub-modules being adapted to access to virtual attributes (409) of virtual objects (404) and/or to attributes (212) of requestable objects (21 1 ), each of said attributes (409;212) having a volatility level according to a predetermined volatility classification.
6. The virtual shell (100) according to any of claims 3 to 5, wherein the virtual objects module (107) comprises a contents locations virtual objects sub- module being adapted to access to virtual attributes (409) of virtual objects (404), said virtual attributes (409) referring to requestable objects (21 1 ).
7. The virtual shell (100) according to any of claims 3 to 6, wherein the virtual objects module (107) comprises a non-contents locations virtual objects sub- module being adapted to access to virtual attributes (409) of virtual objects (404), said virtual attributes (409) not referring to requestable objects (21 1 ).
8. The virtual shell (100) according to any of claims 3 to 7, wherein the virtual objects module (107) comprises:
an exclusive process for each virtual object (404), said exclusive process having exclusivity for performing updating access to virtual attributes (409) of said virtual object (404).
9. The virtual shell (100) according to claim 8, wherein the virtual objects module (107) is adapted to activate and/or deactivate exclusive processes for virtual objects.
10. The virtual shell (100) according to claim 9, wherein the virtual objects module (107) is adapted to deactivate exclusive processes for virtual objects depending on an inactivity time threshold and/or available resources.
1 1 . The virtual shell (100) according to any of claims 8 to 10, wherein each exclusive process for virtual object is adapted to deactivate itself and/or to activate/deactivate one or more of the other exclusive processes.
12. The virtual shell (100) according to claim 1 1 , wherein each exclusive process for virtual object is adapted to deactivate itself and/or to deactivate other exclusive processes depending on an inactivity time threshold and/or available resources.
13. The virtual shell (100) according to any of claims 8 to 12, wherein each exclusive process for virtual object comprises one or more exclusive sub- processes, each of said exclusive sub-processes being dedicated to virtual attributes (409) of said virtual object (404), each of said virtual attributes (409) having a volatility level according to a predetermined volatility classification.
14. The virtual shell (100) according to any of claims 8 to 13, wherein each exclusive process for virtual object comprises a contents locations exclusive sub-process being dedicated to virtual attributes (409) of said virtual object (404), each of said virtual attributes (409) referring to requestable objects.
15. The virtual shell (100) according to any of claims 8 to 14, wherein each exclusive process for virtual object comprises a non-contents locations exclusive sub-process being dedicated to virtual attributes (409) of said virtual object (404), each of said virtual attributes (409) not referring to requestable objects.
16. The virtual shell (100) according to any of claims 1 to 15, the virtual shell (100) further comprising a non-context monitoring parameters module (105) for producing data related to one or more non-context monitoring parameters (204);
wherein the logic module (109) is adapted to interact (108) with the non- context monitoring parameters module (105) and to produce the set of data for the interception module (104) further from a set of the data produced by the non-context monitoring parameters module (105);
wherein the logic module (109) is adapted to interact (1 14) with the multi- namespaces module (1 18) further depending on a subset of the set of the data produced by the non-context monitoring parameters module (105).
17. The virtual shell (100) according to claim 16, when claim 16 depends on claim 2 or any of claims 4 to 15, when claim 8 depends on any of claims 4 to 7, when claim 7 depends on any of claims 4 to 6, when claim 6 depends on any of claims 4 or 5, when claim 5 depends on claim 4, wherein the logic module (109) is adapted to interact (1 15) with the contents locations module (1 10) further depending on a subset of the set of the data produced by the non-context monitoring parameters module (105).
18. The virtual shell (100) according to any of claims 1 to 17, wherein the multi-namespaces module (1 18) is adapted to produce a reproducible ordered list (605) of entries comprising one or more disjoint partial results (606) from a set of entries of namespaces, each entry of said set of entries having one or more related properties being able to be inputted to at least one hash function, said reproducible ordered list being obtained from:
• obtaining a list of distinct hash outputs (601 ) from applying the hash function to one or more of the properties of each entry of the set of entries; for each distinct hash output (602) according to a first ordering criteria:
· obtaining a subset (604) of the set of entries being related to said distinct hash output (602);
ordering the obtained subset (604) of entries according to a second ordering criteria;
providing the ordered subset (607) of entries as one of the disjoint partial results.
19. The virtual shell (100) according to claim 18, wherein the multi- namespaces module (1 18) is adapted to obtain each hash output from selecting a set of bits from the result of applying the hash function to the properties of each entry of the set of entries, said set of bits depending on the number of entries from which the reproducible ordered list of entries is produced.
20. The virtual shell (100) according to any of claims 18 or 19, wherein each entry of namespaces comprises at least one hash container adapted to contain the result of applying the hash function to the properties of the entry.
21 . The virtual shell (100) according to any of claims 18 to 20, wherein the multi-namespaces module (1 18) is adapted to select the hash function to be applied to one or more of the properties of each entry of the set of entries depending on data from the context monitoring parameters module (1 16) and/or from the non-context monitoring parameters module (105).
22. A context monitoring parameters module (1 16) for use in a virtual shell (100) according to any of claims 1 to 21 , the context monitoring parameters module (1 16) comprising:
· computing means for interacting (1 12) with the logic module (109) of the virtual shell (1 00);
• computing means for producing data related to one or more (202) context monitoring parameters (201 ).
23. A multi-namespaces module (1 18) for use in a virtual shell (100) according to any of claims 1 to 21 , the multi-namespaces module (1 18) comprising:
• computing means for interacting (1 14) with the logic module (109) of the virtual shell (1 00);
computing means for producing data related to one or more (205) namespaces (206), each of said namespaces (206) comprising one or more (207) entries (208), each of said entries (208) referring to a set (210) of requestable objects (21 1 ), and each of the requestable objects (21 1 ) having a set (213) of requestable object attributes (212) and being referred from one or more (209) entries (208) of one or more of said namespaces (206), wherein the organization of each namespace (206) is decoupled from the attributes (212) of the requestable objects (21 1 ) referred from the entries (208) of the namespace (206), said organizations allowing arbitrarily structured namespaces (206).
24. An underlying interface module (1 17) for use in a virtual shell (100) for producing one or more responses (103) to a user request (102) related to one or more objects, the virtual shell (100) according to any of claims 1 to 21 , the underlying interface module (1 17) comprising:
computing means for interacting (1 13) with the logic module (109) of the virtual shell (1 00);
• computing means for producing data related to a set of underlying objects comprising data related to requestable objects related to the requested objects.
25. An interception module (104) for use in a virtual shell (100) for producing one or more responses (103) to a user request (102), the virtual shell (100) according to any of claims 1 to 21 , the interception module (104) comprising:
• computing means for intercepting the user request (102);
computing means for interacting (106) with the logic module (109) of the virtual shell (1 00);
• computing means for producing the responses (103) to the user request (102) from a set of data produced by the logic module (109).
26. A logic module (109) for use in a virtual shell (100) according to any of claims 1 to 21 , the logic module (109) comprising:
• computing means for interacting (1 12) with the context monitoring parameters module (1 16) of the virtual shell (100);
• computing means for interacting (1 14) with the multi-namespaces module (1 18) of the virtual shell (100);
• computing means for interacting (1 13) with the underlying interface module (1 17) of the virtual shell (100);
• computing means for producing data for the interception module (104) of the virtual shell (100) from a set of data produced by the context monitoring parameters module (1 16), a set of data produced by the multi-namespaces module (1 18) and a set of data produced by the underlying interface module (1 17), the interaction (1 14) with the multi-namespaces module (1 18) depending on a subset of the set of data produced by the context monitoring parameters module (1 16).
27. A method for producing one or more responses (103) to a user request (102) related to one or more objects, the method comprising:
producing, by means of a context monitoring parameters module (1 16), data related to one or more (202) context monitoring parameters (201 );
· producing, by means of a multi-namespaces module (1 18), data related to one or more (205) namespaces (206), each of said namespaces (206) comprising one or more (207) entries (208), each of said entries (208) referring to a set (210) of requestable objects (21 1 ), and each of the requestable objects (21 1 ) having a set (213) of requestable object attributes (212) and being referred from one or more (209) entries (208) of one or more of said namespaces (206), wherein the organization of each namespace (206) is decoupled from the attributes (212) of the requestable objects (21 1 ) referred from the entries (208) of the namespace (206), said organizations allowing arbitrarily structured namespaces (206);
· producing, by means of an underlying interface module (1 17), data related to a set of underlying objects comprising data related to requestable objects related to the requested objects; producing, by means of a logic module (109), data for an interception module (104) from a set of the data produced by the context monitoring parameters module (1 16) and obtained by means of the logic module (109) interacting (1 12) with the context monitoring parameters module (1 16), a set of the data produced by the multi-namespaces module (1 18) and obtained by means of the logic module (109) interacting (1 14) with the multi-namespaces module (1 18), and a set of the data produced by the underlying interface module (1 17) and obtained by means of the logic module (109) interacting (1 13) with the underlying interface module (1 17), the interaction (1 14) of the logic module (109) with the multi-namespaces module (1 18) depending on a subset of the set of the data produced by the context monitoring parameters module (1 16);
• intercepting, by means of the interception module (104), the user request and producing the responses (103) to the user request (102) from a set of the data produced by the logic module (109) and obtained by means of the interception module (104) interacting (106) with the logic module (109).
28. The method according to claim 27, further comprising:
producing, by means of the multi-namespaces module (1 18), the data related to namespaces (206) further taking into account that each entry (208) of each namespace (206) refers to its related set (210) of requestable objects (21 1 ) through one or more (301 ) requestable object contents locations (300), each requestable object contents location (300) referring to a subset (303) of the related set (210) of requestable objects (21 1 ), and each of the requestable objects (21 1 ) being referred from one or more (302) requestable object contents locations (300);
producing, by means of a contents locations module (1 10), data related to one or more references to one or more requestable objects (21 1 ) from data related to requestable object contents locations (300);
· producing, by means of the logic module (109), the set of data for the interception module (104) further from a set of the data produced by the contents locations module (1 10) and obtained by means of the logic module (109) interacting (1 15) with the contents locations module (1 10), said interaction (1 15) of the logic module (109) with the contents locations module
(1 10) depending on a subset of the set of the data produced by the context monitoring parameters module (1 16).
29. The method according to claim 27, further comprising:
producing, by means of the multi-namespaces module (1 18), the data related to namespaces (206) further taking into account that each entry (208) of each namespace (206) refers to its related set (210) of requestable objects (21 1 ) through one or more (400) virtual contents locations (401 ) each virtual contents location (401 ) referring to a set (403) of virtual objects (404) and each virtual object (404) referring to a subset (406) of the related set (210) of requestable objects (21 1 ), each of the requestable objects (21 1 ) being referred from one or more (405) virtual objects (404) and each virtual object (404) being referred from one or more (402) virtual contents locations (401
producing, by means of a virtual objects module (107), data related to one or more references to one or more requestable objects (21 1 ) from data related to virtual contents locations (401 )
producing, by means of the virtual objects module (107), the data related to references to requestable objects (21 1 ) further taking into account that each virtual object (404) has a set (410) of virtual attributes (409), said virtual attributes (409) being decoupled from attributes (212) related to requestable objects (21 1 );
producing, by means of the multi-namespaces module (1 18), the data related to namespaces (206) further taking into account that the organization of each namespace (206) is decoupled from the virtual attributes (409) of the virtual objects (404) referred from the entries (208) of the namespace (206) through the virtual contents locations (401 )
producing, by means of the logic module (109), the set of data for the interception module (104) further from a set of the data produced by the virtual objects module (107) and obtained by means of the logic module (109) interacting (1 1 1 ) with the virtual objects module (107), said interaction^ 1 1 ) of the logic module (109) with the virtual objects module (107) depending on a subset of the set of the data produced by the context monitoring parameters module (1 16).
30. The method according to claim 29, further comprising:
· producing, by means of the virtual objects module (107), the data related to references to requestable objects further taking into account that each virtual object (404) refers to its related subset (406) of requestable objects (21 1 ) through one or more (500) requestable object contents locations (501 ), each requestable object contents location (501 ) referring to a subset (503) of the related subset (406) of requestable objects (21 1 ), and each of the requestable objects (21 1 ) being referred from one or more (502) requestable object contents locations (501 );
producing, by means of a contents locations module (1 10), data related to one or more references to one or more requestable objects (21 1 ) from data related to requestable object contents locations (501 );
producing, by means of the logic module (109), the set of data for the interception module (104) further from a set of the data produced by the contents locations module (1 10) and obtained by means of the logic module
(109) interacting (1 15) with the contents locations module (1 10), said interaction (1 15) of the logic module (109) with the contents locations module
(1 10) depending on a subset of the set of the data produced by the context monitoring parameters module (1 16).
31 . A computer program product comprising program instructions for causing a computer to perform a method for producing one or more responses (103) to a user request (102) related to one or more objects, said method according to any of claims 27 to 30.
32. The computer program product according to claim 31 , embodied on a storage medium.
33. The computer program product according to claim 31 , carried on a carrier signal.
PCT/EP2011/053410 2010-03-08 2011-03-08 Virtual shell WO2011110534A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP11709070.4A EP2585946A2 (en) 2010-03-08 2011-03-08 Virtual shell

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US31165910P 2010-03-08 2010-03-08
EP10155837.7 2010-03-08
EP10155837 2010-03-08
US61/311,659 2010-03-08

Publications (2)

Publication Number Publication Date
WO2011110534A2 true WO2011110534A2 (en) 2011-09-15
WO2011110534A3 WO2011110534A3 (en) 2012-01-19

Family

ID=42289405

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2011/053410 WO2011110534A2 (en) 2010-03-08 2011-03-08 Virtual shell

Country Status (2)

Country Link
EP (1) EP2585946A2 (en)
WO (1) WO2011110534A2 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080256354A1 (en) 2005-11-17 2008-10-16 Steven Blumenau Systems and methods for exception handling

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7925682B2 (en) * 2003-03-27 2011-04-12 Microsoft Corporation System and method utilizing virtual folders
KR101366220B1 (en) * 2006-05-23 2014-02-21 노리안 홀딩 코포레이션 Distributed storage
US8209365B2 (en) * 2007-07-23 2012-06-26 Hewlett-Packard Development Company, L.P. Technique for virtualizing storage using stateless servers

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080256354A1 (en) 2005-11-17 2008-10-16 Steven Blumenau Systems and methods for exception handling

Also Published As

Publication number Publication date
WO2011110534A3 (en) 2012-01-19
EP2585946A2 (en) 2013-05-01

Similar Documents

Publication Publication Date Title
US10956601B2 (en) Fully managed account level blob data encryption in a distributed storage environment
US11669544B2 (en) Allocation and reassignment of unique identifiers for synchronization of content items
US8849759B2 (en) Unified local storage supporting file and cloud object access
US9043372B2 (en) Metadata subsystem for a distributed object store in a network storage system
US8548957B2 (en) Method and system for recovering missing information at a computing device using a distributed virtual file system
JP5047988B2 (en) Distributed storage system with web service client interface
JP4896541B2 (en) Discoverability and enumeration mechanisms in hierarchically secure storage systems
US20170123931A1 (en) Object Storage System with a Distributed Namespace and Snapshot and Cloning Features
US20060059204A1 (en) System and method for selectively indexing file system content
EP2329379A1 (en) Shared namespace for storage clusters
CN104778192B9 (en) Directory structure representing content addressable storage system
EP2585946A2 (en) Virtual shell
Schneller et al. Mysql Admin Cookbook Lite: Replication and Indexing
Štědronský A decentralized file synchronization tool
Klosterman Delayed instantiation bulk operations for management of distributed, object-based storage systems
Artiaga Amouroux File system metadata virtualization
Klosterman Delayed Instantiation Bulk Operations for Management of Distributed, Object-based Storage Systems (CMU–PDL–09–108)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11709070

Country of ref document: EP

Kind code of ref document: A2

WPC Withdrawal of priority claims after completion of the technical preparations for international publication

Ref document number: 61/311,659

Country of ref document: US

Date of ref document: 20120125

Free format text: WITHDRAWN AFTER TECHNICAL PREPARATION FINISHED

Ref document number: 10155837.7

Country of ref document: EP

Date of ref document: 20120125

Free format text: WITHDRAWN AFTER TECHNICAL PREPARATION FINISHED

NENP Non-entry into the national phase in:

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2011709070

Country of ref document: EP