EP2776952A2 - Logically and end-user-specific physically storing an electronic file - Google Patents
Logically and end-user-specific physically storing an electronic fileInfo
- Publication number
- EP2776952A2 EP2776952A2 EP12784586.5A EP12784586A EP2776952A2 EP 2776952 A2 EP2776952 A2 EP 2776952A2 EP 12784586 A EP12784586 A EP 12784586A EP 2776952 A2 EP2776952 A2 EP 2776952A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- storage device
- physical storage
- file
- user
- storing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/122—File system administration, e.g. details of archiving or snapshots using management policies
Definitions
- the invention relates to a method of using a data storage system, to a data storage system, and to control software for executing the method.
- a data processing system such as a personal computer (PC), a mobile telephone or Smartphone, a laptop PC, a tablet PC, etc., has a data storage system configured for storing electronic files, e.g., electronic documents, video files, audio files, e-mails, still pictures, etc., for later retrieval.
- electronic files e.g., electronic documents, video files, audio files, e-mails, still pictures, etc.
- GUI graphical user-interface
- the electronic files are typically presented as logically organized in file folders. That is, the user of the data storage system determines in advance the logical organization, or: grouping, of his/her collection of electronic files.
- the logical organization is represented in the GUI as a set of multiple file folders.
- the set may be a hierarchical set: a particular file folder at a certain level in the hierarchy may contain one or more further file folders at a next lower level in the hierarchy.
- Each particular one of the electronic files can then be accommodated in one or more specific ones of the multiple file folders, depending on a semantic aspect or on a plurality of semantic aspects of the particular electronic file.
- the expression "semantic aspect” refers to a characteristic of the particular electronic file that is meaningful to the user for the purpose of logically organizing the electronic files based on, e.g., a respective topical aspect of the contents of a respective one of the electronic files (e.g., a title of the respective electronic file), a respective source or respective author of the respective electronic file, a respective file format of the respective electronic file, etc.
- the characteristic enables the user to discriminate between different electronic files that, from the point of view of this individual user, belong to different file folders.
- the user may have created individual file folders for each individual file format of the electronic files.
- a first file folder is created for holding only video files
- a second file folder is created for holding only audio files
- a third file folder is created for holding only text documents
- a fourth file folder is created for holding only still pictures, etc. That is, in the first scenario, the organization of file folders is created on the basis of the file format or the rendering type of the electronic file.
- the user may have created individual file folders based on semantic topics.
- a fifth file folder is created for holding electronic files relating to a first topic, e.g., classic cars; a sixth file folder is created for holding electronic files relating to a second topic, e.g., science; a seventh file folder is created for holding electronic files relating to a third topic, e.g., performing arts; an eighth file folder is created for holding electronic files relating to a fourth topic, e.g., world history, etc.
- the logical organization of electronic files in the file folders represents a systematic classification of the electronic files.
- the classification is typically determined in advance and is specific to the individual end-user.
- the classification is meant to assist the end- user in managing his/her electronic files, e.g., for ease of identifying or of retrieving a particular one of the electronic files.
- the distribution of the electronic files among the file folders is therefore typically driven by the classification scheme that the individual end-user has adopted for his/her purposes and, therefore, is subjective.
- the data storage system is also configured for physically storing the electronic files at one or more physical storage devices of the data processing system.
- Typical examples of such physical storage devices are a hard disk drive (HDD), a solid-state drive (SSD), a network drive (NAS) and a server of a cloud service provider.
- the data storage system forms a functional component of a data processing system.
- Operation of the data storage system is typically controlled by a file system.
- the file system forms a part of the operating system of the data processing system.
- the file system interacts with a respective one of the physical storage devices via a respective device driver.
- Each respective device driver translates commands from the operating system into commands that are specific to the hardware of the respective physical storage device, typically using SCSI (Small Computer System Interface).
- SCSI is a set of standards for physically connecting a computer to one or more peripheral devices and for transferring data between the computer and the peripheral device.
- the operation of a conventional file system is typically such that there is a one-to-one relationship between a particular file folder and a particular physical storage device: all files allocated to a particular file folder are stored at a single physical storage device.
- a namespace is an abstract container or environment that is created to hold a logical grouping of unique identifiers, here: file names.
- An identifier defined in a particular namespace is associated only with that particular namespace. The same identifier can be independently defined in multiple namespaces.
- Data storage devices support namespaces. Data storage devices use directories (or folders) as namespaces.
- Dynamic tiering or “dynamic storage tiering”.
- the dynamic tiering is enabled by controllers at the physical storage devices. Dynamic storage tiering allows root users, i.e., system administrators or processes pre-defined by system administrators, to move files among different volumes, allocate files to different volumes at file creation time, and independently recover volumes, without altering the namespace of the file system.
- the term "volume”, as used within the context of computer operating systems, refers to a single accessible storage area within a single file system, typically resident on a single partition of a hard disk.
- a SAN is a dedicated network that provides access to consolidated, block-level data storage.
- SANs are primarily used to make physical storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers, so that the physical storage devices appear as devices locally attached to the operating system.
- a SAN typically has its own network of physical storage devices that are generally not accessible through the local area network by other devices.
- a standard connection type for SAN in enterprise storage is Fibre Channel.
- Fibre Channel is a gigabit-speed network technology primarily used for storage networking. System administrators can move files among different volumes, and can allocate files to different volumes at Fibre Channel level. However, for the file system there is still a one-to-one relationship between the file folder and the physical storage device. The commands, however, are redirected at the level of the Fibre Channel controller.
- typical file systems maintain a one-to-one relationship between, on the one hand, a particular file folder logically storing one or more electronic files and, on the other hand, a physical storage device physically storing the electronic files of the particular file folder.
- Dynamic storage tiering and the combination of SAN and Fibre Channel are examples of storage scenarios, wherein the logical storage and physical storage can be decoupled. However, the decoupling is under control of the system administrator, and not under control of the end-user of the electronic files.
- control over determining of which particular one of the electronic files allocated to the same file folder is stored at which particular one of multiple physical storage devices is given to the end-user of the electronic files before actually storing the particular electronic file.
- a particular one of the physical storage devices is selected under combined control of a storage policy specific to this end-user, and one or more attributes of the particular electronic file.
- the storage policy specifies the requirements of this end-user's with regard to the storing of his her /electronic files in, and/or the retrieval of his/her electronic files from, the data storage system.
- the requirements are specified in terms of one or more storage properties of each individual one of the physical storage devices available to this end-user, and in terms of one or more attributes that an electronic file can have and that are relevant to this end-user.
- the storage policy determines for each electronic file at which one of the physical storage devices the electronic file is going to be stored, given the one or more attributes of the electronic file and given the storage properties of the physical storage devices.
- the applicable storage policy may be determined in advance, or may be determined solely from an explicit user-input at the time of logically allocating a particular electronic file to the particular file folder, or may be determined dynamically as a result of a history of this end- user interacting with the data storage system, or the applicable storage policy may be determined by a combination of two or more of these determining factors.
- the data storage system is configured for physically storing a specific electronic file of a specific file folder at a specific one of the physical storage devices, independent of at which one of the physical storage devices another electronic file of the same file folder has been stored.
- storage property refers to one or more characteristics of the functioning of the relevant physical storage device in operational use, such as, for example, speed of storing or of retrieving of data (e.g., expressed in Mbit/sec), storage capacity available, cost per unit of storage capacity (e.g., cost per megabit of storage), latency on a data connection via which data is stored at, or retrieved from, a relevant one of the first physical storage device and the second physical storage device, whether the relevant one of the first physical storage device and the second physical storage device is a Write-Once-Read-Many (WORM) storage device, geographic location of the relevant physical storage device, etc.
- WORM Write-Once-Read-Many
- the storage property of a physical storage device can be quantified by a value of a parameter indicative of a specific characteristic, or by multiple values, each respective one of the values being of a respective one of multiple parameters and being indicative of a respective one of multiple characteristics.
- the storage property of a specific physical storage device is characterized by a specific region in a one-dimensional parameter space or in a multi-dimensional parameter space, wherein each particular dimension is indicative a particular characteristic or of a particular combination of characteristics, of the kinds specified above.
- a role of the quantity "storage property" as discussed above, is to be able to discriminate between different types of physical storage devices and to be able to determine which one or which ones suit the purposes of the end-user best.
- the storage properties of the physical storage devices may therefore also be specified in qualitative terms so long as a comparison is possible between different ones of the physical storage devices on the basis of their storage properties to determine a suitable one of the physical storage devices given the end-user's requirements laid down in the storage policy.
- the inventors propose a method of using a data storage system.
- the data storage system comprises at least a first physical storage device and a second physical storage device.
- the first storage device has first storage property and the second storage device has a second storage property.
- the first storage property differs from the second storage property.
- the data storage system is configured for logically storing one or more respective ones of multiple electronic files of an end-user at a respective one of a plurality of file folders, and for physically storing one or more respective ones of the electronic files at a respective one of the first physical storage device and the second physical storage device.
- the method comprises receiving a request for storing a particular one of the electronic files in a particular one of the file folders.
- the method further comprises: in response to the request performing following actions: logically storing the particular electronic file in the particular one of the file folders; determining an attribute of the particular electronic file; determining a storage policy of the end-user that depends on the first storage property and the second storage property; under combined control of the attribute and the storage policy selecting a particular one of the first physical storage device and the second physical storage device; and physically storing the particular electronic file at the particular physical storage device.
- the electronic files of the end-user are logically organized in one or more file folders, and different ones of the electronic files logically stored in the same particular file folder are physically stored at different ones of multiple physical storage devices under combined control of the attributes of the electronic files and a storage policy that is specific to the end-user.
- the invention therefore enables each specific end-user of electronic files to exploit the different storage properties available at his/her data storage system so as to optimize his/her usage of the data storage system with regard to, e.g., speed, costs, latency, etc., under control of the attributes of the individual electronic files.
- Two different end-users may each request to store a respective copy of the same electronic file while applying different storage policies or different storage preferences, resulting in storage of the respective copy at respective physical storage devices having different storage properties.
- the invention operates in a layer between the operating system and the file system.
- the file system manages each specific one of the file folders as a specific logical storage container.
- a particular logical storage container is implemented physically by means of multiple physical storage devices with different operational characteristics (e.g., speed, capacity, costs, etc).
- a particular electronic file logically stored in a particular file folder is physically stored at a specific one of the physical storage devices physically implementing the particular file folder.
- the specific physical storage device is selected under control of the operational characteristics of the specific physical storage device, optionally with respect to the operational characteristics of the other physical storage devices that physically implement the particular file folder.
- the request for storing the particular electronic file is initiated by, or on behalf of, the end-user of the particular electronic file.
- the file system receives the request from the operating system.
- the operating system runs on, e.g., a desktop PC, a Smartphone, a picture camera, etc.
- the request for storing the particular electronic file is generated as, e.g., a manual input from the end- user via a user-interface (e.g., alphanumeric keyboard) of a data processing system, of which the data storage system forms a functional component.
- the request is created automatically, e.g., via a pre-determined script, in response to the data processing system detecting an electronic file having been downloaded.
- GUI graphical user-interface
- the data processing system controls the GUI to generate a pop-up window with a question: "on what kind of physical storage device would you like to store this particular electronic file", and with a menu of the options available for physically storing the particular electronic file currently considered.
- the menu has a first option “fast”, a second option “medium”, and a third option “slow”.
- the first option “fast” represents a physical storage device that is capable of storing data and/or retrieving data at high data transfer rate (e.g., solid-state memory).
- the second option “medium” represents another physical storage device that is capable of storing data and/or retrieving data at a maximum data transfer rate that is substantially lower than that represented by the first option "fast” (e.g., a local hard-disk drive).
- the third option “slow” represents yet another physical storage device that is capable of storing data and/or retrieving data at a maximum data transfer rate that is substantially lower than that represented by the second option “medium” (e.g., a writable optical disc or a network storage device (NAS)).
- a first physical storage device comprises a hard disk drive local to the user's PC
- a second physical storage device comprises a NAS accessible from the user's PC via a data network.
- the local hard disk drive is represented in the menu as the option "fast”
- the NAS is represented in the menu as the option "slow”.
- My Media a logical directory structure with a file folder named "My Media”.
- the menu has a first option: "expensive - high performance” and a second option: “cheap - low performance”.
- the first option “expensive - high performance” represents a physical storage device, capable of operating at a high data transfer rate but also, alas, expensive per unit of data stored.
- the second option “cheap - low performance” represents another physical storage device, only capable of operating at a data transfer rate that is lower than that of the physical storage device in the first option, that is, however inexpensive per unit of data stored.
- the user may contemplate future usage of the particular file to be stored and adopt a storage option that best fits his/her contemplated usage.
- an audio file may therefore as well be stored at a physical storage device with a low performance so as to not unduly occupy storage space of the high-performance physical storage device and so as to reduce the costs per audio file stored.
- a high-performance physical storage device e.g., a solid-state memory
- the menu may indicate that the storage of the particular file will cost the user, say, $ 0.05 per year if stored at a specific physical storage device, and will cost the user, say $ 0.15 per year if stored at another physical storage device.
- the one or more rules, specific to a particular end-user may be pre-determined and fixed or, alternatively, may be made to change dynamically in dependence on a change in
- cloud storage is a storage option available to the particular end-user, and is physically represented by a particular one of the first physical storage device and the second physical storage device.
- the costs per unit storage, as charged by the cloud storage provider per unit of data, may change over time.
- the change of costs may cause a change in the storage policy adopted by this particular end-user.
- the costs per unit data are reduced so that the policy is adjusted towards storing more electronic files in the cloud and fewer locally at the home network of this particular end-user.
- the particular end-user adds a further physical storage device to his/her data storage system. Then, the storage policy for this particular end-user is changed so as to also take into account the storage property of this further physical storage device.
- a first rule specifies that the particular file always be stored on the fastest physical storage device available.
- a second rule specifies that the physical storage device available for storing an electronic file larger than 1 MB is the least expensive one available.
- a third rule specifies that the physical storage device available for storing an electronic file with a file extension relating to streaming of audio or of video is the one available that has a performance that matches the streaming requirements. Accordingly, the physical storage device available for storing a streaming audio file of a size smaller than 1 MB is the fastest one among the least expensive ones available that are capable of audio streaming.
- Storage policies may be created, e.g., by a service provider or another supplier, for different sets of physical storage devices, each respective one thereof with a respective storage property.
- a specific storage policy is optimized for a specific parameter (e.g., speed, latency, costs, capacity, etc.), or for a specific combination of parameters (e.g., speed and costs, latency and costs, capacity and costs and latency, etc.).
- the end-user of the electronic files may then obtain from the service provider a specific storage policy that matches the specific variety in storage properties of the set of physical storage devices of his/her data storage system.
- the end-user uploads to the service provider information representative of his/her set of physical storage devices, further information about his/her typical usage and about what parameters or qualities of the physical storage devices are deemed relevant by this end-user.
- the service provider then creates a storage policy based on the information received from the end- user and programs the end-user's data storage system via, e.g., the Internet, so as to configure the data storage system for complying with the storage policy created.
- the service provider obtains a profile from this end-user with regard to the usage and storage of electronic files, and the service provider can use this profile to customize or otherwise adjust a service provided to this end-user.
- the attribute of the particular electronic file may be determined by metadata descriptive of the particular electronic file and available from, e.g., a header of the particular electronic file.
- the metadata may be representative of, e.g., a size of the particular electronic file, a file format of the particular electronic file, a title or other semantic indication of the particular electronic file, such as an author, a date of creation of the particular electronic file, a topic or semantic class of the particular electronic file, whether or not the particular electronic file is subjected to an auditing policy (so as to determine whether or not to physically store the particular electronic file at a WORM storage device), etc.
- the attribute is determined by a user-specific context of the request for storing, e.g., a time of the day or the day of the week of receiving the request for storing; an identity of the end-user who, or on behalf of whom, the request for storing is created, a history of recent further requests for storing or a profile of the end-user determined in advance, etc.
- the attribute is indicative of a respective cost indication for storing the particular file at a respective one of the first physical storage device and the second physical storage device.
- the attribute is determined by explicit input from the end-user via the user-interface of the data processing system, of which the data storage system forms a functional component.
- the data storage system has been configured in advance for enabling the user to specify, at the start of the storing procedure of the particular electronic file, a parameter value indicative of whether or not the end-user is going to need this particular electronic file in the near future. If the end-user specifies he/she is going to need the particular electronic file in the near future, the data storage system is configured for physically storing the particular electronic file locally at the end-user's data processing system (e.g., at the HDD of the end-user's PC). If the end-user specifies that he/she is not going to need this particular electronic file in the foreseeable future, the data storage system is configured for physically storing the particular electronic file at a server of a cloud storage provider.
- end-user may refer to an individual consumer, but also to an individual member of an organization such as a department of a commercial enterprise, a governmental department, a university, or to a group of such individual members.
- the end-user is a group of individuals.
- the storage policy for storing electronic files for this group of individuals is under combined control the one or more attributes of each specific one of the electronic files and one or more rule bases. That is, the storage policy may be uniform for all individuals in the group or, alternatively, may depend on the specific identity of each specific individual of the group, for example, as indicated by the specific individual's job title or role in the group.
- the data storage system forms a functional part of a data processing system of the end-user.
- the first physical storage device comprises remote storage connected via a data network to the data processing system; and the second physical storage device is local to the data processing system.
- the first physical storage device and the second physical storage device are managed by a cloud service provider.
- cloud computing and “cloud service” refer to the delivery of hosted services over the Internet.
- cloud in the expressions “cloud computing” and “cloud service” refers to the Internet and was reportedly inspired by the cloud-like symbol that has typically been used to represent the Internet in diagrams and flowcharts.
- a hosted service can be provided by a server or by a conglomerate of multiple servers located anywhere in the cloud.
- a cloud service is available to the public and to enterprises, is accessed through the Internet or via a wide area network (WAN), and is purchased on an as-needed basis.
- WAN wide area network
- a cloud service is fully managed by the cloud service provider.
- a cloud service is elastic in the sense that a user can have as much or as little of the cloud service, e.g., compute power or data storage capacity, as he/she wants at any given time.
- An example of a cloud service is cloud storage.
- Cloud storage refers to data storage capacity that is accessed by the end-user through the Internet or via a WAN.
- Cloud storage uses an infrastructure of a plurality of storage systems that may be distributed among different geographic locations or regions. Cloud storage has been becoming increasingly more relevant especially to enterprises and governmental agencies.
- the first physical storage device and the second physical storage device in above further embodiment are managed by the cloud service provider.
- the first physical storage device and the second physical storage device have different storage properties. Accordingly, the cloud service provider can offer a customized storage policy to the individual end-user, based on the preferences of the individual end-user, and charge accordingly.
- the invention further relates to a data storage system, wherein the data storage system comprises at least a first physical storage device and a second physical storage device.
- the first physical storage device has a first storage property and the second physical storage device has a second storage property different from the first storage property.
- the data storage system is operative to logically store one or more respective ones of multiple electronic files of an end-user at a respective one of a plurality of file folders, and to physically store one or more respective ones of the electronic files at a respective one of the first physical storage device and the second physical storage device.
- the data storage system comprises an input for receiving a request for storing a particular one of the electronic files in a particular one of the file folders and a file system.
- the file system is configured for, in response to the request: logically storing the particular electronic file in the particular one of the file folders; determining an attribute of the particular electronic file; determining a storage policy of the end-user that depends on the first storage property and the second storage property; under combined control of the attribute and the storage policy selecting a particular one of the first physical storage device and the second physical storage device and physically storing the particular electronic file at the particular physical storage device.
- the attribute is determined by at least one of: metadata in the particular electronic file that is descriptive of the particular electronic file; explicit user input from the end-user; a user-specific context of the request for storing; a history of recent further requests for storing; and a pre-determined profile of the end-user.
- the attribute is indicative of at least one of: a size of the particular electronic file; a time of creation or availability of the particular electronic file; a file format of the particular electronic file; a semantic aspect of a content of the particular electronic file; whether or not the particular electronic file is subjected to an auditing policy; whether or not the end-user is going to need the particular electronic file in the near future; a time of the day at which the request for storing is received; a day of the week on which the request for storing is received; a respective cost indication for storing the particular file at a respective one of the first physical storage device and the second physical storage device; and a history of recent further requests for storing.
- the data storage system forms a functional part of a data processing system of the end-user.
- the first physical storage device comprises remote storage connected via a data network to the data processing system; and the second physical storage device is local to the data processing system.
- the invention also relates to, control software for configuring a data storage system.
- the control software may be commercially supplied as recorded on a computer-readable medium such as a magnetic disk, an optical disc, a solid-state memory, etc., or may be made available for being downloaded via a data network such as the Internet.
- the data storage system comprises at least a first physical storage device and a second physical storage device.
- the first physical storage device has a first storage property and the second physical storage device has a second storage property different from the first storage property.
- the data storage system is operative to logically store one or more respective ones of multiple electronic files of an end-user at a respective one of a plurality of file folders, and to physically store one or more respective ones of the electronic files at a respective one of the first physical storage device and the second physical storage device.
- the data storage system comprises an input for receiving a request for storing a particular one of the electronic files in a particular one of the file folders.
- the control software comprises: first instructions for, in response to the request, logically storing the particular electronic file in the particular one of the file folders; second instructions for, in response to the request, determining an attribute of the particular electronic file; third instructions for, in response to the request, third instructions for, in response to the request, determining a storage policy of the end-user that depends on the first storage property and the second storage property; fourth instructions for, in response to the request and under combined control of the attribute and the storage policy, selecting a particular one of the first physical storage device and the second physical storage device; and fifth instructions for, in response to the request, physically storing the particular electronic file at the particular physical storage device.
- the attribute is determined by at least one of: metadata in the particular electronic file that is descriptive of the particular electronic file;
- the attribute is indicative of at least one of: a size of the particular electronic file; a time of creation or availability of the particular electronic file; a file format of the particular electronic file; a semantic aspect of a content of the particular electronic file; whether or not the particular electronic file is subjected to an auditing policy; whether or not the end-user is going to need the particular electronic file in the near future; a time of the day at which the request for storing is received; a day of the week on which the request for storing is received; a respective cost indication for storing the particular file at a respective one of the first physical storage device and the second physical storage device; and a history of recent further requests for storing.
- Fig.l is a block diagram for illustrating a data processing system comprising a data storage system of the invention.
- Fig.2 is a process diagram for illustrating a method in the invention.
- the invention relates to a data processing system that has a data storage system for storing electronic files.
- the data storage system enables the end-user to logically store different electronic files in the same file folder and to physically store different electronic files allocated to the same file folder at different ones of multiple physical storage devices.
- This approach supports optimizing a storage usage for this particular end-user of the electronic files, based on the storage properties of individual ones of the physical storage devices, and under combined control of the attributes of the individual electronic files and a storage policy specific to the end-user.
- Fig.l is a block diagram of a data processing system 100.
- the data processing system 100 comprises, for example, any of an individual end-user's PC, a laptop PC, a tablet PC, a
- Smartphone an individual end-user's home network, etc., or to a computer network of an organization such as a commercial enterprise, a governmental department, a university, etc.
- the data processing system 100 has a user interface 102, and also has a data storage system 104 according to the invention.
- the data storage system 104 is configured for logically organizing a plurality of electronic files in one or more file folders.
- the user interface 102 has a display monitor 106 for graphically representing to a user of the data processing system 100 a logical organization of the file folders stored at the data storage system 104.
- the file folders may form a hierarchical folder configuration.
- the hierarchical folder configuration has a first level of first-level file folders 108.
- a specific one of the first-level file folders 108 may itself contain one or more second-level file folders.
- the specific one of the first-level file folders 108 is indicated with a reference numeral 110, and is shown to comprise a second level of second-level file folders 1 12.
- a particular one of the second- level file folders 1 12 may itself contain one or more third-level file folders.
- the particular one of the second-level file folders 112 is indicated with a reference numeral 1 14, and is shown to comprise a third level of third-level file folders 116, and so on.
- a file folder in the logical organization may be empty, or may contain one or more further file folders, and/or may contain one or more electronic files.
- FIG.1 shows the logical organization of the file folders as a branching configuration rendered on the display monitor 106.
- Other examples (not shown) of visually representing the logical organization of the file folders on a display monitor are feasible. See, e.g., international application publication WO0244935 filed for Greg Roelofs for "GUI has library metaphor based on non-euclidean geometry" and incorporated herein by reference.
- International application publication WO0244935 describes a data processing system with a GUI that enables the user to interact with a virtual environment.
- the environment has a graphical representation of a storage based on a library metaphor.
- the storage is being used to graphically archive information items.
- the virtual environment has a path-dependent geometry. This allows modification of the storage to add additional items without visually disrupting the organization of the items stored previously.
- the data storage system 104 comprises a plurality of physical storage devices, e.g., a first physical storage device 118, a second physical storage device 120, and a third physical storage device 122.
- the data storage system 104 is also configured for physically organizing the plurality of electronic files by means of allocating each particular one of the electronic files to one of the first physical storage device 118, the second physical storage device 120 and the third physical storage device 122.
- the data storage system 104 comprises a file system 126 operative to manage the logical organization of the electronic files as well as the physical organization of the electronic files.
- the first physical storage device 118 and the second physical storage device 120 are local to the data processing system 100, whereas the third physical storage device 122 is remote to the data process system 100 and is accessed from the data processing system 100 via a data network 124, e.g., the Internet.
- the data processing system 100 comprises, for example, an end-user's PC
- the first physical storage device 118 comprises, e.g., a local hard disk drive (HDD)
- the second physical storage device 120 comprises, e.g., a solid-state drive (SSD) accommodated at the PC.
- HDD hard disk drive
- SSD solid-state drive
- the data processing system 100 comprises the end- user's home network
- the first physical storage device 1 18 comprises, e.g., a local HDD of the user's desktop PC
- the second physical storage device 120 comprises, e.g., the end-user's direct-attached storage (DAS).
- DAS direct-attached storage
- a typical DAS comprises a number of HDDs connected to the PC through a host bus adapter (HBA).
- HBA host bus adapter
- the data processing system 100 comprises the end- user's home network
- the first physical storage device 1 18 comprises, e.g., a HDD of the user's desktop PC
- the second physical storage device 120 comprises, e.g., an SSD of the end-user's Smartphone temporarily connected to the home network.
- the data processing system 100 comprises the end- user's home network
- the first physical storage device 1 18 comprises, e.g., a local HDD of the user's desktop PC
- the second physical storage device 120 comprises, e.g., network-attached storage (NAS) connected to the desktop PC via the end-user's local area network (LAN).
- NAS network-attached storage
- the data processing system 100 comprises a corporate computer network of a commercial enterprise.
- the first physical storage device 118 comprises read/write storage managed by the commercial enterprise, and the second physical storage device 120 comprises write-once-read-many (WORM) storage, managed by the commercial enterprise.
- WORM write-once-read-many
- the third physical storage device 122 is remote to the data process system 100, and is connected to the data processing system 100 via a data network 124, e.g., the Internet or another wide area network (WAN).
- the third physical storage device 122 is then formed by one or more servers that provide an online storage facility to the end-user, e.g., through a cloud service provider.
- cloud computing and “cloud service”, as used within the field of data processing, refer to the delivery of hosted services over the Internet.
- cloud in the expressions “cloud computing” and “cloud service” refers to the Internet and was reportedly inspired by the cloud-like symbol that has typically been used to represent the Internet in diagrams and flowcharts.
- a hosted service can be provided by a server or by a conglomerate of multiple servers located anywhere in the cloud.
- a cloud service is available to the public and to enterprises, is accessed through the Internet or via a WAN, and is purchased on an as-needed basis.
- a cloud service is fully managed by the cloud service provider.
- a cloud service is elastic in the sense that a user can have as much or as little of the cloud service, e.g., compute power or data storage capacity, as he/she wants at any given time.
- the operation of a file system is such that there is a one-to-one relationship between, on the one hand, a particular file folder (a unit of logical organization) and, on the other hand, a particular physical storage device (a unit of physical organization). That is, the logical organization of electronic files in file folders as chosen by the end-user determines the physical organization of the electronic files at the physical storage devices.
- first physical storage device 1 18, the second physical storage device 120 and the third physical storage device 122 have different storage properties such as, e.g., different speeds of storage or retrieval, different storage capacities, different costs, etc.
- the end-user cannot optimize, per individual electronic file, his/her usage of the data storage system 104 on the basis of the storage properties of the physical storage devices available, as the conventional data storage system does not allow logically storing different electronic files in the same file folder and at the same time physically storing these different electronic files of the same file folder at different ones of the first physical storage device 118, the second physical storage device 120 and the third physical storage device 122.
- the performance of storage and retrieval of each individual electronic file that is logically stored in the same file folder is determined by the very characteristics of the specific physical storage device, to which the file folder has been allocated, regardless of a manner wherein the end-user wishes to engage in using the individual electronic file.
- an SSD has very low access time and very low latency compared to a HDD, is more costly per unit of storage capacity than a HDD, and typically supports a number of write operations in operational use that is much lower than the number of write operations supported by a HDD.
- an end-user of a data processing system who has created on his/her data processing system a specific topical file folder for logically storing electronic files on a particular topic, e.g., electric guitars.
- the electronic files stored comprise, e.g., word processor documents (text of emails and of letters sent and received in the mail), documents in pdf- format (scanned-in copies of owner manuals and repair manuals, sheet music), still pictures (pictures taken at concerts, pictures of the end-user's guitars), audio clips (e.g., of favorite music downloaded from the web), video clips (videos of concerts, videos of the end-user performing on guitar), web documents (web pages as saved), etc.
- the end-user finds it convenient to have a storage policy based on the size of the electronic files, i.e., on the number of bytes of digital information per individual electronic file. The end-user may then want to physically keep the electronic files, which are logically allocated to this specific topical file folder and have a size lower then a pre-determined magnitude, in local storage, e.g., at the end-user's PC or
- Smartphone and the other electronic files, which are logically allocated to the same topical file folder and have a size equal or larger than the pre-determined magnitude, in remote storage, e.g., in cloud storage.
- an end-user who downloads movies (motion pictures) from the Internet and who classifies the downloaded movies by logically storing the downloaded movies in a particular file folder at his/her data processing system.
- the particular file folder comprises multiple subsidiary file folders so as to be able to logically organize the movies and cluster the movies in a specific subsidiary file based on a specific semantic aspect (e.g., genre, era, film director, featuring actor, etc.).
- a first movie e.g., "The New Haircut of the Museum" of A.T.
- this end-user may decide to watch the first movie in the near future; and while downloading a second movie ("Goya's women" of Mr Mann- Emperor), this end-user may decide to archive this second movie so as to have the second movie available if and when he/she decides to watch it. It would be convenient for this end-user to initially have the first movie physically stored locally, e.g., at the end-user's home network, and to initially have the second movie physically stored remotely, e.g., at a server in the cloud.
- a research institute that uses the data processing system 104 for storing information from their projects.
- the information is logically organized in directories and file folders.
- Information pertaining to a specific project is logically stored in a specific one of the file folders.
- Each specific file folder has one or more sub-folders for storing electronic files related to the specific project.
- a specific file folder has a first sub- folder for storing electronic files of library literature, including scientific publications in the field of the specific project, drafts of scientific publications being prepared by members of the staff of the specific project, handbooks, etc., a second sub-folder for storing electronic files of confidential test results relating to the specific project, a third sub-folder for storing electronic files with information about the individuals in the specific project's staff, such as contact information, personal data, salary, etc.
- the first physical storage device 118 comprises an SAN and the second physical storage device 120 comprises a WORM-drive.
- the first physical storage device 1 18 and the second physical storage device 120 are managed by the research institute.
- the third physical storage device 122 is formed by one or more servers of a cloud storage service provider, and is accessible via the Internet.
- the file system 126 of the data storage system 104 is configured for storing the electronic files of the first sub-folder as follows.
- the electronic files of the scientific publications and the handbooks are stored at the third physical storage device 122, i.e., in the cloud, whereas the electronic files relating to drafts of scientific publications are stored locally at the first physical storage device 118 for reasons of security or confidentiality.
- the file system 126 discriminates between the electronic files on the basis of their attributes, e.g., metadata manually added to the drafts of the scientific publications by their authors.
- the file system 126 of the data storage system 104 is configured for storing the electronic files of the second sub-folder, i.e., the test results, at the WORM-drive of the second physical storage device 120.
- the file system 126 is configured for automatically copying the test results that have been on the WORM-drive for, say, 6 months, to the third physical storage device 122 of the cloud.
- a reason for this is, for example, that a physical WORM-drive unit, e.g., a CD-R or a DVD-R, has finite storage capacity and that, when full, is functionally disconnected from the data storage system 104 and physically transported to a physical array from which it can be retrieved and functionally re-connected to the data storage system 104, if and when needed.
- the WORM-drive system may impose too long delays on the retrieval of old test data. It may therefore be more efficient to automatically copy the data on the WORM-drive to the cloud after a certain time period, so as to have the old data available on-line.
- the research institute may not want the test data stored at servers outside the jurisdiction of the particular territory, wherein the research institute is operating. Accordingly, the research institute has specified to the cloud service provider that the test data uploaded to the cloud be stored at servers within the particular territory.
- the file system 126 allocates the electronic files of the test results to the WORM-drive 120 under control of a dedicated attribute of the electronic files of the test results, either added manually by the staff or generated automatically by the test equipment having been configured in advance to do so.
- the file system 126 copies the older data at the WORM-drive to the cloud under control of a time-stamp added at the time of recording at the WORM-drive.
- the cloud service provider processes the test data uploaded by the research institute under control of a file attribute signifying to the cloud service provider that the uploaded data be stored at a server within the particular territory.
- the file system 126 of the data storage system 104 is configured for storing the electronic files of the third sub-folder (information about the individuals in the specific project's staff) at the SAN of the first physical storage device 118, i.e., locally for security and confidentiality reasons.
- the data storage system 104 is configured for enabling the user to logically store different electronic files in the same file folder and to physically store different electronic files of the same file folder in different ones of the first physical storage device 1 18, the second physical storage device 120 and the third physical storage device 122.
- This approach supports optimizing the usage of the data storage system 104, by considering the storage properties of individual ones of the first physical storage device 118, the second physical storage device 120 and the third physical storage device 122 within the context of the usage of the electronic files as contemplated by the end-user.
- the data storage system 104 in the invention is configured for logically storing the electronic files in a plurality of file folders, e.g., one or more specific ones of the first- level file folders 108, and/or one or more specific ones of the second-level file folders 112, and/or one or more specific ones of the third-level file folders 1 16.
- the end-user may specify per individual electronic file in which particular one of the file folders the individual electronic file in to be logically stored, i.e., logically classified.
- the data storage system 104 determines a semantic aspect of the individual electronic file, e.g., as manually specified by the end-user at the start of the storing operation, and/or as derived from the metadata associated with the individual electronic file, and automatically classifies the individual electronic file in a particular one of the file folders under control of the semantic aspect.
- semantic aspects are, e.g., one or more tags manually added to the individual electronic file by the end-user at the start of the storage operation that are representative of the subjective, i.e., personal, meaning of the individual electronic filed to this end-user.
- the allocation of individual electronic files to individual ones of the file folders i.e., the logical organization, represents a systematic, user-specific, classification that has typically been determined in advance by the end-user.
- the classification is meant to assist the user in managing his/her electronic files, e.g., for identifying or retrieving a particular one of the electronic files, on the basis of their semantic meaning to this end-user.
- Examples of a semantic aspect include, e.g., file meta-data such as "author of the electronic file”, “title of the electronic file”, “source of the electronic file”, such as the website from which the electronic file is downloaded, "geotag information” (geographical identification metadata added to videos, pictures, SMS messages so as to add location-specific information), etc.
- file meta-data such as "author of the electronic file”, “title of the electronic file”, “source of the electronic file”, such as the website from which the electronic file is downloaded, “geotag information” (geographical identification metadata added to videos, pictures, SMS messages so as to add location-specific information), etc.
- the end-user may manually overrule the automatic classification.
- the data storage system 104 logically stores this particular electronic file in the particular one of the second-level file folders 112 as specified by the end-user or under control of a semantic aspect derived from the particular electronic file.
- the data storage system 104 in the invention is also configured for physically storing the individual electronic file at a specific one of the first physical storage device 1 18, the second physical storage device 120, and the third physical storage device 122.
- the data storage system 104 determines an attribute of the particular electronic file.
- the term "attribute" as used in this text indicates a functional property of the particular electronic file with regard to an intended usage of the particular electronic file by the end-user or with regard to storage of the particular electronic file according to a preference of the end-user. Examples of an attribute are the size of the particular electronic file, the format of the particular electronic file, the time of day of downloading or storing the particular electronic file, a tag added by the end- user and signifying an intended use of the particular electronic file, etc.
- the attribute of the particular electronic file is relevant to the process of determining on which one of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122, the particular electronic file is to be physically stored.
- the data storage system 104 is therefore configured to select a particular one of the first physical storage device 1 18, the second physical storage device 120 and the third physical storage device 122 under control of the attribute of the particular electronic file, and to physically store the particular electronic file at the selected one of the first physical storage device 1 18, the second physical storage device 120, and the third physical storage device 122.
- the file system 126 of the data storage system 104 in the invention comprises an attribute determinant 128 that is operative to determine one or more attributes of a specific electronic file that is to be stored physically at a specific one of the first physical storage device 1 18, the second physical storage device 120, and the third physical storage device 122.
- the attribute determinant 128 extracts the attribute from, e.g., the meta-data of the specific electronic file to be stored.
- the attribute determinant 128 receives the attribute as a specific user-input manually entered into the data processing system 100 by the end-user at the start of the storing operation via a suitable component of the user interface 102, e.g., an alphanumeric keyboard (not shown). Examples of the attribute have been discussed above.
- the file system 126 also comprises a storage policy, e.g., in the form of a rule base 130.
- the rule base 130 comprises one or more rules that specify at which one of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122, the specific electronic file is to be stored physically, given the one or more attributes of the specific electronic file as determined by the attribute determinant 128.
- the rules have been specified in advance, e.g., by the end-user, or by a cloud service provider, or by the developer of the control-software running on the data processing system 100.
- a set of rules is determined, here: in advance, to optimize the usage of the data storage system 104 for this end-user, e.g., with regard to costs, latency at storage or retrieval, mobility of the end-user, etc.
- the end-user specifies in which of the file folders a specific electronic file is to be classified, and the file system 126 takes care of logically storing the specific electronic file in the file folder specified, and of physically storing the specific electronic file at a specific one of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122, as selected under control of an attribute of the specific electronic file, the rule base 130 and in dependence on the storage properties of the first physical storage device 1 18, the second physical storage device 120, and the third physical storage device 122.
- Fig.2 is a process diagram for illustrating a method 200 in the invention as executed in the data processing system 100 of Fig.1.
- the data storage system 104 receives a request for logically storing a particular electronic file is a particular file folder.
- the file system 126 of the data storage system 104 logically stores the particular electronic file in the particular file folder as requested.
- the attribute determinant 128 determines an attribute of the particular electronic file.
- the file system 126 consults the rule base 130 in order to determine one or more rules under control of the attribute, so as to be able to select a particular one of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122 in dependence on the storage properties of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122.
- a fifth step 210 the particular one of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122 is selected.
- the file system 126 physically stores the particular electronic file at the selected one of the first physical storage device 1 18, the second physical storage device 120, and the third physical storage device 122.
- the method 200 proceeds to a seventh step 214 for awaiting a next request for logically storing a next electronic file. Upon receipt of the next request, the method returns to the second step 202, now applied to the next electronic file.
- the second step 204, on the one hand, and the sequence of the third step 206, the fourth step 208, the fifth step 210 and the sixth step 212, on the other hand, have been drawn as separate threads of the method 200.
- the separate threads may be executed in parallel or one after the other.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A data processing system has a data storage system for storing electronic files. The data storage system enables the end-user to logically store different electronic files in the same file folder and to physically store different electronic files allocated to the same file folder at different ones of multiple physical storage devices. This approach supports optimizing a storage policy for this particular end-user of the electronic files, based on the storage properties of individual ones of the physical storage devices, within the context of the usage of the electronic files as contemplated by the end-user.
Description
Logically and end-user-specific physically storing an electronic file
FIELD OF THE INVENTION
The invention relates to a method of using a data storage system, to a data storage system, and to control software for executing the method.
BACKGROUND ART
A data processing system such as a personal computer (PC), a mobile telephone or Smartphone, a laptop PC, a tablet PC, etc., has a data storage system configured for storing electronic files, e.g., electronic documents, video files, audio files, e-mails, still pictures, etc., for later retrieval. In a graphical user-interface (GUI) of the data storage system, the electronic files are typically presented as logically organized in file folders. That is, the user of the data storage system determines in advance the logical organization, or: grouping, of his/her collection of electronic files. The logical organization is represented in the GUI as a set of multiple file folders. The set may be a hierarchical set: a particular file folder at a certain level in the hierarchy may contain one or more further file folders at a next lower level in the hierarchy. Each particular one of the electronic files can then be accommodated in one or more specific ones of the multiple file folders, depending on a semantic aspect or on a plurality of semantic aspects of the particular electronic file. The expression "semantic aspect" refers to a characteristic of the particular electronic file that is meaningful to the user for the purpose of logically organizing the electronic files based on, e.g., a respective topical aspect of the contents of a respective one of the electronic files (e.g., a title of the respective electronic file), a respective source or respective author of the respective electronic file, a respective file format of the respective electronic file, etc. The characteristic enables the user to discriminate between different electronic files that, from the point of view of this individual user, belong to different file folders. In a first scenario, the user may have created individual file folders for each individual file format of the electronic files. For example, a first file folder is created for holding only video files, a second file folder is created for holding only audio files, a third file folder is created for holding only text documents, a fourth file folder is created for holding only still pictures, etc. That is, in the first scenario, the organization of file folders is created on the basis of the file format or the rendering type of the
electronic file. In a second scenario, the user may have created individual file folders based on semantic topics. For example, a fifth file folder is created for holding electronic files relating to a first topic, e.g., classic cars; a sixth file folder is created for holding electronic files relating to a second topic, e.g., science; a seventh file folder is created for holding electronic files relating to a third topic, e.g., performing arts; an eighth file folder is created for holding electronic files relating to a fourth topic, e.g., world history, etc.
Accordingly, the logical organization of electronic files in the file folders represents a systematic classification of the electronic files. The classification is typically determined in advance and is specific to the individual end-user. The classification is meant to assist the end- user in managing his/her electronic files, e.g., for ease of identifying or of retrieving a particular one of the electronic files. The distribution of the electronic files among the file folders is therefore typically driven by the classification scheme that the individual end-user has adopted for his/her purposes and, therefore, is subjective.
The data storage system is also configured for physically storing the electronic files at one or more physical storage devices of the data processing system. Typical examples of such physical storage devices are a hard disk drive (HDD), a solid-state drive (SSD), a network drive (NAS) and a server of a cloud service provider.
The data storage system forms a functional component of a data processing system.
Operation of the data storage system is typically controlled by a file system. The file system forms a part of the operating system of the data processing system. The file system interacts with a respective one of the physical storage devices via a respective device driver. Each respective device driver translates commands from the operating system into commands that are specific to the hardware of the respective physical storage device, typically using SCSI (Small Computer System Interface). SCSI is a set of standards for physically connecting a computer to one or more peripheral devices and for transferring data between the computer and the peripheral device. The operation of a conventional file system is typically such that there is a one-to-one relationship between a particular file folder and a particular physical storage device: all files allocated to a particular file folder are stored at a single physical storage device. There are solutions commercially available that are configured for transferring files, which belong to the same namespace, between different physical storage devices. As known, a namespace is an abstract container or environment that is created to hold a logical grouping of unique identifiers, here: file
names. An identifier defined in a particular namespace is associated only with that particular namespace. The same identifier can be independently defined in multiple namespaces.
Accordingly, the meaning associated with an identifier defined in one namespace may or may not have the same meaning as the same identifier defined in another namespace. Data storage devices support namespaces. Data storage devices use directories (or folders) as namespaces.
One of such solutions is referred to in the art as "dynamic tiering" or "dynamic storage tiering". The dynamic tiering is enabled by controllers at the physical storage devices. Dynamic storage tiering allows root users, i.e., system administrators or processes pre-defined by system administrators, to move files among different volumes, allocate files to different volumes at file creation time, and independently recover volumes, without altering the namespace of the file system. The term "volume", as used within the context of computer operating systems, refers to a single accessible storage area within a single file system, typically resident on a single partition of a hard disk.
Another such solution relates to a storage area network (SAN) technology. A SAN is a dedicated network that provides access to consolidated, block-level data storage. SANs are primarily used to make physical storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers, so that the physical storage devices appear as devices locally attached to the operating system. A SAN typically has its own network of physical storage devices that are generally not accessible through the local area network by other devices. A standard connection type for SAN in enterprise storage is Fibre Channel. Fibre Channel is a gigabit-speed network technology primarily used for storage networking. System administrators can move files among different volumes, and can allocate files to different volumes at Fibre Channel level. However, for the file system there is still a one-to-one relationship between the file folder and the physical storage device. The commands, however, are redirected at the level of the Fibre Channel controller.
SUMMARY OF THE INVENTION
As discussed above, typical file systems maintain a one-to-one relationship between, on the one hand, a particular file folder logically storing one or more electronic files and, on the other hand, a physical storage device physically storing the electronic files of the particular file folder. Dynamic storage tiering and the combination of SAN and Fibre Channel are examples of
storage scenarios, wherein the logical storage and physical storage can be decoupled. However, the decoupling is under control of the system administrator, and not under control of the end-user of the electronic files.
In the invention, control over determining of which particular one of the electronic files allocated to the same file folder is stored at which particular one of multiple physical storage devices is given to the end-user of the electronic files before actually storing the particular electronic file.
Accordingly, when the end-user of a particular file is in the process of logically allocating the particular electronic file to a particular file folder, a particular one of the physical storage devices is selected under combined control of a storage policy specific to this end-user, and one or more attributes of the particular electronic file. The storage policy specifies the requirements of this end-user's with regard to the storing of his her /electronic files in, and/or the retrieval of his/her electronic files from, the data storage system. The requirements are specified in terms of one or more storage properties of each individual one of the physical storage devices available to this end-user, and in terms of one or more attributes that an electronic file can have and that are relevant to this end-user. Accordingly, the storage policy determines for each electronic file at which one of the physical storage devices the electronic file is going to be stored, given the one or more attributes of the electronic file and given the storage properties of the physical storage devices. The applicable storage policy may be determined in advance, or may be determined solely from an explicit user-input at the time of logically allocating a particular electronic file to the particular file folder, or may be determined dynamically as a result of a history of this end- user interacting with the data storage system, or the applicable storage policy may be determined by a combination of two or more of these determining factors. As a result, the data storage system is configured for physically storing a specific electronic file of a specific file folder at a specific one of the physical storage devices, independent of at which one of the physical storage devices another electronic file of the same file folder has been stored. The expression "storage property" as used in this text refers to one or more characteristics of the functioning of the relevant physical storage device in operational use, such as, for example, speed of storing or of retrieving of data (e.g., expressed in Mbit/sec), storage capacity available, cost per unit of storage capacity (e.g., cost per megabit of storage), latency on a data connection via which data is stored at, or retrieved from, a relevant one of the first physical storage device and the second physical
storage device, whether the relevant one of the first physical storage device and the second physical storage device is a Write-Once-Read-Many (WORM) storage device, geographic location of the relevant physical storage device, etc. The storage property of a physical storage device can be quantified by a value of a parameter indicative of a specific characteristic, or by multiple values, each respective one of the values being of a respective one of multiple parameters and being indicative of a respective one of multiple characteristics. Alternatively, the storage property of a specific physical storage device is characterized by a specific region in a one-dimensional parameter space or in a multi-dimensional parameter space, wherein each particular dimension is indicative a particular characteristic or of a particular combination of characteristics, of the kinds specified above. A role of the quantity "storage property" as discussed above, is to be able to discriminate between different types of physical storage devices and to be able to determine which one or which ones suit the purposes of the end-user best. The storage properties of the physical storage devices may therefore also be specified in qualitative terms so long as a comparison is possible between different ones of the physical storage devices on the basis of their storage properties to determine a suitable one of the physical storage devices given the end-user's requirements laid down in the storage policy.
More specifically, the inventors propose a method of using a data storage system. The data storage system comprises at least a first physical storage device and a second physical storage device. The first storage device has first storage property and the second storage device has a second storage property. The first storage property differs from the second storage property. The data storage system is configured for logically storing one or more respective ones of multiple electronic files of an end-user at a respective one of a plurality of file folders, and for physically storing one or more respective ones of the electronic files at a respective one of the first physical storage device and the second physical storage device. The method comprises receiving a request for storing a particular one of the electronic files in a particular one of the file folders. The method further comprises: in response to the request performing following actions: logically storing the particular electronic file in the particular one of the file folders; determining an attribute of the particular electronic file; determining a storage policy of the end-user that depends on the first storage property and the second storage property; under combined control of the attribute and the storage policy selecting a particular one of the first physical storage device
and the second physical storage device; and physically storing the particular electronic file at the particular physical storage device.
Accordingly, the electronic files of the end-user are logically organized in one or more file folders, and different ones of the electronic files logically stored in the same particular file folder are physically stored at different ones of multiple physical storage devices under combined control of the attributes of the electronic files and a storage policy that is specific to the end-user. The invention therefore enables each specific end-user of electronic files to exploit the different storage properties available at his/her data storage system so as to optimize his/her usage of the data storage system with regard to, e.g., speed, costs, latency, etc., under control of the attributes of the individual electronic files. Two different end-users may each request to store a respective copy of the same electronic file while applying different storage policies or different storage preferences, resulting in storage of the respective copy at respective physical storage devices having different storage properties.
In terms of computer resources, the invention operates in a layer between the operating system and the file system. The file system manages each specific one of the file folders as a specific logical storage container. In the invention, a particular logical storage container is implemented physically by means of multiple physical storage devices with different operational characteristics (e.g., speed, capacity, costs, etc). A particular electronic file logically stored in a particular file folder is physically stored at a specific one of the physical storage devices physically implementing the particular file folder. The specific physical storage device is selected under control of the operational characteristics of the specific physical storage device, optionally with respect to the operational characteristics of the other physical storage devices that physically implement the particular file folder. The request for storing the particular electronic file is initiated by, or on behalf of, the end-user of the particular electronic file. The file system receives the request from the operating system. The operating system runs on, e.g., a desktop PC, a Smartphone, a picture camera, etc.
The request for storing the particular electronic file is generated as, e.g., a manual input from the end- user via a user-interface (e.g., alphanumeric keyboard) of a data processing system, of which the data storage system forms a functional component. As another example, the request is created automatically, e.g., via a pre-determined script, in response to the data processing system detecting an electronic file having been downloaded. As yet another example, consider an
application scenario, wherein the user interacts with the data storage system via a drag-and-drop graphical user-interface (GUI) of the data processing system, and wherein the data processing system is configured in advance for carrying out the following operations. When the user grabs a graphical representation of the particular electronic file, and drags the graphical representation to another graphical representation of the particular file folder, the data processing system controls the GUI to generate a pop-up window with a question: "on what kind of physical storage device would you like to store this particular electronic file", and with a menu of the options available for physically storing the particular electronic file currently considered.
For example, the menu has a first option "fast", a second option "medium", and a third option "slow". The first option "fast" represents a physical storage device that is capable of storing data and/or retrieving data at high data transfer rate (e.g., solid-state memory). The second option "medium" represents another physical storage device that is capable of storing data and/or retrieving data at a maximum data transfer rate that is substantially lower than that represented by the first option "fast" (e.g., a local hard-disk drive). The third option "slow" represents yet another physical storage device that is capable of storing data and/or retrieving data at a maximum data transfer rate that is substantially lower than that represented by the second option "medium" (e.g., a writable optical disc or a network storage device (NAS)). Accordingly, assume that a first physical storage device comprises a hard disk drive local to the user's PC, and that a second physical storage device comprises a NAS accessible from the user's PC via a data network. The local hard disk drive is represented in the menu as the option "fast", and the NAS is represented in the menu as the option "slow". Assume that the user has a logical directory structure with a file folder named "My Media". If the user selects above first option "fast" when logically storing an electronic file of the movie "Solaris" in the file folder "My Media", the movie "Solaris" will be physically stored at the local hard disk drive. In the overview of the directories and file folders, the user sees this movie as stored in the file folder "My Media". If the user selects above third option "slow" when logically storing an electronic file of the movie "Solaris" in the file folder "My Media", the movie "Solaris" will be physically stored at the NAS. In the overview of the directories and file folders, the user again sees this movie as stored in the file folder "My Media".
As another example, the menu has a first option: "expensive - high performance" and a second option: "cheap - low performance". The first option "expensive - high performance"
represents a physical storage device, capable of operating at a high data transfer rate but also, alas, expensive per unit of data stored. The second option "cheap - low performance" represents another physical storage device, only capable of operating at a data transfer rate that is lower than that of the physical storage device in the first option, that is, however inexpensive per unit of data stored. In both examples, the user may contemplate future usage of the particular file to be stored and adopt a storage option that best fits his/her contemplated usage. It is not necessary for the storage and retrieval of an audio file to employ a high-performance physical storage device (e.g., a solid-state memory) that is designed for storing and retrieval of video files. An audio file may therefore as well be stored at a physical storage device with a low performance so as to not unduly occupy storage space of the high-performance physical storage device and so as to reduce the costs per audio file stored.
As yet another example, the menu may indicate that the storage of the particular file will cost the user, say, $ 0.05 per year if stored at a specific physical storage device, and will cost the user, say $ 0.15 per year if stored at another physical storage device.
The one or more rules, specific to a particular end-user may be pre-determined and fixed or, alternatively, may be made to change dynamically in dependence on a change in
circumstances. For example, consider a scenario, wherein cloud storage is a storage option available to the particular end-user, and is physically represented by a particular one of the first physical storage device and the second physical storage device. The costs per unit storage, as charged by the cloud storage provider per unit of data, may change over time. The change of costs may cause a change in the storage policy adopted by this particular end-user. For example, the costs per unit data are reduced so that the policy is adjusted towards storing more electronic files in the cloud and fewer locally at the home network of this particular end-user. Consider another example, wherein the particular end-user adds a further physical storage device to his/her data storage system. Then, the storage policy for this particular end-user is changed so as to also take into account the storage property of this further physical storage device.
Consider as yet another example, a scenario wherein storage costs determine at which specific one of the physical storage devices the particular file is to be stored. In an example discussed above, the storage of the particular file will cost the user, say, $ 0.05 per year if stored at a specific physical storage device, and will cost the user, say $ 0.15 per year if stored at another physical storage device. An example of an applicable rule is then that storage should not
cost more than $0.10 per year per unit of data stored. Then, according to this rule, the specific storage device is the only option and the storing proceeds automatically without the displaying of the menu. If, however, two or more physical storage devices are available to this user that comply with the rule that the costs per unit of data stored are $ 0.10, then the GUI presents to the user a pop-up window with a menu of the two or more selectable options available.
Consider still another example, wherein the storage policy is determined by multiple rules. A first rule specifies that the particular file always be stored on the fastest physical storage device available. A second rule specifies that the physical storage device available for storing an electronic file larger than 1 MB is the least expensive one available. A third rule specifies that the physical storage device available for storing an electronic file with a file extension relating to streaming of audio or of video is the one available that has a performance that matches the streaming requirements. Accordingly, the physical storage device available for storing a streaming audio file of a size smaller than 1 MB is the fastest one among the least expensive ones available that are capable of audio streaming.
Storage policies may be created, e.g., by a service provider or another supplier, for different sets of physical storage devices, each respective one thereof with a respective storage property. A specific storage policy is optimized for a specific parameter (e.g., speed, latency, costs, capacity, etc.), or for a specific combination of parameters (e.g., speed and costs, latency and costs, capacity and costs and latency, etc.). The end-user of the electronic files may then obtain from the service provider a specific storage policy that matches the specific variety in storage properties of the set of physical storage devices of his/her data storage system. For example, the end-user uploads to the service provider information representative of his/her set of physical storage devices, further information about his/her typical usage and about what parameters or qualities of the physical storage devices are deemed relevant by this end-user. The service provider then creates a storage policy based on the information received from the end- user and programs the end-user's data storage system via, e.g., the Internet, so as to configure the data storage system for complying with the storage policy created. In this manner, the service provider obtains a profile from this end-user with regard to the usage and storage of electronic files, and the service provider can use this profile to customize or otherwise adjust a service provided to this end-user.
The attribute of the particular electronic file may be determined by metadata descriptive of the particular electronic file and available from, e.g., a header of the particular electronic file. The metadata may be representative of, e.g., a size of the particular electronic file, a file format of the particular electronic file, a title or other semantic indication of the particular electronic file, such as an author, a date of creation of the particular electronic file, a topic or semantic class of the particular electronic file, whether or not the particular electronic file is subjected to an auditing policy (so as to determine whether or not to physically store the particular electronic file at a WORM storage device), etc. Alternatively, or in addition, the attribute is determined by a user-specific context of the request for storing, e.g., a time of the day or the day of the week of receiving the request for storing; an identity of the end-user who, or on behalf of whom, the request for storing is created, a history of recent further requests for storing or a profile of the end-user determined in advance, etc. Alternatively, or in addition, the attribute is indicative of a respective cost indication for storing the particular file at a respective one of the first physical storage device and the second physical storage device. Alternatively, or in addition, the attribute is determined by explicit input from the end-user via the user-interface of the data processing system, of which the data storage system forms a functional component. For example, the data storage system has been configured in advance for enabling the user to specify, at the start of the storing procedure of the particular electronic file, a parameter value indicative of whether or not the end-user is going to need this particular electronic file in the near future. If the end-user specifies he/she is going to need the particular electronic file in the near future, the data storage system is configured for physically storing the particular electronic file locally at the end-user's data processing system (e.g., at the HDD of the end-user's PC). If the end-user specifies that he/she is not going to need this particular electronic file in the foreseeable future, the data storage system is configured for physically storing the particular electronic file at a server of a cloud storage provider.
The expression "end-user", as used throughout this text, may refer to an individual consumer, but also to an individual member of an organization such as a department of a commercial enterprise, a governmental department, a university, or to a group of such individual members. Consider the example wherein the end-user is a group of individuals. The storage policy for storing electronic files for this group of individuals is under combined control the one or more attributes of each specific one of the electronic files and one or more rule bases. That is,
the storage policy may be uniform for all individuals in the group or, alternatively, may depend on the specific identity of each specific individual of the group, for example, as indicated by the specific individual's job title or role in the group.
In an embodiment of the method, the data storage system forms a functional part of a data processing system of the end-user. The first physical storage device comprises remote storage connected via a data network to the data processing system; and the second physical storage device is local to the data processing system.
In a further embodiment of the method, the first physical storage device and the second physical storage device are managed by a cloud service provider.
Within the field of data processing, the expressions "cloud computing" and "cloud service" refer to the delivery of hosted services over the Internet. The term "cloud" in the expressions "cloud computing" and "cloud service" refers to the Internet and was reportedly inspired by the cloud-like symbol that has typically been used to represent the Internet in diagrams and flowcharts. A hosted service can be provided by a server or by a conglomerate of multiple servers located anywhere in the cloud. A cloud service is available to the public and to enterprises, is accessed through the Internet or via a wide area network (WAN), and is purchased on an as-needed basis. A cloud service is fully managed by the cloud service provider. A cloud service is elastic in the sense that a user can have as much or as little of the cloud service, e.g., compute power or data storage capacity, as he/she wants at any given time. An example of a cloud service is cloud storage. Cloud storage refers to data storage capacity that is accessed by the end-user through the Internet or via a WAN. Cloud storage uses an infrastructure of a plurality of storage systems that may be distributed among different geographic locations or regions. Cloud storage has been becoming increasingly more relevant especially to enterprises and governmental agencies.
The first physical storage device and the second physical storage device in above further embodiment are managed by the cloud service provider. The first physical storage device and the second physical storage device have different storage properties. Accordingly, the cloud service provider can offer a customized storage policy to the individual end-user, based on the preferences of the individual end-user, and charge accordingly.
The invention further relates to a data storage system, wherein the data storage system comprises at least a first physical storage device and a second physical storage device. The first
physical storage device has a first storage property and the second physical storage device has a second storage property different from the first storage property. The data storage system is operative to logically store one or more respective ones of multiple electronic files of an end-user at a respective one of a plurality of file folders, and to physically store one or more respective ones of the electronic files at a respective one of the first physical storage device and the second physical storage device. The data storage system comprises an input for receiving a request for storing a particular one of the electronic files in a particular one of the file folders and a file system. The file system is configured for, in response to the request: logically storing the particular electronic file in the particular one of the file folders; determining an attribute of the particular electronic file; determining a storage policy of the end-user that depends on the first storage property and the second storage property; under combined control of the attribute and the storage policy selecting a particular one of the first physical storage device and the second physical storage device and physically storing the particular electronic file at the particular physical storage device.
In an embodiment of the data storage system, the attribute is determined by at least one of: metadata in the particular electronic file that is descriptive of the particular electronic file; explicit user input from the end-user; a user-specific context of the request for storing; a history of recent further requests for storing; and a pre-determined profile of the end-user.
In a further embodiment of the data storage system, the attribute is indicative of at least one of: a size of the particular electronic file; a time of creation or availability of the particular electronic file; a file format of the particular electronic file; a semantic aspect of a content of the particular electronic file; whether or not the particular electronic file is subjected to an auditing policy; whether or not the end-user is going to need the particular electronic file in the near future; a time of the day at which the request for storing is received; a day of the week on which the request for storing is received; a respective cost indication for storing the particular file at a respective one of the first physical storage device and the second physical storage device; and a history of recent further requests for storing.
In a further embodiment of the data storage system, the data storage system forms a functional part of a data processing system of the end-user. The first physical storage device comprises remote storage connected via a data network to the data processing system; and the second physical storage device is local to the data processing system.
The invention also relates to, control software for configuring a data storage system. The control software may be commercially supplied as recorded on a computer-readable medium such as a magnetic disk, an optical disc, a solid-state memory, etc., or may be made available for being downloaded via a data network such as the Internet. The data storage system comprises at least a first physical storage device and a second physical storage device. The first physical storage device has a first storage property and the second physical storage device has a second storage property different from the first storage property. The data storage system is operative to logically store one or more respective ones of multiple electronic files of an end-user at a respective one of a plurality of file folders, and to physically store one or more respective ones of the electronic files at a respective one of the first physical storage device and the second physical storage device. The data storage system comprises an input for receiving a request for storing a particular one of the electronic files in a particular one of the file folders. The control software comprises: first instructions for, in response to the request, logically storing the particular electronic file in the particular one of the file folders; second instructions for, in response to the request, determining an attribute of the particular electronic file; third instructions for, in response to the request, third instructions for, in response to the request, determining a storage policy of the end-user that depends on the first storage property and the second storage property; fourth instructions for, in response to the request and under combined control of the attribute and the storage policy, selecting a particular one of the first physical storage device and the second physical storage device; and fifth instructions for, in response to the request, physically storing the particular electronic file at the particular physical storage device.
In an embodiment of the control software, the attribute is determined by at least one of: metadata in the particular electronic file that is descriptive of the particular electronic file;
explicit user input from the end-user; a user-specific context of the request for storing; a history of recent further requests for storing; and a pre-determined profile of the end-user.
In a further embodiment of the control software, the attribute is indicative of at least one of: a size of the particular electronic file; a time of creation or availability of the particular electronic file; a file format of the particular electronic file; a semantic aspect of a content of the particular electronic file; whether or not the particular electronic file is subjected to an auditing policy; whether or not the end-user is going to need the particular electronic file in the near future; a time of the day at which the request for storing is received; a day of the week on which
the request for storing is received; a respective cost indication for storing the particular file at a respective one of the first physical storage device and the second physical storage device; and a history of recent further requests for storing. BRIEF DESCRIPTION OF THE DRAWING
The invention is explained in further detail, by way of example and with reference to the accompanying drawing, wherein:
Fig.l is a block diagram for illustrating a data processing system comprising a data storage system of the invention; and
Fig.2 is a process diagram for illustrating a method in the invention.
Throughout the drawing, similar or corresponding features are indicated by same reference numerals.
DETAILED EMBODIMENTS
As discussed above, the invention relates to a data processing system that has a data storage system for storing electronic files. The data storage system enables the end-user to logically store different electronic files in the same file folder and to physically store different electronic files allocated to the same file folder at different ones of multiple physical storage devices. This approach supports optimizing a storage usage for this particular end-user of the electronic files, based on the storage properties of individual ones of the physical storage devices, and under combined control of the attributes of the individual electronic files and a storage policy specific to the end-user.
Fig.l is a block diagram of a data processing system 100. The data processing system 100 comprises, for example, any of an individual end-user's PC, a laptop PC, a tablet PC, a
Smartphone, an individual end-user's home network, etc., or to a computer network of an organization such as a commercial enterprise, a governmental department, a university, etc.
The data processing system 100 has a user interface 102, and also has a data storage system 104 according to the invention. The data storage system 104 is configured for logically organizing a plurality of electronic files in one or more file folders. The user interface 102 has a display monitor 106 for graphically representing to a user of the data processing system 100 a logical organization of the file folders stored at the data storage system 104. As shown in the
example of Fig.1 , the file folders may form a hierarchical folder configuration. The hierarchical folder configuration has a first level of first-level file folders 108. A specific one of the first-level file folders 108 may itself contain one or more second-level file folders. For example, the specific one of the first-level file folders 108 is indicated with a reference numeral 110, and is shown to comprise a second level of second-level file folders 1 12. Likewise, a particular one of the second- level file folders 1 12 may itself contain one or more third-level file folders. For example, the particular one of the second-level file folders 112 is indicated with a reference numeral 1 14, and is shown to comprise a third level of third-level file folders 116, and so on. Accordingly, a file folder in the logical organization may be empty, or may contain one or more further file folders, and/or may contain one or more electronic files.
The example in the diagram of Fig.1 shows the logical organization of the file folders as a branching configuration rendered on the display monitor 106. Other examples (not shown) of visually representing the logical organization of the file folders on a display monitor are feasible. See, e.g., international application publication WO0244935 filed for Greg Roelofs for "GUI has library metaphor based on non-euclidean geometry" and incorporated herein by reference.
International application publication WO0244935 describes a data processing system with a GUI that enables the user to interact with a virtual environment. The environment has a graphical representation of a storage based on a library metaphor. The storage is being used to graphically archive information items. The virtual environment has a path-dependent geometry. This allows modification of the storage to add additional items without visually disrupting the organization of the items stored previously.
The data storage system 104 comprises a plurality of physical storage devices, e.g., a first physical storage device 118, a second physical storage device 120, and a third physical storage device 122. In addition to being configured for logically organizing the plurality of electronic files, as discussed above, the data storage system 104 is also configured for physically organizing the plurality of electronic files by means of allocating each particular one of the electronic files to one of the first physical storage device 118, the second physical storage device 120 and the third physical storage device 122.
The data storage system 104 comprises a file system 126 operative to manage the logical organization of the electronic files as well as the physical organization of the electronic files.
In the example of Fig.1, the first physical storage device 118 and the second physical storage device 120 are local to the data processing system 100, whereas the third physical storage device 122 is remote to the data process system 100 and is accessed from the data processing system 100 via a data network 124, e.g., the Internet.
In a first example scenario, the data processing system 100 comprises, for example, an end-user's PC, and the first physical storage device 118 comprises, e.g., a local hard disk drive (HDD), and the second physical storage device 120 comprises, e.g., a solid-state drive (SSD) accommodated at the PC.
In a second example scenario, the data processing system 100 comprises the end- user's home network, and the first physical storage device 1 18 comprises, e.g., a local HDD of the user's desktop PC and the second physical storage device 120 comprises, e.g., the end-user's direct-attached storage (DAS). A typical DAS comprises a number of HDDs connected to the PC through a host bus adapter (HBA).
In a third scenario, the data processing system 100 comprises the end- user's home network, and the first physical storage device 1 18 comprises, e.g., a HDD of the user's desktop PC and the second physical storage device 120 comprises, e.g., an SSD of the end-user's Smartphone temporarily connected to the home network.
In a fourth scenario, the data processing system 100 comprises the end- user's home network, and the first physical storage device 1 18 comprises, e.g., a local HDD of the user's desktop PC and the second physical storage device 120 comprises, e.g., network-attached storage (NAS) connected to the desktop PC via the end-user's local area network (LAN).
In a fifth scenario, the data processing system 100 comprises a corporate computer network of a commercial enterprise. The first physical storage device 118 comprises read/write storage managed by the commercial enterprise, and the second physical storage device 120 comprises write-once-read-many (WORM) storage, managed by the commercial enterprise.
The third physical storage device 122 is remote to the data process system 100, and is connected to the data processing system 100 via a data network 124, e.g., the Internet or another wide area network (WAN). The third physical storage device 122 is then formed by one or more servers that provide an online storage facility to the end-user, e.g., through a cloud service provider. As known, the expressions "cloud computing" and "cloud service", as used within the field of data processing, refer to the delivery of hosted services over the Internet. The term
"cloud" in the expressions "cloud computing" and "cloud service" refers to the Internet and was reportedly inspired by the cloud-like symbol that has typically been used to represent the Internet in diagrams and flowcharts. A hosted service can be provided by a server or by a conglomerate of multiple servers located anywhere in the cloud. A cloud service is available to the public and to enterprises, is accessed through the Internet or via a WAN, and is purchased on an as-needed basis. A cloud service is fully managed by the cloud service provider. Furthermore, a cloud service is elastic in the sense that a user can have as much or as little of the cloud service, e.g., compute power or data storage capacity, as he/she wants at any given time.
Conventionally, the operation of a file system is such that there is a one-to-one relationship between, on the one hand, a particular file folder (a unit of logical organization) and, on the other hand, a particular physical storage device (a unit of physical organization). That is, the logical organization of electronic files in file folders as chosen by the end-user determines the physical organization of the electronic files at the physical storage devices.
Consider the examples of each of the first physical storage device 1 18, the second physical storage device 120 and the third physical storage device 122, as given above. Different ones of the first physical storage device 118, the second physical storage device 120 and the third physical storage device 122 have different storage properties such as, e.g., different speeds of storage or retrieval, different storage capacities, different costs, etc. In a conventional data storage system, the end-user cannot optimize, per individual electronic file, his/her usage of the data storage system 104 on the basis of the storage properties of the physical storage devices available, as the conventional data storage system does not allow logically storing different electronic files in the same file folder and at the same time physically storing these different electronic files of the same file folder at different ones of the first physical storage device 118, the second physical storage device 120 and the third physical storage device 122. As a result, the performance of storage and retrieval of each individual electronic file that is logically stored in the same file folder, is determined by the very characteristics of the specific physical storage device, to which the file folder has been allocated, regardless of a manner wherein the end-user wishes to engage in using the individual electronic file. As known, an SSD has very low access time and very low latency compared to a HDD, is more costly per unit of storage capacity than a HDD, and typically supports a number of write operations in operational use that is much lower than the number of write operations supported by a HDD.
Consider, for example, an end-user of a data processing system, who has created on his/her data processing system a specific topical file folder for logically storing electronic files on a particular topic, e.g., electric guitars. The electronic files stored comprise, e.g., word processor documents (text of emails and of letters sent and received in the mail), documents in pdf- format (scanned-in copies of owner manuals and repair manuals, sheet music), still pictures (pictures taken at concerts, pictures of the end-user's guitars), audio clips (e.g., of favorite music downloaded from the web), video clips (videos of concerts, videos of the end-user performing on guitar), web documents (web pages as saved), etc. The end-user finds it convenient to have a storage policy based on the size of the electronic files, i.e., on the number of bytes of digital information per individual electronic file. The end-user may then want to physically keep the electronic files, which are logically allocated to this specific topical file folder and have a size lower then a pre-determined magnitude, in local storage, e.g., at the end-user's PC or
Smartphone, and the other electronic files, which are logically allocated to the same topical file folder and have a size equal or larger than the pre-determined magnitude, in remote storage, e.g., in cloud storage.
Consider, as another example, an end-user who downloads movies (motion pictures) from the Internet and who classifies the downloaded movies by logically storing the downloaded movies in a particular file folder at his/her data processing system. The particular file folder comprises multiple subsidiary file folders so as to be able to logically organize the movies and cluster the movies in a specific subsidiary file based on a specific semantic aspect (e.g., genre, era, film director, featuring actor, etc.). While downloading a first movie (e.g., "The New Haircut of the Jedi" of A.T. des Pierre-Montagnes), this end-user may decide to watch the first movie in the near future; and while downloading a second movie ("Goya's women" of Mr Mann- Emperor), this end-user may decide to archive this second movie so as to have the second movie available if and when he/she decides to watch it. It would be convenient for this end-user to initially have the first movie physically stored locally, e.g., at the end-user's home network, and to initially have the second movie physically stored remotely, e.g., at a server in the cloud.
Consider as yet another example, a research institute that uses the data processing system 104 for storing information from their projects. The information is logically organized in directories and file folders. Information pertaining to a specific project is logically stored in a specific one of the file folders. Each specific file folder has one or more sub-folders for storing
electronic files related to the specific project. For example, a specific file folder has a first sub- folder for storing electronic files of library literature, including scientific publications in the field of the specific project, drafts of scientific publications being prepared by members of the staff of the specific project, handbooks, etc., a second sub-folder for storing electronic files of confidential test results relating to the specific project, a third sub-folder for storing electronic files with information about the individuals in the specific project's staff, such as contact information, personal data, salary, etc.
The first physical storage device 118 comprises an SAN and the second physical storage device 120 comprises a WORM-drive. The first physical storage device 1 18 and the second physical storage device 120 are managed by the research institute. The third physical storage device 122 is formed by one or more servers of a cloud storage service provider, and is accessible via the Internet.
The file system 126 of the data storage system 104 is configured for storing the electronic files of the first sub-folder as follows. The electronic files of the scientific publications and the handbooks are stored at the third physical storage device 122, i.e., in the cloud, whereas the electronic files relating to drafts of scientific publications are stored locally at the first physical storage device 118 for reasons of security or confidentiality. The file system 126 discriminates between the electronic files on the basis of their attributes, e.g., metadata manually added to the drafts of the scientific publications by their authors.
The file system 126 of the data storage system 104 is configured for storing the electronic files of the second sub-folder, i.e., the test results, at the WORM-drive of the second physical storage device 120. The file system 126 is configured for automatically copying the test results that have been on the WORM-drive for, say, 6 months, to the third physical storage device 122 of the cloud. A reason for this is, for example, that a physical WORM-drive unit, e.g., a CD-R or a DVD-R, has finite storage capacity and that, when full, is functionally disconnected from the data storage system 104 and physically transported to a physical array from which it can be retrieved and functionally re-connected to the data storage system 104, if and when needed. This means that the WORM-drive system may impose too long delays on the retrieval of old test data. It may therefore be more efficient to automatically copy the data on the WORM-drive to the cloud after a certain time period, so as to have the old data available on-line. However, the research institute may not want the test data stored at servers outside the jurisdiction of the particular territory,
wherein the research institute is operating. Accordingly, the research institute has specified to the cloud service provider that the test data uploaded to the cloud be stored at servers within the particular territory.
The file system 126 allocates the electronic files of the test results to the WORM-drive 120 under control of a dedicated attribute of the electronic files of the test results, either added manually by the staff or generated automatically by the test equipment having been configured in advance to do so. The file system 126 copies the older data at the WORM-drive to the cloud under control of a time-stamp added at the time of recording at the WORM-drive. The cloud service provider processes the test data uploaded by the research institute under control of a file attribute signifying to the cloud service provider that the uploaded data be stored at a server within the particular territory.
The file system 126 of the data storage system 104 is configured for storing the electronic files of the third sub-folder (information about the individuals in the specific project's staff) at the SAN of the first physical storage device 118, i.e., locally for security and confidentiality reasons.
In the invention, the data storage system 104 is configured for enabling the user to logically store different electronic files in the same file folder and to physically store different electronic files of the same file folder in different ones of the first physical storage device 1 18, the second physical storage device 120 and the third physical storage device 122. This approach supports optimizing the usage of the data storage system 104, by considering the storage properties of individual ones of the first physical storage device 118, the second physical storage device 120 and the third physical storage device 122 within the context of the usage of the electronic files as contemplated by the end-user.
Accordingly, the data storage system 104 in the invention is configured for logically storing the electronic files in a plurality of file folders, e.g., one or more specific ones of the first- level file folders 108, and/or one or more specific ones of the second-level file folders 112, and/or one or more specific ones of the third-level file folders 1 16. The end-user may specify per individual electronic file in which particular one of the file folders the individual electronic file in to be logically stored, i.e., logically classified. Alternatively, or in addition, the data storage system 104 determines a semantic aspect of the individual electronic file, e.g., as manually specified by the end-user at the start of the storing operation, and/or as derived from the metadata associated with the individual electronic file, and automatically classifies the individual
electronic file in a particular one of the file folders under control of the semantic aspect.
Examples of such semantic aspects are, e.g., one or more tags manually added to the individual electronic file by the end-user at the start of the storage operation that are representative of the subjective, i.e., personal, meaning of the individual electronic filed to this end-user. As mentioned above, the allocation of individual electronic files to individual ones of the file folders, i.e., the logical organization, represents a systematic, user-specific, classification that has typically been determined in advance by the end-user. The classification is meant to assist the user in managing his/her electronic files, e.g., for identifying or retrieving a particular one of the electronic files, on the basis of their semantic meaning to this end-user. Examples of a semantic aspect include, e.g., file meta-data such as "author of the electronic file", "title of the electronic file", "source of the electronic file", such as the website from which the electronic file is downloaded, "geotag information" (geographical identification metadata added to videos, pictures, SMS messages so as to add location-specific information), etc. Optionally, the end-user may manually overrule the automatic classification. Accordingly, in response to a request for logically storing the individual electronic file in a particular one of the file folders, e.g., a particular one of the second-level file folders 1 12, the data storage system 104 logically stores this particular electronic file in the particular one of the second-level file folders 112 as specified by the end-user or under control of a semantic aspect derived from the particular electronic file.
On the other hand, the data storage system 104 in the invention is also configured for physically storing the individual electronic file at a specific one of the first physical storage device 1 18, the second physical storage device 120, and the third physical storage device 122. To this end, the data storage system 104 determines an attribute of the particular electronic file. The term "attribute" as used in this text indicates a functional property of the particular electronic file with regard to an intended usage of the particular electronic file by the end-user or with regard to storage of the particular electronic file according to a preference of the end-user. Examples of an attribute are the size of the particular electronic file, the format of the particular electronic file, the time of day of downloading or storing the particular electronic file, a tag added by the end- user and signifying an intended use of the particular electronic file, etc. The attribute of the particular electronic file is relevant to the process of determining on which one of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122, the particular electronic file is to be physically stored. The data storage
system 104 is therefore configured to select a particular one of the first physical storage device 1 18, the second physical storage device 120 and the third physical storage device 122 under control of the attribute of the particular electronic file, and to physically store the particular electronic file at the selected one of the first physical storage device 1 18, the second physical storage device 120, and the third physical storage device 122.
The file system 126 of the data storage system 104 in the invention comprises an attribute determinant 128 that is operative to determine one or more attributes of a specific electronic file that is to be stored physically at a specific one of the first physical storage device 1 18, the second physical storage device 120, and the third physical storage device 122. The attribute determinant 128 extracts the attribute from, e.g., the meta-data of the specific electronic file to be stored. Alternatively, or in addition, the attribute determinant 128 receives the attribute as a specific user-input manually entered into the data processing system 100 by the end-user at the start of the storing operation via a suitable component of the user interface 102, e.g., an alphanumeric keyboard (not shown). Examples of the attribute have been discussed above.
The file system 126 also comprises a storage policy, e.g., in the form of a rule base 130.
The rule base 130 comprises one or more rules that specify at which one of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122, the specific electronic file is to be stored physically, given the one or more attributes of the specific electronic file as determined by the attribute determinant 128. The rules have been specified in advance, e.g., by the end-user, or by a cloud service provider, or by the developer of the control-software running on the data processing system 100. For example, given the collection of physical storage devices available to the data processing system 100 (here: the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122), and given the profile or preferences of the end-user with regard to his/her intended usage of the data storage system 104, a set of rules is determined, here: in advance, to optimize the usage of the data storage system 104 for this end-user, e.g., with regard to costs, latency at storage or retrieval, mobility of the end-user, etc.
As a result, the end-user specifies in which of the file folders a specific electronic file is to be classified, and the file system 126 takes care of logically storing the specific electronic file in the file folder specified, and of physically storing the specific electronic file at a specific one of the first physical storage device 118, the second physical storage device 120, and the third
physical storage device 122, as selected under control of an attribute of the specific electronic file, the rule base 130 and in dependence on the storage properties of the first physical storage device 1 18, the second physical storage device 120, and the third physical storage device 122.
Fig.2 is a process diagram for illustrating a method 200 in the invention as executed in the data processing system 100 of Fig.1.
In a first step 202, the data storage system 104 receives a request for logically storing a particular electronic file is a particular file folder.
In a second step 204, the file system 126 of the data storage system 104 logically stores the particular electronic file in the particular file folder as requested.
In a third step 206, the attribute determinant 128 determines an attribute of the particular electronic file.
In a fourth step 208, the file system 126 consults the rule base 130 in order to determine one or more rules under control of the attribute, so as to be able to select a particular one of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122 in dependence on the storage properties of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122.
In a fifth step 210, the particular one of the first physical storage device 118, the second physical storage device 120, and the third physical storage device 122 is selected.
In a sixth step 212, the file system 126 physically stores the particular electronic file at the selected one of the first physical storage device 1 18, the second physical storage device 120, and the third physical storage device 122.
After the second step 204 and the sixth step 212 have been executed, the method 200 proceeds to a seventh step 214 for awaiting a next request for logically storing a next electronic file. Upon receipt of the next request, the method returns to the second step 202, now applied to the next electronic file.
The second step 204, on the one hand, and the sequence of the third step 206, the fourth step 208, the fifth step 210 and the sixth step 212, on the other hand, have been drawn as separate threads of the method 200. The separate threads may be executed in parallel or one after the other.
Claims
1. A method of using a data storage system (104), wherein:
the data storage system comprises at least a first physical storage device (1 18) and a second physical storage device (120);
the first physical storage device has a first storage property and the second physical storage device has a second storage property different from the first storage property;
the data storage system is operative to logically store one or more respective ones of multiple electronic files of an end-user at a respective one of a plurality of file folders (108, 112, 1 16), and to physically store one or more respective ones of the electronic files at a respective one of the first physical storage device and the second physical storage device;
the method comprises:
receiving (202) a request for storing a particular one of the electronic files in a particular one of the file folders; and
in response to the request:
logically storing (204) the particular electronic file in the particular one of the file folders;
determining (206) an attribute of the particular electronic file;
determining a storage policy of the end-user that depends on the first storage property and the second storage property;
under combined control of the attribute and the storage policy selecting (210) a particular one of the first physical storage device and the second physical storage device; and
physically storing (212) the particular electronic file at the particular physical storage device.
2. The method of claim 1 , wherein the attribute is determined by at least one of:
metadata in the particular electronic file that is descriptive of the particular electronic file; explicit user input from the end-user;
a user-specific context of the request for storing;
a history of recent further requests for storing; and a pre-determined profile of the end-user.
3. The method of claim 2, wherein the attribute is indicative of at least one of:
a size of the particular electronic file;
a time of creation or availability of the particular electronic file;
a file format of the particular electronic file;
a semantic aspect of a content of the particular electronic file;
whether or not the particular electronic file is subjected to an auditing policy;
whether or not the end-user is going to need the particular electronic file in the near future;
a time of the day at which the request for storing is received;
a day of the week on which the request for storing is received;
a respective cost indication for storing the particular file at a respective one of the first physical storage device and the second physical storage device; and
a history of recent further requests for storing.
4. The method of claim 1, wherein a relevant one of the first storage property and the second storage property is indicative of at least one of:
a speed of storing data at, or of retrieving data from, the relevant one of the first physical storage device and the second physical storage device;
a storage capacity available at the relevant one of the first physical storage device and the second physical storage device;
cost per unit of the storage capacity of the relevant one of the first physical storage device and the second physical storage device;
latency on a data connection via which data is stored at or retrieved from the relevant one of the first physical storage device and the second physical storage device;
whether or not the relevant one of the first physical storage device and the second physical storage device is of a Write-Once-Read-Many type; and
a geographic location of the relevant one of the first physical storage device and the second physical storage device.
5. The method of claim 1 , wherein:
the data storage system forms a functional part of a data processing system (100) of the end-user;
the first physical storage device comprises remote storage connected via a data network (124) to the data processing system; and
the second physical storage device is local to the data processing system.
6. The method of claim 1 , wherein the first physical storage device and the second physical storage device are managed by a cloud service provider.
7. A data storage system (104), wherein:
the data storage system comprises at least a first physical storage device (1 18) and a second physical storage device (120);
the first storage device has a first storage property and the second storage device has a second storage property different from the first storage property;
the data storage system is operative to logically store one or more respective ones of multiple electronic files of an end-user at a respective one of a plurality of file folders (108, 112, 1 16), and to physically store one or more respective ones of the electronic files at a respective one of the first physical storage device and the second physical storage device;
the data storage system comprises:
an input for receiving (202) a request for storing a particular one of the electronic files in a particular one of the file folders; and
a file system (126) configured for, in response to the request:
logically storing (204) the particular electronic file in the particular one of the file folders;
determining (206) an attribute of the particular electronic file;
determining a storage policy of the end-user that depends on the first storage property and the second storage property;
under combined control of the attribute and the storage policy selecting (210) a particular one of the first physical storage device and the second physical storage device; and physically storing (212) the particular electronic file at the particular physical storage device.
8. The data storage system of claim 7, wherein the attribute is determined by at least one of: metadata in the particular electronic file that is descriptive of the particular electronic file; explicit user input from the end-user;
a user-specific context of the request for storing;
a history of recent further requests for storing; and
a pre-determined profile of the end-user.
9. The data storage system of claim 8, wherein the attribute is indicative of at least one of:
a size of the particular electronic file;
a time of creation or availability of the particular electronic file;
a file format of the particular electronic file;
a semantic aspect of a content of the particular electronic file;
whether or not the particular electronic file is subjected to an auditing policy;
whether or not the end-user is going to need the particular electronic file in the near future;
a time of the day at which the request for storing is received;
a day of the week on which the request for storing is received;
a respective cost indication for storing the particular file at a respective one of the first physical storage device and the second physical storage device; and
a history of recent further requests for storing.
10. The data storage system of claim 7, wherein a relevant one of the first storage property and the second storage property is indicative of at least one of:
a speed of storing data at, or of retrieving data from, the relevant one of the first physical storage device and the second physical storage device;
a storage capacity available at the relevant one of the first physical storage device and the second physical storage device; cost per unit of the storage capacity of the relevant one of the first physical storage device and the second physical storage device;
latency on a data connection via which data is stored at or retrieved from the relevant one of the first physical storage device and the second physical storage device;
whether or not the relevant one of the first physical storage device and the second physical storage device is of a Write-Once-Read-Many type; and
a geographic location of the relevant one of the first physical storage device and the second physical storage device.
11. The data storage system of claim 7, wherein:
the data storage system forms a functional part of a data processing system (100) of the end-user;
the first physical storage device comprises remote storage connected via a data network (124) to the data processing system; and
the second physical storage device is local to the data processing system.
12. Control software for configuring a data storage system (104), wherein:
the data storage system comprises at least a first physical storage device (1 18) and a second physical storage device (120);
the first physical storage device has a first storage property and the second physical storage device has a second storage property different from the first storage property;
the data storage system is operative to logically store (204) one or more respective ones of multiple electronic files of an end-user at a respective one of a plurality of file folders (108, 112, 1 16), and to physically store (212) one or more respective ones of the electronic files at a respective one of the first physical storage device and the second physical storage device;
the data storage system comprises an input for receiving (202) a request for storing a particular one of the electronic files in a particular one of the file folders; and
the control software comprises:
first instructions for, in response to the request, logically storing (204) the particular electronic file in the particular one of the file folders; second instructions for, in response to the request, determining (206) an attribute of the particular electronic file;
third instructions for, in response to the request, determining a storage policy of the end-user that depends on the first storage property and the second storage property;
fourth instructions for, in response to the request and under combined control of the attribute and the storage policy, selecting (210) a particular one of the first physical storage device and the second physical storage device; and
fifth instructions for, in response to the request, physically storing (212) the particular electronic file at the particular physical storage device.
13. The control software of claim 12, wherein the attribute is determined by at least one of: metadata in the particular electronic file that is descriptive of the particular electronic file; explicit user input from the end-user;
a user-specific context of the request for storing;
a history of recent further requests for storing; and
a pre-determined profile of the end-user.
14. The control software of claim 13, wherein the attribute is indicative of at least one of:
a size of the particular electronic file;
a time of creation or availability of the particular electronic file;
a file format of the particular electronic file;
a semantic aspect of a content of the particular electronic file;
whether or not the particular electronic file is subjected to an auditing policy;
whether or not the end-user is going to need the particular electronic file in the near future;
a time of the day at which the request for storing is received;
a day of the week on which the request for storing is received;
a respective cost indication for storing the particular file at a respective one of the first physical storage device and the second physical storage device; and
a history of recent further requests for storing.
15. The control software of claim 12, wherein a relevant one of the first storage property and the second storage property is indicative of at least one of:
a speed of storing data at, or of retrieving data from, the relevant one of the first physical storage device and the second physical storage device;
a storage capacity available at the relevant one of the first physical storage device and the second physical storage device;
cost per unit of the storage capacity of the relevant one of the first physical storage device and the second physical storage device;
latency on a data connection via which data is stored at or retrieved from the relevant one of the first physical storage device and the second physical storage device;
whether or not the relevant one of the first physical storage device and the second physical storage device is of a Write-Once-Read-Many type; and
a geographic location of the relevant one of the first physical storage device and the second physical storage device.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP12784586.5A EP2776952A2 (en) | 2011-11-10 | 2012-11-09 | Logically and end-user-specific physically storing an electronic file |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP11188601 | 2011-11-10 | ||
PCT/EP2012/072264 WO2013068530A2 (en) | 2011-11-10 | 2012-11-09 | Logically and end-user-specific physically storing an electronic file |
EP12784586.5A EP2776952A2 (en) | 2011-11-10 | 2012-11-09 | Logically and end-user-specific physically storing an electronic file |
Publications (1)
Publication Number | Publication Date |
---|---|
EP2776952A2 true EP2776952A2 (en) | 2014-09-17 |
Family
ID=47178011
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP12784586.5A Withdrawn EP2776952A2 (en) | 2011-11-10 | 2012-11-09 | Logically and end-user-specific physically storing an electronic file |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP2776952A2 (en) |
WO (1) | WO2013068530A2 (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9280683B1 (en) | 2014-09-22 | 2016-03-08 | International Business Machines Corporation | Multi-service cloud storage decision optimization process |
JP2018502385A (en) | 2014-12-08 | 2018-01-25 | アンブラ テクノロジーズ リミテッドUmbra Technologies Ltd. | System and method for content retrieval from a remote network region |
US11711346B2 (en) | 2015-01-06 | 2023-07-25 | Umbra Technologies Ltd. | System and method for neutral application programming interface |
JP2018507639A (en) | 2015-01-28 | 2018-03-15 | アンブラ テクノロジーズ リミテッドUmbra Technologies Ltd. | System and method for global virtual network |
JP2018519688A (en) | 2015-04-07 | 2018-07-19 | アンブラ テクノロジーズ リミテッドUmbra Technologies Ltd. | Multi-perimeter firewall in the cloud |
WO2017098326A1 (en) | 2015-12-11 | 2017-06-15 | Umbra Technologies Ltd. | System and method for information slingshot over a network tapestry and granularity of a tick |
US11743332B2 (en) * | 2016-04-26 | 2023-08-29 | Umbra Technologies Ltd. | Systems and methods for routing data to a parallel file system |
CN110377854A (en) * | 2019-05-31 | 2019-10-25 | 平安科技(深圳)有限公司 | User access activity information monitoring method and device, computer equipment |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004515003A (en) | 2000-11-28 | 2004-05-20 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | GUI with library metaphor based on non-Euclidean geometry |
US7454446B2 (en) * | 2001-08-31 | 2008-11-18 | Rocket Software, Inc. | Techniques for storing data based upon storage policies |
US8417678B2 (en) * | 2002-07-30 | 2013-04-09 | Storediq, Inc. | System, method and apparatus for enterprise policy management |
-
2012
- 2012-11-09 WO PCT/EP2012/072264 patent/WO2013068530A2/en active Application Filing
- 2012-11-09 EP EP12784586.5A patent/EP2776952A2/en not_active Withdrawn
Non-Patent Citations (1)
Title |
---|
See references of WO2013068530A2 * |
Also Published As
Publication number | Publication date |
---|---|
WO2013068530A2 (en) | 2013-05-16 |
WO2013068530A3 (en) | 2013-12-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2776952A2 (en) | Logically and end-user-specific physically storing an electronic file | |
US9773002B2 (en) | Search filtered file system using secondary storage, including multi-dimensional indexing and searching of archived files | |
CA2665863C (en) | Systems and methods for space management in file systems | |
US8504797B2 (en) | Method and apparatus for managing thin provisioning volume by using file storage system | |
US7565494B1 (en) | Configuring a bounded cache prefetch policy in a computer system employing object addressable storage | |
US10338852B2 (en) | Systems and methods for list retrieval in a storage device | |
US8868666B1 (en) | Methods, devices and systems for content discovery, aggregation and presentment over a network | |
US20150142765A1 (en) | System and method for enabling remote file access via a reference file stored at a local device that references the content of the file | |
US20080065718A1 (en) | Configuring a cache prefetch policy that is controllable based on individual requests | |
US7451225B1 (en) | Configuring a cache prefetch policy in a computer system employing object addressable storage | |
US20090077141A1 (en) | Aggregation of file/directory structures | |
US7565493B1 (en) | Configuring a cache replacement policy in a computer system employing object addressable storage | |
US9189494B2 (en) | Object file system | |
US20130219050A1 (en) | Cloud service access apparatus, cloud service access method, and cloud service access system | |
WO2008086250A1 (en) | Prioritized data synchronization with host device | |
US20230008201A1 (en) | Automated Content Medium Selection | |
US20130238730A1 (en) | Online backup system | |
US11531468B2 (en) | System and method for managing storage space | |
KR20180099349A (en) | User terminal, and cloud service system including the same | |
US20170180285A1 (en) | Import content items from email | |
US20170060892A1 (en) | Search-based shareable collections | |
US7526553B1 (en) | Configuring a cache in a computer system employing object addressable storage | |
US9552432B1 (en) | Lightweight appliance for content retrieval | |
AU2012200321B2 (en) | Systems and methods for space management in file systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20140620 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN |
|
18W | Application withdrawn |
Effective date: 20141219 |