METHOD AND APPARATUS FOR STORING AND ACCESSING DATA IN
COMPUTER SYSTEMS
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to computing systems and, more particularly, to storing and accessing data in computing systems.
2. Description of the Related Art
In a computing environment data may be scattered over an address space on memory or on a persistent storage device (e.g., disk). Fig. 1 A depicts a portion of an address space 100 having various data objects scattered inside, As depicted in Fig. 1 A, a data object 102 can reference a group of data objects 104, 106 and 108. Each of the data objects 104, 106 and 108 contain data which can, for example, represent a particular type of data (e.g., integer, float, string, etc.) Accordingly, data objects of various types and sizes may be associated with each other in a group. In addition, the data in the address space 100 may represent several layers of nesting between data objects. For example, data object 102 references data object 108 which in turn references a data object 1 10. The data object 1 10 may also reference data objects 1 12 and 1 14, and so on.
For various applications, it is desirable to collect related data and store it on a data record in a space efficient manner. Furthermore, it is desirable to store data in a manner which allows relatively easy access to data (e.g., traversal of the data record in both forward and backward directions is possible). Fig. 1 B depicts a conventional data record 120 suitable for storage of data in a sequential manner. The conventional
data record 120 is partitioned into equally sized portions (e.g., 122, 124, and 126). The size of these equally sized portions is typically predetermined (e.g., 1 K bytes). Data can be stored in the equally sized portions of the conventional data record 120. For example, data objects 102, 104, and 106 (also shown in Fig. 1 A) can be respectively stored on data record portions 122, 124, and 126. It should be noted that in order to store data objects of various sizes, typically, data record 200 is partitioned into relatively large portions. It should also be noted that since the data portions are of a predetermined size, it is relatively easy to traverse the conventional data record 120 both in forward and backward directions. Although the conventional data record 120 allows storage of data objects of various sizes, the conventional data record 120 has various drawbacks. For example, one problem is that relatively large amounts of space need to be reserved for each portion (e.g., 102, 104, and 106) regardless of the size of data that needs to be stored for a particular portion. Another problem is that the conventional data record 120 is not suitable for capturing relationships between data objects. For example, data objects associated with each other in a group (" peer-to-peer" relationship) cannot be adequately represented in the conventional data record 120. As another example, data objects in a nested relationship (" hierarchical" relationships) cannot be adequately represented by the conventional data record 120.
In view of the foregoing, there is a need for improved methods for storing and accessing data in computing systems.
SUMMARY OF THE INVENTION
Broadly speaking, the invention relates to techniques for storing and accessing data in computer systems. In accordance with one aspect of the invention, a data packing module suitable for packing data and storing it on a data record is disclosed. The data packing module can store data objects of various types and sizes on the data record. In addition to data associated with each data object, various relationships between data objects can be stored on the data record by the data packing module. In accordance with another aspect of the invention, a data unpacking module suitable for unpacking data stored on a data record is disclosed. The data unpacking module can be used to access data associated with data objects as well as various other information stored on the data record. In one embodiment, a data accessing interface is provided to enable access to data stored on the data record. For example, the data accessing interface provides the ability to read data, as well as the ability to traverse the data record in both forward and backward directions.
The invention can be implemented in numerous ways, including a system, an apparatus, a data record, a method, or a computer readable medium. Several embodiments of the invention are discussed below.
As a method of storing information capable of being represented as one or more objects which can have associated data, one embodiment of the invention includes at least the acts of: storing a forward-skip-indicator for an object; storing data associated with the object; and storing a back-skip-indicator for the data object.
As a data record capable of sequentially representing one or more objects capable of having associated data with a data size that is not predetermined, one embodiment of the invention comprises: a forward- skip-indicator field capable of having a forward-skip value which can be
used to determine α next-position in the data record which can represent the position of a next sequentially stored object on the data record; a data field suitable for storing data associated with one or more objects; and a back-skip-indicator field capable of having a back-skip value which can be used to determine a previous-position in the data record which can represent the position of a previous sequentially stored object on the data record.
As a computer system, one embodiment of the invention comprises: one or more data objects capable of respectively having associated data that does not have a predetermined size; a data record capable of holding data; and a data packing module operating to sequentially store the one or more data objects on the data record. The data packing module operates to store at least two indicators on the data record to enable traversal of the data record in forward and backward directions.
As a computer system, another embodiment of the invention comprises: a data record having two or more data objects being sequentially stored on it, each of the two or more data objects capable of respectively having associated data that does have a predetermined size; and a data unpacking module operating to access the data record to access one or more data objects on the data record, the unpacking module capable of accessing data associated with the one or more data objects, and the unpacking module capable of traversing the data record in forward and backward directions.
As a method of accessing data on a data record having two or more data objects sequentially stored with each of the two or more data objects capable of respectively having associated data that does not have a predetermined size, one embodiment of the invention includes at least the acts of: reading a back-skip or a forward-skip value stored on the data record; and moving a reference to the data record by the back-skip or the forward skip value.
As α computer readable media including computer program code for storing information capable of being represented as one or more objects with associated data, one embodiment of the invention comprises: computer program code for storing a forward-skip-indicator for an object; computer program code for storing data associated with the object; and computer program code for storing a back-skip-indicator for the data object.
The advantages of the invention are numerous. Different embodiments or implementations may have one or more of the following advantages. One advantage is that the invention allows data objects of various sizes to be stored in a space efficient manner. Another advantage is that the invention allows various relationships between data objects to be captured and stored in a data record. Yet another advantage is that the invention provides for accessing the stored data in an efficient manner.
Other aspects and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will be readily understood by the following detailed description in conjunction with the accompanying drawings, wherein like reference numerals designate like structural elements, and in which:
Fig. 1 A depicts a portion of an address space having various data objects scattered inside.
Fig. 1 B depicts a conventional data record suitable for storage of data in a sequential manner.
Fig. 2A illustrates an exemplary computing environment including a data packing module in accordance with one aspect of the present invention.
Fig. 2B illustrates an exemplary computing environment including a data unpacking module in accordance with another aspect of the present invention.
Figs. 3A, 3B, 3C, and 4D illustrate data record portions in accordance with various embodiments of the invention.
Fig. 4 illustrates an exemplary data storing method for storing data objects on a data record in accordance with one embodiment of the invention.
Figs. 5A and 5B illustrate a method for storing one or more data objects as a group object on a data record in accordance with one embodiment of the invention.
Figs. 6A and 6B illustrate an exemplary method for accessing one or more data objects stored as a group object in a data record.
Fig. 7 illustrates an exemplary method for advancing a current position reference over data objects stored on a data record in the forward direction.
Fig. 8 illustrates an exemplary method for moving a current position reference over data objects stored on a data record in the backward direction
DETAILED DESCRIPTION OF THE INVENTION
The invention relates to techniques for storing and accessing data in computer systems. In accordance with one aspect of the invention, a data packing module suitable for packing data and storing it on a data record is disclosed. The data packing module can store data objects of various types and sizes on the data record. In addition to data associated with each data object, various relationships between data objects can be stored on the data record by the data packing module. In accordance with another aspect of the invention, a data unpacking module suitable for unpacking data stored on a data record is disclosed. The data unpacking module can be used to access data associated with data objects as well as various other information stored on the data record. In one embodiment, a data accessing interface is provided to enable access to data stored on the data record. For example, the data accessing interface provides the ability to read data, as well as the ability to traverse the data record in both forward and backward directions.
Embodiments of the invention are discussed below with reference to Figs. 2-8. However, those skilled in the art will readily appreciate that the detailed description given herein with respect to these figures is for explanatory purposes as the invention extends beyond these limited embodiments.
Fig. 2A illustrates an exemplary computing environment 200 including a data packing module 202 in accordance with one aspect of the present invention. The data packing module 202 can package data resident in the computing environment 200 and store it on the data record 204. For example, the data that is packaged by the data packing module 202 can be resident in a memory 204 and/or a persistent storage device 206 (e.g., a disk). The data packing module 202 may access the
data directly or through an operating system (not shown in Fig. 2A) in the computing environment 200.
The data packaged by the data packing module 202 can be scattered in an address space format (e.g., non-contiguous memory or disk locations). In addition, the size of data that is packaged by the data packing module need not be predetermined. As will be appreciated by those skilled in the art, the data packing module 202 can package and store data objects of various sizes in a sequential format on the data record 204. Moreover, data objects of various sizes can be stored on the data record 204 without wasting a significant amount of storage space. Furthermore, various relationships between data objects can be captured and stored on the data record 204. For example, these relationships include " peer-to-peer" relationships between data objects associated with each other in a " group", as well as " hierarchical" relationships between data objects " nested" in multiple layers.
Fig. 2B illustrates an exemplary computing environment 250 including a data unpacking module 252 in accordance with another aspect of the present invention. The data unpacking module 252 can be used to access the data stored on a data record 254. For example, the data stored on record 254 could have been stored by the data packing module 202 of Fig, 2A. The data unpacking module 252 can be used to access data (e.g., read data) stored on the data record 254. For example, an application 254 may utilize the data unpacking module 252 to access the data through a data accessing interface 256. As will be discussed below, various functions (methods) can be provided by the data accessing interface 256 to access data and/or navigate through the data record 254 (e.g., read(), nextO- and previousO functions that can be used to read and navigate through the data record in forward and backward directions).
Fig. 3A illustrates a data record portion 300 in accordance with one embodiment of the invention. For example, the data record portion 300
can represent a portion of the data record 204 generated by the data packing module 202 of Fig. 2. The data record 300 includes a forward- skip-indicator field 302, a variable-length-data field 304, and a back-skip- indicator field 306. The size of the variable-length-data field 304 is not predetermined. Thus, data objects with various sizes can be stored in the variable-length-data field 304. The forward-skip-indicator field 302 and back-skip-indicator field 306 are of predetermined sizes (e.g., n bytes of data). These fields can respectively hold values which can be used to forward-skip and back-skip to the next and previous data record portions in a data record. For example, in one particular embodiment, the value indicated by the forward-skip-indicator field 302 (or back-skip-indicator field 306) is the size of the data portion 300. In other words, the sum of the sizes of all fields in that data portion 300. For example, if the forward-skip- indicator field 302, variable length data field 304, and back-skip-indicator field 306 respectively have the sizes of si, s2, and s3 bytes, the forward- skip-indicator field 302 can have the value S which is the sum of sizes of all the data fields (si + s2 + s3). Accordingly, the value S can be used to forward-skip over the data record portion 300 from a first bit 308 to a last bit 310 of the data record portion 300. Similarly, the back-skip-indicator 306 can be used to back-skip over the data record portion 300 from the last bit 310 to a first bit 308 of the data record portion 300. In this manner, data of various sizes can be stored in a data record portion in conjunction with information which facilitates traversal of a data record in forward and backward directions.
Fig. 3B illustrates a data record portion 320 in accordance with another embodiment of the invention. In addition to the fields described above, the data record portion 320 includes a type-indicator field 322 which can be used to indicate the type of data object included in the variable length data field 324. For example, the type-indicator field 322 may be used to indicate that the data object is of a particular type (e.g., integer, string, real, etc).
Fig. 3C illustrates a data record portion 340 in accordance with yet another embodiment of the invention. The data record portion 340 includes a type-indicator field 342 which indicates that the data included in the variable length data field 344 is of the type " Group" . The type- indicator " Group" indicates that the data object contained in the variable length data field 344 is a group object which can represent one or more data objects. The one or more data objects are associated with each other and collectively can be represented as a group. It should be noted that each of the one or more data objects in the variable length data field 344 can have a variable length-data-field, as well as other fields (e.g., forward-skip, back-skip fields). Accordingly, the variable- length-data field 344 may include one or more fields of variable sizes (e.g., variable-length-data field 304 or 324) with corresponding forward- skip-indicator and back-skip-indicator fields, as well as additional fields (e.g., type-indicators 322, or 342). It should be noted that each data object in the group can have various sizes, Furthermore, these sizes need not be predetermined. For example, variables integer I, real , and string S, which are of various sizes, can be grouped together and represented as a group. It should also be noted that a data object in a group can itself represent another group (one or more data objects). Accordingly, data portions 300, 320, and 340 may be utilized to represent various relationships between data objects. It should be noted that a group object may be a group without any data objects (i.e., an empty group without any members). These relationships include " peer-to-peer" relationships between data objects associated with each other in a group, as well as " hierarchical" relationships between data objects nested in multiple layers.
Fig. 3D illustrates a data record portion 360 in accordance with still another embodiment of the invention. As the type-indicator field 362 indicates, the data record portion 360 represents a group of data objects contained in a variable-length-data field 364. The forward-skip-indicator and back-skip- indicator fields 368 and 370 can be used to traverse the
data record portion 360. A number-of-objects indicator field can represent the number of objects in the group. The variable length data field 364 includes one or more data record portions (320A...320B) which are stored sequentially. It should be noted that each one of the data objects in the group can be represented by a data record portion (320A...320B) which includes the fields shown in data record portion 320 of Fig. 3A. Thus, the forward-skip-indicator and back-skip-indicator fields associated with each data object represented in the variable-length- data field 364 can respectively be used to skip over each data object in the group in the forward and backward directions.
Fig. 4 illustrates an exemplary data storing method 400 for storing data objects on a data record in accordance with one embodiment of the invention. Initially, at operation 402, a reference to a data object is received. Next, at operation 404, a forward-skip-indicator field for the data object is stored on the data record. At operation 406, a type- indicator field for the data object is stored on the data record. The data associated with the data object is stored on the data record at operation 408. Finally, at operation 410, a back-skip-indicator field is stored on the data record.
Figs. 5A and 5B illustrate a method 500 for storing one or more data objects as a group object on a data record in accordance with one embodiment of the invention. For example, the data storing method 400 can be utilized by the data packing module of 202 to store data on the data record 204 of Fig. 2A. The group object (or group) can represent one or more data objects associated with each other. For simplicity, it is assumed that the group data object does not itself include another group data object. Furthermore, it is assumed that a group data object includes at least one data object. However, it should be noted that the group data object may include one or more other group data objects. It should also be noted that a group data object may be a group without any data objects (i.e., an empty group without any members).
At operation 502, references to one or more data objects which are to be represented as a group on the data record are received. Initially, the reference points to the first data object in the group. Next, at operation 506, a forward-skip-indicator field is stored on the data record. This field can be of a predetermined size. Accordingly, the appropriate size (e.g., number of bytes) can be reserved and initialized to zero. At operation 506, a type-indicator field which can have a predetermined size is stored on the data record. The type-indicator field is set to " Group" to indicate the group object type. Next, at operation 508, a number-of- objects indicator field which can have a predetermined size is stored on the data record. The number-of-objects indicator field can be set to a value which indicates the number of objects in the group.
At operation 510, a reference to the next data object in the group is obtained. Initially, this reference would be a reference to the first data object in the group. Next, at operation 512, a forward-skip-indicator field is stored for the first data object in the group. As noted above, this field can be used to forward-skip over the first data object (e.g., to the next data object in the group, the next group, etc.) At operation 514, a type- indicator field is stored for the first data object in the group. This field can be set to indicate the type of data for the first data object in the data group. It should be noted that the data objects not having any data associated with them may be represented, for example, as the type " Null" . Accordingly, at operation 516, a determination is made as to whether the first data object in the group is of type " Null" . If it is determined at operation 516 that the data is not of type " Null" , the method 500 proceeds to operation 518 where the data associated with the first data object is stored on the data record. On the other hand, when it is determined at operation 516 that the first data object is of type " Null", the method 500 skips operation 518 and proceeds directly to the operation 520. Following the operation 518, the method 500 proceeds to operation 520 where a back-skip-indicator field associated with the first data object is stored and set to the appropriate value. Next, at
operation 522, the value indicated by the forward-skip-indicator field of the first object in the group (stored in operation 512) is added to the value indicated by the forward-skip-indicator of the group (initially set to zero at operation 504). At operation 524, a determination is made as to whether there are more data objects in the group. If it is determined at operation 524 that there are one or more data objects in the group, the method 500 proceeds back to operation 510 where a reference to the next object (e.g., second data object in the group) can be obtained. The next object in the group can be processed in accordance with operations 512-522 as discussed above. On the other hand, if it is determined at operation 522 that there are no more data objects in the group, the method 500 proceeds to operation 526 where a back-skip-indicator field for the group is stored on the data record and set to the appropriate value. Following operation 526, the method 500 ends.
Figs. 6A and 6B illustrate an exemplary method 600 for accessing one or more data objects stored as a group object in a data record. For example, the method 600 can be utilized by the unpacking data module 252 to access the data record 254 of Fig. 2B. The method 600 can represent one or more functions (methods) provided by the data accessing interface 256 of Fig. 2B (e.g., a read function). For example, the method 600 can utilize a data record stored in accordance with the embodiment depicted in Fig. 3D. (data record portion 360). For simplicity, it is assumed that the data objects in the group do not represent another group of data objects. Furthermore, it is assumed that the group contains at least one data object having data associated with it.
Initially, at operation 602, a reference to the current position (reference) in the data record is obtained. It should be noted that the current reference position points to the beginning of a data portion in the data record (e.g., the left-most bit of the forward-skip-indicator 368 of Fig. 3D), Next, at operation 604, the current reference position is advanced over the forward-indicator field. At operation 606, the type-indicator field is read. As shown in Fig. 4D, the type-indicator field follows the
forwαrd-skip-indicαtor field. The current reference position is advanced over the type-indicator field at operation 608. Next, at operation 610, the number-of-objects indicator field is read. As depicted in Fig. 3D, the number-of-objects indicator field follows a group type-indicator field. Next, at operation 612 the current reference position is advanced over the number-of-objects-indicator field. It should be noted that at this point the current reference points to the first data object in the group (e.g., leftmost bit of the data record portion 320A). At operation 614, a group- index is set to zero. Next, at operation 616, the current reference position is advanced over the forward-skip field of the first data object in the group. It should be noted that each data object in the group can include the fields shown in data record portion 320 of Fig. 3B. At operation 618, the type-indicator field of the first data object is read. The current reference position is advanced over the type-indicator field (e.g., the type-indicator field 322 of Fig, 3B) at operation 620, Next, at operation 622, the data in the variable length data field is read. At operation 624, the current reference position is advanced over the back- skip-indicator field of the next data object in the group. At operation 626, the group-index is incremented by one. Next, at operation 628, a determination is made as to whether the group-index is equal to the number of objects in the group as indicated by the number-of-objects indicator field of the group (read at operation 610). If it is determined at operation 628 that the group-index is not equal to the number of objects in the group, the method 600 proceeds to operation 616 where the next data object (e.g. second data object) in the group can be processed.
On the other hand, if it is determined at operation 628 that the group-index is equal to the number of objects in the group, the method 600 proceeds to operation 630 where the current reference position is advanced over the back-skip-indicator field for the group (e.g., the back- skip-indicator field 370 of Fig. 3D). The method 600 ends following the operation 630.
Fig. 7 illustrates an exemplary method 700 for advancing a current position reference over data objects stored on a data record in the forward direction (e.g., from left to right of data record portion 360 of Fig. 3D). For example, the method 700 can be utilized by the unpacking data module 252 to access data record 254 of Fig. 2B. The method 700 can represent one or more functions (methods) provided by the data accessing interface 256 of Fig. 2B (e.g., a next function for traversing the data record in the forward direction). The method 700 can utilize that a data record stored in accordance with the embodiment depicted in Fig. 3D (data record portion 360).
Initially, at operation 702, a reference to the current position (reference) in the data record is obtained. It should be noted that the current reference position points to the beginning of a data portion in the data record (e.g., the left-most bit of the forward-skip-indicator field 368 of Fig. 3D). Next, at operation 704, the forward-skip-indicator field is read. At operation 706, the current reference position is advanced over the forward-indicator field. At operation 708, the type-indicator field is read. As shown in Fig. 4D, the type-indicator field follows the forward-skip- indicator field. Next, at operation 710, a determination is made as to whether the type-indicator field indicates the " Group" type. If it is determined at operation 710 that the type-indicator field does not indicate the " Group" type, the method 700 proceeds to operation 712 where the current position reference is moved back over the forward- skip-indicator field. Following the operation 712, at operation 714, the current reference position is advanced by the value indicated by the forward-skip-indicator (read at operation 704). The method 700 ends following the operation 714.
On the other hand, if it is determined at operation 710 that the type-indicator field indicates the " Group" type, the method 700 proceeds to operation 716 where the current reference position is advanced over the forward-skip-indicator field. Following the operation 716, the current reference position is advanced over the number-of-
objects-indicator field at operation 718. It should be noted that upon completion of the operation 718, the current reference position points to the first data object in the group. The method 700 ends following operation 718.
Fig, 8 illustrates an exemplary method 800 for moving a current position reference (reference) over data objects stored on a data record in the backward direction (e.g., from right to left of data record portion 360 of Fig. 3D). For example, the method 800 can be utilized by the unpacking data module 252 to access data record 254 of Fig. 2B. The method 700 can represent one or more functions (methods) provided by the data accessing interface 256 of Fig. 2B (e.g., a previous function to locate the previous data object stored on the data record). The method 800 can utilize a data record stored in the accordance with the embodiment depicted in Fig. 3D (the data record portion 360).
Initially, at operation 802, a reference to the current position
(reference) in the data record is obtained, It should be noted that the current reference position points to the end of a data portion in the data record (e.g., the right-most bit of the back-skip-indicator field 368 of Fig. 3D). Next, at operation 804, the current position reference is moved back over the back-skip-indicator field. The back-skip-indicator field is read at operation 806, Finally, at operation 808, the current position reference is moved back by the value (e.g. number of bytes) indicated by the back- skip-indicator field (read at operation 808).
The invention can use a combination of hardware and software components. The software can be embodied as computer readable code (or computer program code) on a computer readable medium.
The computer readable medium is any data storage device that can store data which can thereafter be read by a computer system.
Examples of the computer readable medium include read-only memory, random-access memory, CD-ROMs, magnetic tape, and optical data storage devices. The computer readable medium can also be
distributed over α network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
The advantages of the invention are numerous. Different embodiments or implementations may have one or more of the following advantages. One advantage is that the invention allows data objects of various sizes to be stored in a space efficient manner. Another advantage is that the invention allows various relationships between data objects to be captured and stored in a data record. Yet another advantage is that the invention provides for accessing the stored data in an efficient manner.
The many features and advantages of the present invention are apparent from the written description, and thus, it is intended by the appended claims to cover all such features and advantages of the invention. Further, since numerous modifications and changes will readily occur to those skilled in the art, it is not desired to limit the invention to the exact construction and operation as illustrated and described. Hence, all suitable modifications and equivalents may be resorted to as falling within the scope of the invention.
What is claimed is: