EP3129902A1 - Apparatus and method for management of bitemporal objects - Google Patents

Apparatus and method for management of bitemporal objects

Info

Publication number
EP3129902A1
EP3129902A1 EP15777222.9A EP15777222A EP3129902A1 EP 3129902 A1 EP3129902 A1 EP 3129902A1 EP 15777222 A EP15777222 A EP 15777222A EP 3129902 A1 EP3129902 A1 EP 3129902A1
Authority
EP
European Patent Office
Prior art keywords
processor
machine
time
system time
collection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP15777222.9A
Other languages
German (de)
French (fr)
Other versions
EP3129902A4 (en
Inventor
Fei Xue
Gajanan Chinchwadkar
Christopher Lindblad
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mark Logic Corp
Original Assignee
Mark Logic Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mark Logic Corp filed Critical Mark Logic Corp
Publication of EP3129902A1 publication Critical patent/EP3129902A1/en
Publication of EP3129902A4 publication Critical patent/EP3129902A4/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2477Temporal data queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2322Optimistic concurrency control using timestamps

Definitions

  • This invention relates generally to data processing in computer networks. More particularly, this invention relates to techniques for management of objects with valid time stamps and system time stamps (bitemporal objects).
  • Bitemporal objects are associated with both a valid time that marks when a thing is known in the real world and a system time that marks when the thing is available for discovery in a Server. Bitemporal data is necessary whenever there is a requirement to maintain snapshots of a transaction across various time dimensions. For example, financial and insurance industries use bitemporal data to track changes to contracts, policies, and events in a manner that adheres to strict regulation and compliance requirements.
  • a machine has a processor and a memory connected to the processor.
  • the memory stores instructions executed by the processor to construct an object collection where each object in the object collection has a common identifier, a valid time start field, a valid time end field, a system time start field and a system time end field.
  • the object collection includes split objects with a legacy object and an updated object with the system time start field set to the system time that the split objects are formed.
  • FIGURE 1 illustrates a system configured in accordance with an embodiment of the invention.
  • FIGURE 2 illustrates processing operations associated with an embodiment of the invention.
  • FIGURE 3 illustrates temporal collections constructed in accordance with an embodiment of the invention.
  • FIGURE 4 illustrates an example of an object split in accordance with an embodiment of the invention.
  • FIGURE 5 illustrates search results obtained utilizing a query processor configured in accordance with an embodiment of the invention.
  • FIGURE 6 illustrates search results obtained utilizing a query processor configured in accordance with an embodiment of the invention.
  • FIGURE 7 illustrates search results obtained utilizing a query processor configured in accordance with an embodiment of the invention.
  • FIGURE 8 illustrates search results obtained utilizing a query processor configured in accordance with an embodiment of the invention.
  • FIGURE 9 illustrates a temporal collection constructed in accordance with a disclosed example.
  • FIGURE 10 illustrates Last Stable Query Time (LSQT) processing performed in accordance with an embodiment of the invention.
  • FIGURE 11 illustrates document split operations performed in accordance with disclosed processing associated with an embodiment of the invention.
  • FIGURE 12 illustrates document split operations performed in accordance with disclosed processing associated with an embodiment of the invention.
  • FIGURE 13 illustrates document split operations performed in accordance with disclosed processing associated with an embodiment of the invention.
  • FIGURE 14 illustrates a rollback operation performed in accordance with an embodiment of the invention.
  • FIGURE 15 illustrates Allen operators utilized in accordance with an embodiment of the invention.
  • FIGURE 16 illustrates multi-temporal object insertion supported in accordance with an embodiment of the invention.
  • FIG. 1 illustrates a system 100 configured in accordance with an embodiment of the invention.
  • the system 100 includes a set of client device 102 1 through 102 N connected to a server 104 via a network 106, which may be in any combination of wired and wireless networks.
  • Each client device 102 includes standard components, such as a central processing unit connected to input/output devices 1 12 via a bus 114.
  • the input/output devices 1 12 may include a keyboard, mouse, touch display and the like.
  • a network interface circuit 116 is also connected to the bus 1 14 to provide connectivity to network 106.
  • a memory 120 is also connected to the bus 114. The memory 120 stores a data source 122 configured for uploading to server 104.
  • the server 104 also includes standard components, such as a central processing unit 160, a bus 162, input/output devices 164 and a network interface circuit 166.
  • a memory 170 is connected to bus 162.
  • the memory 170 stores instructions executed by the central processing unit 160 to implement operations of the invention.
  • the memory stores an object collection curator 172 with instructions to implement the operations shown in connection with Figure 2.
  • the object collection curator 172 constructs and curates an object collection 174, examples of which are provided below.
  • a query processor 176 processes queries against the object collection 174. Exemplary query processor 176 operations are discussed below.
  • Figure 2 illustrates processing operations associated with an embodiment of the invention.
  • the object curator 172 tests whether an object is available 200. For example, the object may be received over network 106 from data source 122. If an object is available (200 - Yes), it is determined whether an object collection exists for this type of object 202. Each object may be specified with a Uniform Resource Indicator (URI). If such an object does not exist (202 - No), a collection of that object type is started 204. If such an object already exists (202 - Yes), the object is associated with its corresponding collection 206. The object is then split 208.
  • URI Uniform Resource Indicator
  • the object is split between a legacy object with a valid time end field value and an updated object with a valid time start field value that is a single minimal incremental time unit (e.g., 1 second) greater than the valid time end field value of the legacy object.
  • the minimal incremental time unit value is contingent upon the time resolution offered by the system.
  • this operation results in a system time that has two versions of an object, where each version of the object has different valid time values.
  • the final operation of Figure 2 is to assign timestamps.
  • time stamp values are assigned to each version of the split object.
  • the time stamps include a valid time start value, a valid time end value, a system time start value and a system time end value, examples of which are provided below.
  • the timestamp assignment operation 210 accommodates a user specified system start time. That is, the system allows one to set a system time start field to a value earlier than a current system time. Thus, an object inserted into a collection can effectively be placed backwards in system time. Most systems only allow objects to be inserted at or after system time. This feature is significant because client machines 102 1 through 102 N may be operating at slightly different time frames. An enterprise controlling the client machines may need to observe these distinct time domains.
  • the system time end value is typically set to infinity upon object insertion.
  • the system time end value may be set to system time or a user input value.
  • Temporal refers to bitemporal objects, documents and collections.
  • Instant is an instant of time (such as, " 12/31/2012, 01 :00:00 am").
  • Period is an anchored duration of time (e.g., December 01, 1999 through December 31, 2000, the fall semester).
  • Axis is a named pair of range indexes that is a container for periods.
  • Temporal objects have both a valid axis and system axis.
  • User-defined Time is a time value that a user provides in replacement of system start time.
  • Last Stable Query Time is an object with a system start time before this point can be queried and an object with a system start time after this point can be updated and ingested.
  • a split refers to the creation of a new object that corresponds to a previous object, but has different valid timestamps and a different system end time.
  • Valid Time is when the information was true in the real world. Valid time may also be called application time. Valid time is provided by the user or application (e.g., data source 122 of client 102). The valid end time is updated by the system when an object is split. • System Time is when the information was stored in the object collection 174. System time may also be called transaction time. System time is managed by the server 104, except in cases when the system start time is set by the application as discussed below.
  • a bitemporal object is managed as a series of versioned objects in a collection.
  • the 'original' object inserted into the object collection is kept and never changes. Updates to the object are inserted as new objects with different valid and system times. A delete of the object is also inserted as a new object. In this way, a bitemporal object can be "rolled back" to review, at any point in time, when the information was known in the real world and when it was recorded in the object collection.
  • Bitemporality is defined on a collection, sometimes referred to as a temporal collection.
  • a temporal collection is a logical grouping of temporal objects that share the same axes with timestamps defined by the same range indices.
  • An object can be in any number of forms, including a Resource Description Framework (RDF) object, an extensible Markup Language (XML) object, a JavaScript Object Notation (JSON) object and a text object.
  • RDF Resource Description Framework
  • XML extensible Markup Language
  • JSON JavaScript Object Notation
  • Objects are sometimes referred to herein as documents.
  • a URI collection is created for that document.
  • a new document representing the update is inserted into the document's URI collection.
  • Any new document inserted into the temporal collection has its own unique URI collection that holds all of the versions of that document. The latest version of each document resides in a latest collection.
  • Figure 3 illustrates this schema.
  • the figure illustrates different collections 300 1, 300 2, 300 3.
  • Each document in each collection has the same URI.
  • Objects may be segregated by earlier versions 304 and latest version 306.
  • the valid and system axis each make use of dateTime range indexes that define the start and end times.
  • the following code creates element range indexes to be used to create the valid and system axes.
  • the follow code is in JavaScript®. XQuery® or other languages may also be used in accordance with embodiments of the invention.
  • config admin .
  • databaseAddRangeElementlndex (config, dbid, validEnd);
  • config admin .
  • databaseAddRangeElementlndex (config,
  • config admin . databaseAddRangeElementlndex (config, dbid, systemEnd); admin . saveConfiguration (config) ;
  • System and valid axes may be formed using the following JavaScript® code.
  • temporal . documentlnsert (“ kool” , “koolorder . j son” , root) ; The last line of code inserts the document.
  • the temporal collection is “kool”
  • the URI is “koolorder .j son”
  • the root is the content of the document.
  • John looks at the trading pattern of the stock over the last week and notices that it always dips during the last minute of the trading day. At 1 1 :30:00, John changes his order to buy the stock at the closing price (15:59:59). The change is recorded as another document in the broker's database at 1 1 :30:01.
  • the object collection may now be queried.
  • the following query searches the temporal documents, using the cts:period-range-query function to locate the documents that were in the database between 11 : 10 and 11 : 15.
  • the following query searches the temporal documents, using the cts: period-range- query function to locate the documents that have a valid time period that starts after 10:30 and ends at 15:59.
  • ALN_FINISHES is one of the comparison operators described in "Allen Operators" discussed in detail below.
  • the following query searches the temporal documents, using the cts:period-compare- query function to locate the documents that were in the database when the valid time period is within the system time period.
  • ISO_CONTATNS is one of the comparison operators described below in connection with "ISO SQL 2011 Operators”.
  • the following query uses the cts:and-query to AND two cts:collection-query functions to return the temporal document that is in the URI collection, koolorder.xml, and the latest collection.
  • Doc 3 meets the search criteria, as shown in Figure 9.
  • the system is configured to allow one to manually set the system start time when inserting or updating a document in a collection. This feature is useful when one needs to maintain a "master" system time across multiple clients that are concurrently inserting and updating bitemporal documents, without the need for the clients to communicate with one another in order to coordinate their system times.
  • the system start times for document versions with the same URI must progress along the system time axis, so that an update to a document cannot have a system start time that is earlier than that of the document that chronicles its last update.
  • it is necessary to ensure that the system time progresses at the same rate for every document insert and update.
  • a special timestamp called the Last Stable Query Time (LSQT) can be enabled on a temporal collection to manage system start times across documents with different URIs.
  • LSQT Last Stable Query Time
  • a temporal document with a system start time before the LSQT can only be queried and a document with a system start time after the LSQT can be updated / ingested, but not queried.
  • This approach is illustrated in Figure 10.
  • the LSQT value starts at 0 (lowest timestamp).
  • document reads and writes are queued until the LSQT is reset to the maximum system start time in the database. For example, the following query first checks to make sure the application time (simulated by the current time) is greater than the LSQT:
  • Deleting a temporal document maintains the document and all of its versions in the URI collection and updates the deleted document and all of its versions that have a system end time of infinity to the time of the delete. Deleting a temporal document removes the document from the latest collection. So the latest collection is the source of all of the documents that are currently valid and the URI collections are the source of the history of each document. Should one insert a document using the same URI as a deleted document, the deleted document, and all of its previous versions remain in the same URI collection as the "newly" inserted document. The newly inserted document is then added to the latest collection.
  • the resulting collection is shown in Figure 12.
  • the stock hits $ 12.50 and John's order is filled, which results in the collection in Figure 13.
  • the broker's policy is to honor the valid times for all orders.
  • the order fulfillment application reviews the valid and system times recorded in the cancellation document, determines that John in fact cancelled his order before it was filled, and does not debit his account for the stock purchase.
  • the broker deletes the order, which results in the collection shown in Figure 14.
  • the query processor 176 may be configured to support Allen interval algebra operators.
  • Allen interval algebra is a calculus for temporal reasoning. The calculus defines possible relations between time intervals and provides a composition table that can be used as a basis for reasoning about temporal descriptions of events.
  • the left side of Figure 15 illustrates Allen operators. To the right of each Allen operator are the corresponding time intervals. SQL provides similar operators that may be used in accordance with embodiments of the invention.
  • the foregoing examples reference two-dimensional object splitting.
  • the disclosed system also supports multi-dimensional object splitting.
  • Vps3 is between vpl and vp2, while vs3 is between vsl and vs2. This update results in four cubes after the object split.
  • “#1" has content V2 and is specified by (vpl, vp3), (vsl, vs3), (t2, INF).
  • Object "#2" has content VI and is specified by (vpl, vp2), (vs l, vs2), (tl, t2).
  • Object "#3” has content VI and is specified by (vpl, vp2), (vs2, vs3), (t2, INF).
  • object “#4" has content VI and is specified by (vp3, vp2), (vs l, vs2), (t2, INF).
  • Specified time ranges may be used to migrate selected objects to tiered storage. For example, older objects may be migrated to cheaper, slower storage resources (e.g., magnetic tape).
  • the query processor 176 may be configured to cache query results in system time segments to support range queries on system time. In this way, query results can be cached and utilized without being evicted from the cache.
  • the object collection curator 172 may be configured to form a replicated object from an object in the object collection 174.
  • a replica may be from a segment of a master object.
  • the object collection curator 172 includes a safety switch that precludes the alteration of the history of an object.
  • the object collection curator 172 may be configured to disable edits to time field values.
  • An embodiment of the present invention relates to a computer storage product with a non-transitory computer readable storage medium having computer code thereon for performing various computer-implemented operations.
  • the media and computer code may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well known and available to those having skill in the computer software arts.
  • Examples of computer-readable media include, but are not limited to: magnetic media, optical media, magneto-optical media and hardware devices that are specially configured to store and execute program code, such as application-specific integrated circuits ("ASICs"), programmable logic devices ("PLDs”) and ROM and RAM devices.
  • Examples of computer code include machine code, such as produced by a compiler, and files containing higher-level code that are executed by a computer using an interpreter.
  • an embodiment of the invention may be implemented using JAVA®, C++, or other object-oriented
  • Another embodiment of the invention may be implemented in hardwired circuitry in place of, or in combination with, machine-executable software instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A machine has a processor and a memory connected to the processor. The memory stores instructions executed by the processor to construct an object collection where each object in the object collection has a common identifier, a valid time start field, a valid time end field, a system time start field and a system time end field. The object collection includes split objects with a legacy object and an updated object with the system time start field set to the system time that the split objects are formed.

Description

APPARATUS AND METHOD FOR MANAGEMENT OF BITEMPORAL OBJECTS
CROSS-REFERENCE TO RELATED APPLICATION This application claims priority to U.S. Provisional Patent Application Serial Number 61/976,378, filed April 7, 2014, the contents of which are incorporated herein by reference.
FIELD OF THE INVENTION
This invention relates generally to data processing in computer networks. More particularly, this invention relates to techniques for management of objects with valid time stamps and system time stamps (bitemporal objects).
BACKGROUND OF THE INVENTION
Bitemporal objects are associated with both a valid time that marks when a thing is known in the real world and a system time that marks when the thing is available for discovery in a Server. Bitemporal data is necessary whenever there is a requirement to maintain snapshots of a transaction across various time dimensions. For example, financial and insurance industries use bitemporal data to track changes to contracts, policies, and events in a manner that adheres to strict regulation and compliance requirements.
There is a need for improved techniques for managing bitemporal objects.
SUMMARY OF THE INVENTION
A machine has a processor and a memory connected to the processor. The memory stores instructions executed by the processor to construct an object collection where each object in the object collection has a common identifier, a valid time start field, a valid time end field, a system time start field and a system time end field. The object collection includes split objects with a legacy object and an updated object with the system time start field set to the system time that the split objects are formed.
BRIEF DESCRIPTION OF THE FIGURES
The invention is more fully appreciated in connection with the following detailed description taken in conjunction with the accompanying drawings, in which:
FIGURE 1 illustrates a system configured in accordance with an embodiment of the invention. FIGURE 2 illustrates processing operations associated with an embodiment of the invention.
FIGURE 3 illustrates temporal collections constructed in accordance with an embodiment of the invention.
FIGURE 4 illustrates an example of an object split in accordance with an embodiment of the invention.
FIGURE 5 illustrates search results obtained utilizing a query processor configured in accordance with an embodiment of the invention.
FIGURE 6 illustrates search results obtained utilizing a query processor configured in accordance with an embodiment of the invention.
FIGURE 7 illustrates search results obtained utilizing a query processor configured in accordance with an embodiment of the invention.
FIGURE 8 illustrates search results obtained utilizing a query processor configured in accordance with an embodiment of the invention.
FIGURE 9 illustrates a temporal collection constructed in accordance with a disclosed example.
FIGURE 10 illustrates Last Stable Query Time (LSQT) processing performed in accordance with an embodiment of the invention.
FIGURE 11 illustrates document split operations performed in accordance with disclosed processing associated with an embodiment of the invention.
FIGURE 12 illustrates document split operations performed in accordance with disclosed processing associated with an embodiment of the invention.
FIGURE 13 illustrates document split operations performed in accordance with disclosed processing associated with an embodiment of the invention.
FIGURE 14 illustrates a rollback operation performed in accordance with an embodiment of the invention.
FIGURE 15 illustrates Allen operators utilized in accordance with an embodiment of the invention.
FIGURE 16 illustrates multi-temporal object insertion supported in accordance with an embodiment of the invention.
Like reference numerals refer to corresponding parts throughout the several views of the drawings. DETAILED DESCRIPTION OF THE INVENTION Figure 1 illustrates a system 100 configured in accordance with an embodiment of the invention. The system 100 includes a set of client device 102 1 through 102 N connected to a server 104 via a network 106, which may be in any combination of wired and wireless networks.
Each client device 102 includes standard components, such as a central processing unit connected to input/output devices 1 12 via a bus 114. The input/output devices 1 12 may include a keyboard, mouse, touch display and the like. A network interface circuit 116 is also connected to the bus 1 14 to provide connectivity to network 106. A memory 120 is also connected to the bus 114. The memory 120 stores a data source 122 configured for uploading to server 104.
The server 104 also includes standard components, such as a central processing unit 160, a bus 162, input/output devices 164 and a network interface circuit 166. A memory 170 is connected to bus 162. The memory 170 stores instructions executed by the central processing unit 160 to implement operations of the invention. In one embodiment, the memory stores an object collection curator 172 with instructions to implement the operations shown in connection with Figure 2. The object collection curator 172 constructs and curates an object collection 174, examples of which are provided below. A query processor 176 processes queries against the object collection 174. Exemplary query processor 176 operations are discussed below.
Figure 2 illustrates processing operations associated with an embodiment of the invention. The object curator 172 tests whether an object is available 200. For example, the object may be received over network 106 from data source 122. If an object is available (200 - Yes), it is determined whether an object collection exists for this type of object 202. Each object may be specified with a Uniform Resource Indicator (URI). If such an object does not exist (202 - No), a collection of that object type is started 204. If such an object already exists (202 - Yes), the object is associated with its corresponding collection 206. The object is then split 208. In particular, the object is split between a legacy object with a valid time end field value and an updated object with a valid time start field value that is a single minimal incremental time unit (e.g., 1 second) greater than the valid time end field value of the legacy object. The minimal incremental time unit value is contingent upon the time resolution offered by the system. As demonstrated below, this operation results in a system time that has two versions of an object, where each version of the object has different valid time values. The final operation of Figure 2 is to assign timestamps. In particular, time stamp values are assigned to each version of the split object. The time stamps include a valid time start value, a valid time end value, a system time start value and a system time end value, examples of which are provided below.
The timestamp assignment operation 210 accommodates a user specified system start time. That is, the system allows one to set a system time start field to a value earlier than a current system time. Thus, an object inserted into a collection can effectively be placed backwards in system time. Most systems only allow objects to be inserted at or after system time. This feature is significant because client machines 102 1 through 102 N may be operating at slightly different time frames. An enterprise controlling the client machines may need to observe these distinct time domains.
If there is no user specified system start time, the current system time is used as the system start time. The system time end value is typically set to infinity upon object insertion. The system time end value may be set to system time or a user input value.
The foregoing is more fully appreciated with reference to various examples. Terms used in connection with the examples include:
• Temporal refers to bitemporal objects, documents and collections.
Instant is an instant of time (such as, " 12/31/2012, 01 :00:00 am").
• Period is an anchored duration of time (e.g., December 01, 1999 through December 31, 2000, the fall semester).
• Axis is a named pair of range indexes that is a container for periods.
Temporal objects have both a valid axis and system axis.
• User-defined Time is a time value that a user provides in replacement of system start time.
• Last Stable Query Time (LSQT) is an object with a system start time before this point can be queried and an object with a system start time after this point can be updated and ingested.
• A split refers to the creation of a new object that corresponds to a previous object, but has different valid timestamps and a different system end time.
• Valid Time is when the information was true in the real world. Valid time may also be called application time. Valid time is provided by the user or application (e.g., data source 122 of client 102). The valid end time is updated by the system when an object is split. • System Time is when the information was stored in the object collection 174. System time may also be called transaction time. System time is managed by the server 104, except in cases when the system start time is set by the application as discussed below.
A bitemporal object is managed as a series of versioned objects in a collection. The 'original' object inserted into the object collection is kept and never changes. Updates to the object are inserted as new objects with different valid and system times. A delete of the object is also inserted as a new object. In this way, a bitemporal object can be "rolled back" to review, at any point in time, when the information was known in the real world and when it was recorded in the object collection.
Bitemporality is defined on a collection, sometimes referred to as a temporal collection. A temporal collection is a logical grouping of temporal objects that share the same axes with timestamps defined by the same range indices. One can create additional temporal collections for objects that require a different schema for the timestamps. An object can be in any number of forms, including a Resource Description Framework (RDF) object, an extensible Markup Language (XML) object, a JavaScript Object Notation (JSON) object and a text object. Objects are sometimes referred to herein as documents.
When a document is inserted into a temporal collection, a URI collection is created for that document. When the document is updated, a new document representing the update is inserted into the document's URI collection. Any new document inserted into the temporal collection has its own unique URI collection that holds all of the versions of that document. The latest version of each document resides in a latest collection.
Figure 3 illustrates this schema. In particular, the figure illustrates different collections 300 1, 300 2, 300 3. Each document in each collection has the same URI. Objects may be segregated by earlier versions 304 and latest version 306.
The valid and system axis each make use of dateTime range indexes that define the start and end times. For example, the following code creates element range indexes to be used to create the valid and system axes. The follow code is in JavaScript®. XQuery® or other languages may also be used in accordance with embodiments of the invention.
var admin = require (" /MarkLogic/admin . xqy" ) ;
var config = admin . getConfiguration () ;
var dbid = xdmp . database ( "Documents" ) ;
var validStart = admin . databaseRangeElementlndex (
"dateTime", "", "validStart", "", fn. false () ) ; var validEnd = admin . databaseRangeElementlndex (
"dateTime", "", "validEnd", "", fn. false () ) ; var systemStart = admin . databaseRangeElementlndex (
"dateTime", "", "systemStart", "", fn. false () ) ; var systemEnd = admin . databaseRangeElementlndex (
"dateTime", "", "systemEnd", "", fn.falseO ) ; config = admin . databaseAddRangeElementlndex (config, dbid,
validStart) ;
config = admin . databaseAddRangeElementlndex (config, dbid, validEnd); config = admin . databaseAddRangeElementlndex (config,
dbid, systemStart) ;
config = admin . databaseAddRangeElementlndex (config, dbid, systemEnd); admin . saveConfiguration (config) ;
System and valid axes may be formed using the following JavaScript® code.
var temporal = require ( "/MarkLogic/temporal . xqy" ) ; var validResult = temporal . axisCreate ( "valid",
cts . elementReference ( fn . QName ( " " , "validStart" ) ) ,
cts . elementReference ( fn . QName ( " " , "validEnd" ) ) ) ; var systemResult = temporal . axisCreate ( "system",
cts . elementReference ( fn . QName ( " " , "systemStart" ) ) , cts . elementReference ( fn . QName ( " " , "systemEnd" ) ) ) ;
An object collection or temporal collection named "kool" may be created using the previously created system and valid axes. The following code accomplishes this. var temporal = require (" /MarkLogic/temporal . xqy" ) ; var collectionResult =
temporal . collectionCreate ( "kool", "system",
"valid") ;
Consider the example of a stock trader, John, who places an order to buy some stock. The record of the trade is stored as a bitemporal object. The stock of KoolCo is trading around $ 12.65. John places a limit order to buy 100 shares of the stock for $12 at 11 :00:00 on 3-Apr-2014 (this is the valid time). The document for the transaction is recorded in the broker's database at 11 :00:01 on 3-Apr-2014 (this is the system time). var temporal = require ( "/MarkLogic/temporal . xqy" ) ; var root =
{ "tempdoc": {
"systemStart" :
null ,
"systemEnd" :
null ,
"validStart" : "2014-04-
03T11:00:00", "validEnd" :
"2014-04-03T16:00:00",
"trader": "John",
"price" : 12
}
} declareUpdate ( ) ;
temporal . documentlnsert (" kool" , "koolorder . j son" , root) ; The last line of code inserts the document. The temporal collection is "kool", the URI is "koolorder .j son" and the root is the content of the document.
John looks at the trading pattern of the stock over the last week and notices that it always dips during the last minute of the trading day. At 1 1 :30:00, John changes his order to buy the stock at the closing price (15:59:59). The change is recorded as another document in the broker's database at 1 1 :30:01.
var temporal = require ( "/MarkLogic/temporal . xqy" ) ; var root =
{ "tempdoc": {
"systemStart" :
null,
"systemEnd" :
null ,
"validStart": "2014-04- 03T15:59:59", "validEnd": "2014-04-03T15 : 59 :59",
"trader": "John",
"price": Closing Price
}
} ; declareUpdate ( ) ;
temporal . documentlnsert (" kool" , "koolorder . j son" , root) ;
This results in three documents with valid and system times as shown in Figure 4. Note that the action resulted in a split on Doc 1 that resulted in Doc 2, as well as Doc 3 that contains the new content.
The object collection may now be queried. The following query searches the temporal documents, using the cts:period-range-query function to locate the documents that were in the database between 11 : 10 and 11 : 15.
cts . search (cts .periodRangeQ
uery ( "system",
"ISO_CONTAI
NS" ,
cts .period (xs .dateTime ("2014-04-
03T11:10:00") , xs . dateTime ( "2014 - 04-03Tll:15:00") ) ) ) ;
In this example, only Doc 1 meets the search criteria. This is shown pictorially in Figure 5.
The following query searches the temporal documents, using the cts: period-range- query function to locate the documents that have a valid time period that starts after 10:30 and ends at 15:59. ALN_FINISHES is one of the comparison operators described in "Allen Operators" discussed in detail below.
cts . search (cts .periodRangeQ
uery ( "valid" ,
"ALN_FINISH
ES",
cts .period (xs .dateTime ("2014-04-
03T10:30:00") , xs . dateTime ( "2014 - 04-03T15:59:59") ) ) ) ;
In this example, only Doc 2 meets the search criteria, as shown in Figure 6. The following query searches the temporal documents, using the cts: period-range- query function to locate the documents that were in the database after 11 :20. ALN AFTER is one of the comparison operators described below in connection with "Allen Operators".
cts . search (cts .periodRangeQ
uery ( "system" ,
"ALN_AFTER",
cts .period (xs .dateTime ("2014-04-
03T11:00:00") , xs . dateTime ( "2014- 04-03T11 :20 : 00") ) ) ) ;
In this example, both Doc 2 and Doc 3 meet the search criteria, as shown in Figure 7.
The following query searches the temporal documents, using the cts:period-compare- query function to locate the documents that were in the database when the valid time period is within the system time period. ISO_CONTATNS is one of the comparison operators described below in connection with "ISO SQL 2011 Operators".
cts . search (cts .periodCompareQ
uery ( "system" ,
"ISO_CONTA
INS",
"valid"
) )
In this example, only Doc 3 meets the search criteria, as shown in Figure 8.
The following query uses the cts:and-query to AND two cts:collection-query functions to return the temporal document that is in the URI collection, koolorder.xml, and the latest collection.
cts . search (cts . andQuery ( [
cts . collectionQuery ( "koolorder . j s
on") ,
cts . collectionQuery ( "latest" ) ] ) )
In this example, Doc 3 meets the search criteria, as shown in Figure 9.
The system is configured to allow one to manually set the system start time when inserting or updating a document in a collection. This feature is useful when one needs to maintain a "master" system time across multiple clients that are concurrently inserting and updating bitemporal documents, without the need for the clients to communicate with one another in order to coordinate their system times. The system start times for document versions with the same URI must progress along the system time axis, so that an update to a document cannot have a system start time that is earlier than that of the document that chronicles its last update. However, when managing documents with different URIs in a temporal collection, it is necessary to ensure that the system time progresses at the same rate for every document insert and update.
A special timestamp, called the Last Stable Query Time (LSQT) can be enabled on a temporal collection to manage system start times across documents with different URIs. A temporal document with a system start time before the LSQT can only be queried and a document with a system start time after the LSQT can be updated / ingested, but not queried. This approach is illustrated in Figure 10. One can advance the LSQT either manually or automatically. This allows one to manage which documents are available to be queried and which documents can be updated.
When LSQT is enabled on a temporal collection, the LSQT value starts at 0 (lowest timestamp). When advanced, document reads and writes are queued until the LSQT is reset to the maximum system start time in the database. For example, the following query first checks to make sure the application time (simulated by the current time) is greater than the LSQT:
xquery version "1.0-ml"; import module namespace temporal =
"http://marklogic.com/xdmp/temporal" at
"/MarkLogic/temporal . xqy" ; let $appTime := fn : current-dateTime ( )
let $LSQT := temporal : get-LSQT ( "temporalcollection" ) let $root :=
<tempdoc>
<systemStart/>
<systemEnd/>
<validStart>2014-06-03T14 : 13 : 05</validStart>
<validEnd>9999-12 -31T23 : 59 : 59.99Z</validEnd>
<content>vl-content here</content>
</tempdoc> let $systemTime :=
if ($appTime > $LSQT) then (temporal : statement-set-system- time (xs : dateTime ( $appTime) ) )
el
se
( ) return (
temporal : document- insert (
"temporalcollectio
n", "doc.xml",
$ro
ot)
r
$systemT
ime )
One can use the temporahdocument-delete function to delete temporal documents. Deleting a temporal document maintains the document and all of its versions in the URI collection and updates the deleted document and all of its versions that have a system end time of infinity to the time of the delete. Deleting a temporal document removes the document from the latest collection. So the latest collection is the source of all of the documents that are currently valid and the URI collections are the source of the history of each document. Should one insert a document using the same URI as a deleted document, the deleted document, and all of its previous versions remain in the same URI collection as the "newly" inserted document. The newly inserted document is then added to the latest collection.
Returning to the example discussed in connection with Figure 4, at 12: 10:00, John changes his mind again and decides to change his order to a limit order to buy at $12.50. The following code reflects this change.
var temporal = require ( "/MarkLogic/temporal . xqy" ) ; var root =
{ "tempdoc": {
"systemStart" :
null ,
"systemEnd" :
null, "validStart" : "2014-04- 03T12:10:00", "validEnd
"2014-04-03T16:00:00",
"content": "12.50"
}
declareUpdate ( ) ;
temporal . documentInsert ( " kool" , "koolorder . j son" , root) ; This transaction is recorded as another document with a valid time of 12: 10:00, but due to heavy trading, the change is not recorded in the broker's database until 12: 10: 12. The resulting collection is shown in Figure 11. Doc 3 is not split because the new Doc 5 contains the same period as Doc 3.
At 13:00:00, the purchase order has not been filled and John decides he no longer wants to buy the stock, so he cancels his order. This cancellation is recorded as another document with a valid time of 13:00:00 and recorded in the broker's database at 13 :00:02. The following code reflects this activity.
var temporal = require ( "/MarkLogic/temporal . xqy" ) ; var root =
{ "tempdoc": {
"systemStart" :
null ,
"systemEnd" :
null ,
"validStart": "2014-04-
03T13:00:00", "validEnd":
"2014-04-03T16:00:00",
"content": "0"
}
declareUpdate ( ) ;
temporal . documentlnsert (" kool" , "koolorder . j son" , root) ;
The resulting collection is shown in Figure 12. At 13:00:01, the stock hits $ 12.50 and John's order is filled, which results in the collection in Figure 13. The broker's policy is to honor the valid times for all orders. At 13 :00:03, the order fulfillment application reviews the valid and system times recorded in the cancellation document, determines that John in fact cancelled his order before it was filled, and does not debit his account for the stock purchase. At 16:00:00, the broker deletes the order, which results in the collection shown in Figure 14.
The query processor 176 may be configured to support Allen interval algebra operators. Allen interval algebra is a calculus for temporal reasoning. The calculus defines possible relations between time intervals and provides a composition table that can be used as a basis for reasoning about temporal descriptions of events. The left side of Figure 15 illustrates Allen operators. To the right of each Allen operator are the corresponding time intervals. SQL provides similar operators that may be used in accordance with embodiments of the invention.
The foregoing examples reference two-dimensional object splitting. The disclosed system also supports multi-dimensional object splitting. Consider the multi-temporal time frame illustrated in Figure 16. There is a system time axis 1600, a vp axis 1602 and a vs axis
1604. At time tl content VI is inserted between axis positions (vpl, vp2) and (vs l, vs2).
This results in a three dimensional object occupying all of the disclosed space in Figure 16.
This can be expressed as (vpl, vp2), (vs l, vs2) and (tl, INF).
Next, there is an update at t2 with content V2. Vps3 is between vpl and vp2, while vs3 is between vsl and vs2. This update results in four cubes after the object split. Object
"#1" has content V2 and is specified by (vpl, vp3), (vsl, vs3), (t2, INF). Object "#2" has content VI and is specified by (vpl, vp2), (vs l, vs2), (tl, t2). Object "#3" has content VI and is specified by (vpl, vp2), (vs2, vs3), (t2, INF). Finally object "#4" has content VI and is specified by (vp3, vp2), (vs l, vs2), (t2, INF).
Those skilled in the art will appreciate that the disclosed techniques facilitate a number of computer system enhancements. Specified time ranges may be used to migrate selected objects to tiered storage. For example, older objects may be migrated to cheaper, slower storage resources (e.g., magnetic tape).
The query processor 176 may be configured to cache query results in system time segments to support range queries on system time. In this way, query results can be cached and utilized without being evicted from the cache.
The object collection curator 172 may be configured to form a replicated object from an object in the object collection 174. For example, a replica may be from a segment of a master object. In one embodiment, the object collection curator 172 includes a safety switch that precludes the alteration of the history of an object. For example, the object collection curator 172 may be configured to disable edits to time field values.
An embodiment of the present invention relates to a computer storage product with a non-transitory computer readable storage medium having computer code thereon for performing various computer-implemented operations. The media and computer code may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well known and available to those having skill in the computer software arts. Examples of computer-readable media include, but are not limited to: magnetic media, optical media, magneto-optical media and hardware devices that are specially configured to store and execute program code, such as application-specific integrated circuits ("ASICs"), programmable logic devices ("PLDs") and ROM and RAM devices. Examples of computer code include machine code, such as produced by a compiler, and files containing higher-level code that are executed by a computer using an interpreter. For example, an embodiment of the invention may be implemented using JAVA®, C++, or other object-oriented
programming language and development tools. Another embodiment of the invention may be implemented in hardwired circuitry in place of, or in combination with, machine-executable software instructions.
The foregoing description, for purposes of explanation, used specific nomenclature to provide a thorough understanding of the invention. However, it will be apparent to one skilled in the art that specific details are not required in order to practice the invention. Thus, the foregoing descriptions of specific embodiments of the invention are presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise forms disclosed; obviously, many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, they thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. It is intended that the following claims and their equivalents define the scope of the invention.

Claims

In the claims:
1. A machine, comprising:
a processor; and
a memory connected to the processor, the memory storing instructions executed by the processor to:
construct an object collection wherein each object in the object collection has a common identifier, a valid time start field, a valid time end field, a system time start field and a system time end field;
wherein the object collection includes split objects with
a legacy object, and
an updated object with the system time start field set to the system time that the split objects are formed.
2. The machine of claim 1 wherein the split objects further comprise an additional updated object with the common identifier, a valid time start field, a valid time end field, a system time start field, a system time end field, a start axis field value and an end axis field value characterizing a multi-temporal object.
3. The machine of claim 1 wherein the memory stores instructions executed by the processor to establish a last stable query time value on a system time axis.
4. The machine of claim 3 wherein the memory stores instructions executed by the processor to query objects with system time end field values less than the last stable query time value.
5. The machine of claim 3 wherein the memory stores instructions executed by the processor to update objects with a system time start field value greater than the last stable query time value.
6. The machine of claim 1 wherein the memory stores instructions executed by the processor to migrate selected objects of the object collection to tiered storage in accordance with a specified time range.
7. The machine of claim 1 wherein the memory stores instructions executed by the processor to cache query results in system time segments to support range queries on system time.
8. The machine of claim 1 wherein the memory stores instructions executed by the processor to form a replicated object from an object within the object collection.
9. The machine of claim 1 wherein the memory stores instructions executed by the processor to disable edits to time field values.
10. The machine of claim 1 wherein the object collection designates early version object instances and a last object instance.
1 1. The machine of claim 1 wherein the memory stores instructions executed by the processor to apply Allen operators to the object collection.
12. The machine of claim 1 wherein the memory stores instructions executed by the processor to apply Structured Query Language operators to the object collection.
13. The machine of claim 1 wherein each object is selected from a Resource Description Framework (RDF) object, an extensible Markup Language (XML) object, a JavaScript Object Notation (JSON) object and a text object.
14. The machine of claim 1 wherein the memory stores instructions executed by the processor to set a system time start field to a value earlier than a current system time.
EP15777222.9A 2014-04-07 2015-04-02 Apparatus and method for management of bitemporal objects Withdrawn EP3129902A4 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201461976378P 2014-04-07 2014-04-07
PCT/US2015/024106 WO2015157086A1 (en) 2014-04-07 2015-04-02 Apparatus and method for management of bitemporal objects

Publications (2)

Publication Number Publication Date
EP3129902A1 true EP3129902A1 (en) 2017-02-15
EP3129902A4 EP3129902A4 (en) 2017-08-30

Family

ID=54209924

Family Applications (1)

Application Number Title Priority Date Filing Date
EP15777222.9A Withdrawn EP3129902A4 (en) 2014-04-07 2015-04-02 Apparatus and method for management of bitemporal objects

Country Status (5)

Country Link
US (1) US20150286688A1 (en)
EP (1) EP3129902A4 (en)
AU (1) AU2015244230A1 (en)
CA (1) CA2941713A1 (en)
WO (1) WO2015157086A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10735802B2 (en) 2015-12-04 2020-08-04 Sharp Kabushiki Kaisha Recovery data with content identifiers

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1586057A2 (en) * 2003-01-15 2005-10-19 Luke Leonard Martin Porter Time in databases and applications of databases
US7490116B2 (en) * 2003-01-23 2009-02-10 Verdasys, Inc. Identifying history of modification within large collections of unstructured data
US8156083B2 (en) * 2005-12-01 2012-04-10 Oracle International Corporation Database system that provides for history-enabled tables
US9824107B2 (en) * 2006-10-25 2017-11-21 Entit Software Llc Tracking changing state data to assist in computer network security
US7734659B2 (en) * 2007-06-01 2010-06-08 United Technologies Corporation System and method for creating an object model
US20110302195A1 (en) * 2010-06-08 2011-12-08 International Business Machines Corporation Multi-Versioning Mechanism for Update of Hierarchically Structured Documents Based on Record Storage
US8781922B2 (en) * 2010-12-17 2014-07-15 Verizon Patent And Licensing Inc. Processing a bill of materials
US9262491B2 (en) * 2011-06-29 2016-02-16 International Business Machines Corporation System and method for implementing multi-temporal database functionality
US8965889B2 (en) * 2011-09-08 2015-02-24 Oracle International Corporation Bi-temporal user profiles for information brokering in collaboration systems
US9742860B2 (en) * 2012-02-28 2017-08-22 International Business Machines Corporation Bi-temporal key value cache system
US8943017B2 (en) * 2013-04-23 2015-01-27 Smartcloud, Inc. Method and device for real-time knowledge processing based on an ontology with temporal extensions

Also Published As

Publication number Publication date
AU2015244230A1 (en) 2016-09-22
US20150286688A1 (en) 2015-10-08
WO2015157086A1 (en) 2015-10-15
CA2941713A1 (en) 2015-10-15
EP3129902A4 (en) 2017-08-30

Similar Documents

Publication Publication Date Title
EP2746965B1 (en) Systems and methods for in-memory database processing
US20240078229A1 (en) Generating, accessing, and displaying lineage metadata
EP3079078B1 (en) Multi-version concurrency control method in database, and database system
AU2023204128A1 (en) System for synchronization of changes in edited websites and interactive applications
US10089408B2 (en) Flexible graph system for accessing organization information
US20200364185A1 (en) Method for data replication in a data analysis system
KR20180021679A (en) Backup and restore from a distributed database using consistent database snapshots
US20150234842A1 (en) Mapping of Extensible Datasets to Relational Database Schemas
US20110289055A1 (en) Linked Databases
EP2653986A2 (en) Client-side caching of database transaction token
US20120136901A1 (en) Generating a checkpoint image for use with an in-memory database
US10545929B2 (en) Metadata versioning in a distributed database
US20160306637A1 (en) Application Object Framework
EP3825863A1 (en) Distributed computer system for delivering data
US20180336253A1 (en) Progressive chart rendering
US9177000B2 (en) Data index using a linked data standard
US9390131B1 (en) Executing queries subject to different consistency requirements
US20150286688A1 (en) Apparatus and Method for Management of Bitemporal Objects
US20200192925A1 (en) Self-adapting resource aware phrase indexes
EP3916577B1 (en) Parallel load of mapping containers for database system start and restart operations
Brahmia et al. Versioning temporal characteristics of JSON-based big data via the τJSchema framework
Kricke et al. Preserving Recomputability of Results from Big Data Transformation Workflows: Depending on External Systems and Human Interactions
US20180075118A1 (en) Replication queue handling
Mak Coping with the storm: Automating name authority record updates and bibliographic file maintenance
Wu et al. An Architecture of Managing Schema Evolution in a Federated Spatial Database System

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20160907

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
A4 Supplementary search report drawn up and despatched

Effective date: 20170727

RIC1 Information provided on ipc code assigned before grant

Ipc: G06F 17/30 20060101AFI20170721BHEP

Ipc: G06F 7/00 20060101ALI20170721BHEP

REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1232627

Country of ref document: HK

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20191101

REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1232627

Country of ref document: HK