US10621163B2 - Tracking and reusing function results - Google Patents
Tracking and reusing function results Download PDFInfo
- Publication number
- US10621163B2 US10621163B2 US15/835,331 US201715835331A US10621163B2 US 10621163 B2 US10621163 B2 US 10621163B2 US 201715835331 A US201715835331 A US 201715835331A US 10621163 B2 US10621163 B2 US 10621163B2
- Authority
- US
- United States
- Prior art keywords
- function
- invocation
- result
- timeframe
- version
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2365—Ensuring data consistency and integrity
Definitions
- function calls are part of client-side processing
- function executing can require transferring code for that function to the client, or function calls can be across a network, with results sent back to the calling system (e.g. in a callback). These transfers takes up valuable network bandwidth and delay processing response time until the data or code can be delivered.
- FIG. 1 is a block diagram illustrating an overview of devices on which some implementations can operate.
- FIG. 2 is a block diagram illustrating an overview of an environment in which some implementations can operate.
- FIG. 3 is a block diagram illustrating components which, in some implementations, can be used in a system employing the disclosed technology.
- FIG. 4 is a flow diagram illustrating a process used in some implementations for tracking and reusing stored function results.
- FIG. 5 is a flow diagram illustrating a process used in some implementations for an optimization that uses placeholders to prevent duplicate checking for valid stored function results.
- FIG. 6 is a flow diagram illustrating a process used in some implementations for recursively adding invalidity items to the linked lists for nodes in a trace.
- FIG. 7 is a conceptual diagram illustrating an example of a function invocation that checks whether there are valid stored results to use instead of executing the function.
- Embodiments for tracking and reusing stored function results are described.
- a computing system executes software, it often makes calls to various “functions” (also known as methods, sub-routines, etc.) to carry out operations.
- Functions allow the same code to be employed multiple times, with the same or different arguments.
- the result of a function can be stored so that when the function is invoked with the same arguments as a previous execution of the function, the stored results can be used instead of re-executing the function.
- a function is re-executed, even with the same arguments as a previous execution, it can produce a different result. This is because, when software functions are executed, results of the function can rely on “source data,” e.g.
- executing a function entails processing the instructions that make up the function, e.g. without using stored results of the function to produce the function result.
- invoking a function entails making a call to the function to obtain a valid result of the function, whether or not that result is obtained by executing the function or by obtaining a stored result of a previous execution of the function.
- a function that uses other source data is referred to as “depending” on that source data. For example, a function that calls a source function is referred to as depending on that source function.
- Some systems can track what source data a function uses in a trace. For each function invocation, with a specified set of zero or more arguments, an invocation node can be created with a trace comprising an ordered list of dependencies (“edges”) to source nodes, each representing source functions or other source data, directly or indirectly used during execution of the invoked function. Thus, a sequence of dependencies can exist between various nodes, where the dependencies are defined by trace edges for one invocation node leading to source nodes, which in turn are invocation nodes in one or more other traces. In some implementations, the system can also track reverse pointers to easily determine, for any given source node, which invocations depend on that given source node.
- Each node can represent a group of one or more functions (with corresponding arguments) or can represent a mutable data source such as a database row or a global variable. While nodes are discussed below as corresponding to a single function or data source, in some implementations, a node can correspond to a chain of multiple functions that all get called during an execution. Thus, when creating a function trace (referred to herein as “memoizing” the function) a function n( ) may invoke a non-memoized function m( ) (i.e. a function which does not have a corresponding trace).
- n( ) ⁇ m( ) edge the node for n( ) can inherently include the call to m( ).
- m( ) went on to call memoized function p( )
- an n( ) ⁇ p( ) edge is added to n( )'s trace to record this dependency.
- a set of nodes, interdependent through a set of traces, can be used when evaluating a function invocation to determine if there is a valid stored result. In some implementations, this is accomplished by having nodes associated with a validity bit, which will be set to invalid when source data the node uses changes in value. These changes can propagate up the trace dependencies, such that when a value for a source node changes, and its validity bit is set to invalid, a stored result of any functions dependent on that source node (i.e. any invocation nodes that have traces with an edge to that source node), are also set as invalid, and so on up the various traces.
- a first function can be invoked for “version 1.”
- a “version” defines a state of available data such that each time a data change is made (e.g. a database value is updated) the version number can be incremented.
- the invoked first function can use a stored result if it was valid for version 1 or if all the source data, that the invoked first function uses, is unchanged since version 1.
- an item of source data for the first function e.g.
- a stored result for a second function can change. If the system only tracks whether a stored result is currently valid or not, setting the stored result of the second function to invalid can cause evaluation of the first function to determine that its stored result is invalid as at least some of its source data has changed, and thus the first function needs to be re-executed. However, since the stored result of the second function was valid for version 1 corresponding to the version for which the first function was invoked, the stored result for the first function is actually a useable result for the version 1 invocation. Thus, when a function is invoked, it can use a stored result if the stored result for the invocation version is valid or if all of the source data that the function uses for that version is valid, even if that stored function result subsequently becomes invalid.
- the function result tracking system disclosed herein can correlate, with function executions, one or more “timeframes” for which results of that function execution are valid.
- a timeframe is a range of versions.
- each node referenced by a series of trace dependencies can represent a function execution performed with a particular set of zero or more arguments.
- Each node can be associated with a data structure, such as a linked list (“LL”), that keeps track of results of function executions, and timeframes for when those results are valid.
- LL linked list
- a node can be an object with a linked list member variable. While the data structure for tracking valid/invalid function result timeframes is discussed herein in relation to a linked list, other data structures (e.g.
- All or some of the most recent result items in the linked list can also be associated with edges in the trace, indicating which functions or other source data was used to generate the result stored in that linked list item.
- Each item in the linked list can have a result and can specify a version or a timeframe.
- a linked list item can indicate that starting at the indicated version, up to but not including the version indicated by the next item in the linked list, the corresponding result is valid.
- an item on a linked list can indicate a version but not a result value or can have a special “invalid” result value.
- This type of linked list item can indicate that, starting at the indicated version, there is no valid stored result for the function.
- a linked list item can specify both a start and end version for which the result for that timeframe is valid.
- Execution of this function can call source functions g(2) and then h( ).
- a resulting trace can have two ordered edges, the first specifying an invocation node for f(2, ‘a’) pointing to source node g(2) and the second also specifying the f(2, ‘a’) invocation node with source node h( ).
- only the most recent linked list item is associated with trace edges.
- a source data value that the execution of h( ) depended on can change. This causes an item to be added to the linked list corresponding to h( ), to each of the linked lists for the invocation nodes that have a trace edge to h( ) as a source node, and iteratively to each further “ancestor” node up the dependencies between interrelated traces.
- an item is inserted in the linked lists corresponding to h( ) and f(2, ‘a’).
- the system needs to obtain a valid result.
- Function result tracking and utilization can include sophisticated technical algorithms and data structures for determining dependencies, tracking result validity, and efficiently re-computing invalid results.
- Functions can be associated with both data dependency structures and timeframe validity data structures. Navigating these data structures can include deciding whether to select a valid stored result, determining if a result marked invalid should be updated to be valid, or re-execute the function completely. Correctly invoking such a function requires significant technical detail. However, the disclosed technology does so in a manner that provides significant improvements over prior art systems in both processing speed and network bandwidth utilization. These benefits are the direct result of the disclosed technology requiring less code execution to produce a correct result.
- FIG. 1 is a block diagram illustrating an overview of devices on which some implementations of the disclosed technology can operate.
- the devices can comprise hardware components of a device 100 that can track and use stored function results.
- Device 100 can include one or more input devices 120 that provide input to the CPU(s) (processor) 110 , notifying it of actions. The actions can be mediated by a hardware controller that interprets the signals received from the input device and communicates the information to the CPU 110 using a communication protocol.
- Input devices 120 include, for example, a mouse, a keyboard, a touchscreen, an infrared sensor, a touchpad, a wearable input device, a camera- or image-based input device, a microphone, or other user input devices.
- CPU 110 can be a single processing unit or multiple processing units in a device or distributed across multiple devices.
- CPU 110 can be coupled to other hardware devices, for example, with the use of a bus, such as a PCI bus or SCSI bus.
- the CPU 110 can communicate with a hardware controller for devices, such as for a display 130 .
- Display 130 can be used to display text and graphics.
- display 130 provides graphical and textual visual feedback to a user.
- display 130 includes the input device as part of the display, such as when the input device is a touchscreen or is equipped with an eye direction monitoring system. In some implementations, the display is separate from the input device.
- Display devices are: an LCD display screen, an LED display screen, a projected, holographic, or augmented reality display (such as a heads-up display device or a head-mounted device), and so on.
- Other I/O devices 140 can also be coupled to the processor, such as a network card, video card, audio card, USB, firewire or other external device, camera, printer, speakers, CD-ROM drive, DVD drive, disk drive, or Blu-Ray device.
- the device 100 also includes a communication device capable of communicating wirelessly or wire-based with a network node.
- the communication device can communicate with another device or a server through a network using, for example, TCP/IP protocols.
- Device 100 can utilize the communication device to distribute operations across multiple network devices.
- the CPU 110 can have access to a memory 150 in a device or distributed across multiple devices.
- a memory includes one or more of various hardware devices for volatile and non-volatile storage, and can include both read-only and writable memory.
- a memory can comprise random access memory (RAM), CPU registers, read-only memory (ROM), and writable non-volatile memory, such as flash memory, hard drives, floppy disks, CDs, DVDs, magnetic storage devices, tape drives, device buffers, and so forth.
- RAM random access memory
- ROM read-only memory
- writable non-volatile memory such as flash memory, hard drives, floppy disks, CDs, DVDs, magnetic storage devices, tape drives, device buffers, and so forth.
- a memory is not a propagating signal divorced from underlying hardware; a memory is thus non-transitory.
- Memory 150 can include program memory 160 that stores programs and software, such as an operating system 162 , function result tracking system 164 , and other application programs 166 .
- Memory 150 can also include data memory 170 that can include traces, function result and validity data structures, a version counter, configuration data, settings, user options or preferences, etc., which can be provided to the program memory 160 or any element of the device 100 .
- Some implementations can be operational with numerous other general purpose or special purpose computing system environments or configurations.
- Examples of well-known computing systems, environments, and/or configurations that may be suitable for use with the technology include, but are not limited to, personal computers, server computers, handheld or laptop devices, cellular telephones, wearable electronics, gaming consoles, tablet devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, or the like.
- FIG. 2 is a block diagram illustrating an overview of an environment 200 in which some implementations of the disclosed technology can operate.
- Environment 200 can include one or more client computing devices 205 A-D, examples of which can include device 100 .
- Client computing devices 205 can operate in a networked environment using logical connections 210 through network 230 to one or more remote computers, such as a server computing device.
- server 210 can be an edge server which receives client requests and coordinates fulfillment of those requests through other servers, such as servers 220 A-C.
- Server computing devices 210 and 220 can comprise computing systems, such as device 100 . Though each server computing device 210 and 220 is displayed logically as a single server, server computing devices can each be a distributed computing environment encompassing multiple computing devices located at the same or at geographically disparate physical locations. In some implementations, each server 220 corresponds to a group of servers.
- Client computing devices 205 and server computing devices 210 and 220 can each act as a server or client to other server/client devices.
- Server 210 can connect to a database 215 .
- Servers 220 A-C can each connect to a corresponding database 225 A-C.
- each server 220 can correspond to a group of servers, and each of these servers can share a database or can have their own database.
- Databases 215 and 225 can warehouse (e.g. store) information such as function results, dependency data, function code, processing statistics, etc. Though databases 215 and 225 are displayed logically as single units, databases 215 and 225 can each be a distributed computing environment encompassing multiple computing devices, can be located within their corresponding server, or can be located at the same or at geographically disparate physical locations.
- Network 230 can be a local area network (LAN) or a wide area network (WAN), but can also be other wired or wireless networks.
- Network 230 may be the Internet or some other public or private network.
- Client computing devices 205 can be connected to network 230 through a network interface, such as by wired or wireless communication. While the connections between server 210 and servers 220 are shown as separate connections, these connections can be any kind of local, wide area, wired, or wireless network, including network 230 or a separate public or private network.
- FIG. 3 is a block diagram illustrating components 300 which, in some implementations, can be used in a system employing the disclosed technology.
- the components 300 include hardware 302 , general software 320 , and specialized components 340 .
- a system implementing the disclosed technology can use various hardware including processing units 304 (e.g. CPUs, GPUs, APUs, etc.), working memory 306 , storage memory 308 (local storage or as an interface to remote storage, such as storage 215 or 225 ), and input and output devices 310 .
- processing units 304 e.g. CPUs, GPUs, APUs, etc.
- storage memory 308 local storage or as an interface to remote storage, such as storage 215 or 225
- input and output devices 310 e.g., input and output devices 310 .
- storage memory 308 can be one or more of: local devices, interfaces to remote storage devices, or combinations thereof.
- storage memory 308 can be a set of one or more hard drives (e.g.
- RAID redundant array of independent disks
- NAS network accessible storage
- Components 300 can be implemented in a client computing device such as client computing devices 205 or on a server computing device, such as server computing device 210 or 220 .
- General software 320 can include various applications including an operating system 322 , local programs 324 , and a basic input output system (BIOS) 326 .
- Specialized components 340 can be subcomponents of a general software application 320 , such as local programs 324 .
- Specialized components 340 can include invocation interceptor 344 , trace operator 346 , invalidity resolver 348 , placeholder optimizer 350 , and components which can be used for transferring data and controlling the specialized components, such as interface 342 .
- components 300 can be in a computing system that is distributed across multiple computing devices or can be an interface to a server-based application executing one or more of specialized components 340 .
- Invocation interceptor 344 can obtain an indication that a function has been invoked. Invocation interceptor 344 can be part of a runtime framework that is responsible for responding to function calls. In various implementations, invocation interceptor 344 can be an explicit function called as part of executing another function (e.g. a checkForStoredResult( ) automatically added to the beginning of other function calls) or invocation interceptor 344 can monitor and handle function calls in the background of execution (e.g. as an initial step coded to be performed before retrieving the function code and establishing space on the memory stack for function execution). Invocation interceptor 344 can determine what function was invoked, what arguments were passed to the function, and a version value associated with the invocation.
- a checkForStoredResult( ) automatically added to the beginning of other function calls
- invocation interceptor 344 can monitor and handle function calls in the background of execution (e.g. as an initial step coded to be performed before retrieving the function code and establishing space on the memory stack for function execution). Invocation interceptor 344 can
- the version value can be a counter (e.g. global variable) that is incremented for each data change, in which case the time value establishes the relative ordering between various states of available data.
- Invocation interceptor 344 can pass the invocation information to trace operator 346 , e.g. directly or through or interface 342 .
- Trace operator 346 can determine, for a particular function invocation, whether there is a linked list corresponding to the invoked function with the corresponding arguments, and if so, whether that linked list has a valid result value for the version associated with the invocation. Trace operator 346 can also update trace dependencies as functions are executed. Trace operator 346 can further keep updated the validity data in linked lists corresponding to nodes, e.g. as data relied upon by a function changes, to propagate invalidity states up interrelated trace dependencies, and in response to new validity data being determined.
- trace operator 346 When trace operator 346 receives, from invocation interceptor 344 or interface 342 , an indication that a function has been invoked with the details of any arguments used in the function invocation with a version number for the invocation, trace operator 346 can obtain a linked list corresponding to that function invocation with the specified arguments. In some implementations, trace operator 346 can maintain multiple traces and corresponding linked lists, corresponding to various unique function invocations, where a unique function invocation is a function called with a particular set of zero or more arguments. Changing either the function called or a argument provided constitutes a different unique function invocation.
- Each node can be associated with a linked list or other data structure that stores pairs of A) a version value or time range and B) corresponding function results (or a blank or special indicator if the entry signifies a timeframe for which there are no valid stored function results or placeholder items, as discussed below).
- items on the linked list can specify a timeframe, starting with the version indicated in the item and continuing until directly before a version indicated by the next item in the linked list, that the corresponding stored function result is valid (or invalid if the corresponding stored function result is blank or uses a invalidity indicator).
- the timeframe for that entry can start at the indicated time and go to infinity, until a new item is added to the head of the linked list.
- invalidity resolver 348 can determine a valid function result for the invoked function for the invocation version number and signal updates that trace operator 346 should implement to the trace as a result of the processing of invalidity resolver 348 .
- trace operator 346 can monitor which other functions are called or other source data is used to determine which source data the executing function is dependent upon, in what order, and what value was provided by each data source.
- Other called functions that are memoized are other source data nodes referenced by edges in the resulting trace, the dependencies are stored as edges in a trace with an associated order and return value.
- at trace can be stored for a specified amount of the most recent linked list items (e.g. 1, 2, 5, etc.).
- invalidity resolver 348 can use these traces to resolve invalidities without having to re-execute the invoked function or all of the invoked function's source functions.
- Trace operator 346 can further store and update validity data (to mark a timeframe as having a valid result or to not have a valid result) by adding items to the linked lists corresponding to invocations. These linked list items can be added, e.g., to mark a timeframe as invalid as data relied upon by a function changes or to mark a new function result for a timeframe as valid.
- validity data to mark a timeframe as having a valid result or to not have a valid result
- These linked list items can be added, e.g., to mark a timeframe as invalid as data relied upon by a function changes or to mark a new function result for a timeframe as valid.
- the dependencies between traces can be traversed, adding an invalidity item to the linked list corresponding to each of the ancestor nodes of changed source node.
- an “ancestor” of a particular node is any node that is connected to the particular node, through a series of one or more trace edge traversals from a current node as the invocation node to one of that current node's source nodes as a new current node, starting at the ancestor node.
- Invalidity resolver 348 can receive, from trace operator 364 , an indication that a function invoked for a particular version did not have a stored valid result for that version and can also receive the corresponding linked list and trace(s). Invalidity resolver 348 can determine, for the invoked function, whether, in the linked list, the most recent valid linked list item prior to the invocation version is associated with a trace indicating which data sources were used in the function invocation that created that valid linked list item.
- invalidity resolver 348 causes the invoked function to be re-executed.
- Results of the re-execution can be provided to trace operator 346 , which will add a valid item to the linked list for that invocation. This item can be added to the linked list at a location to keep the linked list items in order of the invocation version numbers associated with each linked list item.
- invalidity resolver 348 can determine whether all the source data has a valid result for the invocation version. If any source data from a stored function result is invalid for the invocation version, invalidity resolver 348 can invoke those functions for the invocation version in the order indicated by the trace, which will be intercepted by invocation interceptor 344 , and produce a valid result for the invocation version. Next, invalidity resolver 348 can determine if either A) the invalid source data is from another data source (e.g.
- invalidity resolver 348 causes the invoked function to be re-executed and results of the re-execution can be provided to trace operator 346 , as discussed above.
- the previous stored function result is still valid; the linked list can be updated to indicate that, for intersection of the timeframes of the used source nodes, the previous stored result is valid. That previous stored result can also be returned in response to the invocation.
- Placeholder optimizer 350 can cause some function invocations, when there is not a stored valid result for a timeframe including the invocation version, to wait for another invocation of the same function to finish before determining how to proceed with the invocation. Placeholder optimizer 350 can take over after trace operator 346 determines that, for a particular function invocation, the corresponding node is not associated with a valid result value for the invocation version. Placeholder optimizer 350 can determine whether, in the linked list for the node corresponding to the invocation, there is already a placeholder item set with a timeframe covering the invocation version.
- a placeholder item can be a special linked list item that indicates another invocation has begun with a particular invocation version which will produce a valid result for at least that earlier invocation version and possibly additional versions.
- placeholder optimizer 350 can pause the received invocation to wait for the result corresponding to the placeholder item.
- placeholder optimizer 350 can determine if the result is valid for the invocation version. If so, that result can be used as a result for the invocation.
- placeholder optimizer 350 can add a placeholder item to the linked list and continue the invocation of the function. When a result of the invocation is produced providing a valid result for a timeframe, the placeholder item can be removed from the linked list.
- FIGS. 1-3 may be altered in a variety of ways. For example, the order of the logic may be rearranged, substeps may be performed in parallel, illustrated logic may be omitted, other logic may be included, etc. In some implementations, one or more of the components described above can execute one or more of the processes described below.
- FIG. 4 is a flow diagram illustrating a process 400 used in some implementations for tracking and reuse of stored function results.
- Process 400 begins at block 402 and continues to block 404 .
- process 400 can receive an indication that a function f( ) was invoked at time t with arguments P; f( ) can be any function, P can be a set of zero or more arguments, and t can be any indicator of data update ordering (e.g. a version number).
- process 400 can obtain a linked list corresponding to f(P).
- Other data structures can be used, such as an array of ⁇ timeStart, timeEnd, value> items. Some items in a data structure can indicate a timeframe for which there is no stored valid function result.
- One or more of the items in a data structure can be associated with trace data indicating what source data (e.g. global variables, other function results, database values, etc.) were used during the execution that produced the result value stored by that item. When a source data value changes (e.g.
- the invocation nodes that are connected by trace edges to nodes for those source data items can also become invalid by adding corresponding invalidating items to their associated data structures.
- process 400 can determine whether the linked list item, for the timeframe covering the invocation version t, is associated with a valid stored result for f(P). If so, process 400 continues to block 410 , where the stored result is returned instead of executing f(P). Process 400 then continues to block 412 , where it ends.
- process 400 continues to block 414 .
- process 400 can implement process 600 at this point instead of proceeding to block 414 .
- process 400 can determine whether an item m in the linked list that is directly before the item for the timespan covering version t, is associated with a trace specifying which source data was used to compute that value. If that trace is not available, process 400 can continue to block 428 , if that trace is available, process 400 can continue to block 416 .
- process 400 can use the trace to obtain the source nodes d, indicating the source data that the execution of f(P) was dependent upon in the execution corresponding to the linked list item m that has the previous valid timeframe identified at block 414 .
- process 400 can determine, in order of edges defined the trace, if, in the linked lists corresponding to each of the d nodes, any of the linked list items that have a timeframe that cover the t invocation version have no valid stored result. Though not shown, process 400 can also determine if any of the source nodes with an invalid result for the t invocation version are from a source other than a function call e.g.
- process 400 would continue to block 428 . If so, process 400 continues to block 420 , if not (meaning all the source nodes d have a valid result for the t invocation version) process 400 continues to block 422 .
- process 400 can invoke some or all of the functions corresponding to the nodes with no valid stored result, found at block 418 .
- These invocations can be in the order defined by the trace obtained at block 414 , i.e. the order in which the corresponding functions were called by f(P).
- Each of these invocations can initiate new instances of process 400 , however, each of those will use invocation version t as the invocation version. These invocations will produce a valid result for each function for at least version t.
- process 400 can get the valid result for each of the functions corresponding to the d nodes.
- Each of these results can be accompanied by an indication of a timeframe for which that result is valid, where each of these timeframes will include at least version t.
- process 400 can iterate through the results obtained at block 422 , checking each against a corresponding result produced by that source data in the execution of f(P) corresponding to item m. These iterations can occur in an order of edges defined by the trace for f(P). If any of these source data results are different than they were in the execution for item m, then f(P) needs to be re-executed with the new data, so process 400 continues to block 428 . In some implementations, these comparisons can occur immediately after each result is obtained at block 420 and if any result is different from the previous result for that function, process 400 can continue from block 420 to block 428 .
- process 400 can re-execute f(P) for version t, not relying on any stored function results, to determine the result of f(P).
- process 400 can add an item to the f(P) node's linked list, where the new item includes the computed result of f(P), with a corresponding version t. This item can be added to the linked list at a location such that the items in the linked list are always in ascending order of their version number. Thus, if t is greater than the last linked list version number, this item is added to the head of the linked list and the timeframe for this item will be from t infinity, until another item is added to the linked list.
- Process 400 can then continue to block 434 where the result of f(P) computed at block 428 is returned as a response to the invocation. Process 400 then continues to block 436 , where it ends.
- FIG. 5 is a flow diagram illustrating a process 500 used in some implementations for an optimization using placeholders to prevent duplicate checking for valid stored function results.
- process 500 can be initiated between the blocks 408 and 414 of process 400 .
- Process 500 begins at block 502 and continues to block 504 .
- process 500 can receive an identification of a function (referred to here as f( )), invoked for a particular data version (referred to here as t), with particular arguments (referred to here as P).
- f( ) an identification of a function
- t particular data version
- P particular arguments
- Process 500 can also receive a linked list, corresponding to f(P) invocation, of stored function results with validity timeframes.
- process 500 can determine whether the received linked list has a placeholder item set for a version with a timeframe that covers t. If so, this indicates that another invocation of the f(P) function is in process, and another version of processes 400 and 500 are determining a valid result for an earlier version. Because that valid result may also be valid for version t, process 500 continues to block 508 , where it waits for the result of the other invocation corresponding to the placeholder in the linked list. Once that result corresponding to the placeholder is obtained, process 500 , at block 510 , can determine whether the result that was valid for the earlier version is also valid for version t.
- process 500 continues to block 512 , where it uses the result corresponding to the placeholder, determined to be valid for version t, as a response to the invocation of f(P).
- process 500 is called between blocks 408 and 414 , using the result corresponding to the placeholder can be a modification to process 400 , causing it to go to block 410 and return the result corresponding to the placeholder, instead of going to block 414 .
- Process 500 can then continue to block 520 , where it ends.
- process 500 determines that the received linked list does not have a placeholder item set with a timeframe that encompasses version t, or if, at block 510 , process 500 determines that the result of any placeholder process is not indicated as valid at least for version t, process 500 continues to block 514 .
- process 500 adds a placeholder item, with a version t, to the received linked list.
- process 500 determines a result for the invoked f(P) function, for version t.
- process 500 is called between blocks 408 and 414 , this can be done by resuming execution of process 400 at block 414 , and obtaining the result from either block 410 or 434 . This is the result that will be passed to any other versions of process 500 waiting at block 508 .
- Process 500 then continues to block 518 , where the placeholder item set at block 514 is removed from the linked list.
- Process 500 then continues to block 520 , where it ends.
- FIG. 6 is a flow diagram illustrating a process 600 used in some implementations for recursively adding invalidity items to the linked lists for nodes across a series of dependencies.
- process 600 can be initiated when source data, corresponding to a node identified in a trace, changes. For example, this can be a change to a global variable or data in a database.
- Process 600 begins at block 602 and continues to block 604 .
- process 600 can receive an indication of a source node (referred to as n) that has been determined to be invalid at a particular data version (referred to as t).
- version t can be the version of the data change, e.g. the current version.
- process 600 can recurse, calling process 600 for each node where a trace edge specifies node n as the source node, using the invocation node for that edge as the new node n in the new call to process 600 .
- process 600 can skip calling a new version of process 604 for nodes that are already in an invalid state for version t, i.e. process 600 may not need to recurse because that node and its ancestor nodes was previously set to invalid for version t.
- process 600 can add an item to the linked list associated with node n.
- This new linked list item can indicate that there is no valid stored result for the function corresponding to node n starting at version t and continuing into infinity.
- the version counter can be incremented, making the version with the invalidations “visible.” Thus, until this version update, there is no concern that only some nodes have the new changes while others do not, because they can't be “seen” yet.
- Process 600 can then continue to block 610 , where it ends.
- FIG. 7 is a conceptual diagram illustrating an example 700 of a function invocation that checks whether there are valid stored results to use instead of executing the function.
- Example 700 shows a series of three interdependent traces referring to nodes corresponding to function calls, including node 702 with corresponding linked list 704 , node 710 with corresponding linked list 712 , and node 722 with corresponding linked list 724 .
- a first trace is the trace for f(3, 9), referring to node 702 as the invocation node, ordered edges 706 and 708 (order not shown) refer to source nodes 710 and 722 , respectively.
- a second trace is the trace for g(“A”, 14), referring to nodes 710 (the invocation node), 718 , and 720 , having corresponding ordered edges 714 and 716 (order not shown).
- a third trace is the trace for h( ), referring to nodes 722 (the invocation node), 720 , and 730 , with corresponding ordered edges 726 and 728 (order not shown).
- the edges in each trace show dependencies that are associated with the most recent valid item in each corresponding linked list, i.e. edges 706 and 708 correspond to item 2 from linked list 704 , edges 714 and 716 correspond to item 2 of linked lists 712 , and edges 726 and 728 corresponding to item 2 of linked lists 724 .
- each edge is shown with a corresponding value at the base of the edge indicating the previous value provided from that source data item.
- example 700 can look at the linked list 704 corresponding to the f(3, 9) node 702 , and determine that, for the timeframe between 60 and 79, a stored valid result is 25, as indicated by item 1 in linked list 704 . Thus, 25 can be returned as a result for the invocation.
- example 700 can look at the linked list 704 corresponding to the f(3, 9) node 702 , and determine that, for the timeframe between 83 and infinity, linked list 704 does not include a stored valid result, as indicated by item 3 in linked list 704 .
- example 700 can look at the linked list 704 corresponding to the f(3, 9) node 702 , and determine that, for the timeframe between 90 (added in the previous portion of this example) and infinity, linked list 704 does not include a stored valid result, as indicated by item 3 in linked list 704 .
- example 700 can determine that there are stored trace edges for the previous valid result, i.e. edges 706 and 708 . Using these edges, example 700 can identify nodes 710 and 722 , which can be examined in an order defined by the trace.
- each linked list item can store vary large data objects as function results. In some cases, the benefit obtained from some of these possible results may be outweighed by the resources necessary to store them.
- certain data sources e.g. certain database values
- This determination can be based on factors such as an analysis of various types of data, a log of historical data uses for individual data items or data item types or categories, or a trained classification engine to predict usage times for data items.
- the trace(s), referenced nodes, and corresponding linked lists can be converted to smaller versions that don't store the results for intermediate nodes referenced in the traces.
- this conversion can include removing intermediate nodes from being referenced by a sequence of traces. For example, if a sequence of traces includes a( ) b( ); b( ) $i; and $i is determined to be unlikely to change, the sequence can be replaced with a( ) $i and with an indication that intermediate nodes have been removed.
- the linked list for b( ) can be deleted. However, if $i does change, a( ) may have to be re-executed because actual dependencies are not kept.
- the conversion can include simply replacing all the items in the linked list corresponding to intermediate node function invocations with an invalid marker for the timeframe 0 infinity. For example, if a sequence of traces includes n( ) p( ); p( ) $k; and $k is determined to be unlikely to change, the sequence of traces can be kept, but the linked list for p( ) can be cleared and an invalid item with a timeframe 0 infinity can be added instead. The result will be that, if the result of that intermediate function is needed, it must be re-executed. However, this eliminates the need to store the linked list result values.
- the computing devices on which the described technology may be implemented can include one or more central processing units, memory, input devices (e.g., keyboard and pointing devices), output devices (e.g., display devices), storage devices (e.g., disk drives), and network devices (e.g., network interfaces).
- the memory and storage devices are computer-readable storage media that can store instructions that implement at least portions of the described technology.
- the data structures and message structures can be stored or transmitted via a data transmission medium, such as a signal on a communications link.
- Various communications links can be used, such as the Internet, a local area network, a wide area network, or a point-to-point dial-up connection.
- computer-readable media can comprise computer-readable storage media (e.g., “non-transitory” media) and computer-readable transmission media.
- being above a threshold means that a value for an item under comparison is above a specified other value, that an item under comparison is among a certain specified number of items with the largest value, or that an item under comparison has a value within a specified top percentage value.
- being below a threshold means that a value for an item under comparison is below a specified other value, that an item under comparison is among a certain specified number of items with the smallest value, or that an item under comparison has a value within a specified bottom percentage value.
- being within a threshold means that a value for an item under comparison is between two specified other values, that an item under comparison is among a middle specified number of items, or that an item under comparison has a value within a middle specified percentage range.
- Relative terms such as high or unimportant, when not otherwise defined, can be understood as assigning a value and determining how that value compares to an established threshold.
- selecting a fast connection can be understood to mean selecting a connection that has a value assigned corresponding to its connection speed that is above a threshold.
- the word “or” refers to any possible permutation of a set of items.
- the phrase “A, B, or C” refers to at least one of A, B, C, or any combination thereof, such as any of: A; B; C; A and B; A and C; B and C; A, B, and C; or multiple of any item such as A and A; B, B, and C; A, A, B, C, and C; etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (20)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/835,331 US10621163B2 (en) | 2017-12-07 | 2017-12-07 | Tracking and reusing function results |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/835,331 US10621163B2 (en) | 2017-12-07 | 2017-12-07 | Tracking and reusing function results |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20190179932A1 US20190179932A1 (en) | 2019-06-13 |
| US10621163B2 true US10621163B2 (en) | 2020-04-14 |
Family
ID=66696859
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/835,331 Expired - Fee Related US10621163B2 (en) | 2017-12-07 | 2017-12-07 | Tracking and reusing function results |
Country Status (1)
| Country | Link |
|---|---|
| US (1) | US10621163B2 (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3598315B1 (en) * | 2018-07-19 | 2022-12-28 | STMicroelectronics (Grenoble 2) SAS | Direct memory access |
| EP3699771A1 (en) * | 2019-02-21 | 2020-08-26 | CoreMedia AG | Method and apparatus for managing data in a content management system |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170171075A1 (en) * | 2015-12-10 | 2017-06-15 | Cisco Technology, Inc. | Co-existence of routable and non-routable rdma solutions on the same network interface |
-
2017
- 2017-12-07 US US15/835,331 patent/US10621163B2/en not_active Expired - Fee Related
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170171075A1 (en) * | 2015-12-10 | 2017-06-15 | Cisco Technology, Inc. | Co-existence of routable and non-routable rdma solutions on the same network interface |
Non-Patent Citations (7)
| Title |
|---|
| Acar, U., et al., "Adaptive Functional Programming," Carnegie Mellon University, May 2006, 46 pages. |
| Acar, U., et al., "An Experimental Analysis of Self-Adjusting Computation," ACM Transactions on Programming Languages and Systems, vol. 32, No. 1, Article 3, Oct. 2009, 53 pages. |
| Acar, U., et al., "Self-Adjusting Computation," Carnegie Mellon University, May 2005, 299 pages. |
| Acar, U., et al., "Traceable Data Types for Self-Adjusting Computation," PLDI, Jun. 5-10, 2010, 14 pages. |
| Burckhardt, S. et al., "Two for the Price of One: A Model for Parallel and Incremental Computation," OOPSLA, Oct. 22-27, 2011, 18 pages. |
| Hammer, M. et al., "ADAPTON: Composable, Demand-Driven Incremental Computation," PLDI, Jun. 9-11, 2014, 11 pages. |
| Ley-Wild, R. et al., "Compiling Self-Adjusting Programs with Continuations," ICFP, Sep. 22-24, 2008, 13 pages. |
Also Published As
| Publication number | Publication date |
|---|---|
| US20190179932A1 (en) | 2019-06-13 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7271734B2 (en) | Data serialization in distributed event processing systems | |
| KR101621137B1 (en) | Low latency query engine for apache hadoop | |
| CN109951547B (en) | Transaction request parallel processing method, device, equipment and medium | |
| US10871918B2 (en) | Writing composite objects to a data store | |
| JP6903755B2 (en) | Data integration job conversion | |
| US11809216B2 (en) | Providing external access to a processing platform | |
| US20130086138A1 (en) | Implementing a java method | |
| US10621163B2 (en) | Tracking and reusing function results | |
| US10747590B2 (en) | Application logging adapters | |
| US20200175402A1 (en) | In-database predictive pipeline incremental engine | |
| US20240112067A1 (en) | Managed solver execution using different solver types | |
| US8539496B1 (en) | Method and apparatus for configuring network systems implementing diverse platforms to perform business tasks | |
| JP6400191B2 (en) | Instruction set generation that implements rules designed to update specified objects according to the application data model | |
| US12242485B2 (en) | Dictionary filtering and evaluation in columnar databases | |
| US12210528B2 (en) | Evaluating expressions over dictionary data | |
| US12124450B2 (en) | Adaptive approach to lazy materialization in database scans using pushed filters | |
| US20230297346A1 (en) | Intelligent data processing system with metadata generation from iterative data analysis | |
| US12387718B1 (en) | Removing bias from automatic speech recognition models using internal language model estimates | |
| JP5748711B2 (en) | Database drivers and programs | |
| US20240111832A1 (en) | Solver execution service management | |
| US20240412095A1 (en) | Feature function based computation of on-demand features of machine learning models | |
| US20240256539A1 (en) | Static approach to lazy materialization in database scans using pushed filters | |
| JP5913440B2 (en) | Server and program | |
| Luu | Spark Streaming | |
| EP4594880A1 (en) | Multi-tenant solver execution service |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| AS | Assignment |
Owner name: FACEBOOK, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HOSTETTER, MATHEW JAMES;HOSMER, BASIL CLARK;ORENSTEIN, AARON;REEL/FRAME:044338/0088 Effective date: 20171208 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| AS | Assignment |
Owner name: META PLATFORMS, INC., CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:FACEBOOK, INC.;REEL/FRAME:058871/0336 Effective date: 20211028 |
|
| FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
| FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20240414 |