US20170090807A1 - Technologies for managing connected data on persistent memory-based systems - Google Patents
Technologies for managing connected data on persistent memory-based systems Download PDFInfo
- Publication number
- US20170090807A1 US20170090807A1 US14/866,941 US201514866941A US2017090807A1 US 20170090807 A1 US20170090807 A1 US 20170090807A1 US 201514866941 A US201514866941 A US 201514866941A US 2017090807 A1 US2017090807 A1 US 2017090807A1
- Authority
- US
- United States
- Prior art keywords
- edges
- nodes
- memory device
- persistent memory
- computing device
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0631—Configuration or reconfiguration of storage systems by allocating resources to storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0866—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
- G06F12/0871—Allocation or management of cache space
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0893—Caches characterised by their organisation or structure
- G06F12/0897—Caches characterised by their organisation or structure with two or more cache hierarchy levels
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/16—General purpose computing application
- G06F2212/163—Server or database system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/21—Employing a record carrier using a specific recording technology
- G06F2212/214—Solid state disk
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/46—Caching storage objects of specific type in disk cache
- G06F2212/466—Metadata, control data
Definitions
- New persistent memory technologies such as memristors and phase change memory, offer a byte-addressable interface and memory access latencies that are comparable to those of volatile memory, such as dynamic random-access memory (DRAM).
- DRAM dynamic random-access memory
- These persistent memory technologies may have a profound influence on organized data storage due to the availability of faster persistent storage and larger main memories.
- none of the existing graph-based databases support a completely in-memory database model.
- FIG. 1 is a simplified block diagram of at least one embodiment of a computing device for managing connected data
- FIGS. 2 and 3 are a simplified block diagrams of an illustrative graph data store that may be generated and/or managed by the computing device of FIG. 1 ;
- FIG. 4 is a simplified block diagram of at least one embodiment of an environment of the computing device of FIG. 1 ;
- FIG. 5 is a simplified flow diagram of at least one embodiment in which an interactor object includes pointers to a graph data store;
- FIG. 6 is a simplified flow diagram of at least one embodiment of a method for generating and/or modifying a graph data store that may be executed by the computing device of FIGS. 1 and 4 ;
- FIG. 7 is a simplified flow diagram of at least one embodiment of a method for responding to a search request that may be executed by the computer system of FIGS. 1 and 4 .
- references in the specification to “one embodiment,” “an embodiment,” “an illustrative embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include that particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
- items included in a list in the form of “at least one A, B, and C” can mean (A); (B); (C); (A and B); (A and C); (B and C); or (A, B, and C).
- items listed in the form of “at least one of A, B, or C” can mean (A); (B); (C); (A and B); (A and C); (B and C); or (A, B, and C).
- the disclosed embodiments may be implemented, in some cases, in hardware, firmware, software, or any combination thereof.
- the disclosed embodiments may also be implemented as instructions carried by or stored on a transitory or non-transitory machine-readable (e.g., computer-readable) storage medium, which may be read and executed by one or more processors.
- a machine-readable storage medium may be embodied as any storage device, mechanism, or other physical structure for storing or transmitting information in a form readable by a machine (e.g., a volatile or non-volatile memory, a media disc, or other media device).
- an illustrative computing device 100 for managing connected data includes at least one processor 102 , an I/O subsystem 104 , at least one on-die cache 106 , and a memory controller 108 to control a volatile memory 110 and a persistent memory 112 .
- an entire graph data store is stored in the persistent memory 112 .
- the graph data store includes a plurality of nodes and edges that define relationships between the nodes.
- the nodes and/or edges each includes a unique identifier.
- the computing device 100 is configured to add, delete, and/or read nodes and/or edges (and other data) in the graph data store.
- the computing device 100 is also configured to generate an iterator object in response to a search request query.
- the iterator object includes pointers (or references) to nodes and/or edges in the graph data store that are responsive to the search request.
- the iterator object is stored in the volatile memory 110 and accesses the graph data store in the persistent memory based on the pointers of the iterator object.
- pointers and “references” are broadly intended to encompass any reference to a value in memory.
- the computing device 100 may be embodied as any type of device capable of performing the functions described herein.
- the computing device 100 may be embodied as, without limitation, a computer, a workstation, a server computer, a laptop computer, a notebook computer, a tablet computer, a smartphone, a mobile computing device, a desktop computer, a distributed computing system, a multiprocessor system, a consumer electronic device, a smart appliance, and/or any other computing device capable of executing software code segments.
- the illustrative computing device 100 includes the processor 102 , the I/O subsystem 104 , the on-die cache 106 , and the memory controller 108 to control volatile memory 110 and persistent memory 112 .
- the computing device 100 may include other or additional components, such as those commonly found in a workstation (e.g., various input/output devices), in other embodiments.
- the computing device 100 may include an external storage 114 , peripherals 116 , and/or a network adapter 118 .
- one or more of the illustrative components may be incorporated in, or otherwise form a portion of, another component.
- the memory 110 , 112 or portions thereof, may be incorporated in the processor 102 in some embodiments.
- the processor 102 may be embodied as any type of processor capable of performing the functions described herein.
- the processor may be embodied as a single or multi-core processor(s), digital signal processor, microcontroller, or other processor or processing/controlling circuit.
- the volatile memory 110 and persistent memory 112 may be embodied as any type of volatile memory and persistent memory, respectively, capable of performing the functions described herein. Volatile memory 110 contrasts with persistent memory 112 in that the persistent memory 112 does not lose content when power is lost.
- the volatile memory 110 and persistent member 112 may store various data and software used during operation of the computing device 100 such as operating systems, applications, programs, libraries, and drivers.
- the memory 110 , 112 is communicatively coupled to the processor 102 via the memory bus using memory control(s) 108 , which may be embodied as circuitry and/or components to facilitate input/output operations with the processor 102 , the memory 110 , 112 , and other components of the computing device 100 .
- the I/O subsystem 104 may be embodied as, or otherwise include, memory controller hubs, input/output control hubs, firmware devices, communication links (i.e., point-to-point links, bus links, wires, cables, light guides, printed circuit board traces, etc.) and/or other components and subsystems to facilitate the input/output operations.
- the I/O subsystem 104 may form a portion of a system-on-a-chip (SoC) and be incorporated, along with the processor 102 , the memory 110 , 112 , and other components of the computing device 100 , on a single integrated circuit chip.
- SoC system-on-a-chip
- An external storage device 114 may be coupled to the processor 102 with the I/O subsystem 104 .
- the external storage device 114 may be embodied as any type of device or devices configured for short-term or long-term storage of data such as, for example, memory devices and circuits, memory cards, hard disk drives, solid-state drives, or other data storage devices. Unlike existing systems, however, one or more embodiments contemplate that computing device 100 would not include any external storage 114 and that a graph database and all other data needed by computing device 100 would be stored on the persistent memory 112 on the memory bus instead of the external storage 114 .
- the computing device 100 may also include peripherals 116 .
- the peripherals 116 may include any number of additional input/output devices, interface devices, and/or other peripheral devices.
- the peripheral 116 may include a display that could be embodied as any type of display capable of displaying digital information such as a liquid crystal display (LCD), a light emitting diode (LED), a plasma display, a cathode ray tube (CRT), or other type of display device.
- LCD liquid crystal display
- LED light emitting diode
- CRT cathode ray tube
- the computing device 100 illustratively includes a network adapter 118 , which may be embodied as any communication circuit, device, or collection thereof, capable of enabling communications between the computing device 100 and other remote devices over a computer network (not shown).
- the network adapter 118 may be configured to use any one or more communication technology (e.g., wired or wireless communications) and associated protocols (e.g., Ethernet, Bluetooth®, Wi-Fi®, WiMAX, etc.) to effect such communication.
- the graph data store 200 includes a plurality of nodes (also called vertices) that are connected with a plurality of edges that establish relationships between the nodes.
- the graph data store 200 is shown for purposes of example only. Although the graph data store 200 shown includes five nodes and four edges for purposes of example, one skilled in the art should understand that more or less nodes and/or edges could be used depending on the data set. In some circumstances, for example, millions of nodes and/or edges (or more) could be provided in a graph data store.
- the example graph data store 200 includes a first node 202 , a second node 204 , a third node 206 , a fourth node 208 , and a fifth node 210 and a first edge 212 , a second edge 214 , a third edge 216 , and a fourth edge 218 .
- Each node and edge of the graph data store 200 has an associated tag that can be used for classification.
- the classification could identify entity types, document types, or other attribute types of a node and/or edge.
- the first and fourth nodes 202 , 208 are associated with the tag “person,” the second node 204 with the tag “email,” and the third and fifth nodes 206 , 210 with the tag “keyword.”
- the first edge 212 is associated with the tag “sent,” the second and fourth edges 214 , 218 with the tag “keyword,” and the third edge 208 with the tag “person.”
- a default tag could be used.
- the tag may be a short string.
- one or more properties may be associated with the nodes and/or edges of the graph data store 200 .
- a property is illustratively embodied as a key-value pair, in which the key is a short string and the value is one of a plurality of pre-defined types.
- the predefined property types may include Booleans, integers, floats, strings, times, and/or blobs (i.e., arbitrary strings of bits). In some embodiments, all pre-defined types, with the exception of blobs, are orderable.
- the first node 202 is associated with the property key “address” and has a value of “brad.jones@live.com.”
- the second node 204 is associated with two properties.
- the first property has a key of “id” and a value of “7482” and a second property key of “subject” and a value of “invoice payment.”
- the edges of the graph data store 200 may include one or more properties.
- the first edge 212 includes a property key of “sent” and a value of “23/12/2001.”
- the third edge 216 includes the property key “received” and a value of “23/12/2001.”
- FIG. 3 illustratively shows an example layout of the graph data store 200 in persistent memory 112 .
- the tags and properties associated with nodes 202 , 204 , 206 , 208 are stored in node objects within the node table.
- the nodes 202 , 204 , 206 , 208 include an in and/or out edge list that points to an edge list 300 ; however, in some embodiments, the edges could be undirected.
- node 202 includes an out edge list pointer to an edge list 302 associated with edge 212 .
- node 208 includes an in edge list pointer to an edge list 304 associated with 216 .
- node 204 includes both an in edge list pointer and an out edge list pointer.
- the in edge list pointer points to an edge list 306 associated with edge 212 .
- the out edge list point points to an edge list 308 associated with edge 216 .
- Node 206 in the example shown, includes an in edge list pointer that points to an edge list 310 associated with edge 214 .
- node 210 includes an in edge list pointer that points to an edge list 312 associated with edge 218 .
- the edge lists 302 , 304 , 306 , 308 , 310 , 312 include pointers to edge tags.
- edge lists 310 , 312 include a pointer to edge tag “keyword.”
- Edge lists 304 , 308 include pointers to edge tag “to.”
- edge lists 302 , 306 include pointers to the edge tag “sent.”
- an intermediate data structure could be used called a tag-sorted edgeset 314 to speed up neighbor lookups in particular and graph traversal in general.
- Tag-sorted edgesets collate edge information with identical tags to allow efficient iteration over related edges.
- the edge-sets may contain source/destination node information to allow efficient accessing of neighbor nodes.
- the computing device 100 establishes an environment 400 during operation.
- the illustrative environment 400 includes an API 402 , a graph data store 404 , a transaction management module 406 , a transaction log 408 , and an allocator module 410 .
- the graph data store 404 illustratively includes a plurality of nodes 412 , a plurality of edges 414 , tag-based and/or property-based indices 416 , a string table 418 , and a plurality of properties 420 associated with the nodes 412 and/or edges 414 .
- the various modules of the environment 400 may be embodied as hardware, firmware, software, or a combination thereof.
- one or more of the modules of the environment 400 may be embodied as circuitry or collection of electrical devices (e.g., transaction management circuitry 406 and/or allocator circuitry 410 ). It should be appreciated that, in such embodiments, one or more of the transaction management circuitry 406 and/or the allocator circuitry 410 may form a portion of one or more of the processor 102 , the I/O subsystem 104 , the memory 110 , 112 , the external storage 114 , the network adapter 118 , and/or other components of the computing device 100 . Additionally, in some embodiments, one or more of the illustrative modules may form a portion of another module and/or one or more of the illustrative modules may be independent of one another.
- the environment 400 includes an API 402 through which programs may interact with the graph data store 404 .
- the API 402 may be used as an interface by a program to add (or create), read, remove (delete) and/or modify the nodes 412 , edges 414 , indices 416 , string table 418 and/or properties 420 of the graph data store 400 .
- the transaction management module 406 manages transactions with the graph data store 400 and updates the transaction log 408 so that the state of the data store is consistent at transaction boundaries and can be recovered if a failure occurs within a transaction.
- the allocator module 410 is configured to manage allocations or partitions in the persistent memory 112 for the various entities (e.g., nodes, edges, properties).
- the allocator module 410 chooses the data structure sizes and layout to be cache-efficient, organizes data and logs to be streaming and prefetch friendly, and avoids unnecessary writes to persistent memory 112 because the write bandwidth is lower than that of volatile memory 110 .
- the allocator module 410 stores nodes 408 and edges 410 in one or more tables with fixed-size objects. To maximize storage utilization inside each node or edge element, the properties of these entities are stored inline in a best-fit manner. For example, the properties of entities could be stored in-line to maximize storage.
- the allocator module 410 allocates separate chunks that are also filled in a best-fit manner. Despite the space-efficient layout, these properties are accessible directly, without the need to “deserialize” them into an accessible format, as is the case with other compact data storage options or disk-based storage options.
- the persistent memory 112 will be slower than volatile memory 110 for read and write, and will have limited durability, meaning the probability of failure increases after some large number of writes.
- the data structures such as iterator objects, allocator objects, and transaction objects, can be split between volatile memory 110 and persistent memory 112 to optimize wear and access times.
- the allocator module 410 could include statistics and the actual persistent memory areas that it manages. Since the statistics are updated quite frequently with each allocation/de-allocation and also since they are primarily used internally rather than part of the user data, this data can be stored in volatile memory 110 and perhaps a checkpoint to this data in the allocator header could be stored in persistent memory 112 if required.
- the API 402 provides an interface through which the graph data store 404 can be searched.
- the API 402 is configured to return an iterator object in response to a search request on the graph data store 404 .
- An illustrative iterator object 500 is shown in FIG. 5 .
- the iterator object 500 includes a reference to data in the graph data store 404 .
- the iterator object 500 could include a first pointer 502 that points to the first node in the graph data store 404 that matches the search query. If the user requests additional data that satisfies the query, the API 402 could be used to advance the iterator object 500 to the next node matching the query.
- a second pointer 504 illustrates the iterator object 500 having been advanced by the API 402 to the next pointer, which points to the next node matching the query.
- a third pointer 506 illustrates the iterator object 500 having been advanced, yet again, to the next node matching the query. This process would continue with the iterator object 500 advancing to the next pointer as the user requests additional matches in the graph data store 404 using the API 402 .
- the iterator object 500 is stored in volatile memory 110 and maintains its volatile state, such as the next pointer, for speed of access, but points to the actual graph data (such as nodes or properties) of the graph data store 404 in persistent memory 112 , which is accessed as the iterator object 500 progresses.
- FIG. 5 illustratively shows pointers 502 of the iterator object 500 pointing to data in a node table 508 , the pointers could point to any data in the graph data store 404 in other embodiments.
- a program searching the graph data store 404 through the API 402 will be returned an iterator object 500 that can be examined by the program one element at a time (next pointer to next pointer). In that sense, the queries are evaluated lazily and follow the “next” pointers only when requested by the user causing a data fetch.
- graph data store 404 Since the entire graph data store 404 is stored in directly-accessible persistent memory 112 , serializing and de-serializing data can be avoided, unlike existing graph stores, which use disks for making data persistent, and managed languages that do not provide a way to access data from buffer caches without creating temporary objects that reside in memory.
- the computing device 100 may execute a method 600 to create a new entity, such as a node and/or edge in the graph data store 404 .
- the method 600 may also be used to create, delete and/or update other entities in the graph data store 404 .
- the node being created is associated with properties; however, nodes and edges need not necessarily be associated with properties and/or tags.
- the method 600 begins with block 602 in which the computing device 100 determines whether a request has been received, such as through the API 402 , to create a new node.
- the method 600 advances to block 604 in which the computing device 100 creates a new node object of a fixed size in persistent memory 112 . Subsequently, in block 606 , the computing device 100 evaluates the size of the properties and/or tags (and any other data) associated with the new node to determine whether this data will fit within the fixed size of the node object. If the size of the properties and/or tags associated with the new node will fit within the new node object, the method 600 advances to block 608 in which the properties and/or tags are stored within the new node object.
- the method 600 advances to block 610 in which the computing device 100 allocates a chunk in persistent memory. Subsequently, in block 612 , the computing device 100 stores the portions of the properties and/or tags that will fit in the fixed sized node object. Additionally, in block 614 , the computing device 100 stores the remainder of properties and/or tags in the allocated chunk of persistent memory. The method 600 will next advance to block 616 in which the allocation status is updated since the operation was completed successfully. This is consistent with ACID principles since a failure would not result in inconsistent data, but would be reverted without an update to the allocation status.
- the computing device 100 may also execute a method 700 for performing a search on the graph data store 404 .
- the method 700 begins with block 702 in which the computing device 100 determines whether a search query has been received.
- the search query may be received via the API 402 from a requesting computing device, service, program, and/or other source. If a search request query has been received, the method 700 advances to block 704 in which the computing device 100 evaluates the search query. For example, the computing device 100 may parse the search query to prepare for searching of the graph data store 404 . Subsequently, in block 706 , the computing device 100 traverses the graph data store 404 to find data responsive to the search query.
- the computing device 100 in response to locating data responsive to the search query, the computing device 100 generates and returns an iterator object stored in volatile memory 110 that includes a reference to the first item in the graph data store 404 in persistent memory 112 that matches the query. As such, in block 710 , the computing device 100 may receive a user request for additional graph data that matches the query. If the user requests this additional graph data, the method 700 advances to block 712 in which the computing device 100 fetches the graph data associated with the next reference in the iterator object. Subsequently, in block 714 , the computing device 100 determines whether the last data responsive to the query has been reached.
- the method loops back to block 710 in which the computing device 100 determines whether another user request for graph data associated with the next pointer in the iterator object has been received. If, however, the last data responsive to the query has been reached, the method 700 advances to block 716 .
- any one or more of the methods described herein may be embodied as various instructions stored on a computer-readable media, which may be executed by the processor 102 , a peripheral device 116 , and/or other components of the computing device 100 to cause the computing device 100 to perform the corresponding method.
- the computer-readable media may be embodied as any type of media capable of being read by the computing device 100 including, but not limited to, the memory 110 , 112 , the external storage 114 , a local memory or cache 106 of the processor 102 , other memory or data storage devices of the computing device 100 , portable media readable by a peripheral device 116 of the computing device 100 , and/or other media.
- An embodiment of the technologies disclosed herein may include any one or more, and any combination of, the examples described below.
- Example 1 includes computing device comprising at least one processor; at least one memory controller to access a volatile memory device and a persistent memory device on a memory bus, the persistent memory device having stored therein a graph data store including a plurality of nodes relationally arranged with a plurality of edges, each of the plurality of edges defining a relationship between at least two of the plurality of nodes, the volatile memory device having stored therein a plurality of instructions that, when executed by the processor, causes the processor to in response to an operation on the graph data store, partition data between the volatile memory device and the persistent memory device to minimize writes on the persistent memory device.
- Example 2 includes the subject matter of Example 1, and wherein at least a portion of the nodes are associated with at least one tag representing a classification of the node.
- Example 3 includes the subject matter of any of Example 1 or 2, and wherein at least a portion of the edges are associated with at least one tag representing a classification of the edge.
- Example 4 includes the subject matter of any of Examples 1-3, and wherein the graph data store includes a tag sorted edge set to collate edges and associated nodes with identical tags.
- Example 5 includes the subject matter of any of Examples 1-4, and wherein at least a portion of the nodes are associated with at least one property in the form of a key-value pair.
- Example 6 includes the subject matter of any of Examples 1-5, and wherein at least a portion of the edges are associated with at least one property in the form of a key-value pair.
- Example 7 includes the subject matter of any of Examples 1-6, and wherein the plurality of instructions further cause the processor to organize the nodes and/or edges of the graph data store in the persistent memory device as fixed size objects.
- Example 8 includes the subject matter of any of Examples 1-7, and wherein the plurality of instructions further cause the processor to store at least one property and/or tag associated with a node and/or edge in-line in the fixed-size object representing the node and/or edge.
- Example 9 includes the subject matter of any of Examples 1-8, and wherein the plurality of instructions further cause the processor to allocate, in response to the property and/or tag associated with the node and/or edge exceeding the size of the fixed-size object, a chunk of the persistent memory device separate from the fixed-size object.
- Example 10 includes the subject matter of any of Examples 1-9, and wherein the plurality of instructions further cause the processor, in response to a search request query, to generate an iterator object stored on the volatile memory device that includes a reference to one or more nodes and/or edges in the graph data store on the persistent memory device.
- Example 11 includes the subject matter of any of Examples 1-10, and wherein the plurality of instructions further cause the processor to advance the iterator object to directly access nodes and/or edges of the graph data store in response to a request for an additional match to the search query.
- Example 12 includes the subject matter of any of Examples 1-11, and wherein the processor stores an allocator on the volatile memory device, the allocator comprising one or more memory addresses of the graph data store in the persistent memory device.
- Example 13 includes the subject matter of any of Examples 1-12, and wherein the processor stores a portion of a transaction object on volatile memory and a portion of the transaction object on persistent memory such that writes to persistent memory are minimized while still maintaining the atomicity, consistency, isolation, durability (“ACID”) properties of the graph data store.
- the processor stores a portion of a transaction object on volatile memory and a portion of the transaction object on persistent memory such that writes to persistent memory are minimized while still maintaining the atomicity, consistency, isolation, durability (“ACID”) properties of the graph data store.
- ACID atomicity, consistency, isolation, durability
- Example 14 includes a method for managing a graph data store on a persistent memory device.
- the method includes storing, on a persistent memory device, a graph data store comprising a plurality of nodes and a plurality of edges, each of the plurality of edges defining a relationship between at least two of the plurality of nodes; managing a operation on the graph data store by storing a first portion of resulting data on a volatile memory device and a second portion of the resulting data on the persistent memory device to minimize writes on the persistent memory device.
- Example 15 includes the subject matter of Example 14, further including allocating, by a computing device, a fixed size object on a persistent memory device to each of the plurality of nodes and edges.
- Example 16 includes the subject matter of any of Example 14 or 15, and further including evaluating, by a computing device, a search request query on the graph data store; and generating, by a computing device, an iterator object including a reference to one or more nodes and/or edges in the graph data store in response to the search request query, wherein the iterator object is stored on a volatile memory device
- Example 17 includes the subject matter of any of Examples 14-16, and wherein the computing device manages the operation by partitioning the first portion and the second portion of resulting data to minimize writes to the persistent memory device.
- Example 18 includes the subject matter of any of Examples 14-17, and further including storing at least one property and/or tag associated with a node and/or edge in-line in a fixed-size object.
- Example 19 includes the subject matter of any of Examples 14-18, and wherein responsive to the property and/or tag associated with the node and/or edge exceeding the size of the fixed-size object, allocating a chunk of the persistent memory device separate from the fixed-size object.
- Example 20 includes the subject matter of any of Examples 14-19, and wherein responsive to a search request query on the graph data store, further comprising storing an iterator that is an output to the search request query in a volatile memory device, the iterator including a reference to one or more nodes and/or edges in the graph data store on the persistent memory device.
- Example 21 includes the subject matter of any of Examples 14-20, and further including advancing the iterator to directly access nodes and/or edges of the graph data store in response to a request for an additional match to the search query.
- Example 22 includes the subject matter of any of Examples 14-21, and wherein at least a portion of the nodes are associated with at least one tag representing a classification of the node.
- Example 23 includes the subject matter of any of Examples 14-22, and wherein at least a portion of the edges are associated with at least one tag representing a classification of the edge.
- Example 24 includes the subject matter of any of Examples 14-23, and wherein the graph data store includes a tag sorted edge set to collate edges with identical tags to allow efficient iteration over related edges.
- Example 25 includes the subject matter of any of Examples 14-24, and, wherein at least a portion of the nodes are associated with at least one property in the form of a key-value pair.
- Example 26 includes the subject matter of any of Examples 14-25, and wherein at least a portion of the edges are associated with at least one property in the form of a key-value pair.
- Example 27 includes one or more machine readable storage media comprising a plurality of instructions stored thereon that in response to being executed result in a computing device performing the method of any of Examples 14-26.
- Example 28 includes a computing device comprising means for storing, on a persistent memory device, a graph data store comprising a plurality of nodes and a plurality of edges, each of the plurality of edges defining a relationship between at least two of the plurality of nodes; and means for managing a operation on the graph data store by storing a first portion of resulting data on a volatile memory device and a second portion of the resulting data on the persistent memory device to minimize writes on the persistent memory device.
- Example 29 includes the subject matter of Example 28, and further including means for allocating a fixed size object on a persistent memory device to each of the plurality of nodes and edges.
- Example 30 includes the subject matter of Examples 28 or 29, and further including means for evaluating a search request query on the graph data store; and means for generating an iterator object including a reference to one or more nodes and/or edges in the graph data store in response to the search request query, wherein the iterator object is stored on a volatile memory device
- Example 31 includes the subject matter of any of Examples 28-30, and further including means for managing the operation by partitioning the first portion and the second portion of resulting data to minimize writes to the persistent memory device.
- Example 32 includes the subject matter of any of Examples 28-31, and further including means for storing at least one property and/or tag associated with a node and/or edge in-line in a fixed-size object.
- Example 33 includes the subject matter of any of Examples 28-32, and further including means for allocating, responsive to the property and/or tag associated with the node and/or edge exceeding the size of the fixed-size object, a chunk of the persistent memory device separate from the fixed-size object.
- Example 34 includes the subject matter of any of Examples 28-33, and further including means for storing, responsive to a search request query on the graph data store, an iterator that is an output to the search request query in a volatile memory device, the iterator including a reference to one or more nodes and/or edges in the graph data store on the persistent memory device.
- Example 35 includes the subject matter of any of Examples 28-34, and further including means for advancing the iterator to directly access nodes and/or edges of the graph data store in response to a request for an additional match to the search query.
- Example 36 includes the subject matter of any of Examples 28-35, and wherein at least a portion of the nodes are associated with at least one tag representing a classification of the node.
- Example 37 includes the subject matter of any of Examples 28-36, and wherein at least a portion of the edges are associated with at least one tag representing a classification of the edge.
- Example 38 includes the subject matter of any of Examples 28-37, and wherein the graph data store includes a tag sorted edge set to collate edges with identical tags to allow efficient iteration over related edges.
- Example 39 includes the subject matter of any of Examples 28-38, and wherein at least a portion of the nodes are associated with at least one property in the form of a key-value pair.
- Example 40 includes the subject matter of any of Examples 28-39, and wherein at least a portion of the edges are associated with at least one property in the form of a key-value pair.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Human Computer Interaction (AREA)
Abstract
Description
- Queries for neighbors, friends-of-friends connections, paths between nodes or other interesting patterns have grown tremendously important on today's ever evolving datasets. Graph-based databases (or data stores) have the potential to bring the important ACID (Atomicity, Consistency, Isolation, Durability) properties associated with transactions to a data organization that treats relationships as a first-class concept. For example, unknown or non-obvious relationships between nodes can be identified.
- New persistent memory technologies, such as memristors and phase change memory, offer a byte-addressable interface and memory access latencies that are comparable to those of volatile memory, such as dynamic random-access memory (DRAM). These persistent memory technologies may have a profound influence on organized data storage due to the availability of faster persistent storage and larger main memories. However, none of the existing graph-based databases support a completely in-memory database model.
- The concepts described herein are illustrated by way of example and not by way of limitation in the accompanying figures. For simplicity and clarity of illustration, elements illustrated in the figures are not necessarily drawn to scale. Where considered appropriate, reference labels have been repeated among the figures to indicate corresponding or analogous elements.
-
FIG. 1 is a simplified block diagram of at least one embodiment of a computing device for managing connected data; -
FIGS. 2 and 3 are a simplified block diagrams of an illustrative graph data store that may be generated and/or managed by the computing device ofFIG. 1 ; -
FIG. 4 is a simplified block diagram of at least one embodiment of an environment of the computing device ofFIG. 1 ; -
FIG. 5 is a simplified flow diagram of at least one embodiment in which an interactor object includes pointers to a graph data store; -
FIG. 6 is a simplified flow diagram of at least one embodiment of a method for generating and/or modifying a graph data store that may be executed by the computing device ofFIGS. 1 and 4 ; and -
FIG. 7 is a simplified flow diagram of at least one embodiment of a method for responding to a search request that may be executed by the computer system ofFIGS. 1 and 4 . - While the concepts of the present disclosure are susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and will be described herein in detail. It should be understood, however, that there is no intent to limit the concepts of the present disclosure to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives consistent with the present disclosure and the appended claims.
- References in the specification to “one embodiment,” “an embodiment,” “an illustrative embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include that particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described. Additionally, it should be appreciated that items included in a list in the form of “at least one A, B, and C” can mean (A); (B); (C); (A and B); (A and C); (B and C); or (A, B, and C). Similarly, items listed in the form of “at least one of A, B, or C” can mean (A); (B); (C); (A and B); (A and C); (B and C); or (A, B, and C).
- The disclosed embodiments may be implemented, in some cases, in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on a transitory or non-transitory machine-readable (e.g., computer-readable) storage medium, which may be read and executed by one or more processors. A machine-readable storage medium may be embodied as any storage device, mechanism, or other physical structure for storing or transmitting information in a form readable by a machine (e.g., a volatile or non-volatile memory, a media disc, or other media device).
- In the drawings, some structural or method features may be shown in specific arrangements and/or orderings. However, it should be appreciated that such specific arrangements and/or orderings may not be required. Rather, in some embodiments, such features may be arranged in a different manner and/or order than shown in the illustrative figures. Additionally, the inclusion of a structural or method feature in a particular figure is not meant to imply that such feature is required in all embodiments and, in some embodiments, may not be included or may be combined with other features.
- Referring now to
FIG. 1 , anillustrative computing device 100 for managing connected data, such as a graph data store, includes at least oneprocessor 102, an I/O subsystem 104, at least one on-die cache 106, and amemory controller 108 to control avolatile memory 110 and apersistent memory 112. In use, as described below, an entire graph data store is stored in thepersistent memory 112. The graph data store includes a plurality of nodes and edges that define relationships between the nodes. In some embodiments, the nodes and/or edges each includes a unique identifier. Thecomputing device 100 is configured to add, delete, and/or read nodes and/or edges (and other data) in the graph data store. Thecomputing device 100 is also configured to generate an iterator object in response to a search request query. The iterator object includes pointers (or references) to nodes and/or edges in the graph data store that are responsive to the search request. The iterator object is stored in thevolatile memory 110 and accesses the graph data store in the persistent memory based on the pointers of the iterator object. The terms “pointers” and “references” are broadly intended to encompass any reference to a value in memory. - The
computing device 100 may be embodied as any type of device capable of performing the functions described herein. For example, thecomputing device 100 may be embodied as, without limitation, a computer, a workstation, a server computer, a laptop computer, a notebook computer, a tablet computer, a smartphone, a mobile computing device, a desktop computer, a distributed computing system, a multiprocessor system, a consumer electronic device, a smart appliance, and/or any other computing device capable of executing software code segments. As shown inFIG. 1 , theillustrative computing device 100 includes theprocessor 102, the I/O subsystem 104, the on-die cache 106, and thememory controller 108 to controlvolatile memory 110 andpersistent memory 112. Of course, thecomputing device 100 may include other or additional components, such as those commonly found in a workstation (e.g., various input/output devices), in other embodiments. For example, thecomputing device 100 may include anexternal storage 114,peripherals 116, and/or anetwork adapter 118. Additionally, in some embodiments, one or more of the illustrative components may be incorporated in, or otherwise form a portion of, another component. For example, thememory processor 102 in some embodiments. - The
processor 102 may be embodied as any type of processor capable of performing the functions described herein. For example, the processor may be embodied as a single or multi-core processor(s), digital signal processor, microcontroller, or other processor or processing/controlling circuit. Thevolatile memory 110 andpersistent memory 112 may be embodied as any type of volatile memory and persistent memory, respectively, capable of performing the functions described herein.Volatile memory 110 contrasts withpersistent memory 112 in that thepersistent memory 112 does not lose content when power is lost. In operation, thevolatile memory 110 andpersistent member 112 may store various data and software used during operation of thecomputing device 100 such as operating systems, applications, programs, libraries, and drivers. Thememory processor 102 via the memory bus using memory control(s) 108, which may be embodied as circuitry and/or components to facilitate input/output operations with theprocessor 102, thememory computing device 100. - The I/
O subsystem 104 may be embodied as, or otherwise include, memory controller hubs, input/output control hubs, firmware devices, communication links (i.e., point-to-point links, bus links, wires, cables, light guides, printed circuit board traces, etc.) and/or other components and subsystems to facilitate the input/output operations. In some embodiments, the I/O subsystem 104 may form a portion of a system-on-a-chip (SoC) and be incorporated, along with theprocessor 102, thememory computing device 100, on a single integrated circuit chip. - An
external storage device 114 may be coupled to theprocessor 102 with the I/O subsystem 104. Theexternal storage device 114 may be embodied as any type of device or devices configured for short-term or long-term storage of data such as, for example, memory devices and circuits, memory cards, hard disk drives, solid-state drives, or other data storage devices. Unlike existing systems, however, one or more embodiments contemplate thatcomputing device 100 would not include anyexternal storage 114 and that a graph database and all other data needed bycomputing device 100 would be stored on thepersistent memory 112 on the memory bus instead of theexternal storage 114. - The
computing device 100 may also includeperipherals 116. Theperipherals 116 may include any number of additional input/output devices, interface devices, and/or other peripheral devices. By way of example only, the peripheral 116 may include a display that could be embodied as any type of display capable of displaying digital information such as a liquid crystal display (LCD), a light emitting diode (LED), a plasma display, a cathode ray tube (CRT), or other type of display device. - The
computing device 100 illustratively includes anetwork adapter 118, which may be embodied as any communication circuit, device, or collection thereof, capable of enabling communications between thecomputing device 100 and other remote devices over a computer network (not shown). Thenetwork adapter 118 may be configured to use any one or more communication technology (e.g., wired or wireless communications) and associated protocols (e.g., Ethernet, Bluetooth®, Wi-Fi®, WiMAX, etc.) to effect such communication. - Referring to
FIG. 2 , an illustrative embodiment is shown with an examplegraph data store 200, which may be stored inpersistent memory 112. Thegraph data store 200 includes a plurality of nodes (also called vertices) that are connected with a plurality of edges that establish relationships between the nodes. Thegraph data store 200 is shown for purposes of example only. Although thegraph data store 200 shown includes five nodes and four edges for purposes of example, one skilled in the art should understand that more or less nodes and/or edges could be used depending on the data set. In some circumstances, for example, millions of nodes and/or edges (or more) could be provided in a graph data store. - As shown, the example
graph data store 200 includes afirst node 202, asecond node 204, athird node 206, afourth node 208, and afifth node 210 and afirst edge 212, asecond edge 214, athird edge 216, and afourth edge 218. Each node and edge of thegraph data store 200 has an associated tag that can be used for classification. For example, the classification could identify entity types, document types, or other attribute types of a node and/or edge. In the example shown, the first andfourth nodes second node 204 with the tag “email,” and the third andfifth nodes first edge 212 is associated with the tag “sent,” the second andfourth edges third edge 208 with the tag “person.” In cases where usage does not require a tag, a default tag could be used. In some embodiments, the tag may be a short string. - In addition to and distinct from tags, one or more properties may be associated with the nodes and/or edges of the
graph data store 200. A property is illustratively embodied as a key-value pair, in which the key is a short string and the value is one of a plurality of pre-defined types. By way of example only, the predefined property types may include Booleans, integers, floats, strings, times, and/or blobs (i.e., arbitrary strings of bits). In some embodiments, all pre-defined types, with the exception of blobs, are orderable. In the example shown, thefirst node 202 is associated with the property key “address” and has a value of “brad.jones@live.com.” As shown, thesecond node 204 is associated with two properties. The first property has a key of “id” and a value of “7482” and a second property key of “subject” and a value of “invoice payment.” Likewise, the edges of thegraph data store 200 may include one or more properties. For example, thefirst edge 212 includes a property key of “sent” and a value of “23/12/2001.” By way of another example, thethird edge 216 includes the property key “received” and a value of “23/12/2001.” -
FIG. 3 illustratively shows an example layout of thegraph data store 200 inpersistent memory 112. In this example, the tags and properties associated withnodes nodes node 202 includes an out edge list pointer to anedge list 302 associated withedge 212. By way of another example,node 208 includes an in edge list pointer to anedge list 304 associated with 216. As shown,node 204 includes both an in edge list pointer and an out edge list pointer. The in edge list pointer points to anedge list 306 associated withedge 212. The out edge list point points to anedge list 308 associated withedge 216.Node 206, in the example shown, includes an in edge list pointer that points to anedge list 310 associated withedge 214. Additionally,node 210 includes an in edge list pointer that points to anedge list 312 associated withedge 218. The edge lists 302, 304, 306, 308, 310, 312 include pointers to edge tags. In the example shown, edge lists 310, 312 include a pointer to edge tag “keyword.” Edge lists 304, 308 include pointers to edge tag “to.” Additionally, edge lists 302, 306 include pointers to the edge tag “sent.” In some embodiments, an intermediate data structure could be used called a tag-sorted edgeset 314 to speed up neighbor lookups in particular and graph traversal in general. Tag-sorted edgesets collate edge information with identical tags to allow efficient iteration over related edges. In addition, the edge-sets may contain source/destination node information to allow efficient accessing of neighbor nodes. - Referring now to
FIG. 4 , in the illustrative embodiment, thecomputing device 100 establishes anenvironment 400 during operation. Theillustrative environment 400 includes anAPI 402, agraph data store 404, atransaction management module 406, atransaction log 408, and anallocator module 410. Thegraph data store 404 illustratively includes a plurality ofnodes 412, a plurality ofedges 414, tag-based and/or property-basedindices 416, a string table 418, and a plurality ofproperties 420 associated with thenodes 412 and/or edges 414. The various modules of theenvironment 400 may be embodied as hardware, firmware, software, or a combination thereof. As such, in some embodiments, one or more of the modules of theenvironment 400 may be embodied as circuitry or collection of electrical devices (e.g.,transaction management circuitry 406 and/or allocator circuitry 410). It should be appreciated that, in such embodiments, one or more of thetransaction management circuitry 406 and/or theallocator circuitry 410 may form a portion of one or more of theprocessor 102, the I/O subsystem 104, thememory external storage 114, thenetwork adapter 118, and/or other components of thecomputing device 100. Additionally, in some embodiments, one or more of the illustrative modules may form a portion of another module and/or one or more of the illustrative modules may be independent of one another. - As shown, the
environment 400 includes anAPI 402 through which programs may interact with thegraph data store 404. For example, theAPI 402 may be used as an interface by a program to add (or create), read, remove (delete) and/or modify thenodes 412, edges 414,indices 416, string table 418 and/orproperties 420 of thegraph data store 400. Thetransaction management module 406 manages transactions with thegraph data store 400 and updates the transaction log 408 so that the state of the data store is consistent at transaction boundaries and can be recovered if a failure occurs within a transaction. - The
allocator module 410 is configured to manage allocations or partitions in thepersistent memory 112 for the various entities (e.g., nodes, edges, properties). Theallocator module 410 chooses the data structure sizes and layout to be cache-efficient, organizes data and logs to be streaming and prefetch friendly, and avoids unnecessary writes topersistent memory 112 because the write bandwidth is lower than that ofvolatile memory 110. In some embodiments, theallocator module 410stores nodes 408 andedges 410 in one or more tables with fixed-size objects. To maximize storage utilization inside each node or edge element, the properties of these entities are stored inline in a best-fit manner. For example, the properties of entities could be stored in-line to maximize storage. For properties that exceed the amount of space available within a node or edge object, theallocator module 410 allocates separate chunks that are also filled in a best-fit manner. Despite the space-efficient layout, these properties are accessible directly, without the need to “deserialize” them into an accessible format, as is the case with other compact data storage options or disk-based storage options. - Based on current projections, the
persistent memory 112 will be slower thanvolatile memory 110 for read and write, and will have limited durability, meaning the probability of failure increases after some large number of writes. In one embodiment, the data structures, such as iterator objects, allocator objects, and transaction objects, can be split betweenvolatile memory 110 andpersistent memory 112 to optimize wear and access times. For example, theallocator module 410 could include statistics and the actual persistent memory areas that it manages. Since the statistics are updated quite frequently with each allocation/de-allocation and also since they are primarily used internally rather than part of the user data, this data can be stored involatile memory 110 and perhaps a checkpoint to this data in the allocator header could be stored inpersistent memory 112 if required. - In some embodiments, the
API 402 provides an interface through which thegraph data store 404 can be searched. In the illustrative embodiment, theAPI 402 is configured to return an iterator object in response to a search request on thegraph data store 404. Anillustrative iterator object 500 is shown inFIG. 5 . Theiterator object 500 includes a reference to data in thegraph data store 404. For example, theiterator object 500 could include afirst pointer 502 that points to the first node in thegraph data store 404 that matches the search query. If the user requests additional data that satisfies the query, theAPI 402 could be used to advance theiterator object 500 to the next node matching the query. As shown, asecond pointer 504 illustrates theiterator object 500 having been advanced by theAPI 402 to the next pointer, which points to the next node matching the query. Athird pointer 506 illustrates theiterator object 500 having been advanced, yet again, to the next node matching the query. This process would continue with theiterator object 500 advancing to the next pointer as the user requests additional matches in thegraph data store 404 using theAPI 402. Theiterator object 500 is stored involatile memory 110 and maintains its volatile state, such as the next pointer, for speed of access, but points to the actual graph data (such as nodes or properties) of thegraph data store 404 inpersistent memory 112, which is accessed as theiterator object 500 progresses. AlthoughFIG. 5 illustratively showspointers 502 of theiterator object 500 pointing to data in a node table 508, the pointers could point to any data in thegraph data store 404 in other embodiments. As shown, a program searching thegraph data store 404 through theAPI 402, will be returned aniterator object 500 that can be examined by the program one element at a time (next pointer to next pointer). In that sense, the queries are evaluated lazily and follow the “next” pointers only when requested by the user causing a data fetch. Since the entiregraph data store 404 is stored in directly-accessiblepersistent memory 112, serializing and de-serializing data can be avoided, unlike existing graph stores, which use disks for making data persistent, and managed languages that do not provide a way to access data from buffer caches without creating temporary objects that reside in memory. - Referring now to
FIG. 6 , in use, thecomputing device 100 may execute amethod 600 to create a new entity, such as a node and/or edge in thegraph data store 404. Although themethod 600 is directed to the creation of a new node, themethod 600 may also be used to create, delete and/or update other entities in thegraph data store 404. In the example shown, the node being created is associated with properties; however, nodes and edges need not necessarily be associated with properties and/or tags. In the example embodiment shown, themethod 600 begins withblock 602 in which thecomputing device 100 determines whether a request has been received, such as through theAPI 402, to create a new node. If a new request is received, themethod 600 advances to block 604 in which thecomputing device 100 creates a new node object of a fixed size inpersistent memory 112. Subsequently, inblock 606, thecomputing device 100 evaluates the size of the properties and/or tags (and any other data) associated with the new node to determine whether this data will fit within the fixed size of the node object. If the size of the properties and/or tags associated with the new node will fit within the new node object, themethod 600 advances to block 608 in which the properties and/or tags are stored within the new node object. Alternatively, if the size of the properties and/or tags associated with the new node exceeds the size of the node object, themethod 600 advances to block 610 in which thecomputing device 100 allocates a chunk in persistent memory. Subsequently, inblock 612, thecomputing device 100 stores the portions of the properties and/or tags that will fit in the fixed sized node object. Additionally, inblock 614, thecomputing device 100 stores the remainder of properties and/or tags in the allocated chunk of persistent memory. Themethod 600 will next advance to block 616 in which the allocation status is updated since the operation was completed successfully. This is consistent with ACID principles since a failure would not result in inconsistent data, but would be reverted without an update to the allocation status. - Referring now to
FIG. 7 , in use, thecomputing device 100 may also execute amethod 700 for performing a search on thegraph data store 404. Themethod 700 begins withblock 702 in which thecomputing device 100 determines whether a search query has been received. The search query may be received via theAPI 402 from a requesting computing device, service, program, and/or other source. If a search request query has been received, themethod 700 advances to block 704 in which thecomputing device 100 evaluates the search query. For example, thecomputing device 100 may parse the search query to prepare for searching of thegraph data store 404. Subsequently, inblock 706, thecomputing device 100 traverses thegraph data store 404 to find data responsive to the search query. Inblock 708, in response to locating data responsive to the search query, thecomputing device 100 generates and returns an iterator object stored involatile memory 110 that includes a reference to the first item in thegraph data store 404 inpersistent memory 112 that matches the query. As such, inblock 710, thecomputing device 100 may receive a user request for additional graph data that matches the query. If the user requests this additional graph data, themethod 700 advances to block 712 in which thecomputing device 100 fetches the graph data associated with the next reference in the iterator object. Subsequently, inblock 714, thecomputing device 100 determines whether the last data responsive to the query has been reached. If not, the method loops back to block 710 in which thecomputing device 100 determines whether another user request for graph data associated with the next pointer in the iterator object has been received. If, however, the last data responsive to the query has been reached, themethod 700 advances to block 716. - It should be appreciated that, in some embodiments, any one or more of the methods described herein may be embodied as various instructions stored on a computer-readable media, which may be executed by the
processor 102, aperipheral device 116, and/or other components of thecomputing device 100 to cause thecomputing device 100 to perform the corresponding method. The computer-readable media may be embodied as any type of media capable of being read by thecomputing device 100 including, but not limited to, thememory external storage 114, a local memory orcache 106 of theprocessor 102, other memory or data storage devices of thecomputing device 100, portable media readable by aperipheral device 116 of thecomputing device 100, and/or other media. - Illustrative examples of the technologies disclosed herein are provided below. An embodiment of the technologies may include any one or more, and any combination of, the examples described below.
- Example 1 includes computing device comprising at least one processor; at least one memory controller to access a volatile memory device and a persistent memory device on a memory bus, the persistent memory device having stored therein a graph data store including a plurality of nodes relationally arranged with a plurality of edges, each of the plurality of edges defining a relationship between at least two of the plurality of nodes, the volatile memory device having stored therein a plurality of instructions that, when executed by the processor, causes the processor to in response to an operation on the graph data store, partition data between the volatile memory device and the persistent memory device to minimize writes on the persistent memory device.
- Example 2 includes the subject matter of Example 1, and wherein at least a portion of the nodes are associated with at least one tag representing a classification of the node.
- Example 3 includes the subject matter of any of Example 1 or 2, and wherein at least a portion of the edges are associated with at least one tag representing a classification of the edge.
- Example 4 includes the subject matter of any of Examples 1-3, and wherein the graph data store includes a tag sorted edge set to collate edges and associated nodes with identical tags.
- Example 5 includes the subject matter of any of Examples 1-4, and wherein at least a portion of the nodes are associated with at least one property in the form of a key-value pair.
- Example 6 includes the subject matter of any of Examples 1-5, and wherein at least a portion of the edges are associated with at least one property in the form of a key-value pair.
- Example 7 includes the subject matter of any of Examples 1-6, and wherein the plurality of instructions further cause the processor to organize the nodes and/or edges of the graph data store in the persistent memory device as fixed size objects.
- Example 8 includes the subject matter of any of Examples 1-7, and wherein the plurality of instructions further cause the processor to store at least one property and/or tag associated with a node and/or edge in-line in the fixed-size object representing the node and/or edge.
- Example 9 includes the subject matter of any of Examples 1-8, and wherein the plurality of instructions further cause the processor to allocate, in response to the property and/or tag associated with the node and/or edge exceeding the size of the fixed-size object, a chunk of the persistent memory device separate from the fixed-size object.
- Example 10 includes the subject matter of any of Examples 1-9, and wherein the plurality of instructions further cause the processor, in response to a search request query, to generate an iterator object stored on the volatile memory device that includes a reference to one or more nodes and/or edges in the graph data store on the persistent memory device.
- Example 11 includes the subject matter of any of Examples 1-10, and wherein the plurality of instructions further cause the processor to advance the iterator object to directly access nodes and/or edges of the graph data store in response to a request for an additional match to the search query.
- Example 12 includes the subject matter of any of Examples 1-11, and wherein the processor stores an allocator on the volatile memory device, the allocator comprising one or more memory addresses of the graph data store in the persistent memory device.
- Example 13 includes the subject matter of any of Examples 1-12, and wherein the processor stores a portion of a transaction object on volatile memory and a portion of the transaction object on persistent memory such that writes to persistent memory are minimized while still maintaining the atomicity, consistency, isolation, durability (“ACID”) properties of the graph data store.
- Example 14 includes a method for managing a graph data store on a persistent memory device. The method includes storing, on a persistent memory device, a graph data store comprising a plurality of nodes and a plurality of edges, each of the plurality of edges defining a relationship between at least two of the plurality of nodes; managing a operation on the graph data store by storing a first portion of resulting data on a volatile memory device and a second portion of the resulting data on the persistent memory device to minimize writes on the persistent memory device.
- Example 15 includes the subject matter of Example 14, further including allocating, by a computing device, a fixed size object on a persistent memory device to each of the plurality of nodes and edges.
- Example 16 includes the subject matter of any of Example 14 or 15, and further including evaluating, by a computing device, a search request query on the graph data store; and generating, by a computing device, an iterator object including a reference to one or more nodes and/or edges in the graph data store in response to the search request query, wherein the iterator object is stored on a volatile memory device
- Example 17 includes the subject matter of any of Examples 14-16, and wherein the computing device manages the operation by partitioning the first portion and the second portion of resulting data to minimize writes to the persistent memory device.
- Example 18 includes the subject matter of any of Examples 14-17, and further including storing at least one property and/or tag associated with a node and/or edge in-line in a fixed-size object.
- Example 19 includes the subject matter of any of Examples 14-18, and wherein responsive to the property and/or tag associated with the node and/or edge exceeding the size of the fixed-size object, allocating a chunk of the persistent memory device separate from the fixed-size object.
- Example 20 includes the subject matter of any of Examples 14-19, and wherein responsive to a search request query on the graph data store, further comprising storing an iterator that is an output to the search request query in a volatile memory device, the iterator including a reference to one or more nodes and/or edges in the graph data store on the persistent memory device.
- Example 21 includes the subject matter of any of Examples 14-20, and further including advancing the iterator to directly access nodes and/or edges of the graph data store in response to a request for an additional match to the search query.
- Example 22 includes the subject matter of any of Examples 14-21, and wherein at least a portion of the nodes are associated with at least one tag representing a classification of the node.
- Example 23 includes the subject matter of any of Examples 14-22, and wherein at least a portion of the edges are associated with at least one tag representing a classification of the edge.
- Example 24 includes the subject matter of any of Examples 14-23, and wherein the graph data store includes a tag sorted edge set to collate edges with identical tags to allow efficient iteration over related edges.
- Example 25 includes the subject matter of any of Examples 14-24, and, wherein at least a portion of the nodes are associated with at least one property in the form of a key-value pair.
- Example 26 includes the subject matter of any of Examples 14-25, and wherein at least a portion of the edges are associated with at least one property in the form of a key-value pair.
- Example 27 includes one or more machine readable storage media comprising a plurality of instructions stored thereon that in response to being executed result in a computing device performing the method of any of Examples 14-26.
- Example 28 includes a computing device comprising means for storing, on a persistent memory device, a graph data store comprising a plurality of nodes and a plurality of edges, each of the plurality of edges defining a relationship between at least two of the plurality of nodes; and means for managing a operation on the graph data store by storing a first portion of resulting data on a volatile memory device and a second portion of the resulting data on the persistent memory device to minimize writes on the persistent memory device.
- Example 29 includes the subject matter of Example 28, and further including means for allocating a fixed size object on a persistent memory device to each of the plurality of nodes and edges.
- Example 30 includes the subject matter of Examples 28 or 29, and further including means for evaluating a search request query on the graph data store; and means for generating an iterator object including a reference to one or more nodes and/or edges in the graph data store in response to the search request query, wherein the iterator object is stored on a volatile memory device
- Example 31 includes the subject matter of any of Examples 28-30, and further including means for managing the operation by partitioning the first portion and the second portion of resulting data to minimize writes to the persistent memory device.
- Example 32 includes the subject matter of any of Examples 28-31, and further including means for storing at least one property and/or tag associated with a node and/or edge in-line in a fixed-size object.
- Example 33 includes the subject matter of any of Examples 28-32, and further including means for allocating, responsive to the property and/or tag associated with the node and/or edge exceeding the size of the fixed-size object, a chunk of the persistent memory device separate from the fixed-size object.
- Example 34 includes the subject matter of any of Examples 28-33, and further including means for storing, responsive to a search request query on the graph data store, an iterator that is an output to the search request query in a volatile memory device, the iterator including a reference to one or more nodes and/or edges in the graph data store on the persistent memory device.
- Example 35 includes the subject matter of any of Examples 28-34, and further including means for advancing the iterator to directly access nodes and/or edges of the graph data store in response to a request for an additional match to the search query.
- Example 36 includes the subject matter of any of Examples 28-35, and wherein at least a portion of the nodes are associated with at least one tag representing a classification of the node.
- Example 37 includes the subject matter of any of Examples 28-36, and wherein at least a portion of the edges are associated with at least one tag representing a classification of the edge.
- Example 38 includes the subject matter of any of Examples 28-37, and wherein the graph data store includes a tag sorted edge set to collate edges with identical tags to allow efficient iteration over related edges.
- Example 39 includes the subject matter of any of Examples 28-38, and wherein at least a portion of the nodes are associated with at least one property in the form of a key-value pair.
- Example 40 includes the subject matter of any of Examples 28-39, and wherein at least a portion of the edges are associated with at least one property in the form of a key-value pair.
Claims (25)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/866,941 US20170090807A1 (en) | 2015-09-26 | 2015-09-26 | Technologies for managing connected data on persistent memory-based systems |
US17/134,306 US11681754B2 (en) | 2015-09-26 | 2020-12-26 | Technologies for managing connected data on persistent memory-based systems |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/866,941 US20170090807A1 (en) | 2015-09-26 | 2015-09-26 | Technologies for managing connected data on persistent memory-based systems |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/134,306 Continuation US11681754B2 (en) | 2015-09-26 | 2020-12-26 | Technologies for managing connected data on persistent memory-based systems |
Publications (1)
Publication Number | Publication Date |
---|---|
US20170090807A1 true US20170090807A1 (en) | 2017-03-30 |
Family
ID=58407137
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/866,941 Abandoned US20170090807A1 (en) | 2015-09-26 | 2015-09-26 | Technologies for managing connected data on persistent memory-based systems |
US17/134,306 Active 2035-10-28 US11681754B2 (en) | 2015-09-26 | 2020-12-26 | Technologies for managing connected data on persistent memory-based systems |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/134,306 Active 2035-10-28 US11681754B2 (en) | 2015-09-26 | 2020-12-26 | Technologies for managing connected data on persistent memory-based systems |
Country Status (1)
Country | Link |
---|---|
US (2) | US20170090807A1 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180357330A1 (en) * | 2017-06-09 | 2018-12-13 | Linkedin Corporation | Compound indexes for graph databases |
US10430463B2 (en) | 2017-03-16 | 2019-10-01 | Raytheon Company | Systems and methods for generating a weighted property graph data model representing a system architecture |
US10430462B2 (en) | 2017-03-16 | 2019-10-01 | Raytheon Company | Systems and methods for generating a property graph data model representing a system architecture |
US10459929B2 (en) | 2017-03-16 | 2019-10-29 | Raytheon Company | Quantifying robustness of a system architecture by analyzing a property graph data model representing the system architecture |
US10496704B2 (en) * | 2017-03-16 | 2019-12-03 | Raytheon Company | Quantifying consistency of a system architecture by comparing analyses of property graph data models representing different versions of the system architecture |
CN110688055A (en) * | 2018-07-04 | 2020-01-14 | 清华大学 | Data access method and system in large graph calculation |
US10671671B2 (en) | 2017-06-09 | 2020-06-02 | Microsoft Technology Licensing, Llc | Supporting tuples in log-based representations of graph databases |
CN113961754A (en) * | 2021-09-08 | 2022-01-21 | 南湖实验室 | Graph database system based on persistent memory |
US20230109463A1 (en) * | 2021-09-20 | 2023-04-06 | Oracle International Corporation | Practical method for fast graph traversal iterators on delta-logged graphs |
US11681754B2 (en) | 2015-09-26 | 2023-06-20 | Intel Corporation | Technologies for managing connected data on persistent memory-based systems |
US11928097B2 (en) | 2021-09-20 | 2024-03-12 | Oracle International Corporation | Deterministic semantic for graph property update queries and its efficient implementation |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2526598B (en) * | 2014-05-29 | 2018-11-28 | Imagination Tech Ltd | Allocation of primitives to primitive blocks |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120323970A1 (en) * | 2011-06-18 | 2012-12-20 | Microsoft Corporation | Dynamic lock-free hash tables |
US9128845B2 (en) * | 2012-07-30 | 2015-09-08 | Hewlett-Packard Development Company, L.P. | Dynamically partition a volatile memory for a cache and a memory partition |
US20160314220A1 (en) * | 2015-04-27 | 2016-10-27 | Linkedin Corporation | Fast querying of social network data |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8180804B1 (en) * | 2010-04-19 | 2012-05-15 | Facebook, Inc. | Dynamically generating recommendations based on social graph information |
US20170090807A1 (en) | 2015-09-26 | 2017-03-30 | Vishakha Gupta | Technologies for managing connected data on persistent memory-based systems |
-
2015
- 2015-09-26 US US14/866,941 patent/US20170090807A1/en not_active Abandoned
-
2020
- 2020-12-26 US US17/134,306 patent/US11681754B2/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120323970A1 (en) * | 2011-06-18 | 2012-12-20 | Microsoft Corporation | Dynamic lock-free hash tables |
US9128845B2 (en) * | 2012-07-30 | 2015-09-08 | Hewlett-Packard Development Company, L.P. | Dynamically partition a volatile memory for a cache and a memory partition |
US20160314220A1 (en) * | 2015-04-27 | 2016-10-27 | Linkedin Corporation | Fast querying of social network data |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11681754B2 (en) | 2015-09-26 | 2023-06-20 | Intel Corporation | Technologies for managing connected data on persistent memory-based systems |
US10430463B2 (en) | 2017-03-16 | 2019-10-01 | Raytheon Company | Systems and methods for generating a weighted property graph data model representing a system architecture |
US10430462B2 (en) | 2017-03-16 | 2019-10-01 | Raytheon Company | Systems and methods for generating a property graph data model representing a system architecture |
US10459929B2 (en) | 2017-03-16 | 2019-10-29 | Raytheon Company | Quantifying robustness of a system architecture by analyzing a property graph data model representing the system architecture |
US10496704B2 (en) * | 2017-03-16 | 2019-12-03 | Raytheon Company | Quantifying consistency of a system architecture by comparing analyses of property graph data models representing different versions of the system architecture |
US20180357330A1 (en) * | 2017-06-09 | 2018-12-13 | Linkedin Corporation | Compound indexes for graph databases |
US10445370B2 (en) * | 2017-06-09 | 2019-10-15 | Microsoft Technology Licensing, Llc | Compound indexes for graph databases |
US10671671B2 (en) | 2017-06-09 | 2020-06-02 | Microsoft Technology Licensing, Llc | Supporting tuples in log-based representations of graph databases |
CN110688055A (en) * | 2018-07-04 | 2020-01-14 | 清华大学 | Data access method and system in large graph calculation |
CN113961754A (en) * | 2021-09-08 | 2022-01-21 | 南湖实验室 | Graph database system based on persistent memory |
US20230109463A1 (en) * | 2021-09-20 | 2023-04-06 | Oracle International Corporation | Practical method for fast graph traversal iterators on delta-logged graphs |
US11928097B2 (en) | 2021-09-20 | 2024-03-12 | Oracle International Corporation | Deterministic semantic for graph property update queries and its efficient implementation |
Also Published As
Publication number | Publication date |
---|---|
US20210117473A1 (en) | 2021-04-22 |
US11681754B2 (en) | 2023-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11681754B2 (en) | Technologies for managing connected data on persistent memory-based systems | |
US10176092B2 (en) | System and method for executing data processing tasks using resilient distributed datasets (RDDs) in a storage device | |
US8819335B1 (en) | System and method for executing map-reduce tasks in a storage device | |
US9092321B2 (en) | System and method for performing efficient searches and queries in a storage node | |
US9021189B2 (en) | System and method for performing efficient processing of data stored in a storage node | |
US10114908B2 (en) | Hybrid table implementation by using buffer pool as permanent in-memory storage for memory-resident data | |
US20200387495A1 (en) | Hybrid data storage and load system with rowid lookup | |
US10248346B2 (en) | Modular architecture for extreme-scale distributed processing applications | |
US11392571B2 (en) | Key-value storage device and method of operating the same | |
US20180300249A1 (en) | Data caching using local and remote memory | |
US9990281B1 (en) | Multi-level memory mapping | |
US8799611B2 (en) | Managing allocation of memory pages | |
US9448934B2 (en) | Affinity group access to global data | |
US10416900B2 (en) | Technologies for addressing data in a memory | |
CN107408132B (en) | Method and system for moving hierarchical data objects across multiple types of storage | |
US10747773B2 (en) | Database management system, computer, and database management method | |
US20200348871A1 (en) | Memory system, operating method thereof and computing system for classifying data according to read and write counts and storing the classified data in a plurality of types of memory devices | |
US11093169B1 (en) | Lockless metadata binary tree access | |
US10339052B2 (en) | Massive access request for out-of-core textures by a parallel processor with limited memory | |
CN104715349A (en) | Method and system for calculating e-commerce freight | |
US11016666B2 (en) | Memory system and operating method thereof | |
US20160335321A1 (en) | Database management system, computer, and database management method | |
US9251100B2 (en) | Bitmap locking using a nodal lock | |
US11507611B2 (en) | Personalizing unstructured data according to user permissions | |
US9298622B2 (en) | Affinity group access to global data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTEL CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GUPTA, VISHAKHA;KAGI, ALAIN;LANTZ, PHILIP;AND OTHERS;SIGNING DATES FROM 20151130 TO 20161215;REEL/FRAME:041956/0831 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |