CN111309750A - Data updating method and device for graph database - Google Patents

Data updating method and device for graph database Download PDF

Info

Publication number
CN111309750A
CN111309750A CN202010241791.2A CN202010241791A CN111309750A CN 111309750 A CN111309750 A CN 111309750A CN 202010241791 A CN202010241791 A CN 202010241791A CN 111309750 A CN111309750 A CN 111309750A
Authority
CN
China
Prior art keywords
vertex
data
message
edge
updating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010241791.2A
Other languages
Chinese (zh)
Inventor
邓崇鑫
蔡苗
陈震宇
刘国华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Postal Savings Bank of China Ltd
Original Assignee
Postal Savings Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Postal Savings Bank of China Ltd filed Critical Postal Savings Bank of China Ltd
Priority to CN202010241791.2A priority Critical patent/CN111309750A/en
Publication of CN111309750A publication Critical patent/CN111309750A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a data updating method and device of a graph database, wherein the method comprises the following steps: determining incremental data and operation types of the updating operation; locking the data number corresponding to the incremental data by using a distributed lock; calling a query interface of a graph database, and querying whether corresponding historical data exist in the data number; updating corresponding data in the graph database according to the operation type and the judgment result; and after the updating operation is completed, unlocking the data number. The scheme of the application supports distributed parallel transmission and processing of data, and supports multiple processes and multiple threads; on the premise of not modifying the existing system (data production and transmission system and graph database), the distributed type, the real-time performance, the time sequence and the idempotent performance can be simultaneously realized in the cross-system processing process from source data to the graph database.

Description

Data updating method and device for graph database
Technical Field
The application relates to the technical field of databases, in particular to a data updating method and device of a graph database.
Background
With the development of the internet and internet of things technology, the data growth speed is faster and faster. Meanwhile, the application of graph data is more and more extensive, and many related applications have high requirements on real-time performance. Therefore, the graph database is required to be capable of realizing real-time processing of mass data, and the problem to be solved is that the graph database is used for real-time incremental updating of the mass data.
The graph database is used for storing entity information and relationship information between entities, and corresponds to points (also called nodes or vertexes) and edges (also called arcs or lines) in the graph theory. For example, the relationship between people can be stored by using a graph database, and the people are entity information and vertex; the relationship between people is the relationship information between entities, and is edge. There are many existing graph databases, such as: neo4j, ArangoDB, OrientDB.
In the related art, a data production and transmission system and a graph database are implemented by selecting different technologies in different organizations and enterprises and different specific applications. If these systems need to be modified one by one in order to implement the relevant functions or features, labor and time costs can be high. Under the condition of not modifying the existing system (a data production and transmission system and a graph database), no corresponding technical scheme exists, and the distributed type, the real-time property, the time sequence and the idempotent property of the process from source data to the graph database can be simultaneously solved.
Disclosure of Invention
To overcome, at least to some extent, the problems in the related art, the present application provides a method and apparatus for updating data in a graph database.
According to a first aspect of embodiments of the present application, there is provided a data updating method for a graph database, including:
determining incremental data and operation types of the updating operation;
locking the data number corresponding to the incremental data by using a distributed lock;
calling a query interface of a graph database, and querying whether corresponding historical data exist in the data number;
updating corresponding data in the graph database according to the operation type and the judgment result;
and after the updating operation is completed, unlocking the data number.
Further, locking the data number corresponding to the incremental data includes:
when the increment data is vertex, the vertex id is locked.
Further, the updating the corresponding data in the graph database according to the operation type and the judgment result includes:
when the operation type is adding or covering, if the judgment result is that the operation type does not exist, entering a sub-process of adding vertex; if the judgment result is that the flow exists, entering a sub-process covering vertex;
and when the operation type is deletion, if the judgment result is that the operation type exists, entering a subprocess for deleting vertex, and updating the vertex log.
Further, the sub-flow of adding vertex includes:
setting the vertex id of the vertex as the vertex id of the message;
setting available of vertex as true;
traversing properties of the message to generate properties corresponding to vertex;
updating the vertex log;
vertex data is added to the graph database.
Further, the sub-process of the overlay vertex includes:
comparing the vertex update message id of vertex with the message id of the message, if the two message ids are the same, ending the process;
traversing the properties of the message, and searching whether $ { property name } of the vertex is the same as the name of the property of the message exists in the properties of the vertex;
property of vertex, if any, is overridden; if not, adding a property in the vertex;
traversing properties of vertex, and comparing $ { property name } update message id of vertex with message id of message;
if the two message ids are not the same, comparing the time of the message with $ { propertylame } update date time of vertex;
and if the timestamp time of the message is later, updating the related information.
Further, locking the data number corresponding to the incremental data includes:
and when the incremental data is edge, sequencing the starting vertex id and the end vertex id of the edge, and locking the starting vertex id and the end vertex id according to the sequence.
Further, the updating the corresponding data in the graph database according to the operation type and the judgment result includes:
when the operation type is adding or covering, if the judgment result is that the operation type does not exist, entering a sub-process of adding edge; if the judgment result is that the edge exists, entering a sub-process covering the edge;
and when the operation type is deletion, deleting the corresponding edge data according to the additional condition.
Further, the sub-process of adding edge includes:
setting the from vertex id of the edge as the from vertex id of the message;
setting the to vertex id of the edge as the to vertex id of the message;
setting the relationship of the edge as the relationship of the message;
setting available of the edge as true;
traversing properties of the message to generate properties corresponding to the edge;
updating the edge log;
adding edge data to the graph database.
Further, the deleting the corresponding edge data according to the additional condition includes:
if the additional conditions are from vertex id, to vertex id and relationship, deleting the corresponding 1 edg data;
if the additional conditions are from vertex id and to vertex id, deleting a plurality of corresponding edg data;
if the additional condition is from vertex id or to vertex id, the corresponding plurality of edg data is deleted.
According to a second aspect of embodiments of the present application, there is provided a data updating apparatus for a graph database, comprising:
the determining module is used for determining the incremental data and the operation type of the updating operation;
the locking module is used for locking the data number corresponding to the incremental data by using a distributed lock;
the query module is used for calling a query interface of the graph database and querying whether the data number has corresponding historical data;
the judging module is used for updating corresponding data in the graph database according to the operation type and the judging result;
and the unlocking module is used for unlocking the data numbers after the updating operation is finished.
The technical scheme provided by the embodiment of the application has the following beneficial effects:
the scheme of the application supports distributed parallel transmission and processing of data, and supports multiple processes and multiple threads; on the premise of not modifying the existing system (data production and transmission system and graph database), the distributed type, the real-time performance, the time sequence and the idempotent performance can be simultaneously realized in the cross-system processing process from source data to the graph database.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
FIG. 1 is a flow chart illustrating a method for data updating of a graph database according to an exemplary embodiment.
FIG. 2 is a system block diagram illustrating a distributed graph database according to an exemplary embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of methods and apparatus consistent with certain aspects of the present application, as detailed in the appended claims.
FIG. 1 is a flow chart illustrating a method for data updating of a graph database according to an exemplary embodiment. The method can be applied to a distributed graph database, and specifically comprises the following steps:
step S1: determining incremental data and operation types of the updating operation;
step S2: locking the data number corresponding to the incremental data by using a distributed lock;
step S3: calling a query interface of a graph database, and querying whether corresponding historical data exist in the data number;
step S4: updating corresponding data in the graph database according to the operation type and the judgment result;
step S5: and after the updating operation is completed, unlocking the data number.
The scheme of the application supports distributed parallel transmission and processing of data, and supports multiple processes and multiple threads; on the premise of not modifying the existing system (data production and transmission system and graph database), the distributed type, the real-time performance, the time sequence and the idempotent performance can be simultaneously realized in the cross-system processing process from source data to the graph database.
The scheme of the application realizes the distribution, real-time performance, time sequence and idempotent performance of the incremental updating process from the source data to the graph database. The method supports distributed parallel transmission and processing of data (simultaneously supports multiple processes and multiple threads), regardless of the sequence of data transmission and processing and whether the transmission and processing are repeated, as long as all related data are successfully transmitted and processed at least once, and the final result is ensured to be completely consistent with the result of serial execution of the data according to the time sequence.
To further detail the technical solution of the present application, first, a description is made of related concepts of a distributed architecture and source data.
The distributed architecture is a common solution for solving high concurrency, a source data generation system of a graph database also often adopts the mode, and the transmission and processing of source data from the generation system to the graph database also involve the distribution. The core design idea of the distributed architecture is parallel splitting and horizontal expansion, which has the following advantages and is being used by more and more systems. The computing and storing capacity can be expanded transversely based on general hardware, the processing capacity of the system is improved, and the continuously increasing requirements of services are met. And the efficiency bottleneck of the traditional serial processing is broken through by parallel processing. The reliability of the system is improved, and the unavailability of system functions caused by single-point faults is avoided. The cost of acquiring the same processing power can be lower than with conventional architectures, based on relatively inexpensive general-purpose computing and storage devices.
Most production systems of source data adopt a distributed architecture, and in order to ensure real-time processing of massive source data, a link from the source data to a graph database needs to adopt the distributed architecture. The source data generated by the distributed parallel system cannot guarantee the global time sequence of source data transmission under the general condition. The same processing is carried out in a plurality of processes or threads at the same time, the consumed time has uncertainty, and partial nodes can also generate errors or faults in the processing. For distributed parallel processing of a batch of ordered data, the time sequence of the batch of data during processing cannot be guaranteed. Data processing can lead to logical errors in data updates if the timing is not guaranteed. For example, the value of a data a field should be updated to a0 and then to a1, and if the update sequence is reversed, the value of the data a field will be updated to a1 and then to a0, and finally to a 0. In order to ensure the time sequence of the source data during transmission, a cache manner is generally adopted, and the data of each node is waited to be sent completely, and after being sorted in the cache, the subsequent serial transmission is performed. This approach can result in delays in data transmission, reducing timeliness. Moreover, if a node fails to normally transmit data for a long time due to a failure, a serious delay is caused, or the data is non-timely. In order to ensure the ordering of data during processing, a serial processing mode is generally adopted, and under the condition that the source data amount is continuously increased, the serial processing mode always reaches the processing upper limit of a single node. If a distributed architecture is adopted, the global time sequence during the source data processing cannot be guaranteed. That is, under the distributed architecture, it is difficult to ensure real-time performance and timing performance simultaneously in transmission and processing of source data.
Since errors cannot be eliminated, error handling is considered in any data transmission or processing procedure. Data transmission is easy to realize at least one successful transmission, and when the data is transmitted and processed across systems, the processing is difficult to guarantee and is only successfully processed once. For example, if data is received, data processing is performed first, and then a data reception Acknowledgement (ACK) is sent. After the data is processed, an error occurs before the data is notified to be successfully received, and since the data sending end does not receive a data receiving Acknowledgement (ACK), the data is considered to be unsuccessfully received, and then the data is sent again, so that the data is repeatedly processed; and if the data is received, firstly sending a data receiving Acknowledgement (ACK) and then carrying out data processing. When data is processed in error, data loss may be caused. If the process of data persistence to the graph database is idempotent, then the easy to implement nature of the data transmission being at least once successful, the source data transmission can be resolved and updated to the graph database, this complete process being processed only once successfully.
The following describes the scheme of the present application in an expanded manner with reference to specific application scenarios, and introduces a system architecture, a related data structure, and a graph data processing procedure, respectively.
In the first section, a system architecture is introduced.
Referring to FIG. 2, the system architecture of a distributed graph database can be divided into three parts: source data production, graph data processing, and graph database operations.
The source data of the graph data can come from any service system, and the source data needs to be arranged into an associated message data structure when being transmitted. The source data transmission mode can be directly sent by a service system, and can also be transmitted through a message queue, so that a distributed parallel architecture is supported. In the transmission process, a service system or a message queue is not needed to ensure the time sequence of data transmission, and the clocks of all nodes for data transmission are consistent.
The graph data processing supports a distributed architecture, and a plurality of graph data processing nodes can be deployed for parallel processing. The data processing flow of the diagram is divided into 3 links of data receiving, data processing and receiving confirmation. When a temporary error occurs in any link, the temporary error can be processed in a retry mode, and the final result can be guaranteed not to be influenced. The data receiving supports parallel batch receiving, the data processing can adopt a thread pool mode, multi-thread parallel processing is carried out, and each thread processes one piece of data.
In the data processing process, graph data can be inquired and persisted through a graph database API. The method does not depend on the special function of the graph database, and related operations only use the most basic graph database API, and comprise the following steps: querying a vertex by a vertex id; querying an edge through from vertex id, to vertex id and relationship; querying a plurality of edges through from vertex and to vertex ids; querying a plurality of edges through from vertex id or to vertex id; adding and updating a vertex; one edge is added and updated. There is no particular requirement for the data stored in the graph database, and only the most basic fields of the graph data need to be stored. The vertex data includes: a vertex id field; the "property" field may include a vertex property field (in a vertex data structure, other information may be stored using different fields besides the vertex id. where the information of vertex log may be merged into one field store, $ { property name } log may be merged into one field store per $). The edge data includes: an edge id field; a from vertex id field; a to vertex id field; a relation field; the edge attribute field (in the edge data structure, other information can be stored by using different fields besides the edge id, from version id, to version and relationship). In this, the information of the edge log can be merged into one field for storage, and each $ { property name } log can be merged into one field for storage). Various graph databases may be used including, but not limited to Neo4j, arango db, OrientDB, and the like. After one batch of data is processed, data Acceptance Confirmation (ACK) is carried out, and then the next batch of data is received and enters the next round of processing.
In the second section, a related data structure is introduced.
The data structure includes 3 parts: the data structure of message when source data is transmitted, the data structure of vertex and the data structure of edge in the graph database.
2.1.message data Structure
Namely, the source data transmission, the data structure used includes the following parts.
message id: and the message id ensures global uniqueness, and a UUID can be used for judging the repeatability of data and processing.
timing and map: the time when the message occurs, the chronology of the present scheme, means the time sequence of the field from far to near.
operation: data processing type, there are 8 processes in total.
data: the data processing related graph data and different data processing types relate to different specific data structures. The following describes the data structure corresponding to this field in detail when each data processing is described.
data structure of properties part of data:
properties:
property:
name is attribute name-1.
value-1 of the attribute.
property:
name is attribute name-2.
value-2 of the attribute.
...
property:
name is the attribute name-n.
value-n of the attribute.
2.2. vertex's data Structure
Data structure of vertex in graph database. Illustratively, $ { property name } represents the name of a vertex property, e.g., name, age. There may be multiple sets of $ { property name } value, $ { property name } available, and $ { property name } log.
vertex id: id of vertex.
vertex available: whether vertex is deleted, false indicates deleted, true indicates present.
vertex log: the processing log information of vertex includes first creation, latest update, deletion, and coverage related log information.
vertex create date time: the timestamp of the message when vertex is added.
vertex create message id: message id of message when vertex is added. This value is only set when vertex is added for the first time and is not modified thereafter, even if it is added again after deletion.
vertex update date time: the timestamp of the message when vertex was last updated. The initial value is vertex create date time.
vertex update message id: message id of message when vertex is updated recently. The initial value is the vertex create message id.
vertex delete datetime: the timestamp of the message when vertex deletes. This field may not be present.
vertex delete message id: message id of message when vertex deletes. This field may not be present.
vertex place date: the timestamp of the message when vertex covers. This field may not be present.
vertex place message id: message id of message when vertex overlays. This field may not be present.
$ Property name value: the value of property $ { property name }.
$ property name available: if the attribute $ { property name } is deleted, false indicates deleted and true indicates present.
$ Property name log: the processing log information of the attribute $ { property name }, including first creation, last update, deletion, overwriting related log information.
$ property name ] create date time: $ Property name, the timestamp of the message when added.
$ property name ] create message id: $ Property name, message id of message when added. This value is only set when $ { property name } is added for the first time, and is not modified later, even if it is added again after deletion.
$ property name } update date time: $ Property name $ timestamp of message when it was last updated. The initial value is $ { property name } create time.
$ property name } update message id: $ Property name $, the message id of the message when it was most recently updated. The initial value is $ { property name } create message id.
$ property name } delete date: $ Property name, time of message when deleted. This field may not be present.
vertex delete message id: $ Property name, message id of message when deleted. This field may not be present.
$ property name replace date: time of message when $ Property name covers. This field may not be present.
$ property name } replace message id: $ Property name $, message id of message when covered. This field may not be present.
For example, the following steps are carried out:
for example, a vertex is a person:
vertex id:506
vertex available:truevertex log:
vertex create datetime:2001-02-2810:05:23.613
vertex create message id:5432898950
vertex update datetime:2019-08-1713:53:09.97
vertex update message id:5894309905
name value: zhang three
name available:truename log:
name create datetime:2001-02-2810:05:23.613
name create message id:5432898950
name update datetime:2001-02-2810:05:23.613
name update message id:5432898950
Reduction value: university
education available:trueeducation log:
education create datetime:2001-02-2810:05:23.613
education create message id:5432898950
education update datetime:2019-08-1713:53:09.97
education update message id:5894309905
For example, another vertex is a course:
vertex id:7960
vertex available:truevertex log:
vertex create datetime:2005-12-2815:11:37.889
vertex create message id:5589049053
name value: college English
name available:truename log:
name create datetime:2005-12-2815:11:37.889
name create message id:5589049053
Data structure of edge
Data structures for edge in a graph database. Illustratively, $ { property name } represents the name of an edge property, e.g., score, rank. There may be multiple sets of $ { property name } value, $ { property name } available, and $ { property name } log.
edge id: id of edge. A group of from vertex id, to vertex id and relationship can only correspond to one edge id, and one edge id can be obtained through the character string combination of the from vertex id, to vertex id and relationship, for example, from vertex id + separator + to vertex id + separator + relationship, and the separator is a character which cannot appear in the from vertex id, to vertex id and relationship.
from vertex id: id of origin vertex of edge.
to vertex id: id of end-point vertex of edge.
And (2) relationship: the relationship between the starting vertex and the ending vertex.
edge available: whether edge is deleted, false indicates deleted, true indicates present.
edge log: the log information of edge is processed, including first creation, latest update, deletion, and overwriting related log information.
edge create data time: when edge is added, the timestamp of the message.
edge create message id: message id of message when edge is added. This value is only set when vertex is added for the first time and is not modified thereafter, even if it is added again after deletion.
edge update data time: the timestamp of the message when edge was last updated. The initial value is vertexcreate date.
edge update message id: message id of message when edge is updated recently. The initial value is the vertex create message id.
edge delete date: when edge deletes, timestamp of message. This field may not be present.
edge delete message id: message id of message when edge is deleted. This field may not be present.
edge display date: edge overrides the timestamp of the message. This field may not be present.
edge replace message id: message id of message when edge overrides. This field may not be present.
$ Property name value: the value of property $ { property name }.
$ property name available: if the attribute $ { property name } is deleted, false indicates deleted and true indicates present.
$ Property name log: the processing log information of the attribute $ { property name }, including first creation, last update, deletion, overwriting related log information.
$ property name ] create date time: $ Property name, the timestamp of the message when added.
$ property name ] create message id: $ Property name, message id of message when added. This value is only set when $ { property name } is added for the first time, and is not modified later, even if it is added again after deletion.
$ property name } update date time: $ Property name $ timestamp of message when it was last updated. The initial value is $ { property name } create time.
$ property name } update message id: $ Property name $, the message id of the message when it was most recently updated. The initial value is $ { property name } create message id.
$ property name } delete date: $ Property name, time of message when deleted. This field may not be present.
vertex delete message id: $ Property name, message id of message when deleted. This field may not be present.
$ property name replace date: time of message when $ Property name covers. This field may not be present.
$ property name } replace message id: $ Property name $, message id of message when covered. This field may not be present.
For example, the following steps are carried out:
for example, an edge is a person learning a course:
edge id:26171
from vertex id:506
to vertex id:7960
relation:study
edge available:true
edge log:
edge create datetime:2019-06-1517:00:00.59
edge create message id:7099812
edge update datetime:2019-09-0515:00:01.237
edge update message id:7234456
score value:86
score available:true
score log:
score create datetime:2019-06-1517:00:00.59
score create message id:7099812
score update datetime:2019-09-0515:00:01.237
score update message id:7234456
score delete datetime:2019-07-1514:22:51.9
score delete message id:7134112
score replace datetime:2019-09-0515:00:01.237
score replace message id:7234456
rank value:13
rank available:true
rank log:
score create datetime:2019-06-1517:00:00.59
score create message id:7099812
score update datetime:2019-09-0515:00:01.237
score update message id:7234456
in the third section, the graph data processing procedure is described.
The data processing types are divided into two categories, namely vertex-related data processing and edge-related data processing. For parallel processing, with vertex as the minimum unit, different vertex are allowed to be processed simultaneously, and edge is regarded as processing 2 vertex simultaneously. Before processing, the vertex id is locked. If the process involves multiple vertexes, then these vertexes are sorted by vertex id and then the vertex id is locked in this order. Locking in this manner can avoid the problem of deadlock in concurrent locking. With distributed locks, the implementation of distributed locks can be in redis.
In some embodiments, the locking the data number corresponding to the incremental data includes:
when the increment data is vertex, the vertex id is locked.
In some embodiments, the updating the corresponding data in the graph database according to the operation type and the determination result includes:
when the operation type is adding or covering, if the judgment result is that the operation type does not exist, entering a sub-process of adding vertex; if the judgment result is that the flow exists, entering a sub-process covering vertex;
and when the operation type is deletion, if the judgment result is that the operation type exists, entering a subprocess for deleting vertex, and updating the vertex log.
3.1. Related data processing
3.1.1.vertex related data processing:
add or overwrite vertex (add or replace vertex). If this occurs after the vertex is deleted, it indicates that the vertex is re-added by means of overwriting.
Delete vertex (delete vertex). And deleting the logic.
Add or overlay the property of vertex (add) or vertex properties). If this occurs after the vertex attribute is deleted, it indicates that the vertex attribute is re-added by means of overwriting.
Delete the property of vertex (delete vertex properties). And deleting the logic.
Edge-related data processing:
add or overwrite an edge (add) edge. If this occurs after deleting an edge, it indicates that the edge is re-added by means of overwriting.
Delete edge (delete edge). And deleting the logic.
Delete 1 edge (delete edge byfrom version id, to version id and relationship) by from version id, to version id and relationship.
Delete multiple edges (delete edge by from vertex id and to vertex id) by from vertex id and to vertex id.
Delete multiple edges (delete edge by from vertex id ortho vertex id) by from vertex id or to vertex id.
Add or overwrite the attribute of the edge (add or replace edge property). If this occurs after the edge attribute is deleted, it indicates that the edge attribute has been re-added by means of overwriting.
Delete the attribute of edge (delete edge property). And deleting the logic.
3.2. Process for adding or overlaying vertex
data structure:
vertex id:vertex id。
properties refers to the data structure of the properties part of the data in the data structure of messge.
The specific process is as follows:
1. the vertex id is locked using a distributed lock.
2. Calling the API of the graph database, and inquiring vertex according to vertex id.
3. It is determined whether vertex already exists.
3.1. If vertex does not exist, the sub-flow of adding vertex is entered.
3.2. If vertex already exists, the sub-flow covering vertex is entered.
4. The vertex id is unlocked using a distributed lock.
In some embodiments, the sub-flow of adding vertex includes:
setting the vertex id of the vertex as the vertex id of the message;
setting available of vertex as true;
traversing properties of the message to generate properties corresponding to vertex;
updating the vertex log;
vertex data is added to the graph database.
3.2.1. sub-Process of adding vertex
1.Vertex data is generated.
1.1. The vertex id of this vertex is set to the vertex id of the message.
1.2. The available of this vertex is set to true.
1.3. And traversing the properties of the message to generate the properties corresponding to the vertex.
1.3.1. Set $ { property name } for this vertex to the name of the property for the message.
1.3.2. The value of $ { property name } value of this vertex is set to the value of property of message.
1.3.3. When entering the process of adding property, the property log updates the sub-process.
1.4. And when the vertex is added, updating the sub-flow by the vertex log.
2. Calling the API of the graph database, and adding vertex data to the graph database.
In some embodiments, the sub-flow of the overlay vertex includes:
comparing the vertex update message id of vertex with the message id of the message, if the two message ids are the same, ending the process;
traversing the properties of the message, and searching whether $ { property name } of the vertex is the same as the name of the property of the message exists in the properties of the vertex;
property of vertex, if any, is overridden; if not, adding a property in the vertex;
traversing properties of vertex, and comparing $ { property name } update message id of vertex with message id of message;
if the two message ids are not the same, comparing the time of the message with $ { propertylame } update date time of vertex;
and if the timestamp time of the message is later, updating the related information.
3.2.2. Universal sub-process covering vertex/edge
The following modifications to the vertex/edge data are based on the existing vertex/edge data.
1. The vertex/edge update message id of vertex is compared with the message id of message.
1.1. If the two message ids are the same, indicating that the messages are repeated, the process ends.
2. And traversing properties of the message.
2.1. In the properties of vertex/edge, whether $ { property name of vertex/edge is the same as the name of property of message exists or not.
2.1.1. If so, the property of vertex/edge is overridden.
2.1.1.1. Compare the message's timestamp with $ { property name } update date of this vertex/edge.
2.1.1.1.1. If the timestamp time of the message is later, the message is in a normal time sequence, and relevant information is updated.
2.1.1.1.1.1. The value of $ { property name } value of this vertex/edge is set to the value of property of message.
2.1.1.1.2. If the timestamp time of the message is earlier, the message is an abnormal time sequence, and the related information does not need to be updated.
2.1.1.2. And when the property is covered, updating the sub-process by the property log.
2.1.2. If not, a property is added to this vertex/edge.
2.1.2.1. Set $ { property name } for this vertex/edge to the name of the property for the message.
2.1.2.2. The value of $ { property name } value of this vertex/edge is set to the value of property of message.
2.1.2.3. When entering the process of adding property, the property log updates the sub-process.
3. And traversing properties of vertex/edge.
3.1. Compares $ { property name } update message id of vertex/edge with the message id of message.
3.1.1. If the two message ids are the same, it indicates that the property of the vertex/edge has been processed by the previous step.
3.1.2. If the two message ids are not the same.
3.1.2.1. Compare the message's timestamp with $ { property name } update date of this vertex/edge.
3.1.2.1.1. If the timestamp time of the message is later, the message is in a normal time sequence, and relevant information is updated.
3.1.2.1.1.1. The value of $ { property name } value for this vertex/edge is set to null.
3.1.2.1.2. If the timestamp time of the message is earlier, the message is an abnormal time sequence, and the related information does not need to be updated.
3.1.2.2. And when entering the deletion of the property, updating the sub-process by the property log.
4. And when entering the coverage of vertex/edge, updating the sub-flow by vertex log/edge log.
5. Calling the API of the graph database, and updating vertex/edge data to the graph database.
3.3. Flow for deleting vertex
This process does not affect the property information of vertex.
data structure:
vertex id:vertex id。
the specific process is as follows:
1. the vertex id is locked using a distributed lock.
2. Calling the API of the graph database, and inquiring vertex according to vertex id.
3. It is determined whether vertex already exists.
3.1. If vertex already exists.
3.1.1. And when the vertex is deleted, updating the sub-process by using the vertex log.
4. The vertex id is unlocked using a distributed lock.
3.4. Flow of adding or overwriting edges
data structure:
from vertex id: id of origin vertex of edge.
to vertex id: id of end-point vertex of edge.
And (2) relationship: the relationship between the starting vertex and the ending vertex.
Properties refers to the data structure of the properties part of the data in the data structure of messge.
In some embodiments, the locking the data number corresponding to the incremental data includes:
and when the incremental data is edge, sequencing the starting vertex id and the end vertex id of the edge, and locking the starting vertex id and the end vertex id according to the sequence.
In some embodiments, the updating the corresponding data in the graph database according to the operation type and the determination result includes:
when the operation type is adding or covering, if the judgment result is that the operation type does not exist, entering a sub-process of adding edge; if the judgment result is that the edge exists, entering a sub-process covering the edge;
and when the operation type is deletion, deleting the corresponding edge data according to the additional condition.
The specific process is as follows:
1. and sequencing a starting vertex id and a finishing vertex id of the edge, and locking the vertex ids by using a distributed lock according to the sequence.
2. And calling an API of the graph database, and inquiring the edge according to the from vertex id, the to vertex id and the relationship.
3. And judging whether edge exists.
3.1. If edge does not exist, enter the sub-flow of adding edge.
3.2. If an edge already exists, the sub-flow that overrides the edge is entered.
4. The vertex id is unlocked using a distributed lock.
In some embodiments, the add edge sub-flow includes:
setting the from vertex id of the edge as the from vertex id of the message;
setting the to vertex id of the edge as the to vertex id of the message;
setting the relationship of the edge as the relationship of the message;
setting available of the edge as true;
traversing properties of the message to generate properties corresponding to the edge;
updating the edge log;
adding edge data to the graph database.
3.4.1. sub-Process of adding edge
1. Edge data is generated.
1.1. The from vertex id of this edge is set to the from vertex id of the message.
1.2. The to vertex id of this edge is set to the to vertex id of the message.
1.3. The relationship of this edge is set to the relationship of the message.
1.4. The available of this edge is set to true.
1.5. And traversing the properties of the message to generate the properties corresponding to the edge.
1.5.1. Set $ { property name } for this edge to the name of property for the message.
1.5.2. The value of $ { property name } value of this vertex is set to the value of property of message.
1.5.3. When entering the process of adding property, the property log updates the sub-process.
1.6. And when the edge is added, the edge log updates the sub-flow.
2. Calling an API of the graph database, and adding edge data to the graph database.
3.5. Delete edge
The processing does not affect the property information of vertex, and the deletion of edge is divided into three seed treatments: delete 1 edge by from vertex id, to vertex id, and relationship, delete multiple edges by from vertex id and to vertex id, and delete multiple edges by from vertex id or to vertex id. Note that when multiple edges are deleted, the edge must be queried again from the graph database after locking, so as to avoid dirty writing (i.e., after multiple edges are queried, before 2 vertex locks on 1 of the edges, during this time, if other processes update the edge, the information of the edge changes, and the information of the edge should be retrieved).
In some embodiments, the deleting the corresponding edge data according to the additional condition includes:
if the additional conditions are from vertex id, to vertex id and relationship, deleting the corresponding 1 edg data;
if the additional conditions are from vertex id and to vertex id, deleting a plurality of corresponding edg data;
if the additional condition is from vertex id or to vertex id, the corresponding plurality of edg data is deleted.
3.5.1. Flow for deleting 1 edge through from vertex id, to vertex id and relationship
data structure:
from vertex id: id of origin vertex of edge.
to vertex id: id of end-point vertex of edge.
And (2) relationship: the relationship between the starting vertex and the ending vertex.
The process comprises the following steps:
1. and sequencing a starting vertex id and a finishing vertex id of the edge, and locking the vertex ids by using a distributed lock according to the sequence.
2. Calling the API of the graph database, and inquiring the vertex according to the starting vertex id, the ending vertex id and the relationship.
3. And judging whether edge exists.
3.1. If edge already exists.
3.1.1. And sequencing the vertex ids of the starting point vertex and the ending point vertex of the edge, and locking the vertex ids by using a distributed lock according to the sequence.
3.1.2. And when the edge is deleted, the edge log updates the sub-process.
3.1.3. The vertex id is unlocked using a distributed lock.
4. The vertex id is unlocked using a distributed lock.
3.5.2. Deleting the flow data structures of a plurality of edges through from vertex id and to vertex id:
from vertex id: id of origin vertex of edge.
to vertex id: id of end-point vertex of edge.
The process comprises the following steps:
1. calling an API of the graph database, and inquiring related edge according to the starting vertex id and the ending vertex id.
2. And traversing the queried edge.
2.1. And sequencing the vertex ids of the starting point vertex and the ending point vertex of the edge, and locking the vertex ids by using a distributed lock according to the sequence.
2.2. And when the edge is deleted, the edge log updates the sub-process.
2.3. The vertex id is unlocked using a distributed lock.
3.5.3. Deleting the flow data structures of a plurality of edges through from vertex id or to vertex id:
from vertex id: id of origin vertex of edge.
Or
to vertex id: id of end-point vertex of edge.
The process comprises the following steps:
1. calling the API of the graph database, and inquiring related edge according to the starting vertex id or the ending vertex id.
2. And traversing the queried edge.
2.1. And sequencing the vertex ids of the starting point vertex and the ending point vertex of the edge, and locking the vertex ids by using a distributed lock according to the sequence.
2.2. And when the edge is deleted, the edge log updates the sub-process.
2.3. The vertex id is unlocked using a distributed lock.
3.6 Universal sub-Process of vertex log/edge log/property log
Including updates to the vertex/edge/$ { property name } available field.
3.6.1. Updating sub-process of vertex log/edge log/property log when vertex/edge/property is added
1. The message id of the message is updated from vertex/edge/$ { property name } create.
2. The version/edge/$ { property name } create date is updated to the time of the message.
3. The message id of the message is updated from vertex/edge/$ { property name } update.
4. The version/edge/$ { property name } update date time is updated to the time of the message.
3.6.2. When the vertex/edge/property is covered, updating the sub-process of vertex log/edge log/property log
1. The time of the message is compared to the vertex/edge/$ { property name } createtatime.
1.1. If the timestamp time of the message is later, the message is in a normal time sequence, and the related information does not need to be updated.
1.2. If the timestamp time of the message is earlier, the message is an abnormal time sequence, and relevant information is updated.
1.2.1. The version/edge/$ { property name } create date is updated to the time of the message.
1.2.2. The message id of the message is updated from vertex/edge/$ { property name } create.
2. The time of the message is compared to vertex/edge/$ { property name } updatetime.
2.1. If the timestamp time of the message is later, the message is in a normal time sequence, and relevant information is updated.
2.1.1. The version/edge/$ { property name } update date time is updated to the time of the message.
2.1.2. The message id of the message is updated from vertex/edge/$ { property name } update.
2.2. If the timestamp time of the message is earlier, the message is an abnormal time sequence, and the related information does not need to be updated.
3. The time of the message is compared to the vertex/edge/$ { property name } replayatetime.
3.1. If the vertex/edge/$ (property name) replace date does not exist or the time of the message is later, the time is a normal time sequence, and the relevant information is updated.
3.1.1. The version/edge/$ { property name } replace message id is updated to the message id of the message.
3.1.2. This vertex/edge/$ { property name } replace datatime is updated to the time map of the message.
3.2. If the timestamp time of the message is earlier, the message is an abnormal time sequence, and the related information does not need to be updated.
4. The timestamp of the message is compared to vertex/edge/$ { property name } deletedetime.
4.1. If vertex/edge/$ { property name } delete date exists, and
the message is deleted when the timestamp is later
The version/edge/$ { property name } is added again later.
4.1.1. The vertex/edge/$ { property name } available is updated to true.
3.6.3. When delete vertex/edge/property, update sub-process of vertex log/edge log/property log
1. The time of the message is compared to vertex/edge/$ { property name } updatetime.
1.1. If the timestamp time of the message is later, the message is in a normal time sequence, and relevant information is updated.
1.1.1. The version/edge/$ { property name } update date time is updated to the time of the message.
1.1.2. The message id of the message is updated from vertex/edge/$ { property name } update.
1.2. If the timestamp time of the message is earlier, the message is an abnormal time sequence, and the related information does not need to be updated.
2. The time of the message is compared to the vertex/edge/$ { property name } replayatetime.
2.1. If vertex/edge/$ { property name } replace datatime does not exist, or
The message is updated with the timing later, which shows that the time sequence is normal.
2.1.1. This vertex/edge/$ { property name } available is updated to false.
2.1.2. The version/edge/$ { property name } delete message id is updated to the message id of the message.
2.1.3. This vertex/edge/$ { property name } delete datatime is updated to the time map of the message.
2.2. If the timestamp time of the message is earlier, the message is an abnormal time sequence, and the related information does not need to be updated.
3. The timestamp of the message is compared to vertex/edge/$ { property name } deletedetime.
3.1. If vertex/edge/$ { property name } delete date exists, and
the message is deleted again when the timestamp time is later.
3.1.1. The version/edge/$ { property name } delete message id is updated to the message id of the message.
3.1.2. This vertex/edge/$ { property name } delete datatime is updated to the time map of the message.
According to the scheme, the creation time, the latest modification time, the deletion time, the coverage time and the corresponding message id of the vertex/edge/property are recorded through the vertex/edge/property log. The message id is used to determine whether the same message has been processed. The creation time, the latest modification time, the deletion time and the coverage time are combined together to solve the time sequence problem together, namely, whether the currently processed time sequence is correct is judged firstly, and then the corresponding flow designed by the scheme is adopted for processing, so that the final processing result of the time sequence is the same as that of the normal time sequence even if the time sequence is abnormal.
For example, there are 2 messages (message-1, message-2) that are the same $ { property name } value (score value) that updates the same vertex (vertex id is 1200), but the values of $ { property name } values to be updated (one is 80.5, and the other is 60.9) are different. If the sequence is normal, message-1 is processed first, message-2 is processed later, and score value is updated to 60.9 finally. If the time sequence is abnormal, the message-2 is processed first, and the message-1 is processed later. After the message-2 is processed, the score update time is updated to 2019-10-0313:19: 26.152. When processing the message-1, according to the designed flow, firstly comparing the timestamp of the message-2 with the score update date, finding that the timestamp time of the message-2 is earlier and is an abnormal time sequence, and then not updating the score value. Thus the final processing results of the normal timing and the abnormal timing are identical, including scorelog. Thus, both chronology and idempotency are achieved.
message-1:
message id:187374
timestamp:2019-10-0309:30:29.374
operation: adding or covering attribute (add or replace vertex properties) data of vertex:
vertex id:1200
properties:
property:
name:score
value:80.5
message-2:
message id:187481
timestamp:2019-10-0313:19:26.152
operation: adding or covering attribute (add or replace vertex properties) data of vertex:
vertex id:1200
properties:
property:
name:score
value:60.9
with vertex as the minimum unit, different vertex are allowed to be processed simultaneously, and edge is regarded as processing 2 vertex simultaneously. By locking the vertex id, dirty reading or dirty writing is avoided when the same vertex or edge is processed at the same time. The concurrency is processed by taking vertex as the minimum unit, so that the conflict can be reduced as small as possible, and the actual parallelism during distributed parallel processing is effectively increased. Generally, as the amount of graph data processing increases, the data of the graph database also increases, i.e., vertex and edge also increase. When the processing amount of the graph data is increased, if the parallelism of the processing is also increased, the probability of collision is not increased linearly. Therefore, the processing amount of the unit node can be reduced in a transverse expansion mode, the delay is reduced, and the real-time performance is guaranteed.
In conclusion, the scheme simultaneously realizes the distribution, the real-time performance, the time sequence performance and the idempotent performance of the incremental updating process from the source data to the graph database. The method supports distributed parallel transmission and processing of data (simultaneously supports multiple processes and multiple threads), regardless of the sequence of data transmission and processing and whether the transmission and processing are repeated, as long as all related data are successfully transmitted and processed at least once, and the final result is ensured to be completely consistent with the result of serial execution of the data according to the time sequence.
The present application further provides the following embodiments:
a data updating apparatus for a graph database, comprising:
the determining module is used for determining the incremental data and the operation type of the updating operation;
the locking module is used for locking the data number corresponding to the incremental data by using a distributed lock;
the query module is used for calling a query interface of the graph database and querying whether the data number has corresponding historical data;
the judging module is used for updating corresponding data in the graph database according to the operation type and the judging result;
and the unlocking module is used for unlocking the data numbers after the updating operation is finished.
With regard to the apparatus in the above embodiment, the specific steps in which the respective modules perform operations have been described in detail in the embodiment related to the method, and are not described in detail herein.
It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.
It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (10)

1. A method for updating data in a graph database, comprising:
determining incremental data and operation types of the updating operation;
locking the data number corresponding to the incremental data by using a distributed lock;
calling a query interface of a graph database, and querying whether corresponding historical data exist in the data number;
updating corresponding data in the graph database according to the operation type and the judgment result;
and after the updating operation is completed, unlocking the data number.
2. The method of claim 1, wherein locking the data number corresponding to the incremental data comprises:
when the increment data is vertex, the vertex id is locked.
3. The method according to claim 2, wherein said updating the corresponding data in the graph database according to the operation type and the determination result comprises:
when the operation type is adding or covering, if the judgment result is that the operation type does not exist, entering a sub-process of adding vertex; if the judgment result is that the flow exists, entering a sub-process covering vertex;
and when the operation type is deletion, if the judgment result is that the operation type exists, entering a subprocess of deleting vertex, and updating vertexlog.
4. The method of claim 3, wherein the sub-process of adding vertex comprises:
setting the vertex id of the vertex as the vertex id of the message;
setting available of vertex as true;
traversing properties of the message to generate properties corresponding to vertex;
updating the vertex log;
vertex data is added to the graph database.
5. The method of claim 3, wherein the sub-flow of the overlay vertex comprises:
comparing the vertex update message id of vertex with the message id of the message, if the two message ids are the same, ending the process;
traversing the properties of the message, and searching whether $ { property name } of the vertex is the same as the name of the property of the message exists in the properties of the vertex;
property of vertex, if any, is overridden; if not, adding a property in the vertex;
traversing properties of vertex, and comparing $ { property name } update message id of vertex with message id of message;
if the two message ids are not the same, comparing the time of the message with $ { property name } update date of vertex;
and if the timestamp time of the message is later, updating the related information.
6. The method of claim 2, wherein locking the data number corresponding to the incremental data comprises:
and when the incremental data is edge, sequencing the starting vertex id and the end vertex id of the edge, and locking the starting vertex id and the end vertex id according to the sequence.
7. The method according to claim 6, wherein said updating the corresponding data in the graph database according to the operation type and the determination result comprises:
when the operation type is adding or covering, if the judgment result is that the operation type does not exist, entering a sub-process of adding edge; if the judgment result is that the edge exists, entering a sub-process covering the edge;
and when the operation type is deletion, deleting the corresponding edge data according to the additional condition.
8. The method of claim 7, wherein the adding edge sub-flow comprises:
setting the from vertex id of the edge as the from vertex id of the message;
setting the to vertex id of the edge as the to vertex id of the message;
setting the relationship of the edge as the relationship of the message;
setting available of the edge as true;
traversing properties of the message to generate properties corresponding to the edge;
updating the edge log;
adding edge data to the graph database.
9. The method of claim 6, wherein deleting the corresponding edge data according to the additional condition comprises:
if the additional conditions are from vertex id, to vertex id and relationship, deleting the corresponding 1 edg data;
if the additional conditions are from vertex id and to vertex id, deleting a plurality of corresponding edg data;
if the additional condition is from vertex id or to vertex id, the corresponding plurality of edg data is deleted.
10. An apparatus for updating data in a graph database, comprising:
the determining module is used for determining the incremental data and the operation type of the updating operation;
the locking module is used for locking the data number corresponding to the incremental data by using a distributed lock;
the query module is used for calling a query interface of the graph database and querying whether the data number has corresponding historical data;
the judging module is used for updating corresponding data in the graph database according to the operation type and the judging result;
and the unlocking module is used for unlocking the data numbers after the updating operation is finished.
CN202010241791.2A 2020-03-31 2020-03-31 Data updating method and device for graph database Pending CN111309750A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010241791.2A CN111309750A (en) 2020-03-31 2020-03-31 Data updating method and device for graph database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010241791.2A CN111309750A (en) 2020-03-31 2020-03-31 Data updating method and device for graph database

Publications (1)

Publication Number Publication Date
CN111309750A true CN111309750A (en) 2020-06-19

Family

ID=71146053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010241791.2A Pending CN111309750A (en) 2020-03-31 2020-03-31 Data updating method and device for graph database

Country Status (1)

Country Link
CN (1) CN111309750A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015819A (en) * 2020-08-31 2020-12-01 杭州欧若数网科技有限公司 Data updating method, device, equipment and medium for distributed graph database
CN112860953A (en) * 2021-01-27 2021-05-28 国家计算机网络与信息安全管理中心 Data importing method, device, equipment and storage medium of graph database
CN116028651A (en) * 2023-03-28 2023-04-28 南京万得资讯科技有限公司 Knowledge graph construction system and method supporting ontology and data increment updating

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101726309A (en) * 2009-12-18 2010-06-09 吉林大学 Navigation electronic map dynamic topology rebuilding system method based on road data increment updating
CN102693324A (en) * 2012-01-09 2012-09-26 西安电子科技大学 Distributed database synchronization system, synchronization method and node management method
US20130110766A1 (en) * 2009-10-13 2013-05-02 Open Text Software Gmbh Method for performing transactions on data and a transactional database
CN106354729A (en) * 2015-07-16 2017-01-25 阿里巴巴集团控股有限公司 Graph data handling method, device and system
CN107967279A (en) * 2016-10-19 2018-04-27 北京国双科技有限公司 The data-updating method and device of distributed data base
US20180144060A1 (en) * 2016-11-23 2018-05-24 Linkedin Corporation Processing deleted edges in graph databases
CN108415835A (en) * 2018-02-22 2018-08-17 北京百度网讯科技有限公司 Distributed data library test method, device, equipment and computer-readable medium
CN108595251A (en) * 2018-05-10 2018-09-28 腾讯科技(深圳)有限公司 Dynamic Graph update method, device, storage engines interface and program medium
CN109033234A (en) * 2018-07-04 2018-12-18 中国科学院软件研究所 It is a kind of to update the streaming figure calculation method and system propagated based on state
CN109582831A (en) * 2018-10-16 2019-04-05 中国科学院计算机网络信息中心 A kind of chart database management system for supporting unstructured data storage and inquiry
CN109670089A (en) * 2018-12-29 2019-04-23 颖投信息科技(上海)有限公司 Knowledge mapping system and its figure server
CN110609904A (en) * 2019-09-11 2019-12-24 深圳众赢维融科技有限公司 Graph database data processing method and device, electronic equipment and storage medium
CN110866024A (en) * 2019-11-06 2020-03-06 山东省国土测绘院 Vector database increment updating method and system

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130110766A1 (en) * 2009-10-13 2013-05-02 Open Text Software Gmbh Method for performing transactions on data and a transactional database
CN101726309A (en) * 2009-12-18 2010-06-09 吉林大学 Navigation electronic map dynamic topology rebuilding system method based on road data increment updating
CN102693324A (en) * 2012-01-09 2012-09-26 西安电子科技大学 Distributed database synchronization system, synchronization method and node management method
CN106354729A (en) * 2015-07-16 2017-01-25 阿里巴巴集团控股有限公司 Graph data handling method, device and system
CN107967279A (en) * 2016-10-19 2018-04-27 北京国双科技有限公司 The data-updating method and device of distributed data base
US20180144060A1 (en) * 2016-11-23 2018-05-24 Linkedin Corporation Processing deleted edges in graph databases
CN108415835A (en) * 2018-02-22 2018-08-17 北京百度网讯科技有限公司 Distributed data library test method, device, equipment and computer-readable medium
CN108595251A (en) * 2018-05-10 2018-09-28 腾讯科技(深圳)有限公司 Dynamic Graph update method, device, storage engines interface and program medium
CN109033234A (en) * 2018-07-04 2018-12-18 中国科学院软件研究所 It is a kind of to update the streaming figure calculation method and system propagated based on state
CN109582831A (en) * 2018-10-16 2019-04-05 中国科学院计算机网络信息中心 A kind of chart database management system for supporting unstructured data storage and inquiry
CN109670089A (en) * 2018-12-29 2019-04-23 颖投信息科技(上海)有限公司 Knowledge mapping system and its figure server
CN110609904A (en) * 2019-09-11 2019-12-24 深圳众赢维融科技有限公司 Graph database data processing method and device, electronic equipment and storage medium
CN110866024A (en) * 2019-11-06 2020-03-06 山东省国土测绘院 Vector database increment updating method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015819A (en) * 2020-08-31 2020-12-01 杭州欧若数网科技有限公司 Data updating method, device, equipment and medium for distributed graph database
CN112860953A (en) * 2021-01-27 2021-05-28 国家计算机网络与信息安全管理中心 Data importing method, device, equipment and storage medium of graph database
CN116028651A (en) * 2023-03-28 2023-04-28 南京万得资讯科技有限公司 Knowledge graph construction system and method supporting ontology and data increment updating

Similar Documents

Publication Publication Date Title
US11321303B2 (en) Conflict resolution for multi-master distributed databases
USRE48589E1 (en) Sharing and deconflicting data changes in a multimaster database system
CN111309750A (en) Data updating method and device for graph database
US7120651B2 (en) Maintaining a shared cache that has partitions allocated among multiple nodes and a data-to-partition mapping
JP4301937B2 (en) Read consistently in a distributed database environment
US6957236B1 (en) Providing a useable version of a data item
US6845384B2 (en) One-phase commit in a shared-nothing database system
EP0362709A2 (en) Method for obtaining access to data structures without locking
US20070226269A1 (en) Method and system for an update synchronization of a domain information file
EP4276651A1 (en) Log execution method and apparatus, and computer device and storage medium
CN111444027A (en) Transaction processing method and device, computer equipment and storage medium
Perrin et al. Update consistency for wait-free concurrent objects
US10860402B2 (en) Long-running storage manageability operation management
JPH0458743B2 (en)
CN114328591A (en) Transaction execution method, device, equipment and storage medium
CN113420006A (en) Data migration method and device, electronic equipment and computer readable medium
US8838910B2 (en) Multi-part aggregated variable in structured external storage
CN117435574B (en) Improved two-stage commit transaction implementation method, system, device and storage medium
US11556512B2 (en) Systems and methods for artifact peering within a multi-master collaborative environment
CN116149803A (en) Transaction processing method and device of database, electronic equipment and storage medium
CN114297043A (en) Log packet replay method and device, electronic equipment and storage medium
CN114791837A (en) Method, device, equipment and medium for processing data of multiple centers of database
CN114722125A (en) Database transaction processing method, device, equipment and computer readable medium
KR100426316B1 (en) Method for testing of global serializability in eager replication technique using massaging order for group communication
CN118152398A (en) Data storage method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200619