CA3176758A1 - Method and apparatus for introducing data to a graph database - Google Patents
Method and apparatus for introducing data to a graph databaseInfo
- Publication number
- CA3176758A1 CA3176758A1 CA3176758A CA3176758A CA3176758A1 CA 3176758 A1 CA3176758 A1 CA 3176758A1 CA 3176758 A CA3176758 A CA 3176758A CA 3176758 A CA3176758 A CA 3176758A CA 3176758 A1 CA3176758 A1 CA 3176758A1
- Authority
- CA
- Canada
- Prior art keywords
- spark
- graph database
- data
- udf
- introducing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000000638 solvent extraction Methods 0.000 claims abstract description 11
- 230000003068 static effect Effects 0.000 claims description 9
- 230000008676 import Effects 0.000 abstract description 5
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000005192 partition Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000003780 insertion Methods 0.000 description 3
- 230000037431 insertion Effects 0.000 description 3
- 101000600779 Homo sapiens Neuromedin-B receptor Proteins 0.000 description 2
- 102100037283 Neuromedin-B receptor Human genes 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 101150077233 Nmbr gene Proteins 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000009849 deactivation Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A graph database data import method and apparatus, the method comprising: registering a custom spark udf function to a graph database program, so that the graph database establishes a connection with a spark resource by means of the spark udf function (S1); creating a node attribute index in the graph database (S2); using the spark resource to query a hive database, and acquiring queried data (S3); after re-partitioning, registering the queried data to a temporary data table (S4); and, by means of the spark udf function and the node attribute index, importing the temporary data table to the graph database (S5). Real-time import of data can be implemented by means of using the combination of spark and a graph database, without the need to export data in a csv format; the use of spark technology facilitates spark performance optimisation and data import speed adjustment; and, by means of using the concurrency feature of spark, data import speed can be increased without data loss.
Description
METHOD AND APPARATUS FOR INTRODUCING DATA TO A GRAPH DATABASE
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the technical field of data processing, and more particularly to a method and an apparatus for introducing data to a graph database.
Description of Related Art
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the technical field of data processing, and more particularly to a method and an apparatus for introducing data to a graph database.
Description of Related Art
[0002] Spark is a data processing technique based on clusters and memory. It is able to process massive data when having plural machines assembled together, and can be integrated with graph computing frameworks to compute data. Spark can not only be integrated in different ways but also pre-process data (through aggregation, filtration, and conversion) and introduce the pre-processed data into graph databases.
[0003] In the prior art, there are several ways to introduce data into graph databases, including compiling create languages, loading CSV languages, and using a batch inserter, batch import, and Neo4j -import. Except for create languages, all of these ways have one thing in common that they require the file of interest to be converted into the csv format, which is a trouble in real-world production- and development-related environments.
For example, in a production-related environment, data are usually confidential and thus it is not feasible to most companies to export data from the production-related environment into a csv file, and this approach does not support real-time insertion.
Furthermore, this is particularly impractical for massive data. As to the latter three, they are incapable of real-time introduction. To be specific, introduction of data requires deactivation of a neo4j (a type of graph databases) server, and thus is inherently impossible to be real-time.
For example, in a production-related environment, data are usually confidential and thus it is not feasible to most companies to export data from the production-related environment into a csv file, and this approach does not support real-time insertion.
Furthermore, this is particularly impractical for massive data. As to the latter three, they are incapable of real-time introduction. To be specific, introduction of data requires deactivation of a neo4j (a type of graph databases) server, and thus is inherently impossible to be real-time.
[0004] Hence, how to introduce data into a graph database with increased speed is crucial to construction of graphs, and is a pressing issue to address in the art.
Date Regue/Date Received 2022-09-23 SUMMARY OF THE INVENTION
Date Regue/Date Received 2022-09-23 SUMMARY OF THE INVENTION
[0005] For addressing the issues of the prior art, embodiments of the present invention provide a method and an apparatus for introducing data to a graph database, which overcome the problems of the prior art such as the necessity of converting data into the csv format before the data can be introduced into a graph database and the incapability of adjusting the speed of data introduction.
[0006] To solve the foregoing one or more technical problems, the present invention adopts the following technical schemes.
[0007] In one aspect, the present invention provides a method for introducing data to a graph database. The method comprising the following steps:
[0008] registering a user-defined spark udf to a graph database program, so that the graph database is connected to a spark resource through the spark udf;
[0009] creating node attribute indexes in the graph database;
[0010] using the spark resource to make enquiry to a hive database to acquire enquiry-generated data;
[0011] re-partitioning the enquiry-generated data and registering them as a temporary data table;
and
and
[0012] introducing the temporary data table to the graph database using the spark udf and the node attribute indexes.
[0013] Further, before the step of registering a user-defined spark udf to a graph database program, the method further comprises:
[0014] setting up a driver in the spark udf for connecting the graph database and writing it in a static method.
[0015] Further, before the step of registering a user-defined spark udf to a graph database program the method further comprises:
[0016] defining parameters for exporting and importing the spark udf.
[0017] Further, after the step of introducing the temporary data table to the graph database the method further comprises:
Date Regue/Date Received 2022-09-23
Date Regue/Date Received 2022-09-23
[0018] turning off the driver of the graph database and the spark resource.
[0019] Further, the step of using the spark resource to make enquiry to a hive database comprises:
[0020] using a reduce operator of the spark resource to perform corresponding computation on the enquiry-generated data.
[0021] In another aspect, the present invention provides an apparatus for introducing data to a graph database. The apparatus comprises:
[0022] a connecting module, for registering a user-defined spark udf to a graph database program, so that the graph database is connected to a spark resource through the spark udf;
[0023] a creating module, for creating node attribute indexes in the graph database;
[0024] an enquiring module, for using the spark resource to make enquiry to a hive database to acquire enquiry-generated data;
[0025] a partitioning module, for re-partitioning the enquiry-generated data and registering them as a temporary data table; and
[0026] an introducing module, for introducing the temporary data table to the graph database using the spark udf and the node attribute indexes.
[0027] Further, the apparatus further comprises:
[0028] a driver connecting module, for setting up a driver in the spark udf for connecting the graph database and writing it in a static method.
[0029] Further, the apparatus further comprises:
[0030] a configuring module, for defining parameters for exporting and importing the spark udf.
[0031] Further, the apparatus further comprises:
[0032] a deactivating module, for turning off the driver of the graph database and the spark resource
[0033] Further, the step of using the spark resource to make enquiry to a hive database comprises:
[0034] using a reduce operator of the spark resource to perform corresponding computation on the enquiry-generated data.
[0035] The technical schemes of the embodiments of the present invention provide the following beneficial effects:
[0036] 1. The method and apparatus for introducing data to a graph database of the embodiments Date Regue/Date Received 2022-09-23 of the present invention use the combination of spark and graph databases to realize real-time introduction of data while eliminating the need of exporting data into the csv format;
[0037] 2. The method and apparatus for introducing data to a graph database of the embodiments of the present invention use the spark technology to achieve easy optimization of spark performance and adjustment of speed of data introduction; and
[0038] 3. The method and apparatus for introducing data to a graph database of the embodiments of the present invention use the feature of spark about concurrency to accelerate introduction of data without data loss.
BRIEF DESCRIPTION OF THE DRAWINGS
BRIEF DESCRIPTION OF THE DRAWINGS
[0039] To better illustrate the technical schemes as disclosed in the embodiments of the present invention, accompanying drawings referred in the description of the embodiments below are introduced briefly. It is apparent that the accompanying drawings as recited in the following description merely provide a part of possible embodiments of the present invention, and people of ordinary skill in the art would be able to obtain more drawings according to those provided herein without paying creative efforts, wherein:
[0040] FIG. 1 is a flowchart of a method for introducing data to a graph database according to one exemplificative embodiment; and
[0041] FIG. 2 is a structural diagram of an apparatus for introducing data to a graph database according to one exemplificative embodiment.
DETAILED DESCRIPTION OF THE INVENTION
DETAILED DESCRIPTION OF THE INVENTION
[0042] To make the foregoing objectives, features, and advantages of the present invention clearer and more understandable, the following description will be directed to some embodiments as depicted in the accompanying drawings to detail the technical schemes disclosed in these embodiments. It is, however, to be understood that the embodiments referred herein are only a part of all possible embodiments and thus not exhaustive. Based on the embodiments of the present invention, all the other embodiments can be conceived Date Regue/Date Received 2022-09-23 without creative labor by people of ordinary skill in the art, and all these and other embodiments shall be encompassed in the scope of the present invention.
[0043] FIG. 1 is a flowchart of a method for introducing data to a graph database according to one exemplificative embodiment. As shown, the method comprises the following steps.
[0044] The step Si involves registering a user-defined spark udf to a graph database program, so that the graph database is connected to a spark resource through the spark udf.
[0045] Specifically, in the embodiment of the present invention, with the spark udf defined by the user, the graph database and the spark resource can be combined (i.e., having connection between the graph database and the spark resource), so that the data can be introduced into the graph database in a real-time manner without having to exporting the data into the csv format. To compile the user-defined spark udf, development can be implemented using the Java language or other computer-programming languages.
Additionally, the compiled, user-defined spark udf must be registered in the graph database program first because in Java-based methods, only registered user-defined udf can be used. It is to be noted that, in the embodiment of the present invention, the use of the spark udf facilitates optimization of spark performance by, for example, deciding how many portions are the data enquired from hive re-partitioned, how to set spark concurrency, how many computing nodes (executors) to be assigned to a spark task, how much memory space to be assigned to each executor, how many cores to be set for one executor, and how much memory space to be assigned to the driver.
Additionally, the compiled, user-defined spark udf must be registered in the graph database program first because in Java-based methods, only registered user-defined udf can be used. It is to be noted that, in the embodiment of the present invention, the use of the spark udf facilitates optimization of spark performance by, for example, deciding how many portions are the data enquired from hive re-partitioned, how to set spark concurrency, how many computing nodes (executors) to be assigned to a spark task, how much memory space to be assigned to each executor, how many cores to be set for one executor, and how much memory space to be assigned to the driver.
[0046] It is to be noted that the Spark udf is such set that when the functions provided by spark are unable to satisfy user needs, the user-defined function can be used to realize its own business logic. Following is an example:
[0047] public class CreateACCTOBANKCARD2 implements UDF5<String,String,Integer,String,String,String>, Serializable {
Date Regue/Date Received 2022-09-23 @Override public String call(final String pay acct no, final String ttl amt, final Integer ttl times, final String latest_pay time, final String rcvr user) throws Exception {
Date Regue/Date Received 2022-09-23 @Override public String call(final String pay acct no, final String ttl amt, final Integer ttl times, final String latest_pay time, final String rcvr user) throws Exception {
[0048] //This udf has five input parameters and one output parameter //Its own function works here return "1";
}
}
[0049] S2 is about creating node attribute indexes in the graph database.
[0050] Specifically, for introduction of massive data, in order to prevent duplication in terms of node and relationship and to ensure fast search, in the embodiment of the present invention, node attribute indexes is created in the graph database to provide every node in the graph database with an attribute index. Without the attribute indexes, data insertion can significantly slow down.
[0051] For example:
[0052] private static Driver driver = null;
static {
driver = OperateNeo4j.connectNeo4j(Constants.url, Constants.neo4jUser, Constants.neo4jPassword);
}
Session session = driver.session();
//The identity card number is taken as the index session.run("create index on :IDNTY NMBR(Idnty Nmbr)");
//The account number is taken as the index session.run("create index on :ACCT NMBR(Acct No)").
static {
driver = OperateNeo4j.connectNeo4j(Constants.url, Constants.neo4jUser, Constants.neo4jPassword);
}
Session session = driver.session();
//The identity card number is taken as the index session.run("create index on :IDNTY NMBR(Idnty Nmbr)");
//The account number is taken as the index session.run("create index on :ACCT NMBR(Acct No)").
[0053] S3 involves using the spark resource to make enquiry to a hive database to acquire enquiry-generated data.
[0054] Specifically, the embodiment of the present invention is about introducing the data of the hive table into the graph database. To introduce the data of the hive table into the graph Date Regue/Date Received 2022-09-23 database, the hive database can be first enquired using the spark resource (specifically, by using the compiled spark sql language to find out the data from the hive database), so as to acquire the enquiry-generated data.
[0055] S4 involves re-partitioning the enquiry-generated data and registering them as a temporary data table.
[0056] Specifically, in the embodiment of the present invention, the data got from the hive database using the spark sql are repartitioned. When doing computation for a resilient distributed dataset (RDD), spark initiates a task for every partition, so the number of the partitions of the RDD determines the total number of the tasks. In this way, with optimization of spark performance, the total number of the tasks can be set by setting the number of the partitions of the RDD. By setting the number of the required computing nodes (executors) and the number of cores in every computing nide, these tasks can be executed concurrently at the same time, so as to accelerate introduction of data into the graph database. It is to be noted that an RDD, or a resilient distributed dataset, is the basic data abstract in Spark. It represents an immutable, partitionable set in which elements can be computed concurrently. An RDD has the features of a data flow model, including automatic failover, location-aware scheduling, and scalability. An RDD allows users to explicitly cache data in the memory during multiple enquires, so that the subsequent enquires can reuse these data. This significantly improves speed of enquires. In addition, since Spark features concurrent computing, each task can execute a part of the whole data without data loss.
[0057] It is to be noted that, in the embodiment of the present invention, the reason of setting the partitions is that, without the partitions, the number of partitions of the enquiry results would be the same as the number of partitions it the hive table, so the concurrency level, and in turn the number of tasks executed concurrently, could not be enhanced. With the re-partitioning step, the number of tasks executed concurrently can be increased, thereby accelerating execution.
[0058] S5 involves introducing the temporary data table to the graph database using the spark udf and the node attribute indexes.
Date Regue/Date Received 2022-09-23
Date Regue/Date Received 2022-09-23
[0059] Specifically, by having the foregoing user-defined spark udf, together with the node attribute indexes created in the graph database, the present invention can have the temporary data table introduced into the graph database. In the embodiment of the present invention, since the hive database is connected through Spark, data can be introduced into the graph database directly, without having to exporting the data into a csv file, and real-time insertion can be achieved.
[0060] As a preferred implementation, in the embodiment of the present invention, before the step of registering a user-defined spark udf to a graph database program, the method further comprises:
[0061] setting up a driver in the spark udf for connecting the graph database and writing it in a static method.
[0062] Specifically, in the embodiment of the present invention, the driver in the user-defined spark udf for connecting the graph database (e.g., neo4j) must be written in the static method. This is to reduce the times of connection between the spark udf and the graph database connecting, thereby reducing resource consumption.
[0063] As a preferred implementation, in the embodiment of the present invention, before the step of registering a user-defined spark udf to a graph database program the method further comprises:
[0064] defining parameters for exporting and importing the spark udf.
[0065] Specifically, in the embodiment of the present invention, the input and output parameters of the spark udf must be defined. In other words, it is necessary to well define the number and the types of the parameters, the type of the output parameters, and that the main return value cannot be null.
[0066] As a preferred implementation, in the embodiment of the present invention, after the step of introducing the temporary data table to the graph database the method further comprises:
[0067] turning off the driver of the graph database and the spark resource.
[0068] Specifically, after the temporary data table is introduced into the graph database, the driver of the graph database and the spark resource need to be turn off to save resources.
Date Regue/Date Received 2022-09-23
Date Regue/Date Received 2022-09-23
[0069] As a preferred implementation, in the embodiment of the present invention, the step of using the spark resource to make enquiry to a hive database comprises:
[0070] using a reduce operator of the spark resource to perform corresponding computation on the enquiry-generated data.
[0071] Specifically, an action operator of spark is requisite for triggering execution of spark because only action operators can execute computation. As to selection of the operator, in the embodiment of the present invention, among action operators, the reduce operator is used rather than other operators such as collect and show. This is because the other operators like collect and show can have negative impact on performance, and the show operator is unable to compute all data. In other words, in the embodiment of the present invention, the reduce operator is used to trigger execution of spark with accelerated data while eliminating the risk of data loss.
[0072] FIG. 2 is a structural diagram of an apparatus for introducing data to a graph database according to one exemplificative embodiment. As shown, the apparatus comprises:
[0073] a connecting module, for registering a user-defined spark udf to a graph database program, so that the graph database is connected to a spark resource through the spark udf;
[0074] a creating module, for creating node attribute indexes in the graph database;
[0075] an enquiring module, for using the spark resource to make enquiry to a hive database to acquire enquiry-generated data;
[0076] a partitioning module, for re-partitioning the enquiry-generated data and registering them as a temporary data table; and
[0077] an introducing module, for introducing the temporary data table to the graph database using the spark udf and the node attribute indexes.
[0078] As a preferred implementation, in the embodiment of the present invention, the apparatus further comprises:
[0079] a driver connecting module, for setting up a driver in the spark udf for connecting the graph database and writing it in a static method.
[0080] As a preferred implementation, in the embodiment of the present invention, the apparatus further comprises:
Date Regue/Date Received 2022-09-23
Date Regue/Date Received 2022-09-23
[0081] a configuring module, for defining parameters for exporting and importing the spark udf.
[0082] As a preferred implementation, in the embodiment of the present invention, the apparatus further comprises:
[0083] a deactivating module, for turning off the driver of the graph database and the resource.
[0084] As a preferred implementation, in the embodiment of the present invention, the step of using the spark resource to make enquiry to a hive database comprises:
[0085] using a reduce operator of the spark resource to perform corresponding computation on the enquiry-generated data.
[0086] To sum up, the technical schemes of the embodiments of the present invention provide the following beneficial effects:
[0087] 1. The method and apparatus for introducing data to a graph database of the embodiments of the present invention use the combination of spark and graph databases to realize real-time introduction of data while eliminating the need of exporting data into the csv format;
[0088] 2. The method and apparatus for introducing data to a graph database of the embodiments of the present invention use the spark technology to achieve easy optimization of spark performance and adjustment of speed of data introduction; and
[0089] 3. The method and apparatus for introducing data to a graph database of the embodiments of the present invention use the feature of spark about concurrency to accelerate introduction of data without data loss.
[0090] It is to be noted that work division among the foregoing functional modules for the order-based phoning system of the present embodiment to implement delivery is merely exemplary. In practical implementations, the work division may be made among different functional modules. In other words, the internal architecture of the order-based phoning system may be reconfigured with different functional modules to perform all or a part of the functions as described previously. In addition, since the order-based phoning system of the present embodiment and the disclosed order-based phoning method in the previous embodiment stem from the same conception, the details of its implementation can be learned from the description made to the method of the previous embodiment, and no repetition is made herein.
Date Regue/Date Received 2022-09-23
Date Regue/Date Received 2022-09-23
[0091] As will be appreciated by people of ordinary skill in the art, implementation of all or a part of the steps of the method of the present invention as described previously may be realized by having a program instruct related hardware components. The program may be stored in a computer-readable storage medium, and the program is about performing the individual steps of the methods described in the foregoing embodiments.
The storage medium may be a ROM/RAM, a hard drive, an optical disk, or the like.
The storage medium may be a ROM/RAM, a hard drive, an optical disk, or the like.
[0092] The preferred embodiments of the present invention described previously are not intended to limit the present invention. Any modification, equivalent replacement, and improvement made under the spirit and principle of the present invention shall be included in the scope of the present invention.
Date Regue/Date Received 2022-09-23
Date Regue/Date Received 2022-09-23
Claims (10)
1. A method for introducing data to a graph database, the method comprising:
registering a user-defined spark udf to a graph database program, so that the graph database is connected to a spark resource through the spark udf;
creating node attribute indexes in the graph database;
using the spark resource to make enquiry to a hive database to acquire enquiry-generated data;
re-partitioning the enquiry-generated data and registering them as a temporary data table; and introducing the temporary data table to the graph database using the spark udf and the node attribute indexes.
registering a user-defined spark udf to a graph database program, so that the graph database is connected to a spark resource through the spark udf;
creating node attribute indexes in the graph database;
using the spark resource to make enquiry to a hive database to acquire enquiry-generated data;
re-partitioning the enquiry-generated data and registering them as a temporary data table; and introducing the temporary data table to the graph database using the spark udf and the node attribute indexes.
2. The method for introducing data to a graph database of claim 1, wherein before the step of registering a user-defined spark udf to a graph database program, the method further comprises:
setting up a driver in the spark udf for connecting the graph database and writing it in a static method.
setting up a driver in the spark udf for connecting the graph database and writing it in a static method.
3. The method for introducing data to a graph database of claim 1 or 2, wherein before the step of registering a user-defined spark udf to a graph database program, the method further comprises:
defining parameters for exporting and importing the spark udf.
defining parameters for exporting and importing the spark udf.
4. The method for introducing data to a graph database of claim 2, wherein after the step of introducing the temporary data table to the graph database, the method further comprises:
turning off the driver of the graph database and the spark resource.
turning off the driver of the graph database and the spark resource.
5. The method for introducing data to a graph database of claim 1 or 2, wherein the step of using the spark resource to make enquiry to a hive database comprises:
using a reduce operator of the spark resource to perform corresponding computation on the Date Regue/Date Received 2022-09-23 enquiry-generated data.
using a reduce operator of the spark resource to perform corresponding computation on the Date Regue/Date Received 2022-09-23 enquiry-generated data.
6. An apparatus for introducing data to a graph database, the apparatus comprising:
a connecting module, for registering a user-defined spark udf to a graph database program, so that the graph database is connected to a spark resource through the spark udf;
a creating module, for creating node attribute indexes in the graph database;
an enquiring module, for using the spark resource to make enquiry to a hive database to acquire enquiry-generated data;
a partitioning module, for re-partitioning the enquiry-generated data and registering them as a temporary data table; and an introducing module, for introducing the temporary data table to the graph database using the spark udf and the node attribute indexes.
a connecting module, for registering a user-defined spark udf to a graph database program, so that the graph database is connected to a spark resource through the spark udf;
a creating module, for creating node attribute indexes in the graph database;
an enquiring module, for using the spark resource to make enquiry to a hive database to acquire enquiry-generated data;
a partitioning module, for re-partitioning the enquiry-generated data and registering them as a temporary data table; and an introducing module, for introducing the temporary data table to the graph database using the spark udf and the node attribute indexes.
7. The apparatus for introducing data to a graph database of claim 6, further comprising:
a driver connecting module, for setting up a driver in the spark udf for connecting the graph database and writing it in a static method.
a driver connecting module, for setting up a driver in the spark udf for connecting the graph database and writing it in a static method.
8. The apparatus for introducing data to a graph database of claim 6 or 7, further comprising:
a configuring module, for defining parameters for exporting and importing the spark udf.
a configuring module, for defining parameters for exporting and importing the spark udf.
9. The apparatus for introducing data to a graph database of claim 7, further comprising:
a deactivating module, for turning off the driver of the graph database and the resource.
a deactivating module, for turning off the driver of the graph database and the resource.
10. The apparatus for introducing data to a graph database of claim 6 or 7, using the spark resource to make enquiry to a hive database is achieved by:
using a reduce operator of the spark resource to perform corresponding computation on the enquiry-generated data.
Date Regue/Date Received 2022-09-23
using a reduce operator of the spark resource to perform corresponding computation on the enquiry-generated data.
Date Regue/Date Received 2022-09-23
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910282923.3A CN110110108B (en) | 2019-04-09 | 2019-04-09 | Data importing method and device of graph database |
CN201910282923.3 | 2019-04-09 | ||
PCT/CN2019/109096 WO2020206952A1 (en) | 2019-04-09 | 2019-09-29 | Graph database data import method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
CA3176758A1 true CA3176758A1 (en) | 2020-10-15 |
Family
ID=67485283
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA3176758A Pending CA3176758A1 (en) | 2019-04-09 | 2019-09-29 | Method and apparatus for introducing data to a graph database |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN110110108B (en) |
CA (1) | CA3176758A1 (en) |
WO (1) | WO2020206952A1 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108596561B (en) * | 2018-03-29 | 2021-06-01 | 时时同云科技(成都)有限责任公司 | Human-effect service system and method based on big data architecture |
CN110110108B (en) * | 2019-04-09 | 2021-03-30 | 苏宁易购集团股份有限公司 | Data importing method and device of graph database |
CN112905854A (en) * | 2021-03-05 | 2021-06-04 | 北京中经惠众科技有限公司 | Data processing method and device, computing equipment and storage medium |
CN112925952A (en) * | 2021-03-05 | 2021-06-08 | 北京中经惠众科技有限公司 | Data query method and device, computing equipment and storage medium |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105528367B (en) * | 2014-09-30 | 2019-06-14 | 华东师范大学 | Storage and near real-time querying method based on open source big data to time sensitive data |
CN104391957A (en) * | 2014-12-01 | 2015-03-04 | 浪潮电子信息产业股份有限公司 | Data interaction analysis method for hybrid big data processing system |
CN105468671B (en) * | 2015-11-12 | 2019-04-02 | 杭州中奥科技有限公司 | The method of realization personnel's relationship modeling |
US10409782B2 (en) * | 2016-06-15 | 2019-09-10 | Chen Zhang | Platform, system, process for distributed graph databases and computing |
CN106528773B (en) * | 2016-11-07 | 2020-06-26 | 山东联友通信科技发展有限公司 | Map computing system and method based on Spark platform supporting spatial data management |
CN106815353B (en) * | 2017-01-20 | 2020-02-21 | 星环信息科技(上海)有限公司 | Data query method and equipment |
CN109344268A (en) * | 2018-08-14 | 2019-02-15 | 北京奇虎科技有限公司 | Method, electronic equipment and the computer readable storage medium of graphic data base write-in |
CN109460416B (en) * | 2018-12-12 | 2020-02-04 | 成都四方伟业软件股份有限公司 | Data processing method and device, electronic equipment and storage medium |
CN110110108B (en) * | 2019-04-09 | 2021-03-30 | 苏宁易购集团股份有限公司 | Data importing method and device of graph database |
-
2019
- 2019-04-09 CN CN201910282923.3A patent/CN110110108B/en active Active
- 2019-09-29 CA CA3176758A patent/CA3176758A1/en active Pending
- 2019-09-29 WO PCT/CN2019/109096 patent/WO2020206952A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2020206952A1 (en) | 2020-10-15 |
CN110110108A (en) | 2019-08-09 |
CN110110108B (en) | 2021-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11392586B2 (en) | Data protection method and device and storage medium | |
CA3176758A1 (en) | Method and apparatus for introducing data to a graph database | |
US7984043B1 (en) | System and method for distributed query processing using configuration-independent query plans | |
US7974967B2 (en) | Hybrid database system using runtime reconfigurable hardware | |
CA2518902C (en) | System and method for query planning and execution | |
WO2016123920A1 (en) | Method and system for achieving integration interface supporting operations of multiple types of databases | |
CN109614413B (en) | Memory flow type computing platform system | |
US10261888B2 (en) | Emulating an environment of a target database system | |
US20160239544A1 (en) | Collaborative planning for accelerating analytic queries | |
US11514009B2 (en) | Method and systems for mapping object oriented/functional languages to database languages | |
US10120886B2 (en) | Database integration of originally decoupled components | |
CN105164674A (en) | Queries involving multiple databases and execution engines | |
WO2015152868A1 (en) | Parallelizing sql on distributed file systems | |
CN107977446A (en) | A kind of memory grid data load method based on data partition | |
CN106020847A (en) | Method and device for configuring SQL for persistent layer development framework | |
Chandramouli et al. | Quill: Efficient, transferable, and rich analytics at scale | |
CN114443015A (en) | Method for generating adding, deleting, modifying and checking service interface based on database metadata | |
US10140335B2 (en) | Calculation scenarios with extended semantic nodes | |
CN105653334B (en) | MIS system rapid development framework based on SAAS mode | |
CN115080663A (en) | Distributed database synchronization method, system, device and medium | |
CN114817311B (en) | Parallel computing method applied to GaussDB database storage process | |
CN113127441B (en) | Method for dynamically selecting database components and self-assembled database management system | |
Chandramouli et al. | The Quill Distributed Analytics Library and Platform | |
CN116910082A (en) | SQL sentence processing method, device, server and medium | |
PALLAVI et al. | A Query Support for Multiple Data Stores using REST Based ODBAPI in Cloud |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request |
Effective date: 20220923 |
|
EEER | Examination request |
Effective date: 20220923 |
|
EEER | Examination request |
Effective date: 20220923 |
|
EEER | Examination request |
Effective date: 20220923 |
|
EEER | Examination request |
Effective date: 20220923 |
|
EEER | Examination request |
Effective date: 20220923 |
|
EEER | Examination request |
Effective date: 20220923 |
|
EEER | Examination request |
Effective date: 20220923 |