CN111523000B - Method, apparatus, device and storage medium for importing data - Google Patents
Method, apparatus, device and storage medium for importing data Download PDFInfo
- Publication number
- CN111523000B CN111523000B CN202010325575.6A CN202010325575A CN111523000B CN 111523000 B CN111523000 B CN 111523000B CN 202010325575 A CN202010325575 A CN 202010325575A CN 111523000 B CN111523000 B CN 111523000B
- Authority
- CN
- China
- Prior art keywords
- target
- import
- target node
- directed edge
- identification information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000008676 import Effects 0.000 claims abstract description 98
- 230000015654 memory Effects 0.000 claims description 64
- 238000012163 sequencing technique Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000013507 mapping Methods 0.000 description 6
- 238000004590 computer program Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a method, a device, equipment and a storage medium for importing data, and relates to the field of knowledge maps. The specific implementation scheme is as follows: acquiring at least one target node and a directed edge between the at least one target node; acquiring original identification information and internal identification information of at least one target node; determining the target import sequence of each directed edge; and sequentially importing the internal identification information of at least one target node according to the original identification information of the two target nodes connected by the directed edge and the target importing sequence. According to the method, data is imported according to the target import sequence, so that the data import efficiency is improved.
Description
Technical Field
The embodiment of the application relates to the technical field of computers, and further relates to the field of knowledge maps, in particular to a method, a device, equipment and a storage medium for importing data.
Background
The graph database supports databases that store and query data in such data structures. In graph databases, all data is typically stored in the form of nodes and edges.
The data import performance is an important performance index of the evaluation graph database. When the data volume reaches a certain degree, the limitation of system resources such as memory, disk and the like can influence the data importing performance, so that the inquiry performance of the graph database is influenced.
Disclosure of Invention
Provided are a method, apparatus, device, and storage medium for importing data.
According to a first aspect, there is provided a method for importing data, the method comprising: acquiring at least one target node and a directed edge between the at least one target node; acquiring original identification information and internal identification information of at least one target node; determining the target import sequence of each directed edge; and sequentially importing the internal identification information of at least one target node according to the original identification information of the two target nodes connected by the directed edge and the target importing sequence.
According to a second aspect, there is provided an apparatus for importing data, the apparatus comprising: a first acquisition unit configured to acquire at least one target node and a directed edge between the at least one target node; a second acquisition unit configured to acquire original identification information and internal identification information of at least one target node; a determining unit configured to determine a target import order of each directed edge; and an importing unit configured to import internal identification information of at least one target node in sequence according to original identification information of two target nodes connected by the directed edge and a target import order.
According to a third aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in the first aspect.
According to a fourth aspect, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method as described in the first aspect.
According to the technology, the problem that the existing data importing method occupies large disk volume and is low in importing efficiency is solved, data importing is achieved according to the target importing sequence, and the data importing efficiency is improved.
It should be understood that the description of this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.
Drawings
The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:
FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a method for importing data according to the present application;
FIG. 3 is a schematic diagram of a directed edge between at least one target node and at least one target node in the embodiment of FIG. 2;
FIG. 4 is a schematic illustration of one application scenario of a method for importing data according to the present application;
FIG. 5 is a flow chart of yet another embodiment of a method for importing data according to the present application;
FIG. 6 is a schematic diagram of an embodiment of an apparatus for importing data according to the present application;
fig. 7 is a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present application.
Detailed Description
Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 illustrates an exemplary architecture 100 to which the methods for importing data or apparatuses for importing data of the present application may be applied.
As shown in fig. 1, system architecture 100 may include a server 101, a network 102, a source database server 103, and a target database server 104. Network 102 is the medium used to provide the communication links between server 101 and source database server 103 and target database server 104. Network 102 may include various connection types such as wired, wireless communication links, or fiber optic cables, among others.
The source database server 103 may deploy a source database, which may be a database server that provides data support to the server 101. The source database may be a storage device, may store at least one target node, and a directed edge between the at least one target node, the target node may represent target entities, and the directed edge may represent an association between the target entities. Server 101 may obtain at least one target node from source database server 103 via network 102, and a directed edge between the at least one target node.
The server 101 may sequentially import internal identification information of at least one target node according to original identification information of two target nodes connected by a directed edge.
The target database server 104 may deploy a target database. The server 101 may store the internal identification information of the two target nodes of the connection of the directed edge and the directed edge in the target database through the network 102. The target database may be a graph database, and the server 101 may also generate a knowledge graph from data stored in the graph database.
It should be noted that, the method for importing data provided in the embodiments of the present application is generally performed by the server 101, and accordingly, the device for importing data is generally disposed in the server 101.
It should be understood that the numbers of servers, networks, source database servers, and target database servers in fig. 1 are merely illustrative. There may be any number of servers, networks, source database servers, and target database servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for importing data according to the present application is shown. The method for importing data comprises the following steps:
In this embodiment, the execution body of the method for importing data (e.g., the server 101 shown in fig. 1) may obtain, from the graph database, a directed edge between at least one target node and the at least one target node. Here, the target nodes may be used to represent target entities, such as people, animals, objects, places, etc., and the directed edges may be used to represent associations between target entities. It will be appreciated that the target nodes and the directed edges between the target nodes may be data in a source database (e.g., the source database deployed by the source database server 103 shown in fig. 1) to be imported into a target graph database (e.g., the target database deployed by the target database server 104 shown in fig. 1).
In this embodiment, the executing body may acquire at least one target node original identification information and internal identification information from an identification mapping relationship of at least one target node. Here, the original identification information may be identification information when a target entity corresponding to the target node is stored in the source database. The internal identification information may be identification information of the target entity when stored in the target database. The internal identification information may be fixed length identification information, such as numerical identifications 1, 2, 3, 4, for facilitating subsequent queries. The internal identifier may be generated in advance by the execution body, or may be generated by other electronic devices according to the original identifier information of each target node. The internal identification information of each target node may have a certain relationship, for example, each internal identification information may form an incremental sequence.
In this embodiment, the execution body may determine the import order of the directed edges in various ways. For example, the directed edges are randomly arranged, and the target import order of the directed edges is determined. When importing data, the execution body needs to read the data to be imported into the memory, and then write the data into the target database from the memory. It will be appreciated that when memory is of limited capacity and multiple directed edges need to be imported, different import sequences may incur different memory read costs. Therefore, the present embodiment needs to determine the target import order.
In some alternative implementations of the present implementation, the step 203 may be specifically implemented by the following steps: determining the number of directed edges connected with each target node; and determining the import sequence of each at least one target node and the directed edges according to the number of the directed edges connected with each target node and the directed edges connected with each target node.
In this alternative implementation manner, the execution body may determine the number of the directed edges connected to each target node first, then arrange each target node from more to less according to the number of the directed edges connected to each target node, and then sequentially arrange the directed edges connected to each target node according to the arrangement order of each target node, so as to determine the import order of each directed edge.
FIG. 3 is a schematic diagram of a directed edge between at least one target node and at least one target node in the embodiment of FIG. 2. In fig. 3, the target nodes may be T1, T2, T3, and T4, the target node T1 and T2 may be connected by the target node L1, the target node T1 and T4 may be connected by the target node L2, the target node T1 and T3 may be connected by the target node L3, and the target node T3 and T4 may be connected by the target node L4, the number of the target nodes T1 connected by the target node T1 may be 3 (the target side L1, L2, and L3), the number of the target nodes T2 connected by the target node T2 may be 1 (the target side L1), the number of the target nodes T4 connected by the target node L2 (the target side L2 and L4), the order of arrangement of the target nodes may be T1, T3, T4, T2, or T1, T3, T2, and the order of introduction of the target nodes may be 1, L2, L3, L4, or L2, L3, L4, L3, L2, L4, L1, L4, L2, L4, etc.
The target import sequence determined by the implementation manner can be used for importing target nodes with more association relations preferentially when importing data, and the memory is not required to be read frequently when importing the target nodes, so that the data importing efficiency is improved.
In this embodiment, the executing body may determine the corresponding directed edge according to the original identification information of the two target nodes connected by the directed edge, and then sequentially replace the original identification information of the two target nodes connected by each directed edge with the internal identification information according to the finally determined target import sequence.
In some optional implementations of the present implementation, the method may further include the following steps, not shown in fig. 2: and storing at least one target node and each directed edge according to the entity types of the two target nodes connected by each directed edge.
In this alternative implementation, the execution body may first determine the entity type of at least one target node, and then store the directed edge according to the entity types of the two target nodes connected by the directed edge. For example, if the entity type of the target node T1 connected to the directed edge L1 is a person, the entity type of the target node T2 is a book, the entity type of the target node T1 connected to the directed edge L2 is a person, and the entity type of the target node T3 is an animal, the execution body may load the original identification information and the internal identification information of the target node T1, the original identification information and the internal identification information of the target node T2 into the memory, perform the introduction of the directed edge L1, and load the original identification information and the internal identification information of the target node T3 into the memory after the introduction of the directed edge L1 is completed, and perform the introduction of the directed edge L2.
Some existing data importing methods need to store mapping relations among all the identifiers of the nodes to be imported in the memory of the execution main body. When the number of nodes to be imported is large, memory shortage is likely to occur, and external storage is required. The read performance of external storage is much lower than that of memory, which results in the execution body spending more time reading the node. In some existing data importing methods, all nodes to be imported and edges between all nodes are stored in a disk, and mixed sorting is performed according to the identifiers. Resulting in the execution body being severely dependent on the read-write performance of the disk and the ordering performance of the main memory. When the number of nodes to be imported is large, the import efficiency is also reduced.
By this implementation, each target node and directed edge are stored by type. When the data is imported, only the identification mapping relations of the two target nodes corresponding to the two types of entity types are loaded into the memory. After the importing is completed, the identification mapping relation corresponding to the new entity type is loaded into the memory. Therefore, the identification mapping relation of all target nodes is not required to be loaded into the memory, and the data is imported under the conditions of limited internal storage and no external storage.
With continued reference to fig. 4, fig. 4 is a schematic diagram of an application scenario of a method for importing data according to the present application. In the application scenario of fig. 4, the execution body 401 may acquire at least one target node (T1, T2, T3, T4), a directed edge (L1, L2, L3, L4) between at least one target node, and a mapping relationship (T1 corresponds to 1, T2 corresponds to 2, T3 corresponds to 3, T4 corresponds to 4) between original identification information (T1, T2, T3, T4) of the target node and internal identification information (1, 2, 3, 4) sent by the source database server 402. 403 represent two target nodes for each directed edge connection. The execution body 401 determines a target import order 404 of each directed edge, and then sequentially imports internal identification information of at least one target node according to the target import order 404. 405 is the result of importing data.
The method for importing data provided in the foregoing embodiment of the present disclosure sequentially imports internal identification information of at least one target node according to original identification information and target import sequence of two target nodes connected by a directed edge, by acquiring the directed edge between the at least one target node and the at least one target node, and the original identification information and the internal identification information of the at least one target node. According to the method, data is imported according to the target import sequence, so that the data import efficiency is improved.
With continued reference to fig. 5, a flow 500 of yet another embodiment of a method for importing data according to the present application is shown. The method for importing data comprises the following steps:
The steps 501 and 502 are identical to the steps 201 and 202 in the foregoing embodiments, and the descriptions of the steps 201 and 202 are also applicable to the steps 501 and 502, which are not repeated herein.
And step 503, performing full sorting on the directed edges to determine candidate import sequences.
In this embodiment, the execution bodies may perform full ordering on the directed edges to determine the candidate import order. For example, if the directional edge is L1, L2, or L3, the candidate introduction order may be L1, L2, L3, L1, L3, L2, L3, L1, L2, L3, L2, or L1.
In this embodiment, the execution body may determine a memory read cost corresponding to each candidate import sequence. Here, the memory read cost may refer to a time value of occupying the memory, or a usage amount of the memory.
In some optional implementations of the present implementation, the memory read cost corresponding to each candidate import order may also be determined by: taking two target nodes connected with each directed edge as a node set corresponding to the directed edge; and for each candidate import sequence, determining the memory read cost corresponding to each candidate import sequence according to the difference set of the node sets corresponding to the two adjacent directed edges in the candidate import sequence.
In this optional implementation manner, the executing body may use two target nodes connected by each directed edge as node sets corresponding to the directed edge, then determine a difference set of node sets corresponding to two adjacent directed edges in the candidate import sequence, and then determine a memory read cost corresponding to the candidate import sequence based on the difference set of node sets corresponding to two adjacent directed edges.
For example, the candidate import orders are L1, L2, L3, and L4, the node set corresponding to the directed edge L1 is (T1, T2), the node set corresponding to the directed edge L1 is (T2, T3), and the difference set of the node sets corresponding to the directed edge L2 adjacent to the directed edge L1 is (T1, T3). The execution body may preset a calculation formula of memory read cost corresponding to the import sequence of each directed edge:
wherein n represents the number of node sets corresponding to the directed edges, W represents the memory read cost, W1 represents the memory read cost corresponding to the first directed edge in the candidate import sequence (generally, 2m, m represents the average memory read cost corresponding to a target node), E i Represents the node set corresponding to the directed edge i, |E i+1 -E i The i represents the absolute value of the number of target nodes in the difference set.
The implementation mode can rapidly determine the memory read cost corresponding to each candidate import sequence based on the calculation formula of the memory read cost.
In step 505, a target import sequence is determined from the candidate import sequences according to the memory read cost.
In this embodiment, after determining the memory read cost corresponding to each candidate import sequence, the execution body may determine the target import sequence. For example, the execution body may use the candidate import order with the lowest memory read cost as the target import order. Or, the execution body may take the candidate import sequence with the memory read cost smaller than the preset threshold value as the target import sequence.
The step 506 corresponds to the step 204 in the foregoing embodiment, and the description of the step 204 is also applicable to the step 506, which is not repeated herein.
In the flowchart 500 of the method for importing data in this embodiment, another method for determining the target import order of each directed edge is described, and the candidate import order with the lowest memory read cost is selected as the target import order by determining the corresponding memory read cost of each candidate import order, so that the efficiency of data import is improved.
With further reference to fig. 6, as an implementation of the method shown in the foregoing figures, the present application provides an embodiment of an apparatus for importing data, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic apparatuses.
As shown in fig. 6, the apparatus 600 for importing data provided in this embodiment includes a first acquiring unit 601, a second acquiring unit 602, a determining unit 603, and an importing unit 604. Wherein the first obtaining unit 601 is configured to obtain at least one target node and a directed edge between the at least one target node; a second acquiring unit 602 configured to acquire original identification information and internal identification information of at least one target node; a determining unit 603 configured to determine a target import order of each directed edge; an importing unit 604, configured to import internal identification information of at least one target node in sequence according to original identification information of two target nodes connected by a directed edge and a target import order.
In this embodiment, in the apparatus 600 for importing data: the specific processes of the first acquiring unit 601, the second acquiring unit 602, the determining unit 603, and the importing unit 604 and the technical effects thereof may refer to the descriptions related to step 201, step 202, step 203, and step 204 in the corresponding embodiment of fig. 2, and are not repeated herein.
In some optional implementations of this embodiment, the second obtaining unit 602 is further configured to determine the target import order of the directed edges by: a first determining module configured to determine the number of directed edges to which each target node is connected; the second determining module is configured to determine the import sequence of the directed edges according to the number of the directed edges connected with each target node and the directed edges connected with each target node.
In some optional implementations of this embodiment, the second obtaining unit 602 is further configured to determine the target import order of the directed edges by: the third determining module is configured to perform full ordering on the directed edges and determine candidate import sequences; a fourth determining module configured to determine a memory read cost corresponding to each candidate import order; and a fifth determining module configured to determine a target import order from the candidate import orders according to the memory read cost.
In some optional implementations of this embodiment, the fourth determination module is further configured to: taking two target nodes connected with each directed edge as a node set corresponding to the directed edge; and for each candidate import sequence, determining the memory read cost corresponding to each candidate import sequence according to the difference set of the node sets corresponding to the two adjacent directed edges in the candidate import sequence.
In some optional implementations of this embodiment, the apparatus further includes: a storage unit (not shown in the figure) configured to store at least one target node and each directed edge according to an entity type of the two target nodes to which each directed edge is connected.
The device provided in the foregoing embodiment of the present application acquires, through the first acquiring unit 601, at least one target node and a directed edge between the at least one target node, then acquires, through the second acquiring unit 602, original identification information and internal identification information of the at least one target node, the determining unit 603 determines a target import order of each directed edge, and the importing unit 604 sequentially imports the internal identification information of the at least one target node according to the original identification information and the target import order of the two target nodes connected by the directed edge. The device can conduct data import according to the target import sequence, and improves the data import efficiency.
According to embodiments of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 7, a block diagram of an electronic device is provided for a method for importing data according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.
As shown in fig. 7, the electronic device includes: one or more processors 701, memory 702, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 701 is illustrated in fig. 7.
The memory 702 is used as a non-transitory computer readable storage medium, and may be used to store a non-transitory software program, a non-transitory computer executable program, and modules, such as program instructions/modules corresponding to the method for importing data in the embodiment of the present application (e.g., the first acquiring unit 601, the second acquiring unit 602, the determining unit 603, and the importing unit 604 shown in fig. 6). The processor 701 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 702, that is, implements the method for importing data in the above-described method embodiment.
The electronic device of the method for importing data may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or otherwise, in fig. 7 by way of example.
The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of the electronic device for importing data, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, and the like. The output device 704 may include a display apparatus, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, the data importing efficiency can be improved.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.
The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.
Claims (10)
1. A method for importing data, comprising:
acquiring at least one target node and a directed edge between the at least one target node;
acquiring original identification information and internal identification information of the at least one target node;
determining a target import order of each directed edge, comprising: carrying out full sequencing on each directed edge to determine a candidate import sequence; determining the memory reading cost corresponding to each candidate import sequence; determining the target import sequence from the candidate import sequences according to the memory read cost;
and sequentially importing the internal identification information of the at least one target node according to the original identification information of the two target nodes connected by the directed edge and the target importing sequence.
2. The method of claim 1, wherein said determining a target import order for each of said directed edges comprises:
determining the number of the directed edges connected with each target node;
and determining the import sequence of each directed edge according to the number of the directed edges connected with each target node and the directed edges connected with each target node.
3. The method of claim 1, wherein the determining the memory read cost corresponding to each candidate import order comprises:
taking two target nodes connected with each directed edge as a node set corresponding to the directed edge;
and for each candidate import sequence, determining the memory reading cost corresponding to each candidate import sequence according to the difference set of the node sets corresponding to the two adjacent directed edges in the candidate import sequence.
4. The method of claim 1, wherein the method further comprises:
and storing the at least one target node and each directed edge according to the entity type of the two target nodes connected by each directed edge.
5. An apparatus for importing data, comprising:
a first acquisition unit configured to acquire at least one target node and a directed edge between the at least one target node;
a second acquisition unit configured to acquire original identification information and internal identification information of the at least one target node;
a determining unit configured to determine a target import order of each of the directed edges;
an importing unit configured to import internal identification information of the at least one target node in sequence according to original identification information of the two target nodes connected by the directed edge and the target import order;
wherein the second acquisition unit is further configured to determine a target import order for each of the directed edges by: the third determining module is configured to perform full ordering on the directed edges to determine a candidate import sequence; a fourth determining module, configured to determine a memory read cost corresponding to each candidate import sequence; and a fifth determining module configured to determine the target import sequence from the candidate import sequences according to the memory read cost.
6. The apparatus of claim 5, wherein the second acquisition unit is further configured to determine a target import order for each of the directed edges by:
a first determining module configured to determine the number of directed edges to which each target node is connected;
and the second determining module is configured to determine the import sequence of the directed edges according to the number of the directed edges connected with each target node and the directed edges connected with each target node.
7. The apparatus of claim 5, wherein the fourth determination module is further configured to:
taking two target nodes connected with each directed edge as a node set corresponding to the directed edge;
and for each candidate import sequence, determining the memory reading cost corresponding to each candidate import sequence according to the difference set of the node sets corresponding to the two adjacent directed edges in the candidate import sequence.
8. The apparatus of claim 5, wherein the apparatus further comprises:
and the storage unit is configured to store the at least one target node and each directed edge according to the entity types of the two target nodes connected by each directed edge.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-4.
10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010325575.6A CN111523000B (en) | 2020-04-23 | 2020-04-23 | Method, apparatus, device and storage medium for importing data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010325575.6A CN111523000B (en) | 2020-04-23 | 2020-04-23 | Method, apparatus, device and storage medium for importing data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111523000A CN111523000A (en) | 2020-08-11 |
CN111523000B true CN111523000B (en) | 2023-06-23 |
Family
ID=71903197
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010325575.6A Active CN111523000B (en) | 2020-04-23 | 2020-04-23 | Method, apparatus, device and storage medium for importing data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111523000B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112527870B (en) * | 2020-12-03 | 2023-09-12 | 北京百度网讯科技有限公司 | Electronic report generation method, device, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108446359A (en) * | 2018-03-12 | 2018-08-24 | 百度在线网络技术(北京)有限公司 | Information recommendation method and device |
CN108776684A (en) * | 2018-05-25 | 2018-11-09 | 华东师范大学 | Optimization method, device, medium, equipment and the system of side right weight in knowledge mapping |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8682933B2 (en) * | 2012-04-05 | 2014-03-25 | Fujitsu Limited | Traversal based directed graph compaction |
US9928287B2 (en) * | 2013-02-24 | 2018-03-27 | Technion Research & Development Foundation Limited | Processing query to graph database |
CN104899156B (en) * | 2015-05-07 | 2017-11-14 | 中国科学院信息工程研究所 | A kind of diagram data storage and querying method towards extensive social networks |
US10579680B2 (en) * | 2016-05-13 | 2020-03-03 | Tibco Software Inc. | Using a B-tree to store graph information in a database |
CN108549731A (en) * | 2018-07-11 | 2018-09-18 | 中国电子科技集团公司第二十八研究所 | A kind of knowledge mapping construction method based on ontology model |
CN109446362B (en) * | 2018-09-05 | 2021-07-23 | 深圳神图科技有限公司 | Graph database structure based on external memory, graph data storage method and device |
CN109522428B (en) * | 2018-09-17 | 2020-11-24 | 华中科技大学 | External memory access method of graph computing system based on index positioning |
CN109726305A (en) * | 2018-12-30 | 2019-05-07 | 中国电子科技集团公司信息科学研究院 | A kind of complex_relation data storage and search method based on graph structure |
CN110825743B (en) * | 2019-10-31 | 2022-03-01 | 北京百度网讯科技有限公司 | Data importing method and device of graph database, electronic equipment and medium |
-
2020
- 2020-04-23 CN CN202010325575.6A patent/CN111523000B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108446359A (en) * | 2018-03-12 | 2018-08-24 | 百度在线网络技术(北京)有限公司 | Information recommendation method and device |
CN108776684A (en) * | 2018-05-25 | 2018-11-09 | 华东师范大学 | Optimization method, device, medium, equipment and the system of side right weight in knowledge mapping |
Non-Patent Citations (1)
Title |
---|
融合多类型信息的知识图谱表示学习;郭舒;《中国博士学位论文全文数据库》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111523000A (en) | 2020-08-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111522967B (en) | Knowledge graph construction method, device, equipment and storage medium | |
CN108369663B (en) | Determining an execution order of a neural network | |
EP3869403A2 (en) | Image recognition method, apparatus, electronic device, storage medium and program product | |
EP4369180A2 (en) | Callpath finder | |
CN111667057B (en) | Method and apparatus for searching model structures | |
CN111582454B (en) | Method and device for generating neural network model | |
CN111913998B (en) | Data processing method, device, equipment and storage medium | |
CN111488492B (en) | Method and device for searching graph database | |
CN110706147B (en) | Image processing environment determination method, device, electronic equipment and storage medium | |
CN111259107B (en) | Determinant text storage method and device and electronic equipment | |
CN111756832B (en) | Method and device for pushing information, electronic equipment and computer readable storage medium | |
CN111461343A (en) | Model parameter updating method and related equipment thereof | |
CN111652354B (en) | Method, apparatus, device and storage medium for training super network | |
CN112559522A (en) | Data storage method and device, query method, electronic device and readable medium | |
EP3721354A1 (en) | Systems and methods for querying databases using interactive search paths | |
CN111523000B (en) | Method, apparatus, device and storage medium for importing data | |
CN111782633B (en) | Data processing method and device and electronic equipment | |
CN111563591B (en) | Super network training method and device | |
CN111177479A (en) | Method and device for acquiring feature vectors of nodes in relational network graph | |
CN112507100B (en) | Update processing method and device of question-answering system | |
CN111522837B (en) | Method and apparatus for determining time consumption of deep neural network | |
CN111680508B (en) | Text processing method and device | |
CN112101447B (en) | Quality evaluation method, device, equipment and storage medium for data set | |
CN111523036B (en) | Search behavior mining method and device and electronic equipment | |
CN112328807A (en) | Anti-cheating method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |