WO2024046013A1 - Information acquisition method and apparatus based on shortest path in knowledge graph - Google Patents

Information acquisition method and apparatus based on shortest path in knowledge graph Download PDF

Info

Publication number
WO2024046013A1
WO2024046013A1 PCT/CN2023/110611 CN2023110611W WO2024046013A1 WO 2024046013 A1 WO2024046013 A1 WO 2024046013A1 CN 2023110611 W CN2023110611 W CN 2023110611W WO 2024046013 A1 WO2024046013 A1 WO 2024046013A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
nodes
path length
shortest path
determined
Prior art date
Application number
PCT/CN2023/110611
Other languages
French (fr)
Chinese (zh)
Inventor
王举范
Original Assignee
王举范
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 王举范 filed Critical 王举范
Publication of WO2024046013A1 publication Critical patent/WO2024046013A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/12Shortest path evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Definitions

  • This article relates to the field of computers, and in particular to an information acquisition method and device based on the shortest path of a knowledge graph.
  • Knowledge graphs use graph models to describe knowledge, model the relationships between things, and use organizational principles to enable users or computer systems to perform knowledge inferences based on underlying data.
  • Nodes in the knowledge graph correspond to entities, and directed edges correspond to relationships between entities. Entities are connected to each other through relationships, and understanding the relationships between entities is the basis for knowledge graph analysis.
  • the shortest path algorithm can be used to calculate the closeness of the relationship between entities and obtain the shortest path tree or a shortest path. This can provide a more intuitive and in-depth understanding of the relationship between entities, and provide insights and discoveries for hidden facts and Provide high-quality data regularly.
  • the shortest path algorithm commonly used for mining information in knowledge graphs in the existing technology is mainly based on the Dijkstra algorithm based on relaxation operations.
  • Dijkstra's algorithm aims at minimizing the weight value of the path for the node u recently added to S, and examines each node v connected to u and not in S, that is, when dist[u]+length(u,v) ⁇ dist [v], the weight value dist[v] of v is replaced by dist[u]+length(u,v), and the path predecessor of v, pre[v], points to u. Based on this, Dijkstra's algorithm has the following flaws:
  • Dijkstra's algorithm uses relaxation operations to continuously select paths with smaller weight values among multiple paths for nodes in Q, and will repeatedly update the path information of certain nodes in Q, as shown in: the path information of nodes in Q May be updated between multiple iterations; the degree of perturbation to dist in each iteration is positively related to the out-degree of u.
  • the above two points greatly increase the maintenance cost of the newly generated u in Q.
  • This article is used to solve the problem of low computational efficiency in the shortest path determination process based on knowledge graphs in the existing technology.
  • this article provides an information acquisition method based on the shortest path of the knowledge graph.
  • the knowledge graph includes multiple nodes and adjacent edges between nodes.
  • the weight of the adjacent edges between nodes represents the distance between nodes.
  • the method includes:
  • S12 use the nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as the first nodes to form a first node set, and select a distance iteration starting point from the adjacent nodes of the first node in the first node set with undetermined shortest paths.
  • the nearest adjacent node determines the shortest path and updates the first node set and the determined path length of the newly added first node;
  • step S13 repeat the above step S12 until all expected shortest paths are found, where the expected shortest paths are related to the shortest path search strategy;
  • this article provides a device for determining the shortest path of a knowledge graph.
  • the knowledge graph includes multiple nodes and adjacent edges between nodes.
  • the weight of the adjacent edges between nodes represents the distance between nodes.
  • the device includes:
  • the initialization unit is used to initially set the iteration starting point and determine the shortest path based on the shortest path search strategy and the nodes in the user request;
  • the shortest path determination unit is configured to use nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as first nodes to form a first node set, from the adjacent nodes of the first node in the first node set with undetermined shortest paths. , select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;
  • a loop control unit configured to repeatedly start the shortest path determination unit until all expected shortest paths are found, wherein the expected shortest path is related to the shortest path search strategy;
  • An information acquisition unit is used to acquire information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.
  • the above two embodiments select the adjacent node closest to the iteration starting point to determine the shortest path each time (recorded as a small step algorithm), so that the shortest path of one or more nodes can be calculated in each iteration. short path and steadily approach the shortest paths of other nodes.
  • the number of adjacent edges of a node will not directly affect the time complexity and can improve the efficiency of determining the shortest path.
  • this article provides a method for determining the shortest path of the knowledge graph.
  • the knowledge graph includes multiple nodes and adjacent edges between nodes.
  • the weight of the adjacent edges between nodes represents the distance between nodes.
  • the methods include:
  • the edge node includes the first node collection node that has determined the shortest path and has adjacent nodes that have not determined the shortest path. , nodes in the candidate node set for which the shortest path has not been determined can be added to the first node set, and nodes in the relaxation operation set for which the path is obtained through the relaxation operation but the shortest path is not determined; initially set the first node set to include the iteration starting point and determine its shortest
  • the path, candidate node set and relaxation operation set are empty; the determined path length of the iteration starting point is initially set to zero, and the determined path length of the remaining nodes is infinite;
  • the node pB with the smallest determined path length among the nodes with the largest heuristic path length is screened out from the first node set and the candidate node set, and the node pB with the smallest determined path length is screened out from the relaxation operation set.
  • the node pA with the largest path length has been determined among the nodes with the smallest path length;
  • the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to that of node pA.
  • the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set that have not yet determined the shortest path, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set and new The determined path length of the first node;
  • the excellent node is the node pA2 with the largest path length determined among the nodes with the smallest inspired path length in the relaxation operation set;
  • the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;
  • step S26 repeat the above process from step S22 to step S25 until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the length of the shortest path from the source node to the target node;
  • S27 Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the obtained information.
  • this article provides a device for determining the shortest path of a knowledge graph.
  • the knowledge graph includes multiple nodes and adjacent edges between nodes.
  • the weight of the adjacent edges between nodes represents the distance between nodes.
  • the device includes:
  • the initialization unit is used to initially set the iteration starting point and the edge node according to the shortest path search strategy and the source node and target node in the user request.
  • the edge node includes the first node that has determined the shortest path and has an adjacent node with an undetermined shortest path.
  • the first node set is initially set to include the iteration starting point and Determine the shortest path, the node set to be selected and the relaxation operation set are empty; initially set the determined path length of the iteration starting point to zero, and the determined path lengths of the remaining nodes to infinity;
  • the calculation unit is used to calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;
  • a screening unit configured to filter out the node pB with the smallest determined path length among the nodes with the largest inspired path length from the first node set and the candidate node set based on the inspired path length and the determined path length of the edge node, and select the node pB with the smallest determined path length from the relaxation operation Centrally filter out the node pA with the largest determined path length among the nodes with the smallest heuristic path length;
  • Algorithm selection unit for use if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than is equal to the determined path length of node pA, then the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node Set and add the determined path length of the first node;
  • the node moving unit is used to select excellent nodes from the relaxation operation set and move them to the candidate node set.
  • the excellent node is the node pA2 with the largest path length determined among the nodes with the smallest inspired path length in the relaxation operation set;
  • the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;
  • the loop control unit is used to repeatedly start the calculation unit, filtering unit, algorithm selection unit and node moving unit until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the shortest path from the source node to the target node. the length of the path;
  • the information acquisition unit is used to obtain information from the knowledge graph according to the shortest path of each node and respond to user requests based on the obtained information.
  • the above two embodiments use heuristic information as target guidance information to combine the small step algorithm with the relaxation operation algorithm, which can reduce the search blindness of the small step algorithm caused by knowing nothing about the iteration endpoint during the search process, and is filtered by the relaxation operation. Finding excellent nodes as candidates for the first node in the small step algorithm can reduce the search range of the small step algorithm and obtain the shortest path from the iteration starting point to the iteration end point more quickly.
  • this article also provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • the processor executes the computer program, it implements the method described in the aforementioned embodiments.
  • this article also provides a computer storage medium on which a computer program is stored.
  • the computer program is run by a processor of a computer device, the instructions of the method according to the foregoing embodiments are executed.
  • Figure 1 shows the first flow chart of the information acquisition method based on the shortest path of the knowledge graph in the embodiment of this article
  • Figure 2 shows the first flow chart of the process of selecting the adjacent node closest to the iteration starting point to determine the shortest path and updating the first node set in the embodiment of this article;
  • Figure 3 shows a schematic diagram of a knowledge graph abstract graph according to Embodiment 1 of this article
  • Figure 4 shows the second flow chart of the information acquisition method based on the shortest path of the knowledge graph in the embodiment of this article
  • Figure 5 shows the second flow chart of the process of selecting the adjacent node closest to the iteration starting point to determine the shortest path and updating the first node set in the embodiment of this article;
  • Figure 6 shows a schematic diagram of another knowledge graph abstract graph according to the embodiment of this article.
  • Figure 7 shows a structural diagram of the information acquisition device based on the shortest path of the knowledge graph in the embodiment of this article
  • Figure 8 shows another structural diagram of the information acquisition device based on the shortest path of the knowledge graph according to the embodiment of this article
  • Figure 9 shows a structural diagram of a computer device according to an embodiment of this article.
  • Knowledge graphs can efficiently organize, manage and utilize massive amounts of information. They are widely used in many fields such as social networks, human resources and recruitment, finance, insurance, retail, advertising, communications, IT, manufacturing, media, medical care, e-commerce and logistics. Wide range of applications. Specifically, for example, the knowledge graph can realize the transformation of the Web from web page links to concept links (supporting retrieval by topic), and truly realize semantic retrieval. For another example, a search engine based on knowledge graphs can graphically feedback structured knowledge to users, allowing users to accurately locate and acquire knowledge in depth without having to browse a large number of web pages.
  • the information acquisition method and device based on the shortest path of the knowledge graph described in this article can be applied to various fields of information acquisition based on the shortest path of the knowledge graph, such as information retrieval, path planning (including robot navigation, vehicle navigation, etc.).
  • This article is based on The specific application fields of the information acquisition method and device of the shortest path of the knowledge graph are not limited.
  • the knowledge graph described in this article includes multiple nodes and adjacent edges between nodes.
  • the weight of the adjacent edges between nodes represents the length distance between nodes.
  • the specific content included in the knowledge graph depends on the application field.
  • the length distance between nodes is an abstract concept, which represents the cost, time, etc. of converting or obtaining information between nodes.
  • an information acquisition method based on the shortest path of the knowledge graph is provided to solve the problem of low computational efficiency in the shortest path determination process based on the knowledge graph in the existing technology.
  • Information acquisition methods based on the shortest path of the knowledge graph include:
  • Step S11 According to the shortest path search strategy and the nodes in the user request, initially set the iteration starting point and determine the shortest path.
  • Step S12 Use nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as first nodes to form a first node set, and select distance iteration from the adjacent nodes of the first node in the first node set with undetermined shortest paths.
  • the adjacent node with the closest starting point determines the shortest path and updates the first node set.
  • Step S13 Repeat the above step S12 until all expected shortest paths are found, where the expected shortest paths are related to the shortest path search strategy.
  • Step S14 Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.
  • the client or server stores knowledge graph information (nodes and connection relationships between nodes, nodes reflect entities, and connection relationships between nodes reflect association relationships between nodes), which are obtained from the knowledge graph. information is displayed on the client.
  • the client described in this article includes but is not limited to computer equipment, mobile terminals, etc., and can also be software installed on computer equipment, mobile terminals, etc.
  • This embodiment avoids the use of relaxation operations and selects the adjacent node closest to the iteration starting point each time to determine the shortest path (recorded as a small step algorithm), so that the shortest path of one or more nodes can be calculated in each iteration and is stable. Approach the shortest path to other nodes. The number of adjacent edges of a node will not directly affect the time complexity and can improve the efficiency of determining the shortest path.
  • the user needs to receive a user request by operating a client.
  • the user request at least includes the starting node, so that information related to the starting node can be mined through the above steps S11 to S14.
  • the shortest path search strategy at least includes single source search, single source single target forward search, single source single target reverse search, and single source single target bidirectional search.
  • the process of initially setting the iteration starting point includes:
  • the shortest path search strategy is a single source search, set the source node in the user request as the iteration starting point, and execute steps S12 and S13 starting from the source node.
  • the determination conditions for all expected shortest paths are all determined shortest paths.
  • the node has no adjacent nodes for which the shortest path has not been determined.
  • the shortest path search strategy is a single source single target forward search, set the source node in the user request as the iteration starting point, and execute steps S12 and S13 according to the forward search direction from the source node to the target node.
  • the shortest path search strategy is a single source single target reverse search, set the target node in the user request as the iteration starting point, and execute steps S12 and S13 according to the reverse search direction from the source node to the target node.
  • the shortest path search strategy is a single source, single target, and bidirectional search, set the source node and target node in the user request as the iteration starting point, and follow the forward search direction from the source node to the target node and the direction from the target node to the source node. Perform step S12 and step S13 respectively in the reverse search direction.
  • the determination condition for all expected shortest paths is that there are nodes associated with both the source node and the target node.
  • the source node is node A
  • the target node is node G
  • the search direction is from node A to node G
  • the intermediate nodes include node B.
  • To associate the source node and the target node with itself is to record node A in the association information of the source node. Node G is recorded in the association information of the target node, and node A is recorded in the association information of node B.
  • the determined path is the shortest path.
  • Directed adjacent edge ⁇ from,to> f is used for forward search, indicating the directed edge from from to to, from corresponds to the first node, to is the adjacent point of from, and to corresponds to the second node.
  • Directed adjacent edge ⁇ from,to> b is used for reverse search, indicating the directed edge from from to to, to corresponds to the first node, from is the adjacent point of to, and from corresponds to the second node.
  • the direction of the reverse search for the adjacent edge ⁇ from,to> b is opposite to the direction of the forward search for the adjacent edge ⁇ from,to> f , that is, the direction is that the second node from points to the first node to, and the forward search The first node from points to the second node to, and they are consistent with the directions of the adjacent edges ⁇ from, to>.
  • the to node in the linear list of adjacent edges of the first node to is the same node, and the to node corresponds to multiple second nodes from.
  • the from node in the linear list of adjacent edges of the first node from is the same node, and the from node corresponds to multiple second nodes to.
  • single-source single-target bidirectional search such as whether to use a unified second node to predict the shortest path length, and whether to use multi-threading.
  • the following method uses an independent second node's expected shortest path length for forward search and reverse search, and in the same thread, independently calculates the second node's expected shortest path length for each iteration direction.
  • the shortest path of a node described in this article includes: the node's predecessor node and the node's determined path length.
  • the determined path length of a node is equal to the determined path length of the node's predecessor node plus the sum of the adjacent edge weights between the node and the node's predecessor node.
  • the node includes at least one predecessor node, such as 0, 1 or more predecessor nodes.
  • the determined path length of a node represents the sum of the weights of adjacent edges moving from the iteration starting point to the node. In the initial state, the determined path length of the iteration starting point is zero, and the determined path lengths of the remaining nodes are infinite. This is done during the path search process. renew.
  • the predecessor node of the iteration starting point is empty, and the determined path length of the iteration starting point is zero.
  • the predecessor node of node C is node B
  • the predecessor node of node B is source node A
  • the weight of adjacent edge ⁇ A,B> is 3
  • the weight of adjacent edge ⁇ B,C> is 2
  • node C has determined the path length as 5.
  • the shortest path of a node may also include the adjacent edge weight of the node, the predecessor node record number, the current node, and the current node record number.
  • the path information from the second node to the first node is recorded through the predecessor node and the predecessor node, and the adjacent edge information is recorded through the path length, predecessor node and current node.
  • the direction of the forward search is from the node from to the node to, the predecessor node of the node to is the from node
  • the direction of the reverse search is from the node to to the node from
  • the predecessor node of the node from is the to node
  • the shortest path p1->p2->p3 consists of two directed edges ⁇ p1, p2> and ⁇ p2, p3>: the forward search path points to p1 ⁇ -p2 and p2 ⁇ -p3, which is opposite to the path direction.
  • the path directions of the reverse search are p2->p3 and p1->p2, which are consistent with the path direction.
  • step S12 from the adjacent nodes of the first node in the first node set for which the shortest path has not been determined, selecting the adjacent node closest to the iteration starting point to determine the shortest path and updating the first node set includes:
  • Step S121 Determine relevant information for the first node in the first node set for which no relevant information has been determined and remove the first node without relevant information from the first node set.
  • the relevant information of the first node includes: the second node, the adjacent edge to be processed, and the estimated shortest path length of the second node.
  • the adjacent edge to be processed is an adjacent edge that satisfies the following conditions among all adjacent edges between the first node and its adjacent node for which the shortest path has not been determined: the adjacent edge weight is the minimum value of all adjacent edges, and the adjacent edge weight is equal to The sum of the determined path lengths of the first node is greater than the adjacent edges of the accumulated movement steps associated with the first node set.
  • the second node is the non-first node of the adjacent edge to be processed.
  • the expected shortest path length of the second node of the first node is equal to the weight of the adjacent edge to be processed plus the determined path length of the first node.
  • the cumulative movement step associated with the first node set is equal to the shortest path length of the node farthest from the iteration starting point of the determined shortest path, reflecting the path weight length that has been moved from the iteration starting point.
  • the first node The cumulative move step associated with the set is zero. This method will sequentially determine all the shortest paths of the second nodes of the adjacent edges with the same weight value of the first node in the same iteration.
  • the adjacent edge to be processed can be determined based on the linear list of adjacent edges arranged in ascending order of the first node.
  • the linear list of adjacent edges points to the first adjacent edge of the unconfirmed shortest path through a pointer.
  • the determination process includes:
  • Step S122 Select the first node with the smallest expected shortest path length of the second node from the first node set.
  • Step S123 The filtered second node related to the first node is the adjacent node closest to the iteration starting point.
  • the filtered first node is used as the predecessor node of its related second node, and the estimated shortest path length of the second node associated with the first node is the determined path length of the second node.
  • Step S124 Update the cumulative movement step length associated with the first node set to the estimated shortest path length of the second node associated with the filtered first node.
  • Step S125 perform the following judgment on the second node related to each filtered first node, the adjacent edge to be processed, and the estimated shortest path length of the second node:
  • Step S1251 When the estimated shortest path length of the second node is equal to the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes. Add a shortest path to the node, determine the relevant information of the first node, and if there is no relevant information, remove the first node from the first node set;
  • the estimated shortest path length of the second node that is, the sum of the determined path length of the first node and the weight of the adjacent edge to be processed, is equal to the determined path length of the second node, indicating that the second node has joined the first node set, At this time, it is only necessary to determine the relevant information of the first node. When the first node has no relevant information, the first node has no adjacent nodes for which the shortest path has not been determined.
  • Step S1252 When the estimated shortest path length of the second node is less than the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes.
  • the node adds a shortest path and determines the relevant information of the first node. If there is no relevant information, remove the first node from the first node set and add the second node as the first node to all the nodes. In the first node set, the relevant information of the newly added first node is determined. If there is no relevant information, the newly added first node is removed from the first node set.
  • the estimated shortest path length of the second node of the first node is the sum of the determined path length of the first node and the weight of the adjacent edge to be processed. If the path length is less than the path length of the second node, it means that the second node has not joined the first node set. .
  • the information acquisition method based on the shortest path of the knowledge graph also includes:
  • the third node is regarded as the successor node, according to The first node and its successor node add a shortest path to the successor node; otherwise,
  • the third node is regarded as the successor node, Add a shortest path to the successor node according to the first node and its successor node, add the third node as the first node to the first node set, and determine the number of the newly added first node. Relevant information, if there is no relevant information, remove the newly added first node from the first node set.
  • the first nodes are node 5, node 3, and node 4.
  • the predecessor node of node 5 is node 2. It is assumed that the determined path length of node 5 is 10 and the determined path length of node 3 is 2. The determined path length of node 4 is 10, and the weight of the adjacent edge ⁇ 3,5> is 8. Node 4 and node 5 do not have a third node.
  • the third node of node 3 is node 5. The shortest path of the third node can be updated.
  • Predecessor nodes of the third node include node 2 and node 3.
  • determining the relevant information of the first node also includes:
  • the relevant information of the first node with the smallest estimated shortest path length of the second node is retained, and the relevant information is re-determined for other first nodes.
  • This embodiment can ensure that unnecessary interference lines are removed during the shortest path search process and avoid invalid iterations.
  • a method for determining the shortest path of a knowledge graph is also provided to solve the problem of low computational efficiency in the shortest path determination process based on the knowledge graph in the prior art.
  • the knowledge graph includes multiple nodes and nodes.
  • the weight of the adjacent edge between nodes represents the distance between nodes, as shown in Figure 4.
  • the method includes:
  • Step S21 initially set the iteration starting point and edge nodes according to the shortest path search strategy and the source node and target node in the user request.
  • the edge nodes include the first node concentration node, the candidate node concentration node and the relaxation operation concentration node.
  • the first node set includes the iteration starting point and its shortest path is determined.
  • the candidate node set and the relaxation operation set are empty; the determined path length of the iteration starting point is initially set to zero, and the determined path lengths of the remaining nodes are infinite. .
  • the nodes in the first node set are adjacent nodes with determined shortest paths and undetermined shortest paths
  • the nodes in the candidate node set are nodes with undetermined shortest paths that can be added to the first node set
  • the nodes in the relaxation operation set are nodes that pass the relaxation operation. The node for which the path was obtained but the shortest path was not determined.
  • the shortest path search strategy includes: single source single target forward search, single source single target reverse search, and single source single target bidirectional search.
  • the most path of a node includes the node's predecessor node and the node's determined path length, the corresponding predecessor node of the iteration starting point is empty, and the determined path length of the iteration starting point is zero.
  • the shortest path of a node also includes node information in the shortest path between the iteration starting point and the node.
  • Step S22 Calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node.
  • f(n) g(n)+h(n);
  • f(n) is the heuristic path length from the iteration starting point to the iteration end point through edge node n, that is, the estimated cost from the iteration starting point to the iteration end point through edge node n; n is the edge node; h(n) is the edge node The predicted path length of n to the iteration end point, that is, how much cost is needed to get from the edge node to the iteration end point; g(n) is the determined path length from the edge node n to the iteration start point, which means the cost from the iteration start point to the edge node n The cost has been determined; h is an estimation function, h can be a Euclidean distance function, a Manhattan distance, a Chebyshev distance function, etc. This article does not limit the specific type of h. The predicted path length from any node to the iteration end point must be less than or Equal to the length of the shortest path from any node to the it
  • Step S23 According to the length of the heuristic path and the determined path length of the edge node, the node pB with the smallest determined path length among the nodes with the largest heuristic path length is screened out from the first node set and the candidate node set, and the heuristic is selected from the relaxation operation set.
  • the node pA with the largest path length has been determined among the nodes with the smallest path length.
  • the node pB with the smallest path length has been determined to be the worst node in the first node set and the candidate node set. From the relaxation operation concentrated Among the selected nodes with the smallest heuristic path length, the node pA with the largest path length has been determined to be the best node. The specific reasons are:
  • the node with the largest heuristic path length f means that the actual length of the path through this node to the iteration end point is more likely to be greater than the actual length of the path from other nodes to the iteration end point.
  • node pB select the node pB with the smallest determined path length g: indicating that the predicted path length h through this node to the iteration end point is the largest, and a larger h value has a greater value in the process of converting to g Uncertainty, the g value from n to the iteration end point will be greater than other nodes with greater probability, that is, a larger path with poor quality will be generated. Therefore, node pB is the worst node.
  • Heuristics the node with the smallest path length f It means that the shortest path from the iteration starting point to the node will appear on the shortest path from the iteration starting point to the iteration end point with a high probability.
  • node pA selects the node pA with the largest determined path length g: indicating that the predicted path length h through this node to the iteration end point is the smallest.
  • choosing a smaller h value has less uncertainty in the conversion to g, i.e. it will generate good quality paths.
  • selecting the node with the largest g value can ensure that the condition of not less than the cumulative movement step is met.
  • it will also reduce the frequency of node movement between the set processed by the small step mechanism and the relaxation operation set. Therefore, node pA is the best node.
  • Step S24 Select the small step algorithm or the relaxation operation algorithm according to the nodes pB and pA, specifically including:
  • the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to the determined path length of node pA path length, the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set.
  • the paths of all adjacent nodes of the node pA in the relaxation operation set are calculated based on the relaxation operation and the relaxation operation set is updated.
  • calculating the paths of all adjacent nodes of node pA in the relaxation operation set based on the relaxation operation and updating the relaxation operation set include:
  • the path length of node pNeighbor is cost, and the predecessor node is pA;
  • cost is equal to the original determined path length g(pNeighbor): record the path information of adjacent point pNeighbor according to node pA and cost, that is, add an equal-length path to pNeighbor;
  • Step S25 Move the nodes in the relaxation operation set, the candidate node set and the first node set. Specifically include:
  • the excellent node is the node pA2 with the largest path length among the nodes with the smallest path length inspired by the relaxation operation set;
  • nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set.
  • Step S26 Repeat the above process from step S22 to step S25 until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the length of the shortest path from the source node to the target node.
  • Step S27 Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.
  • This embodiment uses heuristic information as target guidance information to combine the small step algorithm with the relaxation operation algorithm, which can reduce the search blindness of the small step algorithm caused by knowing nothing about the iteration end point during the search process, and filter out the outstanding ones through the relaxation operation.
  • node as the candidate node of the first node in the small step algorithm, when the number of nodes in the first node set and the candidate node set is greater than the predetermined value, some nodes in the first node set and the candidate node set are moved to the relaxation operation set, which can reduce
  • the small step algorithm searches the range and obtains the shortest path from the iteration starting point to the iteration end point more quickly.
  • step S21 if the shortest path search strategy is a single source single target forward search, the iteration starting point is the source node. Further, steps S22 to S25 are executed according to the forward search direction from the source node to the target node. In step S22, the heuristic path length from the iteration starting point through the edge node is the heuristic path length from the source node through the edge node to the target node.
  • the iteration starting point is the target node, and steps S22 to S25 are performed according to the reverse search direction from the source node to the target node.
  • the heuristic path length from the iteration starting point through the edge node is the heuristic path length from the target node through the edge node to the source node.
  • the starting point of the iteration is the source node and the target node, and steps S22 to S22 are executed respectively according to the forward search direction from the source node to the target node and according to the reverse search direction from the target node to the source node.
  • Step S25 In the forward search direction, the heuristic path length from the iteration starting point through the edge node in step S22 is the heuristic path length from the source node through the edge node to the target node.
  • the heuristic path length from the iteration starting point in step S22 The length of the heuristic path through the edge node is the length of the heuristic path from the target node to the source node through the edge node.
  • the single-source single-target bidirectional search uses the iteration starting point s as the search starting point to perform a forward search, and uses the iteration end point t as the search starting point to perform a reverse search. When the same node is reached, a path from s to t is found. shortest path.
  • the directed adjacent edge ⁇ from, to> f is used for forward search, indicating the directed edge from from to to, from corresponds to the first node, to is the adjacent point of from, and to corresponds to the second node;
  • Directed adjacent edge ⁇ from,to> b is used for reverse search, indicating the directed edge from from to to, to corresponds to the first node, from is the adjacent point of to, and from corresponds to the second node;
  • the direction of the reverse search for the adjacent edge ⁇ from,to> b is opposite to the direction of the forward search for the adjacent edge ⁇ from,to> f , that is, the direction is that the second node from points to the first node to, and the forward search is the A node from points to the second node to, which is consistent with the direction of the adjacent edge ⁇ from, to>;
  • the reverse search the to node in the linear list of adjacent edges of the first node to is the same node, and the to node Corresponds to
  • step S25 filters out nodes exceeding the second predetermined value from the first node set and the candidate node set and moves them to the relaxation operation set, including:
  • the second predetermined value is a debugging parameter. If the second predetermined value is too small, it will cause nodes to continuously move between the first node set, the candidate node set, and the relaxation operation set. If the second predetermined value is too large, it will cause the first node set and the candidate node set to move continuously. The number of nodes in the node set increases, thereby increasing the amount of calculation, making the heuristic information lose its application effect. Therefore, a reasonable selection of the second predetermined value can reduce the movement of nodes between the first node set, the candidate node set and the relaxation operation set. times to improve the efficiency of determining the shortest path.
  • the second predetermined value may be determined by one of the following three methods:
  • the second predetermined value can be set to a value between the lower limit and the upper limit when initialized, and the second predetermined value can be dynamically adjusted according to the following strategy: In each consecutive K 0 iterations, if the number of nodes moved by the relaxation operation set to the candidate node set is less than K 1 , the second predetermined value is reduced by M. If the number of nodes moved by the relaxation operation set toward the candidate node set is greater than K 2 , the second predetermined value is increased by M, where K 0 , K 1 , and M are positive integers and can be set according to the actual situation.
  • Application scenarios include information retrieval, path planning, etc., depending on the specific application field of the knowledge graph.
  • the K 0 iterations mentioned herein refer to the number of times the above steps S22 to S24 are performed.
  • the information acquisition method based on the shortest path of the knowledge graph also includes:
  • the first intersection node is removed from the relaxation operation set and the candidate node set, wherein the first intersection node is the intersection of the first node set, the relaxation operation set and the candidate node set node.
  • a second intersection node is removed from the first node set and the candidate node set, wherein the second intersection node is the intersection of the relaxation operation set and the first node set and the candidate node set node.
  • the cumulative movement step associated with the first node set is also set to zero.
  • step S24 When updating the first node set in step S24, the cumulative movement step associated with the first node set is also updated.
  • step S25 the execution conditions for selecting excellent nodes from the slack operation set and moving them to the candidate node set include:
  • the first node set and the candidate node set are empty; or the heuristic path length of the excellent node is less than the maximum heuristic path length of the nodes in the first node set and the candidate node set and the determined path length of the excellent node is greater than or equal to the first The cumulative movement steps associated with the node set.
  • step S24 in the small step algorithm, from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, the adjacent node closest to the iteration starting point is selected to determine the shortest path and update the first node.
  • a node set includes:
  • Step S241 when the first node set is empty, move the node with the smallest determined path length in the candidate node set to the first node set and set the cumulative movement step associated with the first node set to the determined path length of the mobile node.
  • Step S242 Determine relevant information for the first node in the first node set for which no relevant information has been determined and remove the first node without relevant information from the first node set.
  • the relevant information of the first node includes: the second node, the adjacent edge to be processed and the estimated shortest path length of the second node;
  • the adjacent edge to be processed is all the adjacent edges between the first node and its adjacent nodes for which the shortest path has not been determined.
  • Adjacent edges that meet the following conditions: the adjacent edge weight is the minimum value of all adjacent edges, and the sum of the adjacent edge weight and the determined path length of the first node is greater than the cumulative movement step associated with the first node set.
  • the second node is a non-first node of the adjacent edge to be processed;
  • the estimated shortest path length of the second node of the first node is equal to the weight of the adjacent edge to be processed plus the determined path length of the first node.
  • Step S243 Screen out the first node with the smallest estimated shortest path length of the second node from the first node set, and select a node from the candidate node set to move to the first node set based on the screened out first node.
  • step S242 when a node moves from the candidate node set to the first node set, step S242 needs to be re-executed to determine relevant information for the mobile node, and re-determine the first node with the smallest expected shortest path length for the second node in the first node set. node.
  • selecting a node from the candidate node set to move to the first node set includes:
  • step S243 Determine whether to move the nodes in the candidate node set to the first node set in (2). If so, perform step S243 again to select the first node with the smallest expected shortest path length of the second node from the first node set. , otherwise, perform step (1) again until the minimum determined path lengths of the nodes in the candidate node set are greater than the predicted shortest path length of the second node of the first node with the smallest predicted shortest path length of the second node, perform step (1).
  • step S244 Determine whether to move the nodes in the candidate node set to the first node set in (2). If so, perform step S243 again to select the first node with the smallest expected shortest path length of the second node from the first node set. , otherwise, perform step (1) again until the minimum determined path lengths of the nodes in the candidate node set are greater than the predicted shortest path length of the second node of the first node with the smallest predicted shortest path length of the second node, perform step (1).
  • this step by moving the first node closest to the iteration starting point in the candidate node set to the first node, it is ensured that the second node corresponding to the estimated shortest path length of the second node of the filtered first node is the closest to the iteration starting point.
  • the nodes with undetermined shortest paths are then moved from the relaxed set to the first node set, and the nodes in the candidate node set that are closest to the iteration starting point are given priority to be moved to the first node set or removed from the candidate node set.
  • Step S244 The filtered second node related to the first node is the adjacent node closest to the iteration starting point.
  • Step S245 Update the cumulative movement step length associated with the first node set to the estimated shortest path length of the second node associated with the filtered first node.
  • Step S246 Perform the following processing on the second node related to each filtered first node, the adjacent edge to be processed, and the estimated shortest path length of the second node:
  • Step S2461 When the estimated shortest path length of the second node of the first node is equal to the determined path length of the second node, the second node is used as a successor node. According to the first node and its successor node Add a shortest path to the subsequent node, determine the relevant information of the first node, and remove the first node from the first node set if there is no relevant information.
  • Step S2462 When the estimated shortest path length of the second node of the first node is less than the determined path length of the second node, the second node is used as a successor node. According to the first node and its successor node Add a shortest path to the successor node and determine the relevant information of the first node. If there is no relevant information, remove the first node from the first node set and add the second node as the first node. Nodes are added to the first node set, and relevant information of the newly added first node is determined. If there is no relevant information, the newly added first node is removed from the first node set.
  • the knowledge graph of this embodiment contains 9 nodes.
  • the number of each node and the weight relationship between the nodes are as shown in Figure 6 Indicates that the source node is node 0 and the end node is node 8.
  • the inspired path length f(n), determined path length g(n) and predicted path length h(n) of each node are respectively:
  • Information acquisition methods based on the shortest path of the knowledge graph include:
  • the shortest path length of the iteration starting point 0 is 0.
  • the iteration starting point is initialized to the first node set.
  • the relaxation operation set and the candidate node set are empty, which are recorded as relaxation operation set ⁇ , candidate node set ⁇ , and the first Node set ⁇ 0 ⁇ , at this time the edge node only has node 0.
  • the first loop Calculate the heuristic path length of the edge node.
  • the small step algorithm is executed, so there is no need to select nodes from the relaxation operation set and move them to the candidate node set.
  • Second loop Calculate the heuristic path length of the edge node.
  • the movement to the relaxation operation set includes node 1, the candidate node set is empty, and the first node set includes nodes 2 and 3.
  • the third loop Calculate the heuristic path length of the edge node.
  • the edge nodes include node 1, node 2 and node 3.
  • the fourth loop Calculate the heuristic path length of the edge node.
  • the edge nodes include node 1, node 2 and node 4.
  • the step size is 3. Therefore, node 6 is added to the candidate node set.
  • the candidate node set is ⁇ 6 ⁇ and the first node set is ⁇ 2,4 ⁇ .
  • the first node set + candidate node set is greater than N. Therefore, the node 4 with the largest inspired path length in the first node set + candidate node set needs to be moved to the relaxation operation set.
  • the relaxation operation set is ⁇ 7, 4 ⁇ , the node set to be selected is ⁇ 6 ⁇ , and the first node set is ⁇ 2 ⁇ .
  • the fifth loop Calculate the heuristic path length of the edge node.
  • the edge nodes include node 2, node 6, node 4 and node 7.
  • the sixth loop Calculate the heuristic path length of the edge node.
  • the edge nodes include node 2, node 7 and node 4.
  • step S242 after determining the relevant information of the first node also includes:
  • the relevant information of the first node with the smallest estimated shortest path length of the second node is retained, and the relevant information is re-determined for other first nodes.
  • the small step algorithm in order to avoid missing the shortest path, also includes:
  • the third node is regarded as the successor node, according to The first node and its successor node add a shortest path to the successor node; otherwise,
  • the third node is regarded as the successor node, Add a shortest path to the successor node according to the first node and its successor node, add the third node as the first node to the first node set, and determine the number of the newly added first node. Relevant information, if there is no relevant information, remove the newly added first node from the first node set.
  • a task parallel distributed system in order to improve the efficiency of shortest path determination, can use multiple cluster devices to simultaneously process multiple single-target node shortest path determination tasks and single-source node shortest path determination tasks. Determine the tasks and implement the different computing processes in each task using cluster or parallel computing methods.
  • this article also provides an information acquisition device based on the shortest path of the knowledge graph, such as the following embodiment. Since the problem-solving principle of the information acquisition device based on the shortest path of the knowledge graph is similar to the information acquisition method based on the shortest path of the knowledge graph, the implementation of the information acquisition device based on the shortest path of the knowledge graph can be found in the information acquisition method based on the shortest path of the knowledge graph. The repetitive parts will not be repeated.
  • an information acquisition device based on the shortest path of the knowledge graph includes:
  • the initialization unit 701 is used to initially set the iteration starting point and determine the shortest path according to the shortest path search strategy and the nodes in the user request;
  • the shortest path determination unit 702 is configured to use nodes that have determined shortest paths and have adjacent nodes with undetermined shortest paths as first nodes to form a first node set, and set the adjacent nodes of the first node with undetermined shortest paths from the first node. , select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;
  • the loop control unit 703 is configured to repeatedly start the shortest path determination unit 702 until all expected shortest paths are found, wherein the expected shortest path is related to the shortest path search strategy;
  • the information acquisition unit 704 is configured to acquire information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.
  • the information acquisition device based on the shortest path of the knowledge graph includes:
  • the initialization unit 801 is used to initially set the iteration starting point and edge nodes according to the shortest path search strategy and the source node and target node in the user request.
  • the edge nodes include the adjacent nodes that have determined the shortest path and have undetermined shortest paths.
  • the first node set is initially set to include the iteration starting point And determine the shortest path, the candidate node set and the relaxation operation set are empty; initially set the determined path length of the iteration starting point to zero, and the determined path length of the remaining nodes to infinity;
  • the calculation unit 802 is used to calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;
  • the screening unit 803 is configured to filter out the node pB with the smallest determined path length among the nodes with the largest inspired path length from the first node set and the candidate node set according to the inspired path length and the determined path length of the edge node, and select the node pB from the relaxed
  • the operation set selects the node pA with the largest determined path length among the nodes with the smallest heuristic path length;
  • Algorithm selection unit 804 used if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to the determined path length of node pA, the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first The node set and the determined path length of the newly added first node;
  • the node moving unit 805 is used to screen out excellent nodes from the relaxation operation set and move them to the candidate node set.
  • the excellent node is the node pA2 with the largest path length determined among the nodes with the smallest path length inspired by the relaxation operation set;
  • the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;
  • the loop control unit 806 is used to repeatedly start the calculation unit, the filtering unit, the algorithm selection unit and the node moving unit until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than that from the source node to the target node.
  • the information acquisition unit 807 is used to acquire information from the knowledge graph according to the shortest path of each node and respond to user requests according to the acquired information.
  • a computer device is also provided.
  • the computer device 902 includes a memory 906, a processor 904, and a computer program stored in the memory 906 and executable on the processor 904.
  • the processor 904 The method described in any of the foregoing embodiments is implemented when the computer program is executed.
  • Processor 904 such as one or more Central processing unit (CPU), each processing unit can implement one or more hardware threads.
  • Memory 906 is used to store any kind of information such as code, settings, data, etc.
  • the memory 906 may include any one or more combinations of the following: any type of RAM, any type of ROM, flash memory device, hard disk, optical disk, etc. More generally, any memory can use any technology to store information.
  • any memory can provide volatile or non-volatile retention of information.
  • any memory may represent a fixed or removable component of computer device 902.
  • processor 904 executes associated instructions stored in any memory or combination of memories
  • computer device 902 may perform any operation of the associated instructions.
  • Computer device 902 also includes one or more drive mechanisms 908 for interacting with any memory, such as a hard disk drive, an optical disk drive, and the like.
  • Computer device 902 may also include an input/output module 910 (I/O) for receiving various inputs (via input device 912) and for providing various outputs (via output device 914).
  • One particular output mechanism may include a presentation device 916 and an associated graphical user interface 918 (GUI).
  • GUI graphical user interface
  • the input/output module 910 (I/O), the input device 912 and the output device 914 may not be included, and may only be used as a computer device in the network.
  • Computer device 902 may also include one or more network interfaces 920 for exchanging data with other devices via one or more communication links 922 .
  • One or more communication buses 924 couple together the components described above.
  • Communication link 922 may be implemented in any manner, such as through a local area network, a wide area network (eg, the Internet), a point-to-point connection, etc., or any combination thereof.
  • Communication link 922 may include any combination of hardwired links, wireless links, routers, gateway functions, name servers, etc. governed by any protocol or combination of protocols.
  • embodiments of this article also provide a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program, and the computer program is run by a processor. Perform the steps of the above method.
  • Embodiments of this document also provide computer-readable instructions, wherein when a processor executes the instructions, the program therein causes the processor to perform the methods shown in FIGS. 1 to 2 and 4 to 5 .
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be an indirect coupling or communication connection through some interfaces, devices or units, or may be electrical, mechanical or other forms of connection.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiments of this article.
  • each functional unit in each embodiment of this article can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution in this article essentially contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this article.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program code.

Abstract

Provided in the present application are an information acquisition method and apparatus based on the shortest path in a knowledge graph. The method comprises: according to a shortest-path searching strategy and nodes in a user request, initially setting an iteration starting point and determining the shortest path of same; taking nodes, the shortest paths of which have been determined and which have adjacency nodes, the shortest paths of which are not determined, as first nodes to form a set of first nodes, selecting an adjacency node, which is the closest to the iteration starting point, from among the adjacency nodes, the shortest paths of which are not determined, of the first nodes in the set of first nodes, determining the shortest path of the adjacency node, and updating the set of first nodes; repeating the above steps until all desired shortest paths are found, wherein the desired shortest paths are related to the shortest-path searching strategy; and acquiring information from a knowledge graph according to the shortest paths of the nodes, and responding to the user request according to the acquired information. By means of the present application, the shortest paths of one or more nodes can be calculated in each iteration, and the shortest paths of other nodes are approached stably.

Description

一种基于知识图谱最短路径的信息获取方法及装置An information acquisition method and device based on the shortest path of knowledge graph
本申请要求享有2022年08月31日递交、申请号为202211058393.2、发明名称为“一种基于知识图谱最短路径的信息获取方法及装置”的中国专利的优先权,该专利的所有内容在此全部引入。This application claims the priority of the Chinese patent submitted on August 31, 2022, with the application number 202211058393.2 and the invention title "An information acquisition method and device based on the shortest path of the knowledge graph". All the contents of this patent are here Introduction.
技术领域Technical field
本文涉及计算机领域,尤其涉及一种基于知识图谱最短路径的信息获取方法及装置。This article relates to the field of computers, and in particular to an information acquisition method and device based on the shortest path of a knowledge graph.
背景技术Background technique
知识图谱用图模型描述知识,建模事物之间关联关系,并使用组织原则,使得用户或计算机系统可以根据底层数据进行知识推理。Knowledge graphs use graph models to describe knowledge, model the relationships between things, and use organizational principles to enable users or computer systems to perform knowledge inferences based on underlying data.
知识图谱中的节点对应实体,有向边对应实体间的关系。实体之间通过关系彼此联系在一起,理解实体之间的关系是知识图谱分析的基础。在对实体关系量化度量后,可用最短路径算法计算实体之间关系的密切程度,得到最短路径树或一条最短路径,以此更直观深入地理解实体之间的关系,为洞察与发现隐藏事实和规律提供高质量数据。Nodes in the knowledge graph correspond to entities, and directed edges correspond to relationships between entities. Entities are connected to each other through relationships, and understanding the relationships between entities is the basis for knowledge graph analysis. After quantifying the relationship between entities, the shortest path algorithm can be used to calculate the closeness of the relationship between entities and obtain the shortest path tree or a shortest path. This can provide a more intuitive and in-depth understanding of the relationship between entities, and provide insights and discoveries for hidden facts and Provide high-quality data regularly.
现有技术中知识图谱挖掘信息常用的最短路径算法主要建立在基于松弛操作的Dijkstra算法基础之上。The shortest path algorithm commonly used for mining information in knowledge graphs in the existing technology is mainly based on the Dijkstra algorithm based on relaxation operations.
在Dijkstra算法中,设S为已经找到最短路径的节点集,Q为还未找到最短路径的节点集。Dijkstra算法对最近加入S中的节点u,以路径的权重值最小化为目标,考察每个与u相连且不在S中的节点v,即在dist[u]+length(u,v)<dist[v]时,v的权重值dist[v]被dist[u]+length(u,v)替换,v的路径前驱pre[v]指向u。基于此,Dijkstra算法存在如下缺陷:In Dijkstra's algorithm, let S be the set of nodes that have found the shortest path, and Q be the set of nodes that have not yet found the shortest path. Dijkstra's algorithm aims at minimizing the weight value of the path for the node u recently added to S, and examines each node v connected to u and not in S, that is, when dist[u]+length(u,v)<dist [v], the weight value dist[v] of v is replaced by dist[u]+length(u,v), and the path predecessor of v, pre[v], points to u. Based on this, Dijkstra's algorithm has the following flaws:
(1)Dijkstra算法用松弛操作为Q中节点在多个路径之间不断地选择权重值更小的路径,会重复更新Q中某些节点的路径信息,具体表现在:Q中节点的路径信息在多次迭代之间可能被更新;在每次迭代中对dist的扰动程度与u的出度正相关。以上两点使得新生成的u在Q中的维护成本大幅增加。(1) Dijkstra's algorithm uses relaxation operations to continuously select paths with smaller weight values among multiple paths for nodes in Q, and will repeatedly update the path information of certain nodes in Q, as shown in: the path information of nodes in Q May be updated between multiple iterations; the degree of perturbation to dist in each iteration is positively related to the out-degree of u. The above two points greatly increase the maintenance cost of the newly generated u in Q.
(2)节点u的邻接边需要很多的条件判断,严重依赖CPU的算术逻辑单元,无法高效地利用并行计算提高计算机系统运行最短路径算法的效率。 (2) The adjacent edges of node u require a lot of conditional judgments and rely heavily on the arithmetic logic unit of the CPU. Parallel computing cannot be efficiently used to improve the efficiency of the computer system in running the shortest path algorithm.
发明内容Contents of the invention
本文用于解决现有技术中基于知识图谱的最短路径确定过程存在计算效率低的问题。This article is used to solve the problem of low computational efficiency in the shortest path determination process based on knowledge graphs in the existing technology.
为了解决上述技术问题,本文一方面提供一种基于知识图谱最短路径的信息获取方法,知识图谱中包括多个节点及节点间邻接边,节点间邻接边权重表示节点间距离,方法包括:In order to solve the above technical problems, this article provides an information acquisition method based on the shortest path of the knowledge graph. The knowledge graph includes multiple nodes and adjacent edges between nodes. The weight of the adjacent edges between nodes represents the distance between nodes. The method includes:
S11,根据最短路径查找策略及用户请求中的节点,初始地设定迭代起点并确定其最短路径;S11, according to the shortest path search strategy and the nodes in the user request, initially set the iteration starting point and determine its shortest path;
S12,将已确定最短路径且有未确定最短路径的邻接节点的节点作为第一节点组成第一节点集,从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集及新增第一节点的已确定路径长度;S12, use the nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as the first nodes to form a first node set, and select a distance iteration starting point from the adjacent nodes of the first node in the first node set with undetermined shortest paths. The nearest adjacent node determines the shortest path and updates the first node set and the determined path length of the newly added first node;
S13,重复上述步骤S12,直至找到所有期望最短路径为止,其中,所述期望最短路径与最短路径查找策略相关;S13, repeat the above step S12 until all expected shortest paths are found, where the expected shortest paths are related to the shortest path search strategy;
S14,根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应所述用户请求。S14. According to the shortest path of each node, obtain information from the knowledge graph and respond to the user request according to the obtained information.
本文另一方面提供一种知识图谱的最短路径确定装置,知识图谱中包括多个节点及节点间邻接边,节点间邻接边权重表示节点间距离,所述装置包括:On the other hand, this article provides a device for determining the shortest path of a knowledge graph. The knowledge graph includes multiple nodes and adjacent edges between nodes. The weight of the adjacent edges between nodes represents the distance between nodes. The device includes:
初始化单元,用于根据最短路径查找策略及用户请求中的节点,初始地设定迭代起点并确定其最短路径;The initialization unit is used to initially set the iteration starting point and determine the shortest path based on the shortest path search strategy and the nodes in the user request;
最短路径确定单元,用于将已确定最短路径且有未确定最短路径的邻接节点的节点作为第一节点组成第一节点集,从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集;The shortest path determination unit is configured to use nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as first nodes to form a first node set, from the adjacent nodes of the first node in the first node set with undetermined shortest paths. , select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;
循环控制单元,用于重复启动所述最短路径确定单元,直至找到所有期望最短路径为止,其中,所述期望最短路径与最短路径查找策略相关;A loop control unit, configured to repeatedly start the shortest path determination unit until all expected shortest paths are found, wherein the expected shortest path is related to the shortest path search strategy;
信息获取单元,用于根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应所述用户请求。An information acquisition unit is used to acquire information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.
上述两个实施例通过避免使用松弛操作,每次选择距离迭代起点最近的邻接节点确定最短路径(记为小步算法),能够使得每一次迭代中都会计算出一个或多个节点的最 短路径,并稳定地向其它节点的最短路径逼近。节点的邻接边的数量不会直接影响时间复杂度,能够提高最短路径的确定效率。By avoiding the use of relaxation operations, the above two embodiments select the adjacent node closest to the iteration starting point to determine the shortest path each time (recorded as a small step algorithm), so that the shortest path of one or more nodes can be calculated in each iteration. short path and steadily approach the shortest paths of other nodes. The number of adjacent edges of a node will not directly affect the time complexity and can improve the efficiency of determining the shortest path.
本文另一方面提供知识图谱最短路径确定方法,知识图谱中包括多个节点及节点间邻接边,节点间邻接边权重表示节点间距离,方法包括:On the other hand, this article provides a method for determining the shortest path of the knowledge graph. The knowledge graph includes multiple nodes and adjacent edges between nodes. The weight of the adjacent edges between nodes represents the distance between nodes. The methods include:
S21,根据最短路径查找策略及用户请求中的源节点和目标节点,初始地设定迭代起点及边缘节点,边缘节点包括已确定最短路径且有未确定最短路径的邻接节点的第一节点集中节点、可加入第一节点集的未确定最短路径的待选节点集中节点、通过松弛操作得到路径但未确定最短路径的松弛操作集中节点;初始地设定第一节点集包括迭代起点并确定其最短路径,待选节点集及松弛操作集为空;初始地设定迭代起点的已确定路径长度为零,其余节点的已确定路径长度为无穷大;S21, according to the shortest path search strategy and the source node and target node in the user request, initially set the iteration starting point and edge node. The edge node includes the first node collection node that has determined the shortest path and has adjacent nodes that have not determined the shortest path. , nodes in the candidate node set for which the shortest path has not been determined can be added to the first node set, and nodes in the relaxation operation set for which the path is obtained through the relaxation operation but the shortest path is not determined; initially set the first node set to include the iteration starting point and determine its shortest The path, candidate node set and relaxation operation set are empty; the determined path length of the iteration starting point is initially set to zero, and the determined path length of the remaining nodes is infinite;
S22,计算边缘节点的启发路径长度,其中,边缘节点的启发路径长度为从迭代起点到迭代终点经过边缘节点的路径长度;S22, calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;
S23,根据所述启发路径长度和边缘节点已确定路径长度,从第一节点集及待选节点集中筛选出启发路径长度最大的节点中已确定路径长度最小的节点pB,从松弛操作集中筛选出启发路径长度最小的节点中已确定路径长度最大的节点pA;S23. According to the length of the heuristic path and the determined path length of the edge node, the node pB with the smallest determined path length among the nodes with the largest heuristic path length is screened out from the first node set and the candidate node set, and the node pB with the smallest determined path length is screened out from the relaxation operation set. The node pA with the largest path length has been determined among the nodes with the smallest path length;
S24,如果松弛操作集为空或节点pB的启发路径长度小于节点pA的启发路径长度,或节点pB的启发路径长度等于节点pA的启发路径长度且节点pB的已确定路径长度大于等于节点pA的已确定路径长度,则执行如下小步算法:从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集及新增第一节点的已确定路径长度;S24, if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to that of node pA. Once the path length has been determined, the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set that have not yet determined the shortest path, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set and new The determined path length of the first node;
否则,基于松弛操作计算所述松弛操作集中节点pA的所有邻接节点的路径并更新所述松弛操作集及新增松弛节点的已确定路径长度;Otherwise, calculate the paths of all adjacent nodes of the node pA in the relaxation operation set based on the relaxation operation and update the determined path lengths of the relaxation operation set and the newly added relaxation node;
S25,从松弛操作集中筛选出优秀节点移动至待选节点集中,所述优秀节点为松弛操作集中启发路径长度最小的节点中已确定路径长度最大的节点pA2;S25, select excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length determined among the nodes with the smallest inspired path length in the relaxation operation set;
当所述第一节点集及待选节点集中节点数量大于第二预定值时,从所述第一节点集及待选节点集中筛选出超出第二预定值的节点移动至松弛操作集中;When the number of nodes in the first node set and the candidate node set is greater than a second predetermined value, the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;
S26,重复以上步骤S22至步骤S25的过程,直至找到从源节点到目标节点的最短路径且边缘节点的启发路径长度都大于从源节点到目标节点的最短路径的长度;S26, repeat the above process from step S22 to step S25 until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the length of the shortest path from the source node to the target node;
S27,根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应用户请求。 S27: Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the obtained information.
本文另一方面提供一种知识图谱的最短路径确定装置,知识图谱中包括多个节点及节点间邻接边,节点间邻接边权重表示节点间距离,装置包括:On the other hand, this article provides a device for determining the shortest path of a knowledge graph. The knowledge graph includes multiple nodes and adjacent edges between nodes. The weight of the adjacent edges between nodes represents the distance between nodes. The device includes:
初始化单元,用于根据最短路径查找策略及用户请求中的源节点和目标节点,初始地设定迭代起点及边缘节点,边缘节点包括已确定最短路径且有未确定最短路径的邻接节点的第一节点集中节点、可加入第一节点集的未确定最短路径的待选节点集中节点、通过松弛操作得到路径但未确定最短路径的松弛操作集中节点;初始地设定第一节点集包括迭代起点并确定其最短路径,待选节点集及松弛操作集为空;初始地设定迭代起点的已确定路径长度为零,其余节点的已确定路径长度为无穷大;The initialization unit is used to initially set the iteration starting point and the edge node according to the shortest path search strategy and the source node and target node in the user request. The edge node includes the first node that has determined the shortest path and has an adjacent node with an undetermined shortest path. node set nodes, candidate node set nodes that can be added to the first node set and the shortest path is not determined, and relaxation operation set nodes whose paths are obtained through the relaxation operation but the shortest path is not determined; the first node set is initially set to include the iteration starting point and Determine the shortest path, the node set to be selected and the relaxation operation set are empty; initially set the determined path length of the iteration starting point to zero, and the determined path lengths of the remaining nodes to infinity;
计算单元,用于计算边缘节点的启发路径长度,其中,边缘节点的启发路径长度为从迭代起点到迭代终点经过边缘节点的路径长度;The calculation unit is used to calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;
筛选单元,用于根据所述启发路径长度和边缘节点已确定路径长度,从第一节点集及待选节点集中筛选出启发路径长度最大的节点中已确定路径长度最小的节点pB,从松弛操作集中筛选出启发路径长度最小的节点中已确定路径长度最大的节点pA;A screening unit, configured to filter out the node pB with the smallest determined path length among the nodes with the largest inspired path length from the first node set and the candidate node set based on the inspired path length and the determined path length of the edge node, and select the node pB with the smallest determined path length from the relaxation operation Centrally filter out the node pA with the largest determined path length among the nodes with the smallest heuristic path length;
算法选择单元,用于如果松弛操作集为空或节点pB的启发路径长度小于节点pA的启发路径长度,或节点pB的启发路径长度等于节点pA的启发路径长度且节点pB的已确定路径长度大于等于节点pA的已确定路径长度,则执行如下小步算法:从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集及新增第一节点的已确定路径长度;Algorithm selection unit for use if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than is equal to the determined path length of node pA, then the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node Set and add the determined path length of the first node;
否则,基于松弛操作计算所述松弛操作集中节点pA的所有邻接节点的路径并更新所述松弛操作集;Otherwise, calculate the paths of all adjacent nodes of the node pA in the relaxation operation set based on the relaxation operation and update the relaxation operation set;
节点移动单元,用于从松弛操作集中筛选出优秀节点移动至待选节点集中,所述优秀节点为松弛操作集中启发路径长度最小的节点中已确定路径长度最大的节点pA2;The node moving unit is used to select excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length determined among the nodes with the smallest inspired path length in the relaxation operation set;
当所述第一节点集及待选节点集中节点数量大于第二预定值时,从所述第一节点集及待选节点集中筛选出超出第二预定值的节点移动至松弛操作集中;When the number of nodes in the first node set and the candidate node set is greater than a second predetermined value, the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;
循环控制单元,用于重复启动计算单元、筛选单元、算法选择单元及节点移动单元,直至找到从源节点到目标节点的最短路径且边缘节点的启发路径长度都大于从源节点到目标节点的最短路径的长度;The loop control unit is used to repeatedly start the calculation unit, filtering unit, algorithm selection unit and node moving unit until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the shortest path from the source node to the target node. the length of the path;
信息获取单元,用于根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应用户请求。 The information acquisition unit is used to obtain information from the knowledge graph according to the shortest path of each node and respond to user requests based on the obtained information.
上述两个实施例通过启发信息作为目标引导信息将小步算法与松弛操作算法结合,能够降低小步算法在搜索过程中因对迭代终点一无所知而导致的搜索盲目性,由松弛操作筛选出优秀节点,作为小步算法中第一节点的候选节点,能够减小小步算法搜索范围,更加快速地得到迭代起点到迭代终点的最短路径。The above two embodiments use heuristic information as target guidance information to combine the small step algorithm with the relaxation operation algorithm, which can reduce the search blindness of the small step algorithm caused by knowing nothing about the iteration endpoint during the search process, and is filtered by the relaxation operation. Finding excellent nodes as candidates for the first node in the small step algorithm can reduce the search range of the small step algorithm and obtain the shortest path from the iteration starting point to the iteration end point more quickly.
本文另一方面还提供一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,处理器执行计算机程序时实现前述实施例所述方法。On the other hand, this article also provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the method described in the aforementioned embodiments.
本文另一方面还提供一种计算机存储介质,其上存储有计算机程序,所述计算机程序被计算机设备的处理器运行时,执行根据前述实施例所述方法的指令。In another aspect, this article also provides a computer storage medium on which a computer program is stored. When the computer program is run by a processor of a computer device, the instructions of the method according to the foregoing embodiments are executed.
为让本文的上述和其他目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附图式,作详细说明如下。In order to make the above and other objects, features and advantages of this article more obvious and understandable, preferred embodiments are cited below and described in detail with the accompanying drawings.
附图说明Description of drawings
为了更清楚地说明本文实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本文的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of this article or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only For some embodiments of this article, those of ordinary skill in the art can also obtain other drawings based on these drawings without exerting creative efforts.
图1示出了本文实施例基于知识图谱最短路径的信息获取方法的第一流程图;Figure 1 shows the first flow chart of the information acquisition method based on the shortest path of the knowledge graph in the embodiment of this article;
图2示出了本文实施例选择距离迭代起点最近的邻接节点确定最短路径及更新第一节点集过程的第一流程图;Figure 2 shows the first flow chart of the process of selecting the adjacent node closest to the iteration starting point to determine the shortest path and updating the first node set in the embodiment of this article;
图3示出了本文实施例一知识图谱抽象图的示意图;Figure 3 shows a schematic diagram of a knowledge graph abstract graph according to Embodiment 1 of this article;
图4示出了本文实施例基于知识图谱最短路径的信息获取方法的第二流程图;Figure 4 shows the second flow chart of the information acquisition method based on the shortest path of the knowledge graph in the embodiment of this article;
图5示出了本文实施例选择距离迭代起点最近的邻接节点确定最短路径及更新第一节点集过程的第二流程图;Figure 5 shows the second flow chart of the process of selecting the adjacent node closest to the iteration starting point to determine the shortest path and updating the first node set in the embodiment of this article;
图6示出了本文实施例另一知识图谱抽象图的示意图;Figure 6 shows a schematic diagram of another knowledge graph abstract graph according to the embodiment of this article;
图7示出了本文实施例基于知识图谱最短路径的信息获取装置的一种结构图;Figure 7 shows a structural diagram of the information acquisition device based on the shortest path of the knowledge graph in the embodiment of this article;
图8示出了本文实施例基于知识图谱最短路径的信息获取装置的另一种结构图;Figure 8 shows another structural diagram of the information acquisition device based on the shortest path of the knowledge graph according to the embodiment of this article;
图9示出了本文实施例计算机设备的结构图。Figure 9 shows a structural diagram of a computer device according to an embodiment of this article.
附图符号说明:
701、初始化单元;
702、最短路径确定单元;
703、循环控制单元;
704、信息获取单元;
801、初始化单元;
802、计算单元;
803、筛选单元;
804、算法选择单元;
805、节点移动单元;
806、循环控制单元;
807、信息获取单元;
902、计算机设备;
904、处理器;
906、存储器;
908、驱动机构;
910、输入/输出模块;
912、输入设备;
914、输出设备;
916、呈现设备;
918、图形用户接口;
920、网络接口;
922、通信链路;
924、通信总线。
Explanation of drawing symbols:
701. Initialization unit;
702. Shortest path determination unit;
703. Cycle control unit;
704. Information acquisition unit;
801. Initialization unit;
802. Computing unit;
803. Screening unit;
804. Algorithm selection unit;
805. Node mobile unit;
806. Cycle control unit;
807. Information acquisition unit;
902. Computer equipment;
904. Processor;
906. Memory;
908. Driving mechanism;
910. Input/output module;
912. Input device;
914. Output device;
916. Presentation equipment;
918. Graphical user interface;
920. Network interface;
922. Communication link;
924. Communication bus.
具体实施方式Detailed ways
下面将结合本文实施例中的附图,对本文实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本文一部分实施例,而不是全部的实施例。基于本文中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本文保护的范围。The technical solutions in the embodiments of this article will be clearly and completely described below with reference to the accompanying drawings in the embodiments of this article. Obviously, the described embodiments are only some of the embodiments of this article, rather than all of the embodiments. Based on the embodiments in this article, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection in this article.
需要说明的是,本文的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的 数据在适当情况下可以互换,以便这里描述的本文的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、装置、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms “first”, “second”, etc. in the description, claims and above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that this is used The data are interchangeable under appropriate circumstances so that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, for example, a process, method, apparatus, product or equipment that includes a series of steps or units and need not be limited to those explicitly listed. Those steps or elements may instead include other steps or elements not expressly listed or inherent to the process, method, product or apparatus.
本说明书提供了如实施例或流程图所述的方法操作步骤,但基于常规或者无创造性的劳动可以包括更多或者更少的操作步骤。实施例中列举的步骤顺序仅仅为众多步骤执行顺序中的一种方式,不代表唯一的执行顺序。在实际中的系统或装置产品执行时,可以按照实施例或者附图所示的方法顺序执行或者并行执行。This specification provides method operation steps as described in the examples or flow charts, but more or less operation steps may be included based on routine or non-inventive efforts. The sequence of steps listed in the embodiment is only one way of executing the sequence of many steps, and does not represent the only execution sequence. When the actual system or device product is executed, the methods shown in the embodiments or drawings may be executed sequentially or in parallel.
知识图谱可高效地组织、管理和利用海量信息,其在社交网络、人力资源与招聘、金融、保险、零售、广告、通信、IT、制造业、传媒、医疗、电子商务和物流等众多领域有着广泛的应用。具体的,例如,知识图谱可以实现Web从网页链接向概念链接转变(支持按主题检索),真正实现语义检索。又例如,基于知识图谱的搜索引擎,能够以图形方式向用户反馈结构化的知识,用户不必浏览大量网页即能准确定位和深度获取知识。Knowledge graphs can efficiently organize, manage and utilize massive amounts of information. They are widely used in many fields such as social networks, human resources and recruitment, finance, insurance, retail, advertising, communications, IT, manufacturing, media, medical care, e-commerce and logistics. Wide range of applications. Specifically, for example, the knowledge graph can realize the transformation of the Web from web page links to concept links (supporting retrieval by topic), and truly realize semantic retrieval. For another example, a search engine based on knowledge graphs can graphically feedback structured knowledge to users, allowing users to accurately locate and acquire knowledge in depth without having to browse a large number of web pages.
本文所述的基于知识图谱最短路径的信息获取方法及装置可应用于各种基于知识图谱最短路径进行信息获取的领域,例如信息检索、路径规划(包括机器人导航、车辆导航等),本文对基于知识图谱最短路径的信息获取方法及装置的具体应用领域不做限定。The information acquisition method and device based on the shortest path of the knowledge graph described in this article can be applied to various fields of information acquisition based on the shortest path of the knowledge graph, such as information retrieval, path planning (including robot navigation, vehicle navigation, etc.). This article is based on The specific application fields of the information acquisition method and device of the shortest path of the knowledge graph are not limited.
本文所述的知识图谱中包括多个节点及节点间邻接边,所述节点间邻接边权重表示节点间长度距离。知识图谱具体包括的内容视应用领域而定。节点间的长度距离为抽象概念,表示节点间转换或获取信息的花费、时间等。The knowledge graph described in this article includes multiple nodes and adjacent edges between nodes. The weight of the adjacent edges between nodes represents the length distance between nodes. The specific content included in the knowledge graph depends on the application field. The length distance between nodes is an abstract concept, which represents the cost, time, etc. of converting or obtaining information between nodes.
本文一实施例中,提供一种基于知识图谱最短路径的信息获取方法,用于解决现有技术中基于知识图谱的最短路径确定过程存在计算效率低的问题,具体的,如图1所示,基于知识图谱最短路径的信息获取方法包括:In an embodiment of this article, an information acquisition method based on the shortest path of the knowledge graph is provided to solve the problem of low computational efficiency in the shortest path determination process based on the knowledge graph in the existing technology. Specifically, as shown in Figure 1, Information acquisition methods based on the shortest path of the knowledge graph include:
步骤S11,根据最短路径查找策略及用户请求中的节点,初始地设定迭代起点并确定其最短路径。 Step S11: According to the shortest path search strategy and the nodes in the user request, initially set the iteration starting point and determine the shortest path.
步骤S12,将已确定最短路径且有未确定最短路径的邻接节点的节点作为第一节点组成第一节点集,从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集。Step S12: Use nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as first nodes to form a first node set, and select distance iteration from the adjacent nodes of the first node in the first node set with undetermined shortest paths. The adjacent node with the closest starting point determines the shortest path and updates the first node set.
步骤S13,重复上述步骤S12,直至找到所有期望最短路径为止,其中,所述期望最短路径与最短路径查找策略相关。Step S13: Repeat the above step S12 until all expected shortest paths are found, where the expected shortest paths are related to the shortest path search strategy.
步骤S14,根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应所述用户请求。Step S14: Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.
本实施例可应用于客户端或服务端,客户端或服务端中存储有知识图谱信息(节点及节点间连接关系,节点反映实体,节点间连接关系反映节点间关联关系),从知识图谱获取的信息显示于客户端。本文所述的客户端包括但不限于计算机设备、移动终端等,还可以为安装于计算机设备、移动终端等的软件。This embodiment can be applied to the client or server. The client or server stores knowledge graph information (nodes and connection relationships between nodes, nodes reflect entities, and connection relationships between nodes reflect association relationships between nodes), which are obtained from the knowledge graph. information is displayed on the client. The client described in this article includes but is not limited to computer equipment, mobile terminals, etc., and can also be software installed on computer equipment, mobile terminals, etc.
本实施例通过避免使用松弛操作,每次选择距离迭代起点最近的邻接节点确定最短路径(记为小步算法),能够使得每一次迭代中都会计算出一个或多个节点的最短路径,并稳定地向其它节点的最短路径逼近。节点的邻接边的数量不会直接影响时间复杂度,能够提高最短路径的确定效率。This embodiment avoids the use of relaxation operations and selects the adjacent node closest to the iteration starting point each time to determine the shortest path (recorded as a small step algorithm), so that the shortest path of one or more nodes can be calculated in each iteration and is stable. Approach the shortest path to other nodes. The number of adjacent edges of a node will not directly affect the time complexity and can improve the efficiency of determining the shortest path.
本实施例实施之前,还需用户操作客户端的方式接收用户请求,用户请求至少包括起始节点,从而通过上述步骤S11至步骤S14挖掘出与起始节点相关的信息。Before the implementation of this embodiment, the user needs to receive a user request by operating a client. The user request at least includes the starting node, so that information related to the starting node can be mined through the above steps S11 to S14.
步骤S11中,最短路径查找策略至少包括单源查找、单源单目标正向查找、单源单目标反向查找及单源单目标双向查找,初始地设置迭代起点过程包括:In step S11, the shortest path search strategy at least includes single source search, single source single target forward search, single source single target reverse search, and single source single target bidirectional search. The process of initially setting the iteration starting point includes:
(1)若最短路径查找策略为单源查找,则设置用户请求中的源节点为迭代起点,从源节点开始执行步骤S12及步骤S13,所有期望最短路径的确定条件为所有已确定最短路径的节点没有未确定最短路径的邻接节点。(1) If the shortest path search strategy is a single source search, set the source node in the user request as the iteration starting point, and execute steps S12 and S13 starting from the source node. The determination conditions for all expected shortest paths are all determined shortest paths. The node has no adjacent nodes for which the shortest path has not been determined.
(2)若最短路径查找策略为单源单目标正向查找,则设置用户请求中的源节点为迭代起点,按照源节点至目标节点的正向查找方向执行步骤S12及步骤S13。(2) If the shortest path search strategy is a single source single target forward search, set the source node in the user request as the iteration starting point, and execute steps S12 and S13 according to the forward search direction from the source node to the target node.
(3)若最短路径查找策略为单源单目标反向查找,则设置用户请求中的目标节点为迭代起点,按照源节点至目标节点的反向查找方向执行步骤S12及步骤S13。(3) If the shortest path search strategy is a single source single target reverse search, set the target node in the user request as the iteration starting point, and execute steps S12 and S13 according to the reverse search direction from the source node to the target node.
(4)若最短路径查找策略为单源单目标双向查找,则设置用户请求中的源节点和目标节点为迭代起点,按照源节点至目标节点的正向查找方向及按照目标节点至源节点的反向查找方向分别执行步骤S12及步骤S13。 (4) If the shortest path search strategy is a single source, single target, and bidirectional search, set the source node and target node in the user request as the iteration starting point, and follow the forward search direction from the source node to the target node and the direction from the target node to the source node. Perform step S12 and step S13 respectively in the reverse search direction.
对于单源单目标正向查找、单源单目标反向查找及单源单目标双向查找,为各查找方向的已确定最短路径的节点关联各查找方向的迭代起点,为源节点和目标节点关联自身,所有期望最短路径的确定条件为存在节点同时关联源节点和目标节点。例如,源节点为节点A,目标节点为节点G,查找方向为从节点A至节点G,中间节点包括节点B,为源节点及目标节点关联自身即在源节点的关联信息中记录节点A,在目标节点的关联信息中记录节点G,在节点B关联信息中记录节点A。For single-source single-target forward search, single-source single-target reverse search, and single-source single-target bidirectional search, associate the iteration starting point of each search direction with the determined shortest path node in each search direction, and associate the source node with the target node. By itself, the determination condition for all expected shortest paths is that there are nodes associated with both the source node and the target node. For example, the source node is node A, the target node is node G, the search direction is from node A to node G, and the intermediate nodes include node B. To associate the source node and the target node with itself is to record node A in the association information of the source node. Node G is recorded in the association information of the target node, and node A is recorded in the association information of node B.
对于单源单目标正向查找、单源单目标反向查找,确定出的路径即为最短路径。For single source single target forward search and single source single target reverse search, the determined path is the shortest path.
对于单源单目标双向查找,得到的是两段路径,即迭代起点s到迭代中点p的路径,迭代中点p到迭代起点t的路径,二者组合成从起点s到终点t的完整路径。路径寻找时,可设置如下邻接边表达方式:For a single source and single target bidirectional search, two paths are obtained, namely the path from the iteration starting point s to the iteration midpoint p, and the path from the iteration midpoint p to the iteration starting point t. The two are combined to form a complete path from the starting point s to the end point t. path. When searching for a path, you can set the following adjacency edge expression:
(1)有向邻接边<from,to>f用于正向搜索,表示从from指向to的有向边,from对应第一节点,to是from的邻接点,to对应第二节点。(1) Directed adjacent edge <from,to> f is used for forward search, indicating the directed edge from from to to, from corresponds to the first node, to is the adjacent point of from, and to corresponds to the second node.
(2)有向邻接边<from,to>b用于反向搜索,表示从from指向to的有向边,to对应第一节点,from是to的邻接点,from对应第二节点。(2) Directed adjacent edge <from,to> b is used for reverse search, indicating the directed edge from from to to, to corresponds to the first node, from is the adjacent point of to, and from corresponds to the second node.
(3)反向搜索邻接边<from,to>b的方向,与正向搜索邻接边<from,to>f的方向相反,即其方向为第二节点from指向第一节点to,正向搜索为第一节点from指向第二节点to,它们都与邻接边<from,to>的方向一致。(3) The direction of the reverse search for the adjacent edge <from,to> b is opposite to the direction of the forward search for the adjacent edge <from,to> f , that is, the direction is that the second node from points to the first node to, and the forward search The first node from points to the second node to, and they are consistent with the directions of the adjacent edges <from, to>.
(4)在反向搜索中,第一节点to的邻接边线性表中的to节点是同一节点,该to节点对应多个第二节点from。(4) In the reverse search, the to node in the linear list of adjacent edges of the first node to is the same node, and the to node corresponds to multiple second nodes from.
(5)在正向搜索中,第一节点from的邻接边线性表中的from节点是同一节点,该from节点对应多个第二节点to。(5) In forward search, the from node in the linear list of adjacent edges of the first node from is the same node, and the from node corresponds to multiple second nodes to.
单源单目标双向查找有多种实现形式,如是否使用统一的第二节点预计最短路径长度,是否使用多线程。下述方法为正向搜索和反向搜索采用独立的第二节点预计最短路径长度,在同一线程内,独立计算各个迭代方向的第二节点预计最短路径长度。There are many implementation forms of single-source single-target bidirectional search, such as whether to use a unified second node to predict the shortest path length, and whether to use multi-threading. The following method uses an independent second node's expected shortest path length for forward search and reverse search, and in the same thread, independently calculates the second node's expected shortest path length for each iteration direction.
本文所述节点的最短路径包括:节点的前驱节点及节点已确定路径长度。节点已确定路径长度等于该节点的前驱节点已确定路径长度加上该节点与该节点的前驱节点间邻接边权重之和,节点至少包括一个前驱节点,例如0个1个或多个前驱节点。节点已确定路径长度表示从迭代起点移动至节点经过邻接边的权重之和,初始状态下,迭代起点的已确定路径长度为零,其余节点的已确定路径长度为无穷大,在路径寻找过程中进行更新。以迭代起点为例,迭代起点的前驱节点为空,迭代起点的已确定路径长度为零。 又例如节点C前驱节点为节点B,节点B前驱节点为源节点A,且邻接边<A,B>权重为3,邻接边<B,C>权重为2,则节点C已确定路径长度为5。具体实施时,节点的最短路径还可包括节点的邻接边权重、前驱节点记录号、当前节点、当前节点记录号。通过前驱节点和前驱节点记录号为第二节点指向第一节点的路径信息,通过路径长度、前驱节点和当前节点记录邻接边信息。The shortest path of a node described in this article includes: the node's predecessor node and the node's determined path length. The determined path length of a node is equal to the determined path length of the node's predecessor node plus the sum of the adjacent edge weights between the node and the node's predecessor node. The node includes at least one predecessor node, such as 0, 1 or more predecessor nodes. The determined path length of a node represents the sum of the weights of adjacent edges moving from the iteration starting point to the node. In the initial state, the determined path length of the iteration starting point is zero, and the determined path lengths of the remaining nodes are infinite. This is done during the path search process. renew. Taking the iteration starting point as an example, the predecessor node of the iteration starting point is empty, and the determined path length of the iteration starting point is zero. For another example, the predecessor node of node C is node B, the predecessor node of node B is source node A, and the weight of adjacent edge <A,B> is 3, and the weight of adjacent edge <B,C> is 2, then node C has determined the path length as 5. During specific implementation, the shortest path of a node may also include the adjacent edge weight of the node, the predecessor node record number, the current node, and the current node record number. The path information from the second node to the first node is recorded through the predecessor node and the predecessor node, and the adjacent edge information is recorded through the path length, predecessor node and current node.
对于邻接边<from,to>,正向搜索的方向从节点from指向节点to,节点to的前驱节点为from节点,反向搜索的方向从节点to指向节点from,节点from的前驱节点为to节点。例如最短路径p1->p2->p3由两条有向边<p1,p2>和<p2,p3>组成:正向搜索的路径指向为p1<-p2和p2<-p3,与路径方向相反;反向搜索的路径指向为p2->p3和p1->p2,与路径方向一致。For the adjacent edge <from,to>, the direction of the forward search is from the node from to the node to, the predecessor node of the node to is the from node, the direction of the reverse search is from the node to to the node from, the predecessor node of the node from is the to node . For example, the shortest path p1->p2->p3 consists of two directed edges <p1, p2> and <p2, p3>: the forward search path points to p1 <-p2 and p2 <-p3, which is opposite to the path direction. ;The path directions of the reverse search are p2->p3 and p1->p2, which are consistent with the path direction.
如图2所示,步骤S12中从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集包括:As shown in Figure 2, in step S12, from the adjacent nodes of the first node in the first node set for which the shortest path has not been determined, selecting the adjacent node closest to the iteration starting point to determine the shortest path and updating the first node set includes:
步骤S121,为第一节点集中未确定相关信息的第一节点确定相关信息并从第一节点集中移除无相关信息的第一节点。Step S121: Determine relevant information for the first node in the first node set for which no relevant information has been determined and remove the first node without relevant information from the first node set.
其中,第一节点的相关信息包括:第二节点、待处理邻接边及第二节点预计最短路径长度。待处理邻接边为第一节点与其未确定最短路径的邻接节点之间的所有邻接边中满足如下条件的邻接边:该邻接边权重为所述所有邻接边的最小值,且该邻接边权重与该第一节点的已确定路径长度之和大于第一节点集关联的累计移动步长的邻接边。第二节点为待处理邻接边的非第一节点。第一节点的第二节点预计最短路径长度等于所述待处理邻接边的权重加上第一节点的已确定路径长度。Among them, the relevant information of the first node includes: the second node, the adjacent edge to be processed, and the estimated shortest path length of the second node. The adjacent edge to be processed is an adjacent edge that satisfies the following conditions among all adjacent edges between the first node and its adjacent node for which the shortest path has not been determined: the adjacent edge weight is the minimum value of all adjacent edges, and the adjacent edge weight is equal to The sum of the determined path lengths of the first node is greater than the adjacent edges of the accumulated movement steps associated with the first node set. The second node is the non-first node of the adjacent edge to be processed. The expected shortest path length of the second node of the first node is equal to the weight of the adjacent edge to be processed plus the determined path length of the first node.
详细的说,第一节点集关联的累计移动步长等于已确定最短路径的距离迭代起点最远的节点的最短路径长度,反映从迭代起点已经移动的路径权重长度,初始状态下,第一节点集关联的累计移动步长为零。该方法会在同一次迭代中,依次确定该第一节点的权重值相同的邻接边的第二节点的所有最短路径。In detail, the cumulative movement step associated with the first node set is equal to the shortest path length of the node farthest from the iteration starting point of the determined shortest path, reflecting the path weight length that has been moved from the iteration starting point. In the initial state, the first node The cumulative move step associated with the set is zero. This method will sequentially determine all the shortest paths of the second nodes of the adjacent edges with the same weight value of the first node in the same iteration.
本步骤中,可依据第一节点的升序排列的邻接边线性表确定待处理邻接边,邻接边线性表通过指针的方式指向第一个未确认最短路径的邻接边,确定过程包括:In this step, the adjacent edge to be processed can be determined based on the linear list of adjacent edges arranged in ascending order of the first node. The linear list of adjacent edges points to the first adjacent edge of the unconfirmed shortest path through a pointer. The determination process includes:
判断该第一节点的已确定路径长度与当前第一个未确认最短路径的邻接边权重之和是否大于第一节点集关联的累计移动步长,若否,则指针指向邻接边线性表中下个邻接 边,若是,则该邻接边可作为下个待处理邻接边,待处理邻接边对应的非第一节点为第二节点。Determine whether the sum of the determined path length of the first node and the adjacent edge weight of the current first unconfirmed shortest path is greater than the cumulative movement step associated with the first node set. If not, the pointer points to the bottom of the adjacent edge linear list. adjacencies edge, if so, the adjacent edge can be used as the next adjacent edge to be processed, and the non-first node corresponding to the adjacent edge to be processed is the second node.
步骤S122,从第一节点集中筛选出第二节点预计最短路径长度最小的第一节点。Step S122: Select the first node with the smallest expected shortest path length of the second node from the first node set.
步骤S123,筛选出的第一节点相关的第二节点为距离迭代起点最近的邻接节点。Step S123: The filtered second node related to the first node is the adjacent node closest to the iteration starting point.
本步骤实施时,将筛选出的第一节点作为其相关第二节点的前驱节点,该第一节点关联的第二节点预计最短路径长度即为该第二节点的已确定路径长度。When this step is implemented, the filtered first node is used as the predecessor node of its related second node, and the estimated shortest path length of the second node associated with the first node is the determined path length of the second node.
步骤S124,更新第一节点集关联的累计移动步长为所述筛选出的第一节点关联的第二节点预计最短路径长度。Step S124: Update the cumulative movement step length associated with the first node set to the estimated shortest path length of the second node associated with the filtered first node.
步骤S125,对筛选出的每一第一节点相关的第二节点、待处理邻接边及第二节点预计最短路径长度,执行如下判断:Step S125, perform the following judgment on the second node related to each filtered first node, the adjacent edge to be processed, and the estimated shortest path length of the second node:
步骤S1251,当该第二节点预计最短路径长度等于该第二节点的已确定路径长度时,把所述第二节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路径,确定该第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该第一节点;Step S1251: When the estimated shortest path length of the second node is equal to the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes. Add a shortest path to the node, determine the relevant information of the first node, and if there is no relevant information, remove the first node from the first node set;
该第二节点预计最短路径长度,即该第一节点的已确定路径长度与该待处理邻接边权重之和,等于该第二节点的已确定路径长度说明第二节点已加入第一节点集中,此时仅需确定该第一节点的相关信息即可。当第一节点无相关信息时第一节点已无未确定最短路径的邻接节点。The estimated shortest path length of the second node, that is, the sum of the determined path length of the first node and the weight of the adjacent edge to be processed, is equal to the determined path length of the second node, indicating that the second node has joined the first node set, At this time, it is only necessary to determine the relevant information of the first node. When the first node has no relevant information, the first node has no adjacent nodes for which the shortest path has not been determined.
步骤S1252,当该第二节点预计最短路径长度小于该第二节点的已确定路径长度时,把所述第二节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路径,确定该第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该第一节点,将该第二节点新增为第一节点加入至所述第一节点集中,确定新增第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该新增第一节点。Step S1252: When the estimated shortest path length of the second node is less than the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes. The node adds a shortest path and determines the relevant information of the first node. If there is no relevant information, remove the first node from the first node set and add the second node as the first node to all the nodes. In the first node set, the relevant information of the newly added first node is determined. If there is no relevant information, the newly added first node is removed from the first node set.
该第一节点的第二节点预计最短路径长度,即第一节点的已确定路径长度与该待处理邻接边权重之和,小于该第二节点的路径长度说明第二节点未加入第一节点集中。The estimated shortest path length of the second node of the first node is the sum of the determined path length of the first node and the weight of the adjacent edge to be processed. If the path length is less than the path length of the second node, it means that the second node has not joined the first node set. .
本文一实施例中,存在如下情况,如图3所示,假设当前第一节点为节点3及节点2,第二节点为节点5及节点4,待处理邻接边为<2,5>及<3,4>,路径0->2->5、0->3->5、0->3->4的权重均为10,假设某次迭代后节点2可达到节点5,此时节点5变更为第一节点,按照前文的逻辑,节点3不会筛选出下一待处理邻接边<3,5>,但实际上,路径 0->3->5与0->2->5权重相同,也就是说,0->3->5也是一最优路径,因此,在图3所示特殊场景下,前文的方法存在不能找出所有最优路径的问题,为了解决该技术问题,基于知识图谱最短路径的信息获取方法还包括:In an embodiment of this article, the following situation exists. As shown in Figure 3, assume that the current first nodes are node 3 and node 2, the second nodes are node 5 and node 4, and the adjacent edges to be processed are <2,5> and <3,4>, the weights of the paths 0->2->5, 0->3->5, 0->3->4 are all 10. Assume that node 2 can reach node 5 after a certain iteration. At this time, node 5 is changed to the first node. According to the previous logic, node 3 will not filter out the next pending adjacent edge <3,5>, but in fact, the path 0->3->5 has the same weight as 0->2->5, that is to say, 0->3->5 is also an optimal path. Therefore, in the special scenario shown in Figure 3, the previous method exists The problem of not being able to find all the optimal paths. In order to solve this technical problem, the information acquisition method based on the shortest path of the knowledge graph also includes:
根据各第一节点的有序邻接边线性表,确定第三节点,其中,第三节点为正确认最短路径或已确认最短路径的节点;Determine the third node according to the ordered adjacent edge linear table of each first node, where the third node is the node that is confirming the shortest path or has confirmed the shortest path;
如果该第一节点的已确定路径长度与该第一节点与其第三节点之间邻接边的权重之和等于该第三节点的已确定路径长度,则把所述第三节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路;否则,If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the determined path length of the third node, then the third node is regarded as the successor node, according to The first node and its successor node add a shortest path to the successor node; otherwise,
如果该第一节点的已确定路径长度与该第一节点与其第三节点之间邻接边的权重之和等于第一节点集关联的累计移动步长,则把所述第三节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路,将该第三节点新增为第一节点加入至所述第一节点集中,确定新增第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该新增第一节点。If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the cumulative movement step associated with the first node set, then the third node is regarded as the successor node, Add a shortest path to the successor node according to the first node and its successor node, add the third node as the first node to the first node set, and determine the number of the newly added first node. Relevant information, if there is no relevant information, remove the newly added first node from the first node set.
以图3为例,第一节点为节点5、节点3、节点4,节点5的前驱节点为节点2,且假设节点5的已确定路径长度为10,节点3的已确定路径长度为2,节点4的已确定路径长度为10,邻接边<3,5>权重为8,节点4及节点5没有第三节点,节点3的第三节点为节点5,更新第三节点的最短路径可更新第三节点的前驱节点包括节点2及节点3。Taking Figure 3 as an example, the first nodes are node 5, node 3, and node 4. The predecessor node of node 5 is node 2. It is assumed that the determined path length of node 5 is 10 and the determined path length of node 3 is 2. The determined path length of node 4 is 10, and the weight of the adjacent edge <3,5> is 8. Node 4 and node 5 do not have a third node. The third node of node 3 is node 5. The shortest path of the third node can be updated. Predecessor nodes of the third node include node 2 and node 3.
本文一实施例中,为了避免出现无效迭代进而影响最小路径的寻找速度,确定第一节点的相关信息之后还包括:In an embodiment of this article, in order to avoid invalid iterations and thus affect the speed of finding the minimum path, determining the relevant information of the first node also includes:
判断是否至少两个第一节点的相关信息中存在相同的第二节点;Determine whether the same second node exists in the relevant information of at least two first nodes;
若是,则比较具有相同第二节点的第一节点的第二节点预计最短路径长度;If so, compare the estimated shortest path length of the second node of the first node with the same second node;
若第一节点的第二节点预计最短路径长度不同,则仅保留最小第二节点预计最短路径长度的第一节点的相关信息,为其它第一节点重新确定相关信息。If the estimated shortest path lengths of the second nodes of the first node are different, only the relevant information of the first node with the smallest estimated shortest path length of the second node is retained, and the relevant information is re-determined for other first nodes.
本实施例能够保证最短路径寻找过程中,去掉不必要的干扰线路,避免无效的迭代。This embodiment can ensure that unnecessary interference lines are removed during the shortest path search process and avoid invalid iterations.
本文一实施例中,还提供一种知识图谱最短路径确定方法,用于解决现有技术中基于知识图谱的最短路径确定过程存在计算效率低的问题,所述知识图谱中包括多个节点及节点间邻接边,所述节点间邻接边权重表示节点间距离,如图4所示,方法包括: In an embodiment of this article, a method for determining the shortest path of a knowledge graph is also provided to solve the problem of low computational efficiency in the shortest path determination process based on the knowledge graph in the prior art. The knowledge graph includes multiple nodes and nodes. The weight of the adjacent edge between nodes represents the distance between nodes, as shown in Figure 4. The method includes:
步骤S21,根据最短路径查找策略及用户请求中的源节点和目标节点,初始地设定迭代起点及边缘节点,边缘节点包括第一节点集中节点、待选节点集中节点及松弛操作集中节点,初始地设定第一节点集包括迭代起点并确定其最短路径,待选节点集及松弛操作集为空;初始地设定迭代起点的已确定路径长度为零,其余节点的已确定路径长度为无穷大。Step S21, initially set the iteration starting point and edge nodes according to the shortest path search strategy and the source node and target node in the user request. The edge nodes include the first node concentration node, the candidate node concentration node and the relaxation operation concentration node. Initially The first node set includes the iteration starting point and its shortest path is determined. The candidate node set and the relaxation operation set are empty; the determined path length of the iteration starting point is initially set to zero, and the determined path lengths of the remaining nodes are infinite. .
其中,第一节点集中节点为已确定最短路径且有未确定最短路径的邻接节点,待选节点集中节点为可加入第一节点集的未确定最短路径的节点,松弛操作集中节点为通过松弛操作得到路径但未确定最短路径的节点。Among them, the nodes in the first node set are adjacent nodes with determined shortest paths and undetermined shortest paths, the nodes in the candidate node set are nodes with undetermined shortest paths that can be added to the first node set, and the nodes in the relaxation operation set are nodes that pass the relaxation operation. The node for which the path was obtained but the shortest path was not determined.
最短路径查找策略包括:单源单目标正向查找、单源单目标反向查找及单源单目标双向查找。The shortest path search strategy includes: single source single target forward search, single source single target reverse search, and single source single target bidirectional search.
一些实施方式中,节点的最路径包括节点的前驱节点及节点的已确定路径长度,对应的迭代起点的先驱节点为空,迭代起点的已确定路径长度为零。其它实施方式中,节点的最短路径还包括迭代起点至节点间最短路径中的节点信息。In some embodiments, the most path of a node includes the node's predecessor node and the node's determined path length, the corresponding predecessor node of the iteration starting point is empty, and the determined path length of the iteration starting point is zero. In other embodiments, the shortest path of a node also includes node information in the shortest path between the iteration starting point and the node.
本步骤实施时,还可设置松弛操作集包括迭代起点。When implementing this step, you can also set the relaxation operation set including the iteration starting point.
步骤S22,计算边缘节点的启发路径长度,其中,边缘节点的启发路径长度为从迭代起点到迭代终点经过边缘节点的路径长度。Step S22: Calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node.
本步骤中,可利用如下公式计算得到边缘节点的启发路径长度:
f(n)=g(n)+h(n);
In this step, the following formula can be used to calculate the heuristic path length of the edge node:
f(n)=g(n)+h(n);
其中,f(n)为从迭代起点到迭代终点的经过边缘节点n的启发路径长度,即从迭代起点经过边缘节点n到达迭代终点的预计成本;n为边缘节点;h(n)为边缘节点n到达迭代终点的预测路径长度,即表示还需要多少成本才能从边缘节点到达迭代终点;g(n)为边缘节点n距离迭代起点的已确定路径长度,即表示从迭代起点到边缘节点n的已确定成本;h为估计函数,h可以为欧几里得距离函数、曼哈顿距离、切比雪夫距离函数等,本文对h具体类型不做限定,任意节点到迭代终点的预测路径长度要小于或等于该任意节点到迭代终点的最短路径长度。Among them, f(n) is the heuristic path length from the iteration starting point to the iteration end point through edge node n, that is, the estimated cost from the iteration starting point to the iteration end point through edge node n; n is the edge node; h(n) is the edge node The predicted path length of n to the iteration end point, that is, how much cost is needed to get from the edge node to the iteration end point; g(n) is the determined path length from the edge node n to the iteration start point, which means the cost from the iteration start point to the edge node n The cost has been determined; h is an estimation function, h can be a Euclidean distance function, a Manhattan distance, a Chebyshev distance function, etc. This article does not limit the specific type of h. The predicted path length from any node to the iteration end point must be less than or Equal to the length of the shortest path from any node to the iteration end point.
步骤S23,根据启发路径长度和边缘节点已确定路径长度,从第一节点集及待选节点集中筛选出启发路径长度最大的节点中已确定路径长度最小的节点pB,从松弛操作集中筛选出启发路径长度最小的节点中已确定路径长度最大的节点pA。Step S23: According to the length of the heuristic path and the determined path length of the edge node, the node pB with the smallest determined path length among the nodes with the largest heuristic path length is screened out from the first node set and the candidate node set, and the heuristic is selected from the relaxation operation set. The node pA with the largest path length has been determined among the nodes with the smallest path length.
本步骤中,从第一节点集及待选节点集中筛选出的启发路径长度最大的节点中已确定路径长度最小的节点pB为第一节点集及待选节点集中最差的节点,从松弛操作集中 筛选出的启发路径长度最小的节点中已确定路径长度最大的节pA为最佳节点,具体理由为:In this step, among the nodes with the largest heuristic path length selected from the first node set and the candidate node set, the node pB with the smallest path length has been determined to be the worst node in the first node set and the candidate node set. From the relaxation operation concentrated Among the selected nodes with the smallest heuristic path length, the node pA with the largest path length has been determined to be the best node. The specific reasons are:
启发路径长度f最大的节点:表示经过该节点到达迭代终点的路径实际长度要大概率大于其它节点到达迭代终点的路径实际长度。The node with the largest heuristic path length f: means that the actual length of the path through this node to the iteration end point is more likely to be greater than the actual length of the path from other nodes to the iteration end point.
在启发路径长度f最大的节点中,选择已确定路径长度g最小的节点pB:表示经过该节点到达迭代终点的预测路径长度h最大,较大的h值在转换为g的过程中具有更大的不确定性,n到迭代终点的g值会更大概率的大于其它节点,即会生成更大的质量差的路径,因此,节点pB为最差节点。Among the nodes with the largest heuristic path length f, select the node pB with the smallest determined path length g: indicating that the predicted path length h through this node to the iteration end point is the largest, and a larger h value has a greater value in the process of converting to g Uncertainty, the g value from n to the iteration end point will be greater than other nodes with greater probability, that is, a larger path with poor quality will be generated. Therefore, node pB is the worst node.
启发路径长度f最小的节点:表示从迭代起点到该节点的最短路径在很大概率上会出现在从迭代起点到迭代终点的最短路径上。Heuristics the node with the smallest path length f: It means that the shortest path from the iteration starting point to the node will appear on the shortest path from the iteration starting point to the iteration end point with a high probability.
在启发路径长度f最小的节点中,选择已确定路径长度g最大的节点pA:表示经过该节点到达迭代终点的预测路径长度h最小。一方面,选择较小的h值在转换为g的过程中具有更小的不确定性,即会生成质量好的路径。另一方面,选择g值最大的节点能够确保满足不小于累计移动步长的条件,另一方面也会降低节点在小步机制处理的集合与松弛操作集之间的移动频率。因此,节点pA为最佳节点。Among the nodes with the smallest heuristic path length f, select the node pA with the largest determined path length g: indicating that the predicted path length h through this node to the iteration end point is the smallest. On the one hand, choosing a smaller h value has less uncertainty in the conversion to g, i.e. it will generate good quality paths. On the other hand, selecting the node with the largest g value can ensure that the condition of not less than the cumulative movement step is met. On the other hand, it will also reduce the frequency of node movement between the set processed by the small step mechanism and the relaxation operation set. Therefore, node pA is the best node.
步骤S24,根据节点pB及pA选择小步算法或松弛操作算法,具体包括:Step S24: Select the small step algorithm or the relaxation operation algorithm according to the nodes pB and pA, specifically including:
如果松弛操作集为空或节点pB的启发路径长度小于节点pA的启发路径长度,或节点pB的启发路径长度等于节点pA的启发路径长度且节点pB的已确定路径长度大于等于节点pA的已确定路径长度,则执行如下小步算法:从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集。If the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to the determined path length of node pA path length, the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set.
否则,基于松弛操作计算松弛操作集中节点pA的所有邻接节点的路径并更新松弛操作集。Otherwise, the paths of all adjacent nodes of the node pA in the relaxation operation set are calculated based on the relaxation operation and the relaxation operation set is updated.
本步骤中,基于松弛操作计算松弛操作集中节点pA的所有邻接节点的路径并更新松弛操作集包括:In this step, calculating the paths of all adjacent nodes of node pA in the relaxation operation set based on the relaxation operation and updating the relaxation operation set include:
确定节点pA已经找到最短路径,将节点pA从松弛操作集中移除,并对节点pA的任一邻接点pNeighbor,计算从迭代起始s经节点pA到邻节点pNeighbor的路径长度cost为节点pA的已确定最短路径长度g(pA)加上节点pA与节点pNeighbor间邻接边的权重,执行如下步骤: Determine that node pA has found the shortest path, remove node pA from the relaxation operation set, and for any adjacent node pNeighbor of node pA, calculate the path length cost from the iteration starting point s via node pA to the adjacent node pNeighbor as node pA's The shortest path length g(pA) plus the weight of the adjacent edge between node pA and node pNeighbor has been determined, and the following steps are performed:
如果cost小于原有已确定路径长度g(pNeighbor)且pNeighbor不在松弛操作集中:把节点pNeighbor加入到松弛操作集中;If cost is less than the original determined path length g(pNeighbor) and pNeighbor is not in the relaxation operation set: add node pNeighbor to the relaxation operation set;
如果cost小于原有已确定路径长度g(pNeighbor):删除为pNeighbor记录的路径信息,并根据节点pA和cost记录邻接点pNeighbor的路径信息,节点pNeighbor的路径长度为cost,前驱节点为pA;If cost is less than the original determined path length g(pNeighbor): delete the path information recorded for pNeighbor, and record the path information of the adjacent point pNeighbor based on node pA and cost. The path length of node pNeighbor is cost, and the predecessor node is pA;
如果cost等于原有已确定路径长度g(pNeighbor):根据节点pA和cost记录邻接点pNeighbor的路径信息,即为pNeighbor增加一条等长路径;If cost is equal to the original determined path length g(pNeighbor): record the path information of adjacent point pNeighbor according to node pA and cost, that is, add an equal-length path to pNeighbor;
如果cost大于原有已确定路径长度g(pNeighbor):不做任何操作。If the cost is greater than the original determined path length g(pNeighbor): no operation is performed.
步骤S25,移动松弛操作集、待选节点集及第一节点集中的节点。具体包括:Step S25: Move the nodes in the relaxation operation set, the candidate node set and the first node set. Specifically include:
从松弛操作集中筛选出优秀节点移动至待选节点集中,优秀节点为松弛操作集启发路径长度最小的节点中已确定路径长度最大的节点pA2;Select excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length among the nodes with the smallest path length inspired by the relaxation operation set;
当第一节点集及待选节点集中节点数量大于第二预定值时,从第一节点集及待选节点集中筛选出超出第二预定值的节点移动至松弛操作集中。When the number of nodes in the first node set and the candidate node set is greater than the second predetermined value, nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set.
步骤S26,重复以上步骤S22至步骤S25的过程,直至找到从源节点到目标节点的最短路径且边缘节点的启发路径长度都大于从源节点到目标节点的最短路径的长度。Step S26: Repeat the above process from step S22 to step S25 until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the length of the shortest path from the source node to the target node.
步骤S27,根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应用户请求。Step S27: Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.
本步骤实施时,可根据知识图谱中节点及节点间邻接边表示的含义获取信息。When this step is implemented, information can be obtained based on the meaning of nodes and adjacent edges between nodes in the knowledge graph.
本实施例通过启发信息作为目标引导信息将小步算法与松弛操作算法结合,能够降低小步算法在搜索过程中因对迭代终点一无所知而导致的搜索盲目性,由松弛操作筛选出优秀节点,作为小步算法中第一节点的候选节点,当第一节点集及候选节点集中节点数量大于预定值时,将第一节点集及候选节点集部分节点移动至松弛操作集中,能够减小小步算法搜索范围,更加快速地得到迭代起点到迭代终点的最短路径。This embodiment uses heuristic information as target guidance information to combine the small step algorithm with the relaxation operation algorithm, which can reduce the search blindness of the small step algorithm caused by knowing nothing about the iteration end point during the search process, and filter out the outstanding ones through the relaxation operation. node, as the candidate node of the first node in the small step algorithm, when the number of nodes in the first node set and the candidate node set is greater than the predetermined value, some nodes in the first node set and the candidate node set are moved to the relaxation operation set, which can reduce The small step algorithm searches the range and obtains the shortest path from the iteration starting point to the iteration end point more quickly.
本文一实施例中,上述步骤S21中,若最短路径查找策略为单源单目标正向查找,则迭代起点为源节点。进一步的,按照源节点至目标节点的正向查找方向执执行步骤S22至步骤S25。步骤S22中从迭代起点经过边缘节点的启发路径长度为从源节点经过边缘节点到目标节点的启发路径长度。 In an embodiment of this article, in the above step S21, if the shortest path search strategy is a single source single target forward search, the iteration starting point is the source node. Further, steps S22 to S25 are executed according to the forward search direction from the source node to the target node. In step S22, the heuristic path length from the iteration starting point through the edge node is the heuristic path length from the source node through the edge node to the target node.
若最短路径查找策略为单源单目标反向查找,迭代起点为目标节点,按照源节点至目标节点的反向查找方向执行步骤S22至步骤S25。步骤S22中从迭代起点经过边缘节点的启发路径长度为从目标节点经过边缘节点到源节点的启发路径长度。If the shortest path search strategy is a single source single target reverse search, the iteration starting point is the target node, and steps S22 to S25 are performed according to the reverse search direction from the source node to the target node. In step S22, the heuristic path length from the iteration starting point through the edge node is the heuristic path length from the target node through the edge node to the source node.
若最短路径查找策略为单源单目标双向查找,迭代起点为源节点和目标节点,按照源节点至目标节点的正向查找方向及按照目标节点至源节点的反向查找方向分别执行步骤S22至步骤S25。在正向查找方向中,步骤S22中的从迭代起点经过边缘节点的启发路径长度为从源节点经过边缘节点到目标节点的启发路径长度,在反向查找方向中,步骤S22中的从迭代起点经过边缘节点的启发路径长度为从目标节点经过边缘节点到源节点的启发路径长度。If the shortest path search strategy is a single source, single target, and bidirectional search, the starting point of the iteration is the source node and the target node, and steps S22 to S22 are executed respectively according to the forward search direction from the source node to the target node and according to the reverse search direction from the target node to the source node. Step S25. In the forward search direction, the heuristic path length from the iteration starting point through the edge node in step S22 is the heuristic path length from the source node through the edge node to the target node. In the reverse search direction, the heuristic path length from the iteration starting point in step S22 The length of the heuristic path through the edge node is the length of the heuristic path from the target node to the source node through the edge node.
为各查找方向的已确定最短路径的节点关联各查找方向的迭代起点,为源节点和目标节点关联自身,从源节点到目标节点的最短路径的确定条件为存在节点同时关联源节点和目标节点。Associate the iteration starting point of each search direction with the node of the determined shortest path in each search direction, associate the source node and the target node with themselves, and the determination condition of the shortest path from the source node to the target node is that there is a node that simultaneously associates the source node and the target node. .
具体实施时,单源单目标双向查找以迭代起点s为搜索起点做正向搜索,以迭代终点t为搜索起点做反向搜索,在到达相同的节点时,即找到了一条从s到t的最短路。其中:(1)有向邻接边<from,to>f用于正向搜索,表示从from指向to的有向边,from对应第一节点,to是from的邻接点,to对应第二节点;(2)有向邻接边<from,to>b用于反向搜索,表示从from指向to的有向边,to对应第一节点,from是to的邻接点,from对应第二节点;(3)反向搜索邻接边<from,to>b的方向,与正向搜索邻接边<from,to>f的方向相反,即其方向为第二节点from指向第一节点to,正向搜索为第一节点from指向第二节点to,与邻接边<from,to>的方向一致;(4)在反向搜索中,第一节点to的邻接边线性表中的to节点是同一节点,该to节点对应多个第二节点from;(5)在正向搜索中,第一节点from的邻接边线性表中的from节点是同一节点,该from节点对应多个第二节点to;(6)<from,to>可指代任意搜索方向的邻接边,在正向搜索中其代表邻接边<from,to>f,在反向搜索走过其代表邻接边<from,to>bDuring the specific implementation, the single-source single-target bidirectional search uses the iteration starting point s as the search starting point to perform a forward search, and uses the iteration end point t as the search starting point to perform a reverse search. When the same node is reached, a path from s to t is found. shortest path. Among them: (1) The directed adjacent edge <from, to> f is used for forward search, indicating the directed edge from from to to, from corresponds to the first node, to is the adjacent point of from, and to corresponds to the second node; (2) Directed adjacent edge <from,to> b is used for reverse search, indicating the directed edge from from to to, to corresponds to the first node, from is the adjacent point of to, and from corresponds to the second node; (3 ) The direction of the reverse search for the adjacent edge <from,to> b is opposite to the direction of the forward search for the adjacent edge <from,to> f , that is, the direction is that the second node from points to the first node to, and the forward search is the A node from points to the second node to, which is consistent with the direction of the adjacent edge <from, to>; (4) In the reverse search, the to node in the linear list of adjacent edges of the first node to is the same node, and the to node Corresponds to multiple second nodes from; (5) In forward search, the from node in the linear list of adjacent edges of the first node from is the same node, and the from node corresponds to multiple second nodes to; (6) <from ,to> can refer to the adjacent edge in any search direction. In the forward search, it represents the adjacent edge <from,to> f , and in the reverse search, it represents the adjacent edge <from,to> b .
本文一实施例中,上述步骤S25从第一节点集及待选节点集中筛选出超出第二预定值的节点移动至松弛操作集中包括:In an embodiment of this article, the above-mentioned step S25 filters out nodes exceeding the second predetermined value from the first node set and the candidate node set and moves them to the relaxation operation set, including:
按照如下方式确定第一节点集及待选节点集中的优秀节点:按照启发路径长度从小到大、已确定路径长度从大到小的顺序筛选出至多排名前第二预定值的已确定最短路径的节点;将第一节点集及待选节点集中的非优秀节点移动至松弛操作集中。 Determine the excellent nodes in the first node set and the candidate node set in the following way: filter out the determined shortest paths with at most the top second predetermined value in order of the heuristic path length from small to large and the determined path length from large to small. Node; move the non-excellent nodes in the first node set and the candidate node set to the relaxation operation set.
第二预定值为调试参数,第二预定值太小会引起节点不断在第一节点集及待选节点集与松弛操作集之间移动,第二预定值太大会导致第一节点集及待选节点集中节点数量变大,进而提升计算量,使得启发式信息失去应用的作用,因此,合理选择第二预定值可降低节点在第一节点集及待选节点集与松弛操作集之间的移动次数,提高最短路径的确定效率。一具体实施方式中,第二预定值可由如下三种方法中的其中之一确定:The second predetermined value is a debugging parameter. If the second predetermined value is too small, it will cause nodes to continuously move between the first node set, the candidate node set, and the relaxation operation set. If the second predetermined value is too large, it will cause the first node set and the candidate node set to move continuously. The number of nodes in the node set increases, thereby increasing the amount of calculation, making the heuristic information lose its application effect. Therefore, a reasonable selection of the second predetermined value can reduce the movement of nodes between the first node set, the candidate node set and the relaxation operation set. times to improve the efficiency of determining the shortest path. In a specific implementation, the second predetermined value may be determined by one of the following three methods:
(1)根据局部或全部知识图谱数据,设定固定的第二预定值。(1) Set a fixed second predetermined value based on partial or all knowledge graph data.
(2)根据迭代起点和迭代终点的不同,设定固定的第二预定值。(2) Set a fixed second predetermined value based on the difference between the iteration starting point and the iteration end point.
(3)根据知识图谱数据与应用场景设计第二预定值的上限和下限,第二预定值初始化时可设为下限和上限之间的一个值,根据下述策略动态调整第二预定值:在每个连续的K0次迭代中,如果松弛操作集向待选节点集移动的节点数量小于K1,第二预定值降低M,如果松弛操作集向待选节点集移动的节点数量大于K2,第二预定值增加M,其中,K0、K1、M为正整数,可根据实际情况进行设定。应用场景例如为信息检索、路径规划等,根据知识图谱的具体应用领域而定。本文所述K0次迭代指的是执行上述步骤S22至步骤S24的次数。(3) Design the upper and lower limits of the second predetermined value based on the knowledge graph data and application scenarios. The second predetermined value can be set to a value between the lower limit and the upper limit when initialized, and the second predetermined value can be dynamically adjusted according to the following strategy: In each consecutive K 0 iterations, if the number of nodes moved by the relaxation operation set to the candidate node set is less than K 1 , the second predetermined value is reduced by M. If the number of nodes moved by the relaxation operation set toward the candidate node set is greater than K 2 , the second predetermined value is increased by M, where K 0 , K 1 , and M are positive integers and can be set according to the actual situation. Application scenarios include information retrieval, path planning, etc., depending on the specific application field of the knowledge graph. The K 0 iterations mentioned herein refer to the number of times the above steps S22 to S24 are performed.
本文一实施例中,为了避免因第一节点集、待选节点集及松弛操作集中节点重复导致计算算力浪费,基于知识图谱最短路径的信息获取方法还包括:In an embodiment of this article, in order to avoid wasting computing power due to duplication of nodes in the first node set, the candidate node set and the relaxation operation set, the information acquisition method based on the shortest path of the knowledge graph also includes:
当所述第一节点集发生变化时,从松弛操作集和待选节点集中移出第一交集节点,其中,所述第一交集节点为第一节点集与松弛操作集和待选节点集的交集节点。When the first node set changes, the first intersection node is removed from the relaxation operation set and the candidate node set, wherein the first intersection node is the intersection of the first node set, the relaxation operation set and the candidate node set node.
本步骤中,当有新的节点,即不是从待选节点集中移动到第一节点集中的节点,加入到第一节点集时,如果该新的节点在待选节点集或松弛操作集中,则把该节点从待选节点集和松弛操作集中删除。In this step, when a new node, that is, a node that is not moved from the candidate node set to the first node set, is added to the first node set, if the new node is in the candidate node set or the relaxation operation set, then Delete the node from the candidate node set and relaxation operation set.
当所述松弛操作集发生变化时,从第一节点集和待选节点集中移出第二交集节点,其中,所述第二交集节点为松弛操作集与第一节点集和待选节点集的交集节点。When the relaxation operation set changes, a second intersection node is removed from the first node set and the candidate node set, wherein the second intersection node is the intersection of the relaxation operation set and the first node set and the candidate node set node.
本步骤中,当有新的节点,即不是从第一节点集中移动到松弛操作集中的节点,加入到松弛操作集时,如果该新的节点在待选节点集或第一节点集中,则把该节点从待选节点集和第一节点集中删除。In this step, when a new node, that is, a node that is not moved from the first node set to the relaxation operation set, is added to the relaxation operation set, if the new node is in the candidate node set or the first node set, then The node is deleted from the candidate node set and the first node set.
本文一实施例中,步骤S21中初始地设定第一节点集包括迭代起点时,还设定第一节点集关联的累计移动步长为零。 In an embodiment of this article, when the first node set is initially set to include the iteration starting point in step S21, the cumulative movement step associated with the first node set is also set to zero.
步骤S24更新第一节点集时,还更新第一节点集关联的累计移动步长。When updating the first node set in step S24, the cumulative movement step associated with the first node set is also updated.
步骤S25中从松弛操作集中筛选出优秀节点移动至待选节点集中的执行条件包括:In step S25, the execution conditions for selecting excellent nodes from the slack operation set and moving them to the candidate node set include:
第一节点集和待选节点集为空;或优秀节点的启发路径长度小于第一节点集和待选节点集中节点最大的启发路径长度且所述优秀节点的已确定路径长度大于或等于第一节点集关联的累计移动步长。The first node set and the candidate node set are empty; or the heuristic path length of the excellent node is less than the maximum heuristic path length of the nodes in the first node set and the candidate node set and the determined path length of the excellent node is greater than or equal to the first The cumulative movement steps associated with the node set.
本文一实施例中,如图5所示,步骤S24小步算法中从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集包括:In an embodiment of this article, as shown in Figure 5, in step S24, in the small step algorithm, from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, the adjacent node closest to the iteration starting point is selected to determine the shortest path and update the first node. A node set includes:
步骤S241,当第一节点集为空时,将待选节点集中已确定路径长度最小的节点移动至第一节点集并设置第一节点集关联的累计移动步长为移动节点的已确定路径长度。Step S241, when the first node set is empty, move the node with the smallest determined path length in the candidate node set to the first node set and set the cumulative movement step associated with the first node set to the determined path length of the mobile node. .
步骤S242,为第一节点集中未确定相关信息的第一节点确定相关信息并从第一节点集中移除无相关信息的第一节点。Step S242: Determine relevant information for the first node in the first node set for which no relevant information has been determined and remove the first node without relevant information from the first node set.
其中,第一节点的相关信息包括:第二节点、待处理邻接边及第二节点预计最短路径长度;待处理邻接边为第一节点与其未确定最短路径的邻接节点之间的所有邻接边中满足如下条件的邻接边:该邻接边权重为所有邻接边的最小值,且该邻接边权重与该第一节点的已确定路径长度之和大于第一节点集关联的累计移动步长的邻接边;第二节点为所述待处理邻接边的非第一节点;第一节点的第二节点预计最短路径长度等于待处理邻接边的权重加上第一节点的已确定路径长度。Among them, the relevant information of the first node includes: the second node, the adjacent edge to be processed and the estimated shortest path length of the second node; the adjacent edge to be processed is all the adjacent edges between the first node and its adjacent nodes for which the shortest path has not been determined. Adjacent edges that meet the following conditions: the adjacent edge weight is the minimum value of all adjacent edges, and the sum of the adjacent edge weight and the determined path length of the first node is greater than the cumulative movement step associated with the first node set. ; The second node is a non-first node of the adjacent edge to be processed; the estimated shortest path length of the second node of the first node is equal to the weight of the adjacent edge to be processed plus the determined path length of the first node.
步骤S243,从第一节点集中筛选出第二节点预计最短路径长度最小的第一节点,根据筛选出的第一节点,从待选节点集中选择节点移动至第一节点集中。Step S243: Screen out the first node with the smallest estimated shortest path length of the second node from the first node set, and select a node from the candidate node set to move to the first node set based on the screened out first node.
本步骤中,当有节点从待选节点集中移动至第一节点集时,需重新执行步骤S242以为移动节点确定相关信息,并重新确定第一节点集中第二节点预计最短路径长度最小的第一节点。In this step, when a node moves from the candidate node set to the first node set, step S242 needs to be re-executed to determine relevant information for the mobile node, and re-determine the first node with the smallest expected shortest path length for the second node in the first node set. node.
本步骤实施时,根据筛选出的第一节点,从待选节点集中选择节点移动至第一节点集中包括:When this step is implemented, based on the filtered first node, selecting a node from the candidate node set to move to the first node set includes:
(1)当所述筛选出的第二节点预计最短路径长度最小的第一节点的第二节点预计最短路径长度大于等于待选节点集中节点的最小已确定路径长度时,按照如下步骤(2)和(3)将待选节点集中节点移至第一节点集中: (1) When the estimated shortest path length of the second node of the first node with the smallest estimated shortest path length of the filtered second node is greater than or equal to the minimum determined path length of the node in the candidate node set, follow the following steps (2) and (3) move the nodes in the candidate node set to the first node set:
(2)对待选节点集中最小已确定路径长度对应的每一个节点pX,判断所述筛选出的第一节点关联的第二节点中是否包含节点pX,(2) For each node pX corresponding to the minimum determined path length in the candidate node set, determine whether the second node associated with the filtered first node contains the node pX,
若否,则将节点pX从待选节点集中移至第一节点集中,为节点pX确定相关信息,其中,节点pX的第二节点预计最短路径长度=节点pX的已确定路径长度+节点pX的待处理邻接边的权重,If not, move the node pX from the candidate node set to the first node set, and determine relevant information for the node pX, where the expected shortest path length of the second node of the node pX = the determined path length of the node pX + the node pX The weight of the adjacent edge to be processed,
若是,则把节点pX从待选节点集移除;If so, remove node pX from the candidate node set;
(3)判断(2)中是否把待选节点集中节点移动到第一节点集中,若是,再次执行步骤S243重新从所述第一节点集中筛选出第二节点预计最短路径长度最小的第一节点,否则,再次执行步骤(1),直到待选节点集中节点的最小已确定路径长度都大于所述第二节点预计最短路径长度最小的第一节点的第二节点预计最短路径长度时,执行步骤S244。(3) Determine whether to move the nodes in the candidate node set to the first node set in (2). If so, perform step S243 again to select the first node with the smallest expected shortest path length of the second node from the first node set. , otherwise, perform step (1) again until the minimum determined path lengths of the nodes in the candidate node set are greater than the predicted shortest path length of the second node of the first node with the smallest predicted shortest path length of the second node, perform step (1). S244.
本步骤中,通过把待选节点集中距离迭代起点最近的第一节点移动到第一节点中,确保筛选出的第一节点的第二节点预计最短路径长度对应的第二节点是距离迭代起点最近的未确定最短路径的节点,进而完成从松弛集到第一节点集的节点移动,优先把待选节点集中距离迭代起点最近的节点移动到第一节点集或从待选节点集中移除。In this step, by moving the first node closest to the iteration starting point in the candidate node set to the first node, it is ensured that the second node corresponding to the estimated shortest path length of the second node of the filtered first node is the closest to the iteration starting point. The nodes with undetermined shortest paths are then moved from the relaxed set to the first node set, and the nodes in the candidate node set that are closest to the iteration starting point are given priority to be moved to the first node set or removed from the candidate node set.
步骤S244,筛选出的第一节点相关的第二节点为距离迭代起点最近的邻接节点。Step S244: The filtered second node related to the first node is the adjacent node closest to the iteration starting point.
步骤S245,更新第一节点集关联的累计移动步长为所述筛选出的第一节点关联的第二节点预计最短路径长度。Step S245: Update the cumulative movement step length associated with the first node set to the estimated shortest path length of the second node associated with the filtered first node.
步骤S246,对筛选出的每一第一节点相关的第二节点、待处理邻接边及第二节点预计最短路径长度,执行如下处理:Step S246: Perform the following processing on the second node related to each filtered first node, the adjacent edge to be processed, and the estimated shortest path length of the second node:
步骤S2461,当该第一节点的第二节点预计最短路径长度等于该第二节点的已确定路径长度时,把所述第二节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路径,确定该第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该第一节点。Step S2461: When the estimated shortest path length of the second node of the first node is equal to the determined path length of the second node, the second node is used as a successor node. According to the first node and its successor node Add a shortest path to the subsequent node, determine the relevant information of the first node, and remove the first node from the first node set if there is no relevant information.
步骤S2462,当该第一节点的第二节点预计最短路径长度小于该第二节点的已确定路径长度时,把所述第二节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路径,确定该第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该第一节点,将该第二节点新增为第一节点加入至所述第一节点集中,确定新增第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该新增第一节点。 Step S2462: When the estimated shortest path length of the second node of the first node is less than the determined path length of the second node, the second node is used as a successor node. According to the first node and its successor node Add a shortest path to the successor node and determine the relevant information of the first node. If there is no relevant information, remove the first node from the first node set and add the second node as the first node. Nodes are added to the first node set, and relevant information of the newly added first node is determined. If there is no relevant information, the newly added first node is removed from the first node set.
为了更清楚说明图4至图5所示实施例的技术方案,下面以一具体实例进行详细说明,本实施例的知识图谱中包含9个节点,各节点编号及节点间权重关系如图6所示,源节点为节点0,结束节点为节点8,各节点的启发路径长度f(n)、已确定路径长度g(n)及预测路径长度h(n)分别为:In order to explain the technical solution of the embodiment shown in Figures 4 to 5 more clearly, a specific example will be used to explain in detail below. The knowledge graph of this embodiment contains 9 nodes. The number of each node and the weight relationship between the nodes are as shown in Figure 6 Indicates that the source node is node 0 and the end node is node 8. The inspired path length f(n), determined path length g(n) and predicted path length h(n) of each node are respectively:
g(0)=0,h(0)=5,f(0)=5;g(0)=0, h(0)=5, f(0)=5;
g(1)=1,h(1)=4,f(1)=5;g(1)=1, h(1)=4, f(1)=5;
g(2)=2,h(2)=2,f(2)=4;g(2)=2, h(2)=2, f(2)=4;
g(3)=2,h(3)=2,f(1)=4;g(3)=2, h(3)=2, f(1)=4;
g(4)=3,h(4)=5,f(4)=8;g(4)=3, h(4)=5, f(4)=8;
g(6)=4,h(6)=1,f(6)=5;g(6)=4, h(6)=1, f(6)=5;
g(7)=5,h(7)=9,f(7)=14;g(7)=5, h(7)=9, f(7)=14;
g(8)=6,h(8)=0,f(8)=6。g(8)=6, h(8)=0, f(8)=6.
基于知识图谱最短路径的信息获取方法包括:Information acquisition methods based on the shortest path of the knowledge graph include:
初始化:迭代起点0的最短路径长度为0,将迭代起点初始化至第一节点集中,松弛操作集及待选节点集为空,记为松弛操作集{},待选节点集{},第一节点集{0},此时边缘节点只有节点0。Initialization: The shortest path length of the iteration starting point 0 is 0. The iteration starting point is initialized to the first node set. The relaxation operation set and the candidate node set are empty, which are recorded as relaxation operation set {}, candidate node set {}, and the first Node set {0}, at this time the edge node only has node 0.
第一次循环:计算边缘节点的启发路径长度。The first loop: Calculate the heuristic path length of the edge node.
边缘节点仅有节点0,则启发路径长度f(0)=g(0)+h(0)=0+5=5。The edge node only has node 0, so the heuristic path length f(0)=g(0)+h(0)=0+5=5.
从第一节点集及待选节点集中选择节点pB=0,从松弛操作集中无法筛选出节点pA。Select node pB=0 from the first node set and the candidate node set, and node pA cannot be filtered out from the relaxation operation set.
因松弛操作集为空,则执行小步算法,筛选出节点1为距离迭代起点最近且未确定最短路径的邻接节点,确定节点1的最短路径,其已确定路径长度为g(1)=1,累计移动步长为1。更新第一节点集为{1,0}。Because the relaxation operation set is empty, the small-step algorithm is executed, and node 1 is selected as the adjacent node closest to the iteration starting point and the shortest path has not been determined, and the shortest path of node 1 is determined, and its determined path length is g(1)=1 , the cumulative moving step size is 1. Update the first node set to {1,0}.
本次循环中,执行小步算法,因此,无需从松弛操作集中选取节点移动至待选节点集中。另第一节点集及待选节点集中节点数量为2,等于N=2,无需移动节点至松弛操作集中。In this cycle, the small step algorithm is executed, so there is no need to select nodes from the relaxation operation set and move them to the candidate node set. In addition, the number of nodes in the first node set and the candidate node set is 2, which is equal to N=2, and there is no need to move nodes to the relaxation operation set.
第二次循环:计算边缘节点的启发路径长度。Second loop: Calculate the heuristic path length of the edge node.
边缘节点包括节点0及节点1,则节点0的启发路径长度为f(0)=g(0)+h(0)=0+5=5,节点1的启发路径长度f(1)=g(1)+h(1)=1+4=5。 The edge node includes node 0 and node 1, then the heuristic path length of node 0 is f(0)=g(0)+h(0)=0+5=5, and the heuristic path length of node 1 is f(1)=g (1)+h(1)=1+4=5.
从第一节点集及待选节点集中选择节点pB=0,从松弛操作集中无法筛选出节点pA。筛选出节点2和3为距离迭代起点最近且未确定最短路径的邻接节点,确定节点2和3的最短路径,节点2及节点3的已确定路径长度为g(2)=g(3)=2,累计移动步长为2。更新第一节点集为{2,3,1}。Select node pB=0 from the first node set and the candidate node set, and node pA cannot be filtered out from the relaxation operation set. Filter out nodes 2 and 3 as adjacent nodes that are closest to the iteration starting point and have not determined the shortest path, and determine the shortest paths of nodes 2 and 3. The determined path lengths of node 2 and node 3 are g(2)=g(3)= 2, the cumulative moving step size is 2. Update the first node set to {2,3,1}.
第一节点集中节点量大于N=2,此时需要将第一节点集中的启发路径长度最大且已确定路径长度最小的节点移动至松弛操作集中,因f(1)=g(1)+h(1)=1+4=5,f(2)=g(2)+h(2)=2+2=4,f(3)=g(3)+h(3)=2+2=4,因此,选择第一节点集中的节点1移动至松弛操作集中,此时,移动至松弛操作集包含节点1,待选节点集为空,第一节点集包含节点2和3。The number of nodes in the first node set is greater than N=2. At this time, the node with the largest heuristic path length and the smallest determined path length in the first node set needs to be moved to the relaxation operation set, because f(1)=g(1)+h (1)=1+4=5, f(2)=g(2)+h(2)=2+2=4, f(3)=g(3)+h(3)=2+2= 4. Therefore, node 1 in the first node set is selected to move to the relaxation operation set. At this time, the movement to the relaxation operation set includes node 1, the candidate node set is empty, and the first node set includes nodes 2 and 3.
第三次循环:计算边缘节点的启发路径长度。The third loop: Calculate the heuristic path length of the edge node.
边缘节点包括节点1、节点2及节点3,节点1的启发路径长度为f(1)=g(1)+h(1)=1+4=5,节点2及节点3的f(2)=g(2)+h(2)=2+2=4,f(3)=g(3)+h(3)=2+2=4。The edge nodes include node 1, node 2 and node 3. The heuristic path length of node 1 is f(1)=g(1)+h(1)=1+4=5, and the length of node 2 and node 3 is f(2) =g(2)+h(2)=2+2=4, f(3)=g(3)+h(3)=2+2=4.
从第一节点集及待选节点集中选择节点pB=3,从松弛操作集中无法筛选出节点pA=1。Node pB=3 is selected from the first node set and the candidate node set, and node pA=1 cannot be filtered out from the relaxation operation set.
f(pB)<f(pA),选择小步算法,节点4为距离迭代起点最近且未确定最短路径的邻接节点,确定节点4的最短路径,其已确定路径长度为g(4)=3,累计移动步长为3,更新第一节点集为{2,4}。f(pB)<f(pA), select the small step algorithm, node 4 is the adjacent node closest to the iteration starting point and the shortest path has not been determined, determine the shortest path of node 4, and its determined path length is g(4)=3 , the cumulative moving step size is 3, and the first node set is updated to {2,4}.
第一节点集及待选节点集中节点数量为2,小于N=3,无需移动节点至松弛操作集中。The number of nodes in the first node set and the candidate node set is 2, which is less than N=3, and there is no need to move nodes to the relaxation operation set.
第四次循环:计算边缘节点的启发路径长度。The fourth loop: Calculate the heuristic path length of the edge node.
边缘节点包括节点1、节点2及节点4,节点1的启发路径长度为f(1)=g(1)+h(1)=1+4=5,节点2及节点4的f(2)=g(2)+h(2)=2+2=4,f(4)=g(4)+h(4)=3+5=8。The edge nodes include node 1, node 2 and node 4. The heuristic path length of node 1 is f(1)=g(1)+h(1)=1+4=5, and the length of node 2 and node 4 is f(2) =g(2)+h(2)=2+2=4, f(4)=g(4)+h(4)=3+5=8.
从第一节点集及待选节点集中选择节点pB=4,从松弛操作集中无法筛选出节点pA=1。Node pB=4 is selected from the first node set and the candidate node set, and node pA=1 cannot be filtered out from the relaxation operation set.
f(pB)>f(pA),选择松弛操作算法,确定节点6和7的路径,节点6及节点7已确定路径长度分别为g(6)=4,g(7)=5,更新松弛操作集为{6,7}。f(pB)>f(pA), select the relaxation operation algorithm, determine the paths of nodes 6 and 7, the determined path lengths of node 6 and node 7 are g(6)=4, g(7)=5 respectively, update the relaxation The operation set is {6,7}.
松弛操作集中节点的启发路径长度为:f(6)=g(6)+h(6)=4+1=5,f(7)=g(7)+h(7)=5+9=14,松弛操作集中优秀节点为节点6,f(6)<f(pB)且g(6)=4大于等于累计移动 步长3,因此,将节点6加入至待选节点集中,此时,待选节点集为{6},第一节点集为{2,4}。第一节点集+待选节点集大于N,因此,需将第一节点集+待选节点集中启发路径长度最大的节点4移动至松弛操作集中,此时,松弛操作集为{7,4},待选节点集为{6},第一节点集为{2}。The heuristic path length of the nodes in the relaxation operation set is: f(6)=g(6)+h(6)=4+1=5, f(7)=g(7)+h(7)=5+9= 14. The outstanding node in the relaxation operation set is node 6, f(6)<f(pB) and g(6)=4 is greater than or equal to the cumulative movement The step size is 3. Therefore, node 6 is added to the candidate node set. At this time, the candidate node set is {6} and the first node set is {2,4}. The first node set + candidate node set is greater than N. Therefore, the node 4 with the largest inspired path length in the first node set + candidate node set needs to be moved to the relaxation operation set. At this time, the relaxation operation set is {7, 4} , the node set to be selected is {6}, and the first node set is {2}.
第五次循环:计算边缘节点的启发路径长度。The fifth loop: Calculate the heuristic path length of the edge node.
边缘节点包括节点2、节点6、节点4及节点7,节点2的启发路径长度为f(2)=g(2)+h(2)=2+2=4,节点6的启发路径长度为f(6)=g(6)+h(6)=4+1=5,节点4的启发路径长度为f(4)=g(4)+h(4)=3+5=8,节点7的启发路径长度为f(7)=g(7)+h(7)=5+9=14。The edge nodes include node 2, node 6, node 4 and node 7. The heuristic path length of node 2 is f(2)=g(2)+h(2)=2+2=4, and the heuristic path length of node 6 is f(6)=g(6)+h(6)=4+1=5, the heuristic path length of node 4 is f(4)=g(4)+h(4)=3+5=8, node The heuristic path length of 7 is f(7)=g(7)+h(7)=5+9=14.
从第一节点集及待选节点集中选择节点pB=6,从松弛操作集中无法筛选出节点pA=4。f(pB)<f(pA),选择小步算法,节点8为距离迭代起点最近且未确定最短路径的邻接节点,确定节点8的最短路径,其已确定路径长度为g(8)=6,更新第一节点集为{2}。Node pB=6 is selected from the first node set and the candidate node set, and node pA=4 cannot be filtered out from the relaxation operation set. f(pB)<f(pA), select the small step algorithm, node 8 is the adjacent node closest to the iteration starting point and the shortest path has not been determined, determine the shortest path of node 8, and its determined path length is g(8)=6 , update the first node set to {2}.
移动至迭代终点为8,此时找到了一条到达迭代终点的最短路径。因为f(2)=4<=g(8),还需要继续执行下一次循环。Move to the iteration end point 8, and now find the shortest path to the iteration end point. Because f(2)=4<=g(8), it is necessary to continue executing the next cycle.
第六次循环:计算边缘节点的启发路径长度。The sixth loop: Calculate the heuristic path length of the edge node.
边缘节点包括节点2、节点7及节点4,节点2的启发路径长度为f(2)=g(2)+h(2)=2+2=4,节点4的启发路径长度为f(4)=g(4)+h(4)=3+5=8,节点7的启发路径长度为f(7)=g(7)+h(7)=5+9=14。The edge nodes include node 2, node 7 and node 4. The heuristic path length of node 2 is f(2)=g(2)+h(2)=2+2=4, and the heuristic path length of node 4 is f(4). )=g(4)+h(4)=3+5=8, and the heuristic path length of node 7 is f(7)=g(7)+h(7)=5+9=14.
按照前述循环过程继续迭代,直到待选节点集、松弛操作集和第一节点集中所有节点的f值都大于g(8)。Continue iteration according to the aforementioned loop process until the f values of all nodes in the candidate node set, the relaxation operation set and the first node set are greater than g(8).
本文一实施例中,为了避免出现无效迭代进而影响最小路径的寻找速度,上述步骤S242确定第一节点的相关信息之后还包括:In an embodiment of this article, in order to avoid invalid iterations and thus affect the speed of finding the minimum path, the above step S242 after determining the relevant information of the first node also includes:
判断是否至少两个第一节点的相关信息中存在相同的第二节点;Determine whether the same second node exists in the relevant information of at least two first nodes;
若是,则比较具有相同第二节点的第一节点的第二节点预计最短路径长度;If so, compare the estimated shortest path length of the second node of the first node with the same second node;
若第一节点的第二节点预计最短路径长度不同,则仅保留最小第二节点预计最短路径长度的第一节点的相关信息,为其它第一节点重新确定相关信息。If the estimated shortest path lengths of the second nodes of the first node are different, only the relevant information of the first node with the smallest estimated shortest path length of the second node is retained, and the relevant information is re-determined for other first nodes.
本文一实施例中,为了避免遗漏最短路径的情况发生,小步算法还包括: In an embodiment of this article, in order to avoid missing the shortest path, the small step algorithm also includes:
根据各第一节点的有序邻接边线性表,确定第三节点,其中,第三节点为正确认最短路径或已确认最短路径的节点;Determine the third node according to the ordered adjacent edge linear table of each first node, where the third node is the node that is confirming the shortest path or has confirmed the shortest path;
如果该第一节点的已确定路径长度与该第一节点与其第三节点之间邻接边的权重之和等于该第三节点的已确定路径长度,则把所述第三节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路;否则,If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the determined path length of the third node, then the third node is regarded as the successor node, according to The first node and its successor node add a shortest path to the successor node; otherwise,
如果该第一节点的已确定路径长度与该第一节点与其第三节点之间邻接边的权重之和等于第一节点集关联的累计移动步长,则把所述第三节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路,将该第三节点新增为第一节点加入至所述第一节点集中,确定新增第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该新增第一节点。If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the cumulative movement step associated with the first node set, then the third node is regarded as the successor node, Add a shortest path to the successor node according to the first node and its successor node, add the third node as the first node to the first node set, and determine the number of the newly added first node. Relevant information, if there is no relevant information, remove the newly added first node from the first node set.
本文一实施例中,为了提高最短路径的确定效率,还提供一种任务并行分布式系统,该分布式系统可由多个集群设备同时处理多个单目标节点最短路径确定任务和单源节点最短路径确定任务,将各任务中的不同计算过程使用集群或并行计算方式实现。In an embodiment of this article, in order to improve the efficiency of shortest path determination, a task parallel distributed system is also provided. This distributed system can use multiple cluster devices to simultaneously process multiple single-target node shortest path determination tasks and single-source node shortest path determination tasks. Determine the tasks and implement the different computing processes in each task using cluster or parallel computing methods.
基于同一发明构思,本文还提供一种基于知识图谱最短路径的信息获取装置,如下面的实施例。由于基于知识图谱最短路径的信息获取装置解决问题的原理与基于知识图谱最短路径的信息获取方法相似,因此基于知识图谱最短路径的信息获取装置的实施可以参见基于知识图谱最短路径的信息获取方法,重复之处不再赘述。Based on the same inventive concept, this article also provides an information acquisition device based on the shortest path of the knowledge graph, such as the following embodiment. Since the problem-solving principle of the information acquisition device based on the shortest path of the knowledge graph is similar to the information acquisition method based on the shortest path of the knowledge graph, the implementation of the information acquisition device based on the shortest path of the knowledge graph can be found in the information acquisition method based on the shortest path of the knowledge graph. The repetitive parts will not be repeated.
具体的,如图7所示,一种基于知识图谱最短路径的信息获取装置包括:Specifically, as shown in Figure 7, an information acquisition device based on the shortest path of the knowledge graph includes:
初始化单元701,用于根据最短路径查找策略及用户请求中的节点,初始地设定迭代起点并确定其最短路径;The initialization unit 701 is used to initially set the iteration starting point and determine the shortest path according to the shortest path search strategy and the nodes in the user request;
最短路径确定单元702,用于将已确定最短路径且有未确定最短路径的邻接节点的节点作为第一节点组成第一节点集,从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集;The shortest path determination unit 702 is configured to use nodes that have determined shortest paths and have adjacent nodes with undetermined shortest paths as first nodes to form a first node set, and set the adjacent nodes of the first node with undetermined shortest paths from the first node. , select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;
循环控制单元703,用于重复启动所述最短路径确定单元702,直至找到所有期望最短路径为止,其中,所述期望最短路径与最短路径查找策略相关;The loop control unit 703 is configured to repeatedly start the shortest path determination unit 702 until all expected shortest paths are found, wherein the expected shortest path is related to the shortest path search strategy;
信息获取单元704,用于根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应所述用户请求。The information acquisition unit 704 is configured to acquire information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.
如图8所示,基于知识图谱最短路径的信息获取装置包括: As shown in Figure 8, the information acquisition device based on the shortest path of the knowledge graph includes:
初始化单元801,用于根据最短路径查找策略及用户请求中的源节点和目标节点,初始地设定迭代起点及边缘节点,边缘节点包括已确定最短路径且有未确定最短路径的邻接节点的第一节点集中节点、可加入第一节点集的未确定最短路径的待选节点集中节点、通过松弛操作得到路径但未确定最短路径的松弛操作集中节点;初始地设定第一节点集包括迭代起点并确定其最短路径,待选节点集及松弛操作集为空;初始地设定迭代起点的已确定路径长度为零,其余节点的已确定路径长度为无穷大;The initialization unit 801 is used to initially set the iteration starting point and edge nodes according to the shortest path search strategy and the source node and target node in the user request. The edge nodes include the adjacent nodes that have determined the shortest path and have undetermined shortest paths. A node set node, a candidate node set node that can be added to the first node set and the shortest path is not determined, and a relaxation operation set node whose path is obtained through the relaxation operation but the shortest path is not determined; the first node set is initially set to include the iteration starting point And determine the shortest path, the candidate node set and the relaxation operation set are empty; initially set the determined path length of the iteration starting point to zero, and the determined path length of the remaining nodes to infinity;
计算单元802,用于计算边缘节点的启发路径长度,其中,边缘节点的启发路径长度为从迭代起点到迭代终点经过边缘节点的路径长度;The calculation unit 802 is used to calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;
筛选单元803,用于根据所述启发路径长度和边缘节点已确定路径长度,从第一节点集及待选节点集中筛选出启发路径长度最大的节点中已确定路径长度最小的节点pB,从松弛操作集中筛选出启发路径长度最小的节点中已确定路径长度最大的节点pA;The screening unit 803 is configured to filter out the node pB with the smallest determined path length among the nodes with the largest inspired path length from the first node set and the candidate node set according to the inspired path length and the determined path length of the edge node, and select the node pB from the relaxed The operation set selects the node pA with the largest determined path length among the nodes with the smallest heuristic path length;
算法选择单元804,用于如果松弛操作集为空或节点pB的启发路径长度小于节点pA的启发路径长度,或节点pB的启发路径长度等于节点pA的启发路径长度且节点pB的已确定路径长度大于等于节点pA的已确定路径长度,则执行如下小步算法:从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集及新增第一节点的已确定路径长度;Algorithm selection unit 804, used if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to the determined path length of node pA, the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first The node set and the determined path length of the newly added first node;
否则,基于松弛操作计算所述松弛操作集中节点pA的所有邻接节点的路径并更新所述松弛操作集及新增松弛节点的已确定路径长度;Otherwise, calculate the paths of all adjacent nodes of the node pA in the relaxation operation set based on the relaxation operation and update the determined path lengths of the relaxation operation set and the newly added relaxation node;
节点移动单元805,用于从松弛操作集中筛选出优秀节点移动至待选节点集中,所述优秀节点为松弛操作集启发路径长度最小的节点中已确定路径长度最大的节点pA2;The node moving unit 805 is used to screen out excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length determined among the nodes with the smallest path length inspired by the relaxation operation set;
当所述第一节点集及待选节点集中节点数量大于第二预定值时,从所述第一节点集及待选节点集中筛选出超出第二预定值的节点移动至松弛操作集中;When the number of nodes in the first node set and the candidate node set is greater than a second predetermined value, the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;
循环控制单元806,用于重复启动计算单元、筛选单元、算法选择单元及节点移动单元,直至找到从源节点到目标节点的最短路径且边缘节点的启发路径长度都大于从源节点到目标节点的最短路径的长度;The loop control unit 806 is used to repeatedly start the calculation unit, the filtering unit, the algorithm selection unit and the node moving unit until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than that from the source node to the target node. The length of the shortest path;
信息获取单元807,用于根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应用户请求。The information acquisition unit 807 is used to acquire information from the knowledge graph according to the shortest path of each node and respond to user requests according to the acquired information.
本文一实施例中,还提供一种计算机设备,如图9所示,计算机设备902包括存储器906、处理器904及存储在存储器906上并可在处理器904上运行的计算机程序,处理器904执行计算机程序时实现前述任一实施例所述方法。处理器904诸如一个或多个 中央处理单元(CPU),每个处理单元可以实现一个或多个硬件线程。存储器906用于存储诸如代码、设置、数据等之类的任何种类的信息。非限制性的,比如,存储器906可以包括以下任一项或多种组合:任何类型的RAM,任何类型的ROM,闪存设备,硬盘,光盘等。更一般地,任何存储器都可以使用任何技术来存储信息。进一步地,任何存储器可以提供信息的易失性或非易失性保留。进一步地,任何存储器可以表示计算机设备902的固定或可移除部件。在一种情况下,当处理器904执行被存储在任何存储器或存储器的组合中的相关联的指令时,计算机设备902可以执行相关联指令的任一操作。计算机设备902还包括用于与任何存储器交互的一个或多个驱动机构908,诸如硬盘驱动机构、光盘驱动机构等。In an embodiment of this article, a computer device is also provided. As shown in Figure 9, the computer device 902 includes a memory 906, a processor 904, and a computer program stored in the memory 906 and executable on the processor 904. The processor 904 The method described in any of the foregoing embodiments is implemented when the computer program is executed. Processor 904 such as one or more Central processing unit (CPU), each processing unit can implement one or more hardware threads. Memory 906 is used to store any kind of information such as code, settings, data, etc. For example, without limitation, the memory 906 may include any one or more combinations of the following: any type of RAM, any type of ROM, flash memory device, hard disk, optical disk, etc. More generally, any memory can use any technology to store information. Further, any memory can provide volatile or non-volatile retention of information. Further, any memory may represent a fixed or removable component of computer device 902. In one instance, when processor 904 executes associated instructions stored in any memory or combination of memories, computer device 902 may perform any operation of the associated instructions. Computer device 902 also includes one or more drive mechanisms 908 for interacting with any memory, such as a hard disk drive, an optical disk drive, and the like.
计算机设备902还可以包括输入/输出模块910(I/O),其用于接收各种输入(经由输入设备912)和用于提供各种输出(经由输出设备914))。一个具体输出机构可以包括呈现设备916和相关联的图形用户接口918(GUI)。在其他实施例中,还可以不包括输入/输出模块910(I/O)、输入设备912以及输出设备914,仅作为网络中的一台计算机设备。计算机设备902还可以包括一个或多个网络接口920,其用于经由一个或多个通信链路922与其他设备交换数据。一个或多个通信总线924将上文所描述的部件耦合在一起。Computer device 902 may also include an input/output module 910 (I/O) for receiving various inputs (via input device 912) and for providing various outputs (via output device 914). One particular output mechanism may include a presentation device 916 and an associated graphical user interface 918 (GUI). In other embodiments, the input/output module 910 (I/O), the input device 912 and the output device 914 may not be included, and may only be used as a computer device in the network. Computer device 902 may also include one or more network interfaces 920 for exchanging data with other devices via one or more communication links 922 . One or more communication buses 924 couple together the components described above.
通信链路922可以以任何方式实现,例如,通过局域网、广域网(例如,因特网)、点对点连接等、或其任何组合。通信链路922可以包括由任何协议或协议组合支配的硬连线链路、无线链路、路由器、网关功能、名称服务器等的任何组合。Communication link 922 may be implemented in any manner, such as through a local area network, a wide area network (eg, the Internet), a point-to-point connection, etc., or any combination thereof. Communication link 922 may include any combination of hardwired links, wireless links, routers, gateway functions, name servers, etc. governed by any protocol or combination of protocols.
对应于图1至图2、图4至图5中的方法,本文实施例还提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法的步骤。Corresponding to the methods in Figures 1 to 2 and Figures 4 to 5, embodiments of this article also provide a computer-readable storage medium. The computer-readable storage medium stores a computer program, and the computer program is run by a processor. Perform the steps of the above method.
本文实施例还提供一种计算机可读指令,其中当处理器执行所述指令时,其中的程序使得处理器执行如图1至图2、图4至图5所示的方法。Embodiments of this document also provide computer-readable instructions, wherein when a processor executes the instructions, the program therein causes the processor to perform the methods shown in FIGS. 1 to 2 and 4 to 5 .
应理解,在本文的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本文实施例的实施过程构成任何限定。It should be understood that in the various embodiments of this article, the size of the sequence numbers of the above-mentioned processes does not mean the order of execution. The execution order of each process should be determined by its functions and internal logic, and should not be used in the implementation of the embodiments of this article. The process constitutes any limitation.
还应理解,在本文实施例中,术语“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系。例如,A和/或B,可以表示:单独存在A,同时存在A和B, 单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。It should also be understood that in the embodiments of this article, the term "and/or" is only an association relationship describing associated objects, indicating that three relationships can exist. For example, A and/or B can mean: A alone exists, A and B exist simultaneously, There are three cases of B alone. In addition, the character "/" in this article generally indicates that the related objects are an "or" relationship.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本文的范围。Those of ordinary skill in the art can appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented with electronic hardware, computer software, or a combination of both. In order to clearly illustrate the relationship between hardware and software Interchangeability, in the above description, the composition and steps of each example have been generally described according to functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. The skilled artisan may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this article.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the systems, devices and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.
在本文所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。In the several embodiments provided herein, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented. In addition, the coupling or direct coupling or communication connection between each other shown or discussed may be an indirect coupling or communication connection through some interfaces, devices or units, or may be electrical, mechanical or other forms of connection.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本文实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiments of this article.
另外,在本文各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of this article can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit. The above integrated units can be implemented in the form of hardware or software functional units.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本文的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本文各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM, Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on this understanding, the technical solution in this article essentially contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this article. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program code.
本文中应用了具体实施例对本文的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本文的方法及其核心思想;同时,对于本领域的一般技术人员,依据本文的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本文的限制。 This article uses specific embodiments to illustrate the principles and implementation methods of this article. The description of the above embodiments is only used to help understand the methods and core ideas of this article; at the same time, for those of ordinary skill in the field, based on the ideas of this article , there will be changes in the specific implementation and application scope. In summary, the content of this description should not be understood as a limitation of this article.

Claims (24)

  1. 一种基于知识图谱最短路径的信息获取方法,其特征在于,所述知识图谱中包括多个节点及节点间邻接边,所述节点间邻接边权重表示节点间距离,所述方法包括:An information acquisition method based on the shortest path of a knowledge graph, characterized in that the knowledge graph includes multiple nodes and adjacent edges between nodes, and the weight of the adjacent edges between nodes represents the distance between nodes. The method includes:
    S11,根据最短路径查找策略及用户请求中的节点,初始地设定迭代起点并确定其最短路径;S11, according to the shortest path search strategy and the nodes in the user request, initially set the iteration starting point and determine its shortest path;
    S12,将已确定最短路径且有未确定最短路径的邻接节点的节点作为第一节点组成第一节点集,从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集;S12, use the nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as the first nodes to form a first node set, and select a distance iteration starting point from the adjacent nodes of the first node in the first node set with undetermined shortest paths. The nearest adjacent nodes determine the shortest path and update the first node set;
    S13,重复上述步骤S12,直至找到所有期望最短路径为止,其中,所述期望最短路径与最短路径查找策略相关;S13, repeat the above step S12 until all expected shortest paths are found, where the expected shortest paths are related to the shortest path search strategy;
    S14,根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应所述用户请求。S14. According to the shortest path of each node, obtain information from the knowledge graph and respond to the user request according to the obtained information.
  2. 如权利要求1所述的方法,其特征在于,若所述最短路径查找策略为单源查找,则设置用户请求中的源节点为迭代起点,从源节点开始执行步骤S12及步骤S13,所述所有期望最短路径的确定条件为所有已确定最短路径的节点没有未确定最短路径的邻接节点;The method according to claim 1, characterized in that if the shortest path search strategy is a single source search, then the source node in the user request is set as the iteration starting point, and steps S12 and S13 are executed starting from the source node. The determination condition for all expected shortest paths is that all nodes with determined shortest paths have no adjacent nodes with undetermined shortest paths;
    若所述最短路径查找策略为单源单目标正向查找,则设置用户请求中的源节点为迭代起点,按照源节点至目标节点的正向查找方向执行步骤S12及步骤S13;If the shortest path search strategy is a single source single target forward search, then set the source node in the user request as the iteration starting point, and execute steps S12 and S13 according to the forward search direction from the source node to the target node;
    若所述最短路径查找策略为单源单目标反向查找,则设置用户请求中的目标节点为迭代起点,按照源节点至目标节点的反向查找方向执行步骤S12及步骤S13;If the shortest path search strategy is a single source single target reverse search, then set the target node in the user request as the iteration starting point, and execute steps S12 and S13 according to the reverse search direction from the source node to the target node;
    若所述最短路径查找策略为单源单目标双向查找,则设置用户请求中的源节点和目标节点为迭代起点,按照源节点至目标节点的正向查找方向及按照目标节点至源节点的反向查找方向分别执行步骤S12及步骤S13;If the shortest path search strategy is a single source, single target, and bidirectional search, then the source node and target node in the user request are set as the iteration starting point, and the search direction is based on the forward search direction from the source node to the target node and the reverse search direction from the target node to the source node. Execute steps S12 and S13 respectively in the search direction;
    对于单源单目标正向查找、单源单目标反向查找及单源单目标双向查找,为各查找方向的已确定最短路径的节点关联各查找方向的迭代起点,为源节点和目标节点关联自身,所述所有期望最短路径的确定条件为存在节点同时关联源节点和目标节点。For single source single target forward search, single source single target reverse search, and single source single target bidirectional search, the iteration starting point of each search direction is associated with the node of the determined shortest path in each search direction, and the source node and target node are associated By itself, the determination condition for all expected shortest paths is that there are nodes simultaneously associated with the source node and the target node.
  3. 如权利要求1所述的方法,其特征在于,节点的最短路径包括:节点的前驱节点及节点已确定路径长度; The method of claim 1, wherein the shortest path of a node includes: the node's predecessor node and the node's determined path length;
    所述节点已确定路径长度等于所述节点的前驱节点已确定路径长度加上所述节点与所述节点的前驱节点间邻接边权重之和,所述节点至少包括一个前驱节点;The determined path length of the node is equal to the determined path length of the node's predecessor node plus the sum of the adjacent edge weights between the node and the node's predecessor node, and the node includes at least one predecessor node;
    S11中初始地设定迭代起点并确定其最短路径包括:设定迭代起点的前驱节点为空,迭代起点的已确定路径长度为零,其余节点的已确定路径长度为无穷大。In S11, initially setting the iteration starting point and determining its shortest path includes: setting the predecessor node of the iteration starting point to be empty, the determined path length of the iteration starting point to be zero, and the determined path lengths of the remaining nodes to be infinity.
  4. 如权利要求3所述的方法,其特征在于,步骤S12中从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集包括:The method according to claim 3, characterized in that in step S12, from the adjacent nodes of the first node in the first node set for which the shortest path has not been determined, the adjacent node closest to the iteration starting point is selected to determine the shortest path and update the first node Set includes:
    S121,为所述第一节点集中未确定相关信息的第一节点确定相关信息并从所述第一节点集中移除无相关信息的第一节点,所述第一节点的相关信息包括:第二节点、待处理邻接边及第二节点预计最短路径长度;所述待处理邻接边为所述第一节点与其未确定最短路径的邻接节点之间的所有邻接边中满足如下条件的邻接边:该邻接边权重为所述所有邻接边的最小值,且该邻接边权重与所述第一节点的已确定路径长度之和大于第一节点集关联的累计移动步长;所述第二节点为所述待处理邻接边的非第一节点;所述第二节点预计最短路径长度等于所述待处理邻接边的权重加上所述第一节点的已确定路径长度;S121. Determine relevant information for the first node in the first node set for which no relevant information has been determined and remove the first node without relevant information from the first node set. The relevant information of the first node includes: second The estimated shortest path length of the node, the adjacent edge to be processed and the second node; the adjacent edge to be processed is the adjacent edge that satisfies the following conditions among all the adjacent edges between the first node and its adjacent node for which the shortest path has not been determined: the The adjacent edge weight is the minimum value of all adjacent edges, and the sum of the adjacent edge weight and the determined path length of the first node is greater than the cumulative movement step associated with the first node set; the second node is the The non-first node of the adjacent edge to be processed; the estimated shortest path length of the second node is equal to the weight of the adjacent edge to be processed plus the determined path length of the first node;
    S122,从第一节点集中筛选出第二节点预计最短路径长度最小的第一节点;S122, select the first node with the smallest expected shortest path length of the second node from the first node set;
    S123,筛选出的第一节点相关的第二节点为距离迭代起点最近的邻接节点;S123, the second node related to the filtered first node is the adjacent node closest to the iteration starting point;
    S124,更新第一节点集关联的累计移动步长为所述筛选出的第一节点关联的第二节点预计最短路径长度;S124, update the cumulative movement step length associated with the first node set to the estimated shortest path length of the second node associated with the filtered first node;
    S125,对所述筛选出的每一第一节点相关的第二节点及待处理邻接边及第二节点预计最短路径长度,执行如下处理:S125, perform the following processing on the second node related to each filtered first node, the adjacent edge to be processed, and the estimated shortest path length of the second node:
    S1251,当该第二节点预计最短路径长度等于该第二节点的已确定路径长度时,把所述第二节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路径,确定该第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该第一节点;S1251. When the expected shortest path length of the second node is equal to the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes according to the first node and its successor node. Add a shortest path and determine the relevant information of the first node. If there is no relevant information, remove the first node from the first node set;
    S1252,当该第二节点预计最短路径长度小于该第二节点的已确定路径长度时,把所述第二节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路径,确定该第一节点的相关信息,若无相关信息,则从所述第一节点集中移 除该第一节点,将该第二节点新增为第一节点加入至所述第一节点集中,确定新增第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该新增第一节点。S1252, when the estimated shortest path length of the second node is less than the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes according to Add a new shortest path to determine the relevant information of the first node. If there is no relevant information, move it from the first node. In addition to the first node, add the second node as a first node to the first node set, determine the relevant information of the newly added first node, and if there is no relevant information, move it from the first node set. Except the new first node.
  5. 如权利要求4所述的方法,其特征在于,所述方法还包括:The method of claim 4, further comprising:
    根据各第一节点的有序邻接边线性表,确定第三节点;Determine the third node according to the ordered adjacent edge linear list of each first node;
    如果该第一节点的已确定路径长度与该第一节点与其第三节点之间邻接边的权重之和等于该第三节点的已确定路径长度,则把所述第三节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路;否则,If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the determined path length of the third node, then the third node is regarded as the successor node, according to The first node and its successor node add a shortest path to the successor node; otherwise,
    如果该第一节点的已确定路径长度与该第一节点与其第三节点之间邻接边的权重之和等于第一节点集关联的累计移动步长,则把所述第三节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路,将该第三节点新增为第一节点加入至所述第一节点集中,确定新增第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该新增第一节点。If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the cumulative movement step associated with the first node set, then the third node is regarded as the successor node, Add a shortest path to the successor node according to the first node and its successor node, add the third node as the first node to the first node set, and determine the number of the newly added first node. Relevant information, if there is no relevant information, remove the newly added first node from the first node set.
  6. 如权利要求4或5所述的方法,其特征在于,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路径包括:The method of claim 4 or 5, wherein adding a shortest path to the successor node based on the first node and its successor node includes:
    所述后继节点的新增最短路径的前驱节点为所述第一节点;The predecessor node of the new shortest path of the successor node is the first node;
    所述后继节点的新增最短路径长度为所述第一节点与所述后继节点间邻接边的权重加上所述第一节点的已确定路径长度。The new shortest path length of the successor node is the weight of the adjacent edge between the first node and the successor node plus the determined path length of the first node.
  7. 如权利要求4或5所述的方法,其特征在于,确定第一节点的相关信息之后还包括:The method according to claim 4 or 5, characterized in that after determining the relevant information of the first node, it further includes:
    判断是否至少两个第一节点的相关信息中存在相同的第二节点;Determine whether the same second node exists in the relevant information of at least two first nodes;
    若是,则比较具有相同第二节点的第一节点的第二节点预计最短路径长度;If so, compare the estimated shortest path length of the second node of the first node with the same second node;
    若第一节点的第二节点预计最短路径长度不同,则仅保留最小第二节点预计最短路径长度的第一节点的相关信息,为其它第一节点重新确定相关信息。If the estimated shortest path lengths of the second nodes of the first node are different, only the relevant information of the first node with the smallest estimated shortest path length of the second node is retained, and the relevant information is re-determined for other first nodes.
  8. 一种基于知识图谱最短路径的信息获取方法,其特征在于,所述知识图谱中包括多个节点及节点间邻接边,所述节点间邻接边权重表示节点间距离,所述方法包括:An information acquisition method based on the shortest path of a knowledge graph, characterized in that the knowledge graph includes multiple nodes and adjacent edges between nodes, and the weight of the adjacent edges between nodes represents the distance between nodes. The method includes:
    S21,根据最短路径查找策略及用户请求中的源节点和目标节点,初始地设定迭代起点及边缘节点,边缘节点包括已确定最短路径且有未确定最短路径的邻接节点的第一 节点集中节点、可加入第一节点集的未确定最短路径的待选节点集中节点、通过松弛操作得到路径但未确定最短路径的松弛操作集中节点;初始地设定第一节点集包括迭代起点并确定其最短路径,待选节点集及松弛操作集为空;初始地设定迭代起点的已确定路径长度为零,其余节点的已确定路径长度为无穷大;S21, according to the shortest path search strategy and the source node and target node in the user request, initially set the iteration starting point and edge node. The edge node includes the first node that has determined the shortest path and has an adjacent node with an undetermined shortest path. node set nodes, candidate node set nodes that can be added to the first node set and the shortest path is not determined, and relaxation operation set nodes whose paths are obtained through the relaxation operation but the shortest path is not determined; the first node set is initially set to include the iteration starting point and Determine the shortest path, the node set to be selected and the relaxation operation set are empty; initially set the determined path length of the iteration starting point to zero, and the determined path lengths of the remaining nodes to infinity;
    S22,计算边缘节点的启发路径长度,其中,边缘节点的启发路径长度为从迭代起点到迭代终点经过边缘节点的路径长度;S22, calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;
    S23,根据所述启发路径长度和边缘节点已确定路径长度,从第一节点集及待选节点集中筛选出启发路径长度最大的节点中已确定路径长度最小的节点pB,从松弛操作集中筛选出启发路径长度最小的节点中已确定路径长度最大的节点pA;S23. According to the length of the heuristic path and the determined path length of the edge node, the node pB with the smallest determined path length among the nodes with the largest heuristic path length is screened out from the first node set and the candidate node set, and the node pB with the smallest determined path length is screened out from the relaxation operation set. The node pA with the largest path length has been determined among the nodes with the smallest path length;
    S24,如果松弛操作集为空或节点pB的启发路径长度小于节点pA的启发路径长度,或节点pB的启发路径长度等于节点pA的启发路径长度且节点pB的已确定路径长度大于等于节点pA的已确定路径长度,则执行如下小步算法:从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集;S24, if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to that of node pA. Once the path length has been determined, the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;
    否则,基于松弛操作计算所述松弛操作集中节点pA的所有邻接节点的路径并更新所述松弛操作集;Otherwise, calculate the paths of all adjacent nodes of the node pA in the relaxation operation set based on the relaxation operation and update the relaxation operation set;
    S25,从松弛操作集中筛选出优秀节点移动至待选节点集中,所述优秀节点为松弛操作集中启发路径长度最小的节点中已确定路径长度最大的节点pA2;S25, select excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length determined among the nodes with the smallest inspired path length in the relaxation operation set;
    当所述第一节点集及待选节点集中节点数量大于第二预定值时,从所述第一节点集及待选节点集中筛选出超出第二预定值的节点移动至松弛操作集中;When the number of nodes in the first node set and the candidate node set is greater than a second predetermined value, the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;
    S26,重复以上步骤S22至步骤S25的过程,直至找到从源节点到目标节点的最短路径且边缘节点的启发路径长度都大于从源节点到目标节点的最短路径的长度;S26, repeat the above process from step S22 to step S25 until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the length of the shortest path from the source node to the target node;
    S27,根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应用户请求。S27: Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the obtained information.
  9. 如权利要求8所述的方法,其特征在于,若所述最短路径查找策略为单源单目标正向查找,迭代起点为源节点,按照源节点至目标节点的正向查找方向执执行步骤S22至步骤S25;The method according to claim 8, characterized in that if the shortest path search strategy is a single source single target forward search, the iteration starting point is the source node, and step S22 is performed according to the forward search direction from the source node to the target node. Go to step S25;
    若所述最短路径查找策略为单源单目标反向查找,迭代起点为目标节点,按照源节点至目标节点的反向查找方向执行步骤S22至步骤S25; If the shortest path search strategy is a single source single target reverse search, the starting point of the iteration is the target node, and steps S22 to S25 are executed according to the reverse search direction from the source node to the target node;
    若所述最短路径查找策略为单源单目标双向查找,迭代起点为源节点和目标节点,按照源节点至目标节点的正向查找方向及按照目标节点至源节点的反向查找方向分别执行步骤S22至步骤S25;If the shortest path search strategy is a single source, single target, and bidirectional search, the starting point of the iteration is the source node and the target node, and steps are performed according to the forward search direction from the source node to the target node and according to the reverse search direction from the target node to the source node. S22 to step S25;
    为各查找方向的已确定最短路径的节点关联各查找方向的迭代起点,为源节点和目标节点关联自身,从源节点到目标节点的最短路径的确定条件为存在节点同时关联源节点和目标节点。Associate the iteration starting point of each search direction with the node of the determined shortest path in each search direction, associate the source node and the target node with themselves, and the determination condition of the shortest path from the source node to the target node is that there is a node that simultaneously associates the source node and the target node. .
  10. 如权利要求8所述的方法,其特征在于,步骤S25中从所述第一节点集及待选节点集中筛选出超出第二预定值的节点移动至松弛操作集中包括:The method of claim 8, wherein in step S25, filtering out nodes exceeding the second predetermined value from the first node set and the candidate node set and moving them to the relaxation operation set includes:
    按照如下方式确定第一节点集及待选节点集中的优秀节点:按照启发路径长度从小到大、已确定路径长度从大到小的顺序筛选出至多排名前第二预定值的已确定最短路径的节点;Determine the excellent nodes in the first node set and the candidate node set in the following way: filter out the determined shortest paths with at most the top second predetermined value in order of the heuristic path length from small to large and the determined path length from large to small. node;
    将第一节点集及待选节点集中的非优秀节点移动至松弛操作集中。Move the non-excellent nodes in the first node set and the candidate node set to the relaxation operation set.
  11. 如权利要求10所述的方法,其特征在于,所述第二预定值确定策略包括:The method of claim 10, wherein the second predetermined value determination strategy includes:
    根据局部或全部知识图谱数据,设定固定的第二预定值;或Set a fixed second predetermined value based on partial or all knowledge graph data; or
    根据迭代起点和迭代终点的不同,设定固定的第二预定值;或Set a fixed second predetermined value based on the difference between the iteration starting point and the iteration end point; or
    根据知识图谱数据与应用场景设计第二预定值的上限和下限,第二预定值初始化时可设为下限和上限之间的一个值,根据下述策略动态调整第二预定值:在每个连续的K0次迭代中,如果松弛操作集向待选节点集移动的节点数量小于K1,第二预定值降低M,如果松弛操作集向待选节点集移动的节点数量大于K2,第二预定值增加M,其中,K0、K1、M为正整数。Design the upper and lower limits of the second predetermined value based on the knowledge graph data and application scenarios. The second predetermined value can be set to a value between the lower limit and the upper limit when initialized. The second predetermined value can be dynamically adjusted according to the following strategy: at each consecutive In K 0 iterations, if the number of nodes moved by the relaxation operation set to the candidate node set is less than K 1 , the second predetermined value is reduced by M. If the number of nodes moved by the relaxation operation set to the candidate node set is greater than K 2 , the second predetermined value The predetermined value is increased by M, where K 0 , K 1 , and M are positive integers.
  12. 如权利要求10所述的方法,其特征在于,还包括:The method of claim 10, further comprising:
    当所述第一节点集发生变化时,从松弛操作集和待选节点集中移出第一交集节点,其中,所述第一交集节点为第一节点集与松弛操作集和待选节点集的交集节点;When the first node set changes, the first intersection node is removed from the relaxation operation set and the candidate node set, wherein the first intersection node is the intersection of the first node set, the relaxation operation set and the candidate node set node;
    当所述松弛操作集发生变化时,从第一节点集和待选节点集中移出第二交集节点,其中,所述第二交集节点为松弛操作集与第一节点集和待选节点集的交集节点。When the relaxation operation set changes, a second intersection node is removed from the first node set and the candidate node set, wherein the second intersection node is the intersection of the relaxation operation set and the first node set and the candidate node set node.
  13. 如权利要求8所述的方法,其特征在于,节点的最短路径包括:节点的前驱节点及节点的已确定路径长度; The method of claim 8, wherein the shortest path of a node includes: a predecessor node of the node and the determined path length of the node;
    迭代起点的前驱节点为空;The predecessor node of the iteration starting point is empty;
    所述节点已确定路径长度等于所述节点的前驱节点已确定路径长度加上所述节点与所述节点的前驱节点间邻接边权重之和,所述节点至少包括一个前驱节点。The determined path length of the node is equal to the determined path length of the node's predecessor node plus the sum of the adjacent edge weights between the node and the node's predecessor node, and the node includes at least one predecessor node.
  14. 如权利要求8所述的方法,其特征在于,所述边缘节点的启发路径长度利用如下公式计算:
    f(n)=g(n)+h(n);
    The method of claim 8, wherein the heuristic path length of the edge node is calculated using the following formula:
    f(n)=g(n)+h(n);
    其中,f(n)为边缘节点n的启发路径长度,n为边缘节点,h(n)为边缘节点n到达迭代终点的预测路径长度,h为估计函数,g(n)为边缘节点n距离迭代起点的已确定路径长度。Among them, f(n) is the heuristic path length of edge node n, n is the edge node, h(n) is the predicted path length of edge node n to the iteration end point, h is the estimation function, and g(n) is the distance of edge node n. The determined path length of the iteration starting point.
  15. 如权利要求8所述方法,其特征在于,步骤S21中初始地设定第一节点集包括迭代起点时,还设定第一节点集关联的累计移动步长为零;The method according to claim 8, characterized in that when initially setting the first node set to include the iteration starting point in step S21, the cumulative movement step associated with the first node set is also set to zero;
    步骤S24中还更新第一节点集关联的累计移动步长;In step S24, the cumulative movement step associated with the first node set is also updated;
    步骤S25中从松弛操作集中筛选出优秀节点移动至待选节点集中的执行条件包括:In step S25, the execution conditions for selecting excellent nodes from the slack operation set and moving them to the candidate node set include:
    所述第一节点集和待选节点集为空;或The first node set and the candidate node set are empty; or
    所述优秀节点的启发路径长度小于所述第一节点集和待选节点集中节点最大的启发路径长度且所述优秀节点的已确定路径长度大于或等于第一节点集关联的累计移动步长。The heuristic path length of the excellent node is less than the maximum heuristic path length of the nodes in the first node set and the candidate node set, and the determined path length of the excellent node is greater than or equal to the cumulative movement step associated with the first node set.
  16. 如权利要求8所述的方法,其特征在于,步骤S24的小步算法中从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集包括:The method according to claim 8, characterized in that in the small step algorithm of step S24, from the adjacent nodes of the first node in the first node set for which the shortest path has not been determined, the adjacent node closest to the iteration starting point is selected to determine the shortest path and Update the first node set to include:
    S241,当所述第一节点集为空时,将待选节点集中已确定路径长度最小的节点移动至所述第一节点集并设置所述第一节点集关联的累计移动步长为移动节点的已确定路径长度;S241. When the first node set is empty, move the node with the smallest determined path length in the candidate node set to the first node set and set the cumulative movement step associated with the first node set as the mobile node. The determined path length;
    S242,为所述第一节点集中未确定相关信息的第一节点确定相关信息并从所述第一节点集中移除无相关信息的第一节点,所述第一节点的相关信息包括:第二节点、待处理邻接边及第二节点预计最短路径长度;所述待处理邻接边为所述第一节点与其未确定最短路径的邻接节点之间的所有邻接边中满足如下条件的邻接边:该邻接边权重为所述所有邻接边的最小值,且该邻接边权重与所述第一节点的已确定路径长度之和大于第一 节点集关联的累计移动步长;所述第二节点为所述待处理邻接边的非第一节点;所述第二节点预计最短路径长度等于所述待处理邻接边的权重加上所述第一节点的已确定路径长度;S242: Determine relevant information for the first node in the first node set for which no relevant information has been determined and remove the first node without relevant information from the first node set. The relevant information of the first node includes: second The estimated shortest path length of the node, the adjacent edge to be processed and the second node; the adjacent edge to be processed is the adjacent edge that satisfies the following conditions among all the adjacent edges between the first node and its adjacent node for which the shortest path has not been determined: the The adjacent edge weight is the minimum value of all adjacent edges, and the sum of the adjacent edge weight and the determined path length of the first node is greater than the first The cumulative movement step associated with the node set; the second node is a non-first node of the adjacent edge to be processed; the estimated shortest path length of the second node is equal to the weight of the adjacent edge to be processed plus the first node The determined path length of a node;
    S243,从所述第一节点集中筛选出第二节点预计最短路径长度最小的第一节点;根据所述筛选出的第一节点,从待选节点集中选择节点移动至第一节点集中;S243, select the first node with the smallest expected shortest path length of the second node from the first node set; select a node from the candidate node set to move to the first node set according to the screened out first node;
    S244,筛选出的第一节点相关的第二节点为距离迭代起点最近的邻接节点;S244, the second node related to the filtered first node is the adjacent node closest to the iteration starting point;
    S245,更新第一节点集关联的累计移动步长为所述筛选出的第一节点关联的第二节点预计最短路径长度;S245, update the cumulative movement step length associated with the first node set to the estimated shortest path length of the second node associated with the filtered first node;
    S246,对所述筛选出的每一第一节点相关的第二节点、待处理邻接边及第二节点预计最短路径长度,执行如下处理:S246: Perform the following processing on the second node, the adjacent edge to be processed and the estimated shortest path length of the second node related to each filtered out first node:
    S2461,当该第二节点预计最短路径长度等于该第二节点的已确定路径长度时,把所述第二节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路径,确定该第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该第一节点;S2461, when the expected shortest path length of the second node is equal to the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes according to Add a shortest path and determine the relevant information of the first node. If there is no relevant information, remove the first node from the first node set;
    S2462,当该第二节点预计最短路径长度小于该第二节点的已确定路径长度时,把所述第二节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路径,确定该第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该第一节点,将该第二节点新增为第一节点加入至所述第一节点集中,确定新增第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该新增第一节点。S2462, when the estimated shortest path length of the second node is less than the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes according to Add a shortest path to determine the relevant information of the first node. If there is no relevant information, remove the first node from the first node set and add the second node as the first node to the set. In the first node set, relevant information of the newly added first node is determined. If there is no relevant information, the newly added first node is removed from the first node set.
  17. 如权利要求16所述的方法,其特征在于,步骤S243根据所述筛选出的第一节点,从待选节点集中选择节点移动至第一节点集中包括:The method of claim 16, wherein step S243, based on the filtered first node, selecting a node from the candidate node set to move to the first node set includes:
    当所述筛选出的第一节点相关的第二节点预计最短路径长度大于或等于待选节点集中节点的最小已确定路径长度时,按照如下步骤将待选节点集中节点移至第一节点集中:When the estimated shortest path length of the second node related to the filtered first node is greater than or equal to the minimum determined path length of the node in the candidate node set, follow the following steps to move the node in the candidate node set to the first node set:
    对待选节点集中最小已确定路径长度对应的每一个节点pX,判断所述筛选出的第一节点相关的第二节点中是否包含节点pX,For each node pX corresponding to the minimum determined path length in the set of nodes to be selected, determine whether the second node related to the filtered first node contains the node pX,
    若否,则将节点pX从待选节点集中移至第一节点集中,为节点pX确定相关信息,其中,节点pX相关的第二节点预计最短路径长度=节点pX的已确定路径长度+节点pX的待处理邻接边的权重,跳转到步骤S243再次执行该步骤, If not, move the node pX from the candidate node set to the first node set, and determine relevant information for the node pX, where the estimated shortest path length of the second node related to the node pX = the determined path length of the node pX + the node pX The weight of the adjacent edge to be processed, jump to step S243 and execute this step again,
    若是,则把节点pX从待选节点集移除,重复执行上述步骤,直到待选节点集中节点的已确定路径长度都大于所述筛选出的第一节点相关的第二节点预计最短路径长度。If so, remove the node pX from the candidate node set, and repeat the above steps until the determined path lengths of the nodes in the candidate node set are greater than the estimated shortest path length of the second node related to the filtered first node.
  18. 如权利要求16所述的方法,其特征在于,所述小步算法还包括:The method of claim 16, wherein the small step algorithm further includes:
    根据各第一节点的有序邻接边线性表,确定第三节点;Determine the third node according to the ordered adjacent edge linear list of each first node;
    如果该第一节点的已确定路径长度与该第一节点与其第三节点之间邻接边的权重之和等于该第三节点的已确定路径长度,则把所述第三节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路;否则,If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the determined path length of the third node, then the third node is regarded as the successor node, according to The first node and its successor node add a shortest path to the successor node; otherwise,
    如果该第一节点的已确定路径长度与该第一节点与其第三节点之间邻接边的权重之和等于第一节点集关联的累计移动步长,则把所述第三节点作为后继节点,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路,将该第三节点新增为第一节点加入至所述第一节点集中,确定新增第一节点的相关信息,若无相关信息,则从所述第一节点集中移除该新增第一节点。If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the cumulative movement step associated with the first node set, then the third node is regarded as the successor node, Add a shortest path to the successor node according to the first node and its successor node, add the third node as the first node to the first node set, and determine the number of the newly added first node. Relevant information, if there is no relevant information, remove the newly added first node from the first node set.
  19. 如权利要求16或18所述的方法,其特征在于,确定第一节点的相关信息之后还包括:The method according to claim 16 or 18, characterized in that after determining the relevant information of the first node, it further includes:
    判断是否至少两个第一节点的相关信息中存在相同的第二节点;Determine whether the same second node exists in the relevant information of at least two first nodes;
    若是,则比较具有相同第二节点的第一节点的第二节点预计最短路径长度;If so, compare the estimated shortest path length of the second node of the first node with the same second node;
    若第一节点的第二节点预计最短路径长度不同,则仅保留最小第二节点预计最短路径长度的第一节点的相关信息,为其它第一节点重新确定相关信息。If the estimated shortest path lengths of the second nodes of the first node are different, only the relevant information of the first node with the smallest estimated shortest path length of the second node is retained, and the relevant information is re-determined for other first nodes.
  20. 如权利要求16或18所述的方法,其特征在于,根据所述第一节点及其所述后继节点为所述后继节点新增一条最短路径包括:The method of claim 16 or 18, wherein adding a shortest path to the successor node based on the first node and its successor node includes:
    所述后继节点的新增最短路径的前驱节点为所述第一节点;The predecessor node of the new shortest path of the successor node is the first node;
    所述后继节点的新增最短路径长度为所述第一节点与所述后继节点间邻接边的权重加上所述第一节点的已确定路径长度。The new shortest path length of the successor node is the weight of the adjacent edge between the first node and the successor node plus the determined path length of the first node.
  21. 一种基于知识图谱最短路径的信息获取装置,其特征在于,所述知识图谱中包括多个节点及节点间邻接边,所述节点间邻接边权重表示节点间距离,所述装置包括:An information acquisition device based on the shortest path of a knowledge graph, characterized in that the knowledge graph includes a plurality of nodes and adjacent edges between nodes, and the weight of the adjacent edges between nodes represents the distance between nodes, and the device includes:
    初始化单元,用于根据最短路径查找策略及用户请求中的节点,初始地设定迭代起点并确定其最短路径; The initialization unit is used to initially set the iteration starting point and determine the shortest path based on the shortest path search strategy and the nodes in the user request;
    最短路径确定单元,用于将已确定最短路径且有未确定最短路径的邻接节点的节点作为第一节点组成第一节点集,从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集;The shortest path determination unit is configured to use nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as first nodes to form a first node set, from the adjacent nodes of the first node in the first node set with undetermined shortest paths. , select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;
    循环控制单元,用于重复启动所述最短路径确定单元,直至找到所有期望最短路径为止,其中,所述期望最短路径与最短路径查找策略相关;A loop control unit, configured to repeatedly start the shortest path determination unit until all expected shortest paths are found, wherein the expected shortest path is related to the shortest path search strategy;
    信息获取单元,用于根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应所述用户请求。An information acquisition unit is used to acquire information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.
  22. 一种基于知识图谱最短路径的信息获取装置,其特征在于,所述知识图谱中包括多个节点及节点间邻接边,所述节点间邻接边权重表示节点间距离,所述装置包括:An information acquisition device based on the shortest path of a knowledge graph, characterized in that the knowledge graph includes a plurality of nodes and adjacent edges between nodes, and the weight of the adjacent edges between nodes represents the distance between nodes, and the device includes:
    初始化单元,用于根据最短路径查找策略及用户请求中的源节点和目标节点,初始地设定迭代起点及边缘节点,边缘节点包括已确定最短路径且有未确定最短路径的邻接节点的第一节点集中节点、可加入第一节点集的未确定最短路径的待选节点集中节点、通过松弛操作得到路径但未确定最短路径的松弛操作集中节点;初始地设定第一节点集包括迭代起点并确定其最短路径,待选节点集及松弛操作集为空;初始地设定迭代起点的已确定路径长度为零,其余节点的已确定路径长度为无穷大;The initialization unit is used to initially set the iteration starting point and the edge node according to the shortest path search strategy and the source node and target node in the user request. The edge node includes the first node that has determined the shortest path and has an adjacent node with an undetermined shortest path. node set nodes, candidate node set nodes that can be added to the first node set and the shortest path is not determined, and relaxation operation set nodes whose paths are obtained through the relaxation operation but the shortest path is not determined; the first node set is initially set to include the iteration starting point and Determine the shortest path, the node set to be selected and the relaxation operation set are empty; initially set the determined path length of the iteration starting point to zero, and the determined path lengths of the remaining nodes to infinity;
    计算单元,用于计算边缘节点的启发路径长度,其中,边缘节点的启发路径长度为从迭代起点到迭代终点经过边缘节点的路径长度;The calculation unit is used to calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;
    筛选单元,用于根据所述启发路径长度和边缘节点已确定路径长度,从第一节点集及待选节点集中筛选出启发路径长度最大的节点中已确定路径长度最小的节点pB,从松弛操作集中筛选出启发路径长度最小的节点中已确定路径长度最大的节点pA;A screening unit, configured to filter out the node pB with the smallest determined path length among the nodes with the largest inspired path length from the first node set and the candidate node set based on the inspired path length and the determined path length of the edge node, and select the node pB with the smallest determined path length from the relaxation operation Centrally filter out the node pA with the largest determined path length among the nodes with the smallest heuristic path length;
    算法选择单元,用于如果松弛操作集为空或节点pB的启发路径长度小于节点pA的启发路径长度,或节点pB的启发路径长度等于节点pA的启发路径长度且节点pB的已确定路径长度大于等于节点pA的已确定路径长度,则执行如下小步算法:从第一节点集中第一节点的未确定最短路径的邻接节点中,选择距离迭代起点最近的邻接节点确定最短路径并更新第一节点集;Algorithm selection unit for use if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than is equal to the determined path length of node pA, then the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;
    否则,基于松弛操作计算所述松弛操作集中节点pA的所有邻接节点的路径并更新所述松弛操作集;Otherwise, calculate the paths of all adjacent nodes of the node pA in the relaxation operation set based on the relaxation operation and update the relaxation operation set;
    节点移动单元,用于从松弛操作集中筛选出优秀节点移动至待选节点集中,所述优秀节点为松弛操作集中启发路径长度最小的节点中已确定路径长度最大的节点pA2; The node moving unit is used to select excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length determined among the nodes with the smallest inspired path length in the relaxation operation set;
    当所述第一节点集及待选节点集中节点数量大于第二预定值时,从所述第一节点集及待选节点集中筛选出超出第二预定值的节点移动至松弛操作集中;When the number of nodes in the first node set and the candidate node set is greater than a second predetermined value, the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;
    循环控制单元,用于重复启动计算单元、筛选单元、算法选择单元及节点移动单元,直至找到从源节点到目标节点的最短路径且边缘节点的启发路径长度都大于从源节点到目标节点的最短路径的长度;The loop control unit is used to repeatedly start the calculation unit, filtering unit, algorithm selection unit and node moving unit until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the shortest path from the source node to the target node. the length of the path;
    信息获取单元,用于根据各节点的最短路径,从知识图谱获取信息并根据获取的信息响应用户请求。The information acquisition unit is used to obtain information from the knowledge graph according to the shortest path of each node and respond to user requests based on the obtained information.
  23. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至20任一项所述方法。A computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that when the processor executes the computer program, it implements any one of claims 1 to 20. described method.
  24. 一种计算机存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被计算机设备的处理器运行时,执行根据权利要求1至20任一项所述方法的指令。 A computer storage medium having a computer program stored thereon, characterized in that when the computer program is run by a processor of a computer device, instructions for executing the method according to any one of claims 1 to 20 are provided.
PCT/CN2023/110611 2022-08-31 2023-08-01 Information acquisition method and apparatus based on shortest path in knowledge graph WO2024046013A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211058393.2 2022-08-31
CN202211058393.2A CN116319518B (en) 2022-08-31 2022-08-31 Information acquisition method and device based on shortest path of knowledge graph

Publications (1)

Publication Number Publication Date
WO2024046013A1 true WO2024046013A1 (en) 2024-03-07

Family

ID=86831055

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/110611 WO2024046013A1 (en) 2022-08-31 2023-08-01 Information acquisition method and apparatus based on shortest path in knowledge graph

Country Status (2)

Country Link
CN (1) CN116319518B (en)
WO (1) WO2024046013A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116319518B (en) * 2022-08-31 2024-02-20 王举范 Information acquisition method and device based on shortest path of knowledge graph
CN116562488B (en) * 2023-07-05 2024-02-13 腾讯科技(深圳)有限公司 Method, apparatus, computer device, medium and program product for generating flow guiding island

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102662974A (en) * 2012-03-12 2012-09-12 浙江大学 A network graph index method based on adjacent node trees
JP2012253541A (en) * 2011-06-02 2012-12-20 Nippon Telegr & Teleph Corp <Ntt> Route determination device and route determination method
US20160189028A1 (en) * 2014-12-31 2016-06-30 Verizon Patent And Licensing Inc. Systems and Methods of Using a Knowledge Graph to Provide a Media Content Recommendation
US20210192371A1 (en) * 2019-12-20 2021-06-24 Fujitsu Limited Computer-readable recording medium, information processing apparatus, and data generating method
CN113808424A (en) * 2021-09-28 2021-12-17 合肥工业大学 Method for acquiring K shortest paths of urban road network based on bidirectional Dijkstra
CN114896377A (en) * 2022-04-07 2022-08-12 东南大学 Knowledge graph-based answer acquisition method
CN116319518A (en) * 2022-08-31 2023-06-23 王举范 Information acquisition method and device based on shortest path of knowledge graph

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9253079B2 (en) * 2013-10-11 2016-02-02 Telefonaktiebolaget L M Ericsson (Publ) High performance LFA path algorithms
CN106096783A (en) * 2016-06-13 2016-11-09 Tcl集团股份有限公司 A kind of method for optimizing route based on Dijkstra and system thereof
CN111966912B (en) * 2020-09-02 2023-03-10 深圳壹账通智能科技有限公司 Recommendation method and device based on knowledge graph, computer equipment and storage medium
CN113781817B (en) * 2021-09-28 2022-07-05 合肥工业大学 Urban road network multisource shortest path obtaining method based on shared computation
CN114860886B (en) * 2022-05-25 2023-07-18 北京百度网讯科技有限公司 Method for generating relationship graph and method and device for determining matching relationship

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012253541A (en) * 2011-06-02 2012-12-20 Nippon Telegr & Teleph Corp <Ntt> Route determination device and route determination method
CN102662974A (en) * 2012-03-12 2012-09-12 浙江大学 A network graph index method based on adjacent node trees
US20160189028A1 (en) * 2014-12-31 2016-06-30 Verizon Patent And Licensing Inc. Systems and Methods of Using a Knowledge Graph to Provide a Media Content Recommendation
US20210192371A1 (en) * 2019-12-20 2021-06-24 Fujitsu Limited Computer-readable recording medium, information processing apparatus, and data generating method
CN113808424A (en) * 2021-09-28 2021-12-17 合肥工业大学 Method for acquiring K shortest paths of urban road network based on bidirectional Dijkstra
CN114896377A (en) * 2022-04-07 2022-08-12 东南大学 Knowledge graph-based answer acquisition method
CN116319518A (en) * 2022-08-31 2023-06-23 王举范 Information acquisition method and device based on shortest path of knowledge graph

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DEMAINE ERIK, KU JASON, SOLOMON JUSTIN: "Lecture 11: Weighted Shortest Paths", INTRODUCTION TO ALGORITHMS: 6.006, MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 1 January 2020 (2020-01-01), pages 1 - 5, XP093145015, Retrieved from the Internet <URL:https://ocw.mit.edu/courses/6-006-introduction-to-algorithms-spring-2020/aa57a9785adf925bc85c1920f53755a0_MIT6_00> [retrieved on 20240325] *
MANRIQUE RUBÉN, MARINO OLGA: "Knowledge graph-based weighting strategies for a scholarly paper recommendation scenario", KARS'18, vol. 2290, 7 October 2018 (2018-10-07), XP093145018 *

Also Published As

Publication number Publication date
CN116319518B (en) 2024-02-20
CN116319518A (en) 2023-06-23

Similar Documents

Publication Publication Date Title
WO2024046013A1 (en) Information acquisition method and apparatus based on shortest path in knowledge graph
US11416505B2 (en) Querying an archive for a data store
US11310313B2 (en) Multi-threaded processing of search responses returned by search peers
US10698777B2 (en) High availability scheduler for scheduling map-reduce searches based on a leader state
US10599308B2 (en) Executing search commands based on selections of time increments and field-value pairs
US11704341B2 (en) Search result replication management in a search head cluster
US10506084B2 (en) Timestamp-based processing of messages using message queues
US20200117674A1 (en) Creating a correlation search
Bertsekas Rollout algorithms for discrete optimization: A survey
US20160034525A1 (en) Generation of a search query to approximate replication of a cluster of events
US20160019316A1 (en) Wizard for creating a correlation search
Trovato et al. Differential a
JP5845716B2 (en) Program for determining route, information processing method and apparatus
Harde et al. Design and implementation of ACO feature selection algorithm for data stream mining
Thirugnanasambandam et al. Greedy based population seeding technique in ant colony optimization algorithm
Rutgers et al. Powerful and efficient bulk shortest-path queries: cypher language extension & Giraph implementation
Mishali et al. elinda: Explorer for linked data
CN114812593A (en) Method and device for generating vehicle path, storage medium, processor and electronic device
Çakır et al. A* Algorithm Under Single-Valued Neutrosophic Fuzzy Environment
Singla A Review: Frequent Pattern Mining Techniques in Static and Stream Data Environment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23859048

Country of ref document: EP

Kind code of ref document: A1