WO2024046013A1

WO2024046013A1 - Information acquisition method and apparatus based on shortest path in knowledge graph

Info

Publication number: WO2024046013A1
Application number: PCT/CN2023/110611
Authority: WO
Inventors: 王举范
Original assignee: 王举范
Priority date: 2022-08-31
Filing date: 2023-08-01
Publication date: 2024-03-07
Also published as: CN116319518B; CN116319518A

Abstract

Provided in the present application are an information acquisition method and apparatus based on the shortest path in a knowledge graph. The method comprises: according to a shortest-path searching strategy and nodes in a user request, initially setting an iteration starting point and determining the shortest path of same; taking nodes, the shortest paths of which have been determined and which have adjacency nodes, the shortest paths of which are not determined, as first nodes to form a set of first nodes, selecting an adjacency node, which is the closest to the iteration starting point, from among the adjacency nodes, the shortest paths of which are not determined, of the first nodes in the set of first nodes, determining the shortest path of the adjacency node, and updating the set of first nodes; repeating the above steps until all desired shortest paths are found, wherein the desired shortest paths are related to the shortest-path searching strategy; and acquiring information from a knowledge graph according to the shortest paths of the nodes, and responding to the user request according to the acquired information. By means of the present application, the shortest paths of one or more nodes can be calculated in each iteration, and the shortest paths of other nodes are approached stably.

Description

An information acquisition method and device based on the shortest path of knowledge graph

This application claims the priority of the Chinese patent submitted on August 31, 2022, with the application number 202211058393.2 and the invention title "An information acquisition method and device based on the shortest path of the knowledge graph". All the contents of this patent are here Introduction.

Technical field

This article relates to the field of computers, and in particular to an information acquisition method and device based on the shortest path of a knowledge graph.

Background technique

Knowledge graphs use graph models to describe knowledge, model the relationships between things, and use organizational principles to enable users or computer systems to perform knowledge inferences based on underlying data.

Nodes in the knowledge graph correspond to entities, and directed edges correspond to relationships between entities. Entities are connected to each other through relationships, and understanding the relationships between entities is the basis for knowledge graph analysis. After quantifying the relationship between entities, the shortest path algorithm can be used to calculate the closeness of the relationship between entities and obtain the shortest path tree or a shortest path. This can provide a more intuitive and in-depth understanding of the relationship between entities, and provide insights and discoveries for hidden facts and Provide high-quality data regularly.

The shortest path algorithm commonly used for mining information in knowledge graphs in the existing technology is mainly based on the Dijkstra algorithm based on relaxation operations.

In Dijkstra's algorithm, let S be the set of nodes that have found the shortest path, and Q be the set of nodes that have not yet found the shortest path. Dijkstra's algorithm aims at minimizing the weight value of the path for the node u recently added to S, and examines each node v connected to u and not in S, that is, when dist[u]+length(u,v)<dist [v], the weight value dist[v] of v is replaced by dist[u]+length(u,v), and the path predecessor of v, pre[v], points to u. Based on this, Dijkstra's algorithm has the following flaws:

(1) Dijkstra's algorithm uses relaxation operations to continuously select paths with smaller weight values among multiple paths for nodes in Q, and will repeatedly update the path information of certain nodes in Q, as shown in: the path information of nodes in Q May be updated between multiple iterations; the degree of perturbation to dist in each iteration is positively related to the out-degree of u. The above two points greatly increase the maintenance cost of the newly generated u in Q.

(2) The adjacent edges of node u require a lot of conditional judgments and rely heavily on the arithmetic logic unit of the CPU. Parallel computing cannot be efficiently used to improve the efficiency of the computer system in running the shortest path algorithm.

Contents of the invention

This article is used to solve the problem of low computational efficiency in the shortest path determination process based on knowledge graphs in the existing technology.

In order to solve the above technical problems, this article provides an information acquisition method based on the shortest path of the knowledge graph. The knowledge graph includes multiple nodes and adjacent edges between nodes. The weight of the adjacent edges between nodes represents the distance between nodes. The method includes:

S11, according to the shortest path search strategy and the nodes in the user request, initially set the iteration starting point and determine its shortest path;

S12, use the nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as the first nodes to form a first node set, and select a distance iteration starting point from the adjacent nodes of the first node in the first node set with undetermined shortest paths. The nearest adjacent node determines the shortest path and updates the first node set and the determined path length of the newly added first node;

S13, repeat the above step S12 until all expected shortest paths are found, where the expected shortest paths are related to the shortest path search strategy;

S14. According to the shortest path of each node, obtain information from the knowledge graph and respond to the user request according to the obtained information.

On the other hand, this article provides a device for determining the shortest path of a knowledge graph. The knowledge graph includes multiple nodes and adjacent edges between nodes. The weight of the adjacent edges between nodes represents the distance between nodes. The device includes:

The initialization unit is used to initially set the iteration starting point and determine the shortest path based on the shortest path search strategy and the nodes in the user request;

The shortest path determination unit is configured to use nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as first nodes to form a first node set, from the adjacent nodes of the first node in the first node set with undetermined shortest paths. , select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;

A loop control unit, configured to repeatedly start the shortest path determination unit until all expected shortest paths are found, wherein the expected shortest path is related to the shortest path search strategy;

An information acquisition unit is used to acquire information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.

By avoiding the use of relaxation operations, the above two embodiments select the adjacent node closest to the iteration starting point to determine the shortest path each time (recorded as a small step algorithm), so that the shortest path of one or more nodes can be calculated in each iteration. short path and steadily approach the shortest paths of other nodes. The number of adjacent edges of a node will not directly affect the time complexity and can improve the efficiency of determining the shortest path.

On the other hand, this article provides a method for determining the shortest path of the knowledge graph. The knowledge graph includes multiple nodes and adjacent edges between nodes. The weight of the adjacent edges between nodes represents the distance between nodes. The methods include:

S21, according to the shortest path search strategy and the source node and target node in the user request, initially set the iteration starting point and edge node. The edge node includes the first node collection node that has determined the shortest path and has adjacent nodes that have not determined the shortest path. , nodes in the candidate node set for which the shortest path has not been determined can be added to the first node set, and nodes in the relaxation operation set for which the path is obtained through the relaxation operation but the shortest path is not determined; initially set the first node set to include the iteration starting point and determine its shortest The path, candidate node set and relaxation operation set are empty; the determined path length of the iteration starting point is initially set to zero, and the determined path length of the remaining nodes is infinite;

S22, calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;

S23. According to the length of the heuristic path and the determined path length of the edge node, the node pB with the smallest determined path length among the nodes with the largest heuristic path length is screened out from the first node set and the candidate node set, and the node pB with the smallest determined path length is screened out from the relaxation operation set. The node pA with the largest path length has been determined among the nodes with the smallest path length;

S24, if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to that of node pA. Once the path length has been determined, the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set that have not yet determined the shortest path, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set and new The determined path length of the first node;

Otherwise, calculate the paths of all adjacent nodes of the node pA in the relaxation operation set based on the relaxation operation and update the determined path lengths of the relaxation operation set and the newly added relaxation node;

S25, select excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length determined among the nodes with the smallest inspired path length in the relaxation operation set;

When the number of nodes in the first node set and the candidate node set is greater than a second predetermined value, the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;

S26, repeat the above process from step S22 to step S25 until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the length of the shortest path from the source node to the target node;

S27: Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the obtained information.

The initialization unit is used to initially set the iteration starting point and the edge node according to the shortest path search strategy and the source node and target node in the user request. The edge node includes the first node that has determined the shortest path and has an adjacent node with an undetermined shortest path. node set nodes, candidate node set nodes that can be added to the first node set and the shortest path is not determined, and relaxation operation set nodes whose paths are obtained through the relaxation operation but the shortest path is not determined; the first node set is initially set to include the iteration starting point and Determine the shortest path, the node set to be selected and the relaxation operation set are empty; initially set the determined path length of the iteration starting point to zero, and the determined path lengths of the remaining nodes to infinity;

The calculation unit is used to calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;

A screening unit, configured to filter out the node pB with the smallest determined path length among the nodes with the largest inspired path length from the first node set and the candidate node set based on the inspired path length and the determined path length of the edge node, and select the node pB with the smallest determined path length from the relaxation operation Centrally filter out the node pA with the largest determined path length among the nodes with the smallest heuristic path length;

Algorithm selection unit for use if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than is equal to the determined path length of node pA, then the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node Set and add the determined path length of the first node;

Otherwise, calculate the paths of all adjacent nodes of the node pA in the relaxation operation set based on the relaxation operation and update the relaxation operation set;

The node moving unit is used to select excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length determined among the nodes with the smallest inspired path length in the relaxation operation set;

The loop control unit is used to repeatedly start the calculation unit, filtering unit, algorithm selection unit and node moving unit until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the shortest path from the source node to the target node. the length of the path;

The information acquisition unit is used to obtain information from the knowledge graph according to the shortest path of each node and respond to user requests based on the obtained information.

The above two embodiments use heuristic information as target guidance information to combine the small step algorithm with the relaxation operation algorithm, which can reduce the search blindness of the small step algorithm caused by knowing nothing about the iteration endpoint during the search process, and is filtered by the relaxation operation. Finding excellent nodes as candidates for the first node in the small step algorithm can reduce the search range of the small step algorithm and obtain the shortest path from the iteration starting point to the iteration end point more quickly.

On the other hand, this article also provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the method described in the aforementioned embodiments.

In another aspect, this article also provides a computer storage medium on which a computer program is stored. When the computer program is run by a processor of a computer device, the instructions of the method according to the foregoing embodiments are executed.

In order to make the above and other objects, features and advantages of this article more obvious and understandable, preferred embodiments are cited below and described in detail with the accompanying drawings.

Description of drawings

In order to more clearly illustrate the embodiments of this article or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only For some embodiments of this article, those of ordinary skill in the art can also obtain other drawings based on these drawings without exerting creative efforts.

Figure 1 shows the first flow chart of the information acquisition method based on the shortest path of the knowledge graph in the embodiment of this article;

Figure 2 shows the first flow chart of the process of selecting the adjacent node closest to the iteration starting point to determine the shortest path and updating the first node set in the embodiment of this article;

Figure 3 shows a schematic diagram of a knowledge graph abstract graph according to Embodiment 1 of this article;

Figure 4 shows the second flow chart of the information acquisition method based on the shortest path of the knowledge graph in the embodiment of this article;

Figure 5 shows the second flow chart of the process of selecting the adjacent node closest to the iteration starting point to determine the shortest path and updating the first node set in the embodiment of this article;

Figure 6 shows a schematic diagram of another knowledge graph abstract graph according to the embodiment of this article;

Figure 7 shows a structural diagram of the information acquisition device based on the shortest path of the knowledge graph in the embodiment of this article;

Figure 8 shows another structural diagram of the information acquisition device based on the shortest path of the knowledge graph according to the embodiment of this article;

Figure 9 shows a structural diagram of a computer device according to an embodiment of this article.

Explanation of drawing symbols:
701. Initialization unit;
702. Shortest path determination unit;
703. Cycle control unit;
704. Information acquisition unit;
801. Initialization unit;
802. Computing unit;
803. Screening unit;
804. Algorithm selection unit;
805. Node mobile unit;
806. Cycle control unit;
807. Information acquisition unit;
902. Computer equipment;
904. Processor;
906. Memory;
908. Driving mechanism;
910. Input/output module;
912. Input device;
914. Output device;
916. Presentation equipment;
918. Graphical user interface;
920. Network interface;
922. Communication link;
924. Communication bus.

Detailed ways

The technical solutions in the embodiments of this article will be clearly and completely described below with reference to the accompanying drawings in the embodiments of this article. Obviously, the described embodiments are only some of the embodiments of this article, rather than all of the embodiments. Based on the embodiments in this article, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection in this article.

It should be noted that the terms “first”, “second”, etc. in the description, claims and above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that this is used The data are interchangeable under appropriate circumstances so that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, for example, a process, method, apparatus, product or equipment that includes a series of steps or units and need not be limited to those explicitly listed. Those steps or elements may instead include other steps or elements not expressly listed or inherent to the process, method, product or apparatus.

This specification provides method operation steps as described in the examples or flow charts, but more or less operation steps may be included based on routine or non-inventive efforts. The sequence of steps listed in the embodiment is only one way of executing the sequence of many steps, and does not represent the only execution sequence. When the actual system or device product is executed, the methods shown in the embodiments or drawings may be executed sequentially or in parallel.

Knowledge graphs can efficiently organize, manage and utilize massive amounts of information. They are widely used in many fields such as social networks, human resources and recruitment, finance, insurance, retail, advertising, communications, IT, manufacturing, media, medical care, e-commerce and logistics. Wide range of applications. Specifically, for example, the knowledge graph can realize the transformation of the Web from web page links to concept links (supporting retrieval by topic), and truly realize semantic retrieval. For another example, a search engine based on knowledge graphs can graphically feedback structured knowledge to users, allowing users to accurately locate and acquire knowledge in depth without having to browse a large number of web pages.

The information acquisition method and device based on the shortest path of the knowledge graph described in this article can be applied to various fields of information acquisition based on the shortest path of the knowledge graph, such as information retrieval, path planning (including robot navigation, vehicle navigation, etc.). This article is based on The specific application fields of the information acquisition method and device of the shortest path of the knowledge graph are not limited.

The knowledge graph described in this article includes multiple nodes and adjacent edges between nodes. The weight of the adjacent edges between nodes represents the length distance between nodes. The specific content included in the knowledge graph depends on the application field. The length distance between nodes is an abstract concept, which represents the cost, time, etc. of converting or obtaining information between nodes.

In an embodiment of this article, an information acquisition method based on the shortest path of the knowledge graph is provided to solve the problem of low computational efficiency in the shortest path determination process based on the knowledge graph in the existing technology. Specifically, as shown in Figure 1, Information acquisition methods based on the shortest path of the knowledge graph include:

Step S11: According to the shortest path search strategy and the nodes in the user request, initially set the iteration starting point and determine the shortest path.

Step S12: Use nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as first nodes to form a first node set, and select distance iteration from the adjacent nodes of the first node in the first node set with undetermined shortest paths. The adjacent node with the closest starting point determines the shortest path and updates the first node set.

Step S13: Repeat the above step S12 until all expected shortest paths are found, where the expected shortest paths are related to the shortest path search strategy.

Step S14: Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.

This embodiment can be applied to the client or server. The client or server stores knowledge graph information (nodes and connection relationships between nodes, nodes reflect entities, and connection relationships between nodes reflect association relationships between nodes), which are obtained from the knowledge graph. information is displayed on the client. The client described in this article includes but is not limited to computer equipment, mobile terminals, etc., and can also be software installed on computer equipment, mobile terminals, etc.

This embodiment avoids the use of relaxation operations and selects the adjacent node closest to the iteration starting point each time to determine the shortest path (recorded as a small step algorithm), so that the shortest path of one or more nodes can be calculated in each iteration and is stable. Approach the shortest path to other nodes. The number of adjacent edges of a node will not directly affect the time complexity and can improve the efficiency of determining the shortest path.

Before the implementation of this embodiment, the user needs to receive a user request by operating a client. The user request at least includes the starting node, so that information related to the starting node can be mined through the above steps S11 to S14.

In step S11, the shortest path search strategy at least includes single source search, single source single target forward search, single source single target reverse search, and single source single target bidirectional search. The process of initially setting the iteration starting point includes:

(1) If the shortest path search strategy is a single source search, set the source node in the user request as the iteration starting point, and execute steps S12 and S13 starting from the source node. The determination conditions for all expected shortest paths are all determined shortest paths. The node has no adjacent nodes for which the shortest path has not been determined.

(2) If the shortest path search strategy is a single source single target forward search, set the source node in the user request as the iteration starting point, and execute steps S12 and S13 according to the forward search direction from the source node to the target node.

(3) If the shortest path search strategy is a single source single target reverse search, set the target node in the user request as the iteration starting point, and execute steps S12 and S13 according to the reverse search direction from the source node to the target node.

(4) If the shortest path search strategy is a single source, single target, and bidirectional search, set the source node and target node in the user request as the iteration starting point, and follow the forward search direction from the source node to the target node and the direction from the target node to the source node. Perform step S12 and step S13 respectively in the reverse search direction.

For single-source single-target forward search, single-source single-target reverse search, and single-source single-target bidirectional search, associate the iteration starting point of each search direction with the determined shortest path node in each search direction, and associate the source node with the target node. By itself, the determination condition for all expected shortest paths is that there are nodes associated with both the source node and the target node. For example, the source node is node A, the target node is node G, the search direction is from node A to node G, and the intermediate nodes include node B. To associate the source node and the target node with itself is to record node A in the association information of the source node. Node G is recorded in the association information of the target node, and node A is recorded in the association information of node B.

For single source single target forward search and single source single target reverse search, the determined path is the shortest path.

For a single source and single target bidirectional search, two paths are obtained, namely the path from the iteration starting point s to the iteration midpoint p, and the path from the iteration midpoint p to the iteration starting point t. The two are combined to form a complete path from the starting point s to the end point t. path. When searching for a path, you can set the following adjacency edge expression:

(1) Directed adjacent edge <from,to> _f is used for forward search, indicating the directed edge from from to to, from corresponds to the first node, to is the adjacent point of from, and to corresponds to the second node.

(2) Directed adjacent edge <from,to> _b is used for reverse search, indicating the directed edge from from to to, to corresponds to the first node, from is the adjacent point of to, and from corresponds to the second node.

(3) The direction of the reverse search for the adjacent edge <from,to> _b is opposite to the direction of the forward search for the adjacent edge <from,to> _f , that is, the direction is that the second node from points to the first node to, and the forward search The first node from points to the second node to, and they are consistent with the directions of the adjacent edges <from, to>.

(4) In the reverse search, the to node in the linear list of adjacent edges of the first node to is the same node, and the to node corresponds to multiple second nodes from.

(5) In forward search, the from node in the linear list of adjacent edges of the first node from is the same node, and the from node corresponds to multiple second nodes to.

There are many implementation forms of single-source single-target bidirectional search, such as whether to use a unified second node to predict the shortest path length, and whether to use multi-threading. The following method uses an independent second node's expected shortest path length for forward search and reverse search, and in the same thread, independently calculates the second node's expected shortest path length for each iteration direction.

The shortest path of a node described in this article includes: the node's predecessor node and the node's determined path length. The determined path length of a node is equal to the determined path length of the node's predecessor node plus the sum of the adjacent edge weights between the node and the node's predecessor node. The node includes at least one predecessor node, such as 0, 1 or more predecessor nodes. The determined path length of a node represents the sum of the weights of adjacent edges moving from the iteration starting point to the node. In the initial state, the determined path length of the iteration starting point is zero, and the determined path lengths of the remaining nodes are infinite. This is done during the path search process. renew. Taking the iteration starting point as an example, the predecessor node of the iteration starting point is empty, and the determined path length of the iteration starting point is zero. For another example, the predecessor node of node C is node B, the predecessor node of node B is source node A, and the weight of adjacent edge <A,B> is 3, and the weight of adjacent edge <B,C> is 2, then node C has determined the path length as 5. During specific implementation, the shortest path of a node may also include the adjacent edge weight of the node, the predecessor node record number, the current node, and the current node record number. The path information from the second node to the first node is recorded through the predecessor node and the predecessor node, and the adjacent edge information is recorded through the path length, predecessor node and current node.

For the adjacent edge <from,to>, the direction of the forward search is from the node from to the node to, the predecessor node of the node to is the from node, the direction of the reverse search is from the node to to the node from, the predecessor node of the node from is the to node . For example, the shortest path p1->p2->p3 consists of two directed edges <p1, p2> and <p2, p3>: the forward search path points to p1 <-p2 and p2 <-p3, which is opposite to the path direction. ;The path directions of the reverse search are p2->p3 and p1->p2, which are consistent with the path direction.

As shown in Figure 2, in step S12, from the adjacent nodes of the first node in the first node set for which the shortest path has not been determined, selecting the adjacent node closest to the iteration starting point to determine the shortest path and updating the first node set includes:

Step S121: Determine relevant information for the first node in the first node set for which no relevant information has been determined and remove the first node without relevant information from the first node set.

Among them, the relevant information of the first node includes: the second node, the adjacent edge to be processed, and the estimated shortest path length of the second node. The adjacent edge to be processed is an adjacent edge that satisfies the following conditions among all adjacent edges between the first node and its adjacent node for which the shortest path has not been determined: the adjacent edge weight is the minimum value of all adjacent edges, and the adjacent edge weight is equal to The sum of the determined path lengths of the first node is greater than the adjacent edges of the accumulated movement steps associated with the first node set. The second node is the non-first node of the adjacent edge to be processed. The expected shortest path length of the second node of the first node is equal to the weight of the adjacent edge to be processed plus the determined path length of the first node.

In detail, the cumulative movement step associated with the first node set is equal to the shortest path length of the node farthest from the iteration starting point of the determined shortest path, reflecting the path weight length that has been moved from the iteration starting point. In the initial state, the first node The cumulative move step associated with the set is zero. This method will sequentially determine all the shortest paths of the second nodes of the adjacent edges with the same weight value of the first node in the same iteration.

In this step, the adjacent edge to be processed can be determined based on the linear list of adjacent edges arranged in ascending order of the first node. The linear list of adjacent edges points to the first adjacent edge of the unconfirmed shortest path through a pointer. The determination process includes:

Determine whether the sum of the determined path length of the first node and the adjacent edge weight of the current first unconfirmed shortest path is greater than the cumulative movement step associated with the first node set. If not, the pointer points to the bottom of the adjacent edge linear list. adjacencies edge, if so, the adjacent edge can be used as the next adjacent edge to be processed, and the non-first node corresponding to the adjacent edge to be processed is the second node.

Step S122: Select the first node with the smallest expected shortest path length of the second node from the first node set.

Step S123: The filtered second node related to the first node is the adjacent node closest to the iteration starting point.

When this step is implemented, the filtered first node is used as the predecessor node of its related second node, and the estimated shortest path length of the second node associated with the first node is the determined path length of the second node.

Step S124: Update the cumulative movement step length associated with the first node set to the estimated shortest path length of the second node associated with the filtered first node.

Step S125, perform the following judgment on the second node related to each filtered first node, the adjacent edge to be processed, and the estimated shortest path length of the second node:

Step S1251: When the estimated shortest path length of the second node is equal to the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes. Add a shortest path to the node, determine the relevant information of the first node, and if there is no relevant information, remove the first node from the first node set;

The estimated shortest path length of the second node, that is, the sum of the determined path length of the first node and the weight of the adjacent edge to be processed, is equal to the determined path length of the second node, indicating that the second node has joined the first node set, At this time, it is only necessary to determine the relevant information of the first node. When the first node has no relevant information, the first node has no adjacent nodes for which the shortest path has not been determined.

Step S1252: When the estimated shortest path length of the second node is less than the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes. The node adds a shortest path and determines the relevant information of the first node. If there is no relevant information, remove the first node from the first node set and add the second node as the first node to all the nodes. In the first node set, the relevant information of the newly added first node is determined. If there is no relevant information, the newly added first node is removed from the first node set.

The estimated shortest path length of the second node of the first node is the sum of the determined path length of the first node and the weight of the adjacent edge to be processed. If the path length is less than the path length of the second node, it means that the second node has not joined the first node set. .

In an embodiment of this article, the following situation exists. As shown in Figure 3, assume that the current first nodes are node 3 and node 2, the second nodes are node 5 and node 4, and the adjacent edges to be processed are <2,5> and <3,4>, the weights of the paths 0->2->5, 0->3->5, 0->3->4 are all 10. Assume that node 2 can reach node 5 after a certain iteration. At this time, node 5 is changed to the first node. According to the previous logic, node 3 will not filter out the next pending adjacent edge <3,5>, but in fact, the path 0->3->5 has the same weight as 0->2->5, that is to say, 0->3->5 is also an optimal path. Therefore, in the special scenario shown in Figure 3, the previous method exists The problem of not being able to find all the optimal paths. In order to solve this technical problem, the information acquisition method based on the shortest path of the knowledge graph also includes:

Determine the third node according to the ordered adjacent edge linear table of each first node, where the third node is the node that is confirming the shortest path or has confirmed the shortest path;

If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the determined path length of the third node, then the third node is regarded as the successor node, according to The first node and its successor node add a shortest path to the successor node; otherwise,

If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the cumulative movement step associated with the first node set, then the third node is regarded as the successor node, Add a shortest path to the successor node according to the first node and its successor node, add the third node as the first node to the first node set, and determine the number of the newly added first node. Relevant information, if there is no relevant information, remove the newly added first node from the first node set.

Taking Figure 3 as an example, the first nodes are node 5, node 3, and node 4. The predecessor node of node 5 is node 2. It is assumed that the determined path length of node 5 is 10 and the determined path length of node 3 is 2. The determined path length of node 4 is 10, and the weight of the adjacent edge <3,5> is 8. Node 4 and node 5 do not have a third node. The third node of node 3 is node 5. The shortest path of the third node can be updated. Predecessor nodes of the third node include node 2 and node 3.

In an embodiment of this article, in order to avoid invalid iterations and thus affect the speed of finding the minimum path, determining the relevant information of the first node also includes:

Determine whether the same second node exists in the relevant information of at least two first nodes;

If so, compare the estimated shortest path length of the second node of the first node with the same second node;

If the estimated shortest path lengths of the second nodes of the first node are different, only the relevant information of the first node with the smallest estimated shortest path length of the second node is retained, and the relevant information is re-determined for other first nodes.

This embodiment can ensure that unnecessary interference lines are removed during the shortest path search process and avoid invalid iterations.

In an embodiment of this article, a method for determining the shortest path of a knowledge graph is also provided to solve the problem of low computational efficiency in the shortest path determination process based on the knowledge graph in the prior art. The knowledge graph includes multiple nodes and nodes. The weight of the adjacent edge between nodes represents the distance between nodes, as shown in Figure 4. The method includes:

Step S21, initially set the iteration starting point and edge nodes according to the shortest path search strategy and the source node and target node in the user request. The edge nodes include the first node concentration node, the candidate node concentration node and the relaxation operation concentration node. Initially The first node set includes the iteration starting point and its shortest path is determined. The candidate node set and the relaxation operation set are empty; the determined path length of the iteration starting point is initially set to zero, and the determined path lengths of the remaining nodes are infinite. .

Among them, the nodes in the first node set are adjacent nodes with determined shortest paths and undetermined shortest paths, the nodes in the candidate node set are nodes with undetermined shortest paths that can be added to the first node set, and the nodes in the relaxation operation set are nodes that pass the relaxation operation. The node for which the path was obtained but the shortest path was not determined.

The shortest path search strategy includes: single source single target forward search, single source single target reverse search, and single source single target bidirectional search.

In some embodiments, the most path of a node includes the node's predecessor node and the node's determined path length, the corresponding predecessor node of the iteration starting point is empty, and the determined path length of the iteration starting point is zero. In other embodiments, the shortest path of a node also includes node information in the shortest path between the iteration starting point and the node.

When implementing this step, you can also set the relaxation operation set including the iteration starting point.

Step S22: Calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node.

In this step, the following formula can be used to calculate the heuristic path length of the edge node:
f(n)=g(n)+h(n);

Among them, f(n) is the heuristic path length from the iteration starting point to the iteration end point through edge node n, that is, the estimated cost from the iteration starting point to the iteration end point through edge node n; n is the edge node; h(n) is the edge node The predicted path length of n to the iteration end point, that is, how much cost is needed to get from the edge node to the iteration end point; g(n) is the determined path length from the edge node n to the iteration start point, which means the cost from the iteration start point to the edge node n The cost has been determined; h is an estimation function, h can be a Euclidean distance function, a Manhattan distance, a Chebyshev distance function, etc. This article does not limit the specific type of h. The predicted path length from any node to the iteration end point must be less than or Equal to the length of the shortest path from any node to the iteration end point.

Step S23: According to the length of the heuristic path and the determined path length of the edge node, the node pB with the smallest determined path length among the nodes with the largest heuristic path length is screened out from the first node set and the candidate node set, and the heuristic is selected from the relaxation operation set. The node pA with the largest path length has been determined among the nodes with the smallest path length.

In this step, among the nodes with the largest heuristic path length selected from the first node set and the candidate node set, the node pB with the smallest path length has been determined to be the worst node in the first node set and the candidate node set. From the relaxation operation concentrated Among the selected nodes with the smallest heuristic path length, the node pA with the largest path length has been determined to be the best node. The specific reasons are:

The node with the largest heuristic path length f: means that the actual length of the path through this node to the iteration end point is more likely to be greater than the actual length of the path from other nodes to the iteration end point.

Among the nodes with the largest heuristic path length f, select the node pB with the smallest determined path length g: indicating that the predicted path length h through this node to the iteration end point is the largest, and a larger h value has a greater value in the process of converting to g Uncertainty, the g value from n to the iteration end point will be greater than other nodes with greater probability, that is, a larger path with poor quality will be generated. Therefore, node pB is the worst node.

Heuristics the node with the smallest path length f: It means that the shortest path from the iteration starting point to the node will appear on the shortest path from the iteration starting point to the iteration end point with a high probability.

Among the nodes with the smallest heuristic path length f, select the node pA with the largest determined path length g: indicating that the predicted path length h through this node to the iteration end point is the smallest. On the one hand, choosing a smaller h value has less uncertainty in the conversion to g, i.e. it will generate good quality paths. On the other hand, selecting the node with the largest g value can ensure that the condition of not less than the cumulative movement step is met. On the other hand, it will also reduce the frequency of node movement between the set processed by the small step mechanism and the relaxation operation set. Therefore, node pA is the best node.

Step S24: Select the small step algorithm or the relaxation operation algorithm according to the nodes pB and pA, specifically including:

If the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to the determined path length of node pA path length, the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set.

Otherwise, the paths of all adjacent nodes of the node pA in the relaxation operation set are calculated based on the relaxation operation and the relaxation operation set is updated.

In this step, calculating the paths of all adjacent nodes of node pA in the relaxation operation set based on the relaxation operation and updating the relaxation operation set include:

Determine that node pA has found the shortest path, remove node pA from the relaxation operation set, and for any adjacent node pNeighbor of node pA, calculate the path length cost from the iteration starting point s via node pA to the adjacent node pNeighbor as node pA's The shortest path length g(pA) plus the weight of the adjacent edge between node pA and node pNeighbor has been determined, and the following steps are performed:

If cost is less than the original determined path length g(pNeighbor) and pNeighbor is not in the relaxation operation set: add node pNeighbor to the relaxation operation set;

If cost is less than the original determined path length g(pNeighbor): delete the path information recorded for pNeighbor, and record the path information of the adjacent point pNeighbor based on node pA and cost. The path length of node pNeighbor is cost, and the predecessor node is pA;

If cost is equal to the original determined path length g(pNeighbor): record the path information of adjacent point pNeighbor according to node pA and cost, that is, add an equal-length path to pNeighbor;

If the cost is greater than the original determined path length g(pNeighbor): no operation is performed.

Step S25: Move the nodes in the relaxation operation set, the candidate node set and the first node set. Specifically include:

Select excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length among the nodes with the smallest path length inspired by the relaxation operation set;

When the number of nodes in the first node set and the candidate node set is greater than the second predetermined value, nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set.

Step S26: Repeat the above process from step S22 to step S25 until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the length of the shortest path from the source node to the target node.

Step S27: Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.

When this step is implemented, information can be obtained based on the meaning of nodes and adjacent edges between nodes in the knowledge graph.

This embodiment uses heuristic information as target guidance information to combine the small step algorithm with the relaxation operation algorithm, which can reduce the search blindness of the small step algorithm caused by knowing nothing about the iteration end point during the search process, and filter out the outstanding ones through the relaxation operation. node, as the candidate node of the first node in the small step algorithm, when the number of nodes in the first node set and the candidate node set is greater than the predetermined value, some nodes in the first node set and the candidate node set are moved to the relaxation operation set, which can reduce The small step algorithm searches the range and obtains the shortest path from the iteration starting point to the iteration end point more quickly.

In an embodiment of this article, in the above step S21, if the shortest path search strategy is a single source single target forward search, the iteration starting point is the source node. Further, steps S22 to S25 are executed according to the forward search direction from the source node to the target node. In step S22, the heuristic path length from the iteration starting point through the edge node is the heuristic path length from the source node through the edge node to the target node.

If the shortest path search strategy is a single source single target reverse search, the iteration starting point is the target node, and steps S22 to S25 are performed according to the reverse search direction from the source node to the target node. In step S22, the heuristic path length from the iteration starting point through the edge node is the heuristic path length from the target node through the edge node to the source node.

If the shortest path search strategy is a single source, single target, and bidirectional search, the starting point of the iteration is the source node and the target node, and steps S22 to S22 are executed respectively according to the forward search direction from the source node to the target node and according to the reverse search direction from the target node to the source node. Step S25. In the forward search direction, the heuristic path length from the iteration starting point through the edge node in step S22 is the heuristic path length from the source node through the edge node to the target node. In the reverse search direction, the heuristic path length from the iteration starting point in step S22 The length of the heuristic path through the edge node is the length of the heuristic path from the target node to the source node through the edge node.

Associate the iteration starting point of each search direction with the node of the determined shortest path in each search direction, associate the source node and the target node with themselves, and the determination condition of the shortest path from the source node to the target node is that there is a node that simultaneously associates the source node and the target node. .

During the specific implementation, the single-source single-target bidirectional search uses the iteration starting point s as the search starting point to perform a forward search, and uses the iteration end point t as the search starting point to perform a reverse search. When the same node is reached, a path from s to t is found. shortest path. Among them: (1) The directed adjacent edge <from, to> _f is used for forward search, indicating the directed edge from from to to, from corresponds to the first node, to is the adjacent point of from, and to corresponds to the second node; (2) Directed adjacent edge <from,to> _b is used for reverse search, indicating the directed edge from from to to, to corresponds to the first node, from is the adjacent point of to, and from corresponds to the second node; (3 ) The direction of the reverse search for the adjacent edge <from,to> _b is opposite to the direction of the forward search for the adjacent edge <from,to> _f , that is, the direction is that the second node from points to the first node to, and the forward search is the A node from points to the second node to, which is consistent with the direction of the adjacent edge <from, to>; (4) In the reverse search, the to node in the linear list of adjacent edges of the first node to is the same node, and the to node Corresponds to multiple second nodes from; (5) In forward search, the from node in the linear list of adjacent edges of the first node from is the same node, and the from node corresponds to multiple second nodes to; (6) <from ,to> can refer to the adjacent edge in any search direction. In the forward search, it represents the adjacent edge <from,to> _f , and in the reverse search, it represents the adjacent edge <from,to> _b .

In an embodiment of this article, the above-mentioned step S25 filters out nodes exceeding the second predetermined value from the first node set and the candidate node set and moves them to the relaxation operation set, including:

Determine the excellent nodes in the first node set and the candidate node set in the following way: filter out the determined shortest paths with at most the top second predetermined value in order of the heuristic path length from small to large and the determined path length from large to small. Node; move the non-excellent nodes in the first node set and the candidate node set to the relaxation operation set.

The second predetermined value is a debugging parameter. If the second predetermined value is too small, it will cause nodes to continuously move between the first node set, the candidate node set, and the relaxation operation set. If the second predetermined value is too large, it will cause the first node set and the candidate node set to move continuously. The number of nodes in the node set increases, thereby increasing the amount of calculation, making the heuristic information lose its application effect. Therefore, a reasonable selection of the second predetermined value can reduce the movement of nodes between the first node set, the candidate node set and the relaxation operation set. times to improve the efficiency of determining the shortest path. In a specific implementation, the second predetermined value may be determined by one of the following three methods:

(1) Set a fixed second predetermined value based on partial or all knowledge graph data.

(2) Set a fixed second predetermined value based on the difference between the iteration starting point and the iteration end point.

(3) Design the upper and lower limits of the second predetermined value based on the knowledge graph data and application scenarios. The second predetermined value can be set to a value between the lower limit and the upper limit when initialized, and the second predetermined value can be dynamically adjusted according to the following strategy: In each consecutive K ₀ iterations, if the number of nodes moved by the relaxation operation set to the candidate node set is less than K ₁ , the second predetermined value is reduced by M. If the number of nodes moved by the relaxation operation set toward the candidate node set is greater than K ₂ , the second predetermined value is increased by M, where K ₀ , K ₁ , and M are positive integers and can be set according to the actual situation. Application scenarios include information retrieval, path planning, etc., depending on the specific application field of the knowledge graph. The K ₀ iterations mentioned herein refer to the number of times the above steps S22 to S24 are performed.

In an embodiment of this article, in order to avoid wasting computing power due to duplication of nodes in the first node set, the candidate node set and the relaxation operation set, the information acquisition method based on the shortest path of the knowledge graph also includes:

When the first node set changes, the first intersection node is removed from the relaxation operation set and the candidate node set, wherein the first intersection node is the intersection of the first node set, the relaxation operation set and the candidate node set node.

In this step, when a new node, that is, a node that is not moved from the candidate node set to the first node set, is added to the first node set, if the new node is in the candidate node set or the relaxation operation set, then Delete the node from the candidate node set and relaxation operation set.

When the relaxation operation set changes, a second intersection node is removed from the first node set and the candidate node set, wherein the second intersection node is the intersection of the relaxation operation set and the first node set and the candidate node set node.

In this step, when a new node, that is, a node that is not moved from the first node set to the relaxation operation set, is added to the relaxation operation set, if the new node is in the candidate node set or the first node set, then The node is deleted from the candidate node set and the first node set.

In an embodiment of this article, when the first node set is initially set to include the iteration starting point in step S21, the cumulative movement step associated with the first node set is also set to zero.

When updating the first node set in step S24, the cumulative movement step associated with the first node set is also updated.

In step S25, the execution conditions for selecting excellent nodes from the slack operation set and moving them to the candidate node set include:

The first node set and the candidate node set are empty; or the heuristic path length of the excellent node is less than the maximum heuristic path length of the nodes in the first node set and the candidate node set and the determined path length of the excellent node is greater than or equal to the first The cumulative movement steps associated with the node set.

In an embodiment of this article, as shown in Figure 5, in step S24, in the small step algorithm, from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, the adjacent node closest to the iteration starting point is selected to determine the shortest path and update the first node. A node set includes:

Step S241, when the first node set is empty, move the node with the smallest determined path length in the candidate node set to the first node set and set the cumulative movement step associated with the first node set to the determined path length of the mobile node. .

Step S242: Determine relevant information for the first node in the first node set for which no relevant information has been determined and remove the first node without relevant information from the first node set.

Among them, the relevant information of the first node includes: the second node, the adjacent edge to be processed and the estimated shortest path length of the second node; the adjacent edge to be processed is all the adjacent edges between the first node and its adjacent nodes for which the shortest path has not been determined. Adjacent edges that meet the following conditions: the adjacent edge weight is the minimum value of all adjacent edges, and the sum of the adjacent edge weight and the determined path length of the first node is greater than the cumulative movement step associated with the first node set. ; The second node is a non-first node of the adjacent edge to be processed; the estimated shortest path length of the second node of the first node is equal to the weight of the adjacent edge to be processed plus the determined path length of the first node.

Step S243: Screen out the first node with the smallest estimated shortest path length of the second node from the first node set, and select a node from the candidate node set to move to the first node set based on the screened out first node.

In this step, when a node moves from the candidate node set to the first node set, step S242 needs to be re-executed to determine relevant information for the mobile node, and re-determine the first node with the smallest expected shortest path length for the second node in the first node set. node.

When this step is implemented, based on the filtered first node, selecting a node from the candidate node set to move to the first node set includes:

(1) When the estimated shortest path length of the second node of the first node with the smallest estimated shortest path length of the filtered second node is greater than or equal to the minimum determined path length of the node in the candidate node set, follow the following steps (2) and (3) move the nodes in the candidate node set to the first node set:

(2) For each node pX corresponding to the minimum determined path length in the candidate node set, determine whether the second node associated with the filtered first node contains the node pX,

If not, move the node pX from the candidate node set to the first node set, and determine relevant information for the node pX, where the expected shortest path length of the second node of the node pX = the determined path length of the node pX + the node pX The weight of the adjacent edge to be processed,

If so, remove node pX from the candidate node set;

(3) Determine whether to move the nodes in the candidate node set to the first node set in (2). If so, perform step S243 again to select the first node with the smallest expected shortest path length of the second node from the first node set. , otherwise, perform step (1) again until the minimum determined path lengths of the nodes in the candidate node set are greater than the predicted shortest path length of the second node of the first node with the smallest predicted shortest path length of the second node, perform step (1). S244.

In this step, by moving the first node closest to the iteration starting point in the candidate node set to the first node, it is ensured that the second node corresponding to the estimated shortest path length of the second node of the filtered first node is the closest to the iteration starting point. The nodes with undetermined shortest paths are then moved from the relaxed set to the first node set, and the nodes in the candidate node set that are closest to the iteration starting point are given priority to be moved to the first node set or removed from the candidate node set.

Step S244: The filtered second node related to the first node is the adjacent node closest to the iteration starting point.

Step S245: Update the cumulative movement step length associated with the first node set to the estimated shortest path length of the second node associated with the filtered first node.

Step S246: Perform the following processing on the second node related to each filtered first node, the adjacent edge to be processed, and the estimated shortest path length of the second node:

Step S2461: When the estimated shortest path length of the second node of the first node is equal to the determined path length of the second node, the second node is used as a successor node. According to the first node and its successor node Add a shortest path to the subsequent node, determine the relevant information of the first node, and remove the first node from the first node set if there is no relevant information.

Step S2462: When the estimated shortest path length of the second node of the first node is less than the determined path length of the second node, the second node is used as a successor node. According to the first node and its successor node Add a shortest path to the successor node and determine the relevant information of the first node. If there is no relevant information, remove the first node from the first node set and add the second node as the first node. Nodes are added to the first node set, and relevant information of the newly added first node is determined. If there is no relevant information, the newly added first node is removed from the first node set.

In order to explain the technical solution of the embodiment shown in Figures 4 to 5 more clearly, a specific example will be used to explain in detail below. The knowledge graph of this embodiment contains 9 nodes. The number of each node and the weight relationship between the nodes are as shown in Figure 6 Indicates that the source node is node 0 and the end node is node 8. The inspired path length f(n), determined path length g(n) and predicted path length h(n) of each node are respectively:

g(0)=0, h(0)=5, f(0)=5;

g(1)=1, h(1)=4, f(1)=5;

g(2)=2, h(2)=2, f(2)=4;

g(3)=2, h(3)=2, f(1)=4;

g(4)=3, h(4)=5, f(4)=8;

g(6)=4, h(6)=1, f(6)=5;

g(7)=5, h(7)=9, f(7)=14;

g(8)=6, h(8)=0, f(8)=6.

Information acquisition methods based on the shortest path of the knowledge graph include:

Initialization: The shortest path length of the iteration starting point 0 is 0. The iteration starting point is initialized to the first node set. The relaxation operation set and the candidate node set are empty, which are recorded as relaxation operation set {}, candidate node set {}, and the first Node set {0}, at this time the edge node only has node 0.

The first loop: Calculate the heuristic path length of the edge node.

The edge node only has node 0, so the heuristic path length f(0)=g(0)+h(0)=0+5=5.

Select node pB=0 from the first node set and the candidate node set, and node pA cannot be filtered out from the relaxation operation set.

Because the relaxation operation set is empty, the small-step algorithm is executed, and node 1 is selected as the adjacent node closest to the iteration starting point and the shortest path has not been determined, and the shortest path of node 1 is determined, and its determined path length is g(1)=1 , the cumulative moving step size is 1. Update the first node set to {1,0}.

In this cycle, the small step algorithm is executed, so there is no need to select nodes from the relaxation operation set and move them to the candidate node set. In addition, the number of nodes in the first node set and the candidate node set is 2, which is equal to N=2, and there is no need to move nodes to the relaxation operation set.

Second loop: Calculate the heuristic path length of the edge node.

The edge node includes node 0 and node 1, then the heuristic path length of node 0 is f(0)=g(0)+h(0)=0+5=5, and the heuristic path length of node 1 is f(1)=g (1)+h(1)=1+4=5.

Select node pB=0 from the first node set and the candidate node set, and node pA cannot be filtered out from the relaxation operation set. Filter out nodes 2 and 3 as adjacent nodes that are closest to the iteration starting point and have not determined the shortest path, and determine the shortest paths of nodes 2 and 3. The determined path lengths of node 2 and node 3 are g(2)=g(3)= 2, the cumulative moving step size is 2. Update the first node set to {2,3,1}.

The number of nodes in the first node set is greater than N=2. At this time, the node with the largest heuristic path length and the smallest determined path length in the first node set needs to be moved to the relaxation operation set, because f(1)=g(1)+h (1)＝1+4＝5, f(2)＝g(2)+h(2)＝2+2＝4, f(3)＝g(3)+h(3)＝2+2＝ 4. Therefore, node 1 in the first node set is selected to move to the relaxation operation set. At this time, the movement to the relaxation operation set includes node 1, the candidate node set is empty, and the first node set includes nodes 2 and 3.

The third loop: Calculate the heuristic path length of the edge node.

The edge nodes include node 1, node 2 and node 3. The heuristic path length of node 1 is f(1)=g(1)+h(1)=1+4=5, and the length of node 2 and node 3 is f(2) =g(2)+h(2)=2+2=4, f(3)=g(3)+h(3)=2+2=4.

Node pB=3 is selected from the first node set and the candidate node set, and node pA=1 cannot be filtered out from the relaxation operation set.

f(pB)<f(pA), select the small step algorithm, node 4 is the adjacent node closest to the iteration starting point and the shortest path has not been determined, determine the shortest path of node 4, and its determined path length is g(4)=3 , the cumulative moving step size is 3, and the first node set is updated to {2,4}.

The number of nodes in the first node set and the candidate node set is 2, which is less than N=3, and there is no need to move nodes to the relaxation operation set.

The fourth loop: Calculate the heuristic path length of the edge node.

The edge nodes include node 1, node 2 and node 4. The heuristic path length of node 1 is f(1)=g(1)+h(1)=1+4=5, and the length of node 2 and node 4 is f(2) =g(2)+h(2)=2+2=4, f(4)=g(4)+h(4)=3+5=8.

Node pB=4 is selected from the first node set and the candidate node set, and node pA=1 cannot be filtered out from the relaxation operation set.

f(pB)>f(pA), select the relaxation operation algorithm, determine the paths of nodes 6 and 7, the determined path lengths of node 6 and node 7 are g(6)=4, g(7)=5 respectively, update the relaxation The operation set is {6,7}.

The heuristic path length of the nodes in the relaxation operation set is: f(6)=g(6)+h(6)=4+1=5, f(7)=g(7)+h(7)=5+9= 14. The outstanding node in the relaxation operation set is node 6, f(6)<f(pB) and g(6)=4 is greater than or equal to the cumulative movement The step size is 3. Therefore, node 6 is added to the candidate node set. At this time, the candidate node set is {6} and the first node set is {2,4}. The first node set + candidate node set is greater than N. Therefore, the node 4 with the largest inspired path length in the first node set + candidate node set needs to be moved to the relaxation operation set. At this time, the relaxation operation set is {7, 4} , the node set to be selected is {6}, and the first node set is {2}.

The fifth loop: Calculate the heuristic path length of the edge node.

The edge nodes include node 2, node 6, node 4 and node 7. The heuristic path length of node 2 is f(2)=g(2)+h(2)=2+2=4, and the heuristic path length of node 6 is f(6)=g(6)+h(6)=4+1=5, the heuristic path length of node 4 is f(4)=g(4)+h(4)=3+5=8, node The heuristic path length of 7 is f(7)=g(7)+h(7)=5+9=14.

Node pB=6 is selected from the first node set and the candidate node set, and node pA=4 cannot be filtered out from the relaxation operation set. f(pB)<f(pA), select the small step algorithm, node 8 is the adjacent node closest to the iteration starting point and the shortest path has not been determined, determine the shortest path of node 8, and its determined path length is g(8)=6 , update the first node set to {2}.

Move to the iteration end point 8, and now find the shortest path to the iteration end point. Because f(2)=4<=g(8), it is necessary to continue executing the next cycle.

The sixth loop: Calculate the heuristic path length of the edge node.

The edge nodes include node 2, node 7 and node 4. The heuristic path length of node 2 is f(2)=g(2)+h(2)=2+2=4, and the heuristic path length of node 4 is f(4). )=g(4)+h(4)=3+5=8, and the heuristic path length of node 7 is f(7)=g(7)+h(7)=5+9=14.

Continue iteration according to the aforementioned loop process until the f values of all nodes in the candidate node set, the relaxation operation set and the first node set are greater than g(8).

In an embodiment of this article, in order to avoid invalid iterations and thus affect the speed of finding the minimum path, the above step S242 after determining the relevant information of the first node also includes:

In an embodiment of this article, in order to avoid missing the shortest path, the small step algorithm also includes:

In an embodiment of this article, in order to improve the efficiency of shortest path determination, a task parallel distributed system is also provided. This distributed system can use multiple cluster devices to simultaneously process multiple single-target node shortest path determination tasks and single-source node shortest path determination tasks. Determine the tasks and implement the different computing processes in each task using cluster or parallel computing methods.

Based on the same inventive concept, this article also provides an information acquisition device based on the shortest path of the knowledge graph, such as the following embodiment. Since the problem-solving principle of the information acquisition device based on the shortest path of the knowledge graph is similar to the information acquisition method based on the shortest path of the knowledge graph, the implementation of the information acquisition device based on the shortest path of the knowledge graph can be found in the information acquisition method based on the shortest path of the knowledge graph. The repetitive parts will not be repeated.

Specifically, as shown in Figure 7, an information acquisition device based on the shortest path of the knowledge graph includes:

The initialization unit 701 is used to initially set the iteration starting point and determine the shortest path according to the shortest path search strategy and the nodes in the user request;

The shortest path determination unit 702 is configured to use nodes that have determined shortest paths and have adjacent nodes with undetermined shortest paths as first nodes to form a first node set, and set the adjacent nodes of the first node with undetermined shortest paths from the first node. , select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;

The loop control unit 703 is configured to repeatedly start the shortest path determination unit 702 until all expected shortest paths are found, wherein the expected shortest path is related to the shortest path search strategy;

The information acquisition unit 704 is configured to acquire information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.

As shown in Figure 8, the information acquisition device based on the shortest path of the knowledge graph includes:

The initialization unit 801 is used to initially set the iteration starting point and edge nodes according to the shortest path search strategy and the source node and target node in the user request. The edge nodes include the adjacent nodes that have determined the shortest path and have undetermined shortest paths. A node set node, a candidate node set node that can be added to the first node set and the shortest path is not determined, and a relaxation operation set node whose path is obtained through the relaxation operation but the shortest path is not determined; the first node set is initially set to include the iteration starting point And determine the shortest path, the candidate node set and the relaxation operation set are empty; initially set the determined path length of the iteration starting point to zero, and the determined path length of the remaining nodes to infinity;

The calculation unit 802 is used to calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;

The screening unit 803 is configured to filter out the node pB with the smallest determined path length among the nodes with the largest inspired path length from the first node set and the candidate node set according to the inspired path length and the determined path length of the edge node, and select the node pB from the relaxed The operation set selects the node pA with the largest determined path length among the nodes with the smallest heuristic path length;

Algorithm selection unit 804, used if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to the determined path length of node pA, the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first The node set and the determined path length of the newly added first node;

The node moving unit 805 is used to screen out excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length determined among the nodes with the smallest path length inspired by the relaxation operation set;

The loop control unit 806 is used to repeatedly start the calculation unit, the filtering unit, the algorithm selection unit and the node moving unit until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than that from the source node to the target node. The length of the shortest path;

The information acquisition unit 807 is used to acquire information from the knowledge graph according to the shortest path of each node and respond to user requests according to the acquired information.

In an embodiment of this article, a computer device is also provided. As shown in Figure 9, the computer device 902 includes a memory 906, a processor 904, and a computer program stored in the memory 906 and executable on the processor 904. The processor 904 The method described in any of the foregoing embodiments is implemented when the computer program is executed. Processor 904 such as one or more Central processing unit (CPU), each processing unit can implement one or more hardware threads. Memory 906 is used to store any kind of information such as code, settings, data, etc. For example, without limitation, the memory 906 may include any one or more combinations of the following: any type of RAM, any type of ROM, flash memory device, hard disk, optical disk, etc. More generally, any memory can use any technology to store information. Further, any memory can provide volatile or non-volatile retention of information. Further, any memory may represent a fixed or removable component of computer device 902. In one instance, when processor 904 executes associated instructions stored in any memory or combination of memories, computer device 902 may perform any operation of the associated instructions. Computer device 902 also includes one or more drive mechanisms 908 for interacting with any memory, such as a hard disk drive, an optical disk drive, and the like.

Computer device 902 may also include an input/output module 910 (I/O) for receiving various inputs (via input device 912) and for providing various outputs (via output device 914). One particular output mechanism may include a presentation device 916 and an associated graphical user interface 918 (GUI). In other embodiments, the input/output module 910 (I/O), the input device 912 and the output device 914 may not be included, and may only be used as a computer device in the network. Computer device 902 may also include one or more network interfaces 920 for exchanging data with other devices via one or more communication links 922 . One or more communication buses 924 couple together the components described above.

Communication link 922 may be implemented in any manner, such as through a local area network, a wide area network (eg, the Internet), a point-to-point connection, etc., or any combination thereof. Communication link 922 may include any combination of hardwired links, wireless links, routers, gateway functions, name servers, etc. governed by any protocol or combination of protocols.

Corresponding to the methods in Figures 1 to 2 and Figures 4 to 5, embodiments of this article also provide a computer-readable storage medium. The computer-readable storage medium stores a computer program, and the computer program is run by a processor. Perform the steps of the above method.

Embodiments of this document also provide computer-readable instructions, wherein when a processor executes the instructions, the program therein causes the processor to perform the methods shown in FIGS. 1 to 2 and 4 to 5 .

It should be understood that in the various embodiments of this article, the size of the sequence numbers of the above-mentioned processes does not mean the order of execution. The execution order of each process should be determined by its functions and internal logic, and should not be used in the implementation of the embodiments of this article. The process constitutes any limitation.

It should also be understood that in the embodiments of this article, the term "and/or" is only an association relationship describing associated objects, indicating that three relationships can exist. For example, A and/or B can mean: A alone exists, A and B exist simultaneously, There are three cases of B alone. In addition, the character "/" in this article generally indicates that the related objects are an "or" relationship.

Those of ordinary skill in the art can appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented with electronic hardware, computer software, or a combination of both. In order to clearly illustrate the relationship between hardware and software Interchangeability, in the above description, the composition and steps of each example have been generally described according to functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. The skilled artisan may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this article.

Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the systems, devices and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.

In the several embodiments provided herein, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented. In addition, the coupling or direct coupling or communication connection between each other shown or discussed may be an indirect coupling or communication connection through some interfaces, devices or units, or may be electrical, mechanical or other forms of connection.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiments of this article.

In addition, each functional unit in each embodiment of this article can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit. The above integrated units can be implemented in the form of hardware or software functional units.

If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on this understanding, the technical solution in this article essentially contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this article. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program code.

This article uses specific embodiments to illustrate the principles and implementation methods of this article. The description of the above embodiments is only used to help understand the methods and core ideas of this article; at the same time, for those of ordinary skill in the field, based on the ideas of this article , there will be changes in the specific implementation and application scope. In summary, the content of this description should not be understood as a limitation of this article.

Claims

An information acquisition method based on the shortest path of a knowledge graph, characterized in that the knowledge graph includes multiple nodes and adjacent edges between nodes, and the weight of the adjacent edges between nodes represents the distance between nodes. The method includes:

S11, according to the shortest path search strategy and the nodes in the user request, initially set the iteration starting point and determine its shortest path;

S12, use the nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as the first nodes to form a first node set, and select a distance iteration starting point from the adjacent nodes of the first node in the first node set with undetermined shortest paths. The nearest adjacent nodes determine the shortest path and update the first node set;

S13, repeat the above step S12 until all expected shortest paths are found, where the expected shortest paths are related to the shortest path search strategy;

S14. According to the shortest path of each node, obtain information from the knowledge graph and respond to the user request according to the obtained information.
The method according to claim 1, characterized in that if the shortest path search strategy is a single source search, then the source node in the user request is set as the iteration starting point, and steps S12 and S13 are executed starting from the source node. The determination condition for all expected shortest paths is that all nodes with determined shortest paths have no adjacent nodes with undetermined shortest paths;

If the shortest path search strategy is a single source single target forward search, then set the source node in the user request as the iteration starting point, and execute steps S12 and S13 according to the forward search direction from the source node to the target node;

If the shortest path search strategy is a single source single target reverse search, then set the target node in the user request as the iteration starting point, and execute steps S12 and S13 according to the reverse search direction from the source node to the target node;

If the shortest path search strategy is a single source, single target, and bidirectional search, then the source node and target node in the user request are set as the iteration starting point, and the search direction is based on the forward search direction from the source node to the target node and the reverse search direction from the target node to the source node. Execute steps S12 and S13 respectively in the search direction;

For single source single target forward search, single source single target reverse search, and single source single target bidirectional search, the iteration starting point of each search direction is associated with the node of the determined shortest path in each search direction, and the source node and target node are associated By itself, the determination condition for all expected shortest paths is that there are nodes simultaneously associated with the source node and the target node.
The method of claim 1, wherein the shortest path of a node includes: the node's predecessor node and the node's determined path length;

The determined path length of the node is equal to the determined path length of the node's predecessor node plus the sum of the adjacent edge weights between the node and the node's predecessor node, and the node includes at least one predecessor node;

In S11, initially setting the iteration starting point and determining its shortest path includes: setting the predecessor node of the iteration starting point to be empty, the determined path length of the iteration starting point to be zero, and the determined path lengths of the remaining nodes to be infinity.
The method according to claim 3, characterized in that in step S12, from the adjacent nodes of the first node in the first node set for which the shortest path has not been determined, the adjacent node closest to the iteration starting point is selected to determine the shortest path and update the first node Set includes:

S121. Determine relevant information for the first node in the first node set for which no relevant information has been determined and remove the first node without relevant information from the first node set. The relevant information of the first node includes: second The estimated shortest path length of the node, the adjacent edge to be processed and the second node; the adjacent edge to be processed is the adjacent edge that satisfies the following conditions among all the adjacent edges between the first node and its adjacent node for which the shortest path has not been determined: the The adjacent edge weight is the minimum value of all adjacent edges, and the sum of the adjacent edge weight and the determined path length of the first node is greater than the cumulative movement step associated with the first node set; the second node is the The non-first node of the adjacent edge to be processed; the estimated shortest path length of the second node is equal to the weight of the adjacent edge to be processed plus the determined path length of the first node;

S122, select the first node with the smallest expected shortest path length of the second node from the first node set;

S123, the second node related to the filtered first node is the adjacent node closest to the iteration starting point;

S124, update the cumulative movement step length associated with the first node set to the estimated shortest path length of the second node associated with the filtered first node;

S125, perform the following processing on the second node related to each filtered first node, the adjacent edge to be processed, and the estimated shortest path length of the second node:

S1251. When the expected shortest path length of the second node is equal to the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes according to the first node and its successor node. Add a shortest path and determine the relevant information of the first node. If there is no relevant information, remove the first node from the first node set;

S1252, when the estimated shortest path length of the second node is less than the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes according to Add a new shortest path to determine the relevant information of the first node. If there is no relevant information, move it from the first node. In addition to the first node, add the second node as a first node to the first node set, determine the relevant information of the newly added first node, and if there is no relevant information, move it from the first node set. Except the new first node.
The method of claim 4, further comprising:

Determine the third node according to the ordered adjacent edge linear list of each first node;

If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the determined path length of the third node, then the third node is regarded as the successor node, according to The first node and its successor node add a shortest path to the successor node; otherwise,

If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the cumulative movement step associated with the first node set, then the third node is regarded as the successor node, Add a shortest path to the successor node according to the first node and its successor node, add the third node as the first node to the first node set, and determine the number of the newly added first node. Relevant information, if there is no relevant information, remove the newly added first node from the first node set.
The method of claim 4 or 5, wherein adding a shortest path to the successor node based on the first node and its successor node includes:

The predecessor node of the new shortest path of the successor node is the first node;

The new shortest path length of the successor node is the weight of the adjacent edge between the first node and the successor node plus the determined path length of the first node.
The method according to claim 4 or 5, characterized in that after determining the relevant information of the first node, it further includes:

Determine whether the same second node exists in the relevant information of at least two first nodes;

If so, compare the estimated shortest path length of the second node of the first node with the same second node;

If the estimated shortest path lengths of the second nodes of the first node are different, only the relevant information of the first node with the smallest estimated shortest path length of the second node is retained, and the relevant information is re-determined for other first nodes.
An information acquisition method based on the shortest path of a knowledge graph, characterized in that the knowledge graph includes multiple nodes and adjacent edges between nodes, and the weight of the adjacent edges between nodes represents the distance between nodes. The method includes:

S21, according to the shortest path search strategy and the source node and target node in the user request, initially set the iteration starting point and edge node. The edge node includes the first node that has determined the shortest path and has an adjacent node with an undetermined shortest path. node set nodes, candidate node set nodes that can be added to the first node set and the shortest path is not determined, and relaxation operation set nodes whose paths are obtained through the relaxation operation but the shortest path is not determined; the first node set is initially set to include the iteration starting point and Determine the shortest path, the node set to be selected and the relaxation operation set are empty; initially set the determined path length of the iteration starting point to zero, and the determined path lengths of the remaining nodes to infinity;

S22, calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;

S23. According to the length of the heuristic path and the determined path length of the edge node, the node pB with the smallest determined path length among the nodes with the largest heuristic path length is screened out from the first node set and the candidate node set, and the node pB with the smallest determined path length is screened out from the relaxation operation set. The node pA with the largest path length has been determined among the nodes with the smallest path length;

S24, if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than or equal to that of node pA. Once the path length has been determined, the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;

Otherwise, calculate the paths of all adjacent nodes of the node pA in the relaxation operation set based on the relaxation operation and update the relaxation operation set;

S25, select excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length determined among the nodes with the smallest inspired path length in the relaxation operation set;

When the number of nodes in the first node set and the candidate node set is greater than a second predetermined value, the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;

S26, repeat the above process from step S22 to step S25 until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the length of the shortest path from the source node to the target node;

S27: Obtain information from the knowledge graph according to the shortest path of each node and respond to the user request according to the obtained information.
The method according to claim 8, characterized in that if the shortest path search strategy is a single source single target forward search, the iteration starting point is the source node, and step S22 is performed according to the forward search direction from the source node to the target node. Go to step S25;

If the shortest path search strategy is a single source single target reverse search, the starting point of the iteration is the target node, and steps S22 to S25 are executed according to the reverse search direction from the source node to the target node;

If the shortest path search strategy is a single source, single target, and bidirectional search, the starting point of the iteration is the source node and the target node, and steps are performed according to the forward search direction from the source node to the target node and according to the reverse search direction from the target node to the source node. S22 to step S25;

Associate the iteration starting point of each search direction with the node of the determined shortest path in each search direction, associate the source node and the target node with themselves, and the determination condition of the shortest path from the source node to the target node is that there is a node that simultaneously associates the source node and the target node. .
The method of claim 8, wherein in step S25, filtering out nodes exceeding the second predetermined value from the first node set and the candidate node set and moving them to the relaxation operation set includes:

Determine the excellent nodes in the first node set and the candidate node set in the following way: filter out the determined shortest paths with at most the top second predetermined value in order of the heuristic path length from small to large and the determined path length from large to small. node;

Move the non-excellent nodes in the first node set and the candidate node set to the relaxation operation set.
The method of claim 10, wherein the second predetermined value determination strategy includes:

Set a fixed second predetermined value based on partial or all knowledge graph data; or

Set a fixed second predetermined value based on the difference between the iteration starting point and the iteration end point; or

Design the upper and lower limits of the second predetermined value based on the knowledge graph data and application scenarios. The second predetermined value can be set to a value between the lower limit and the upper limit when initialized. The second predetermined value can be dynamically adjusted according to the following strategy: at each consecutive In K 0 iterations, if the number of nodes moved by the relaxation operation set to the candidate node set is less than K 1 , the second predetermined value is reduced by M. If the number of nodes moved by the relaxation operation set to the candidate node set is greater than K 2 , the second predetermined value The predetermined value is increased by M, where K 0 , K 1 , and M are positive integers.
The method of claim 10, further comprising:

When the first node set changes, the first intersection node is removed from the relaxation operation set and the candidate node set, wherein the first intersection node is the intersection of the first node set, the relaxation operation set and the candidate node set node;

When the relaxation operation set changes, a second intersection node is removed from the first node set and the candidate node set, wherein the second intersection node is the intersection of the relaxation operation set and the first node set and the candidate node set node.
The method of claim 8, wherein the shortest path of a node includes: a predecessor node of the node and the determined path length of the node;

The predecessor node of the iteration starting point is empty;

The determined path length of the node is equal to the determined path length of the node's predecessor node plus the sum of the adjacent edge weights between the node and the node's predecessor node, and the node includes at least one predecessor node.
The method of claim 8, wherein the heuristic path length of the edge node is calculated using the following formula:
f(n)=g(n)+h(n);

Among them, f(n) is the heuristic path length of edge node n, n is the edge node, h(n) is the predicted path length of edge node n to the iteration end point, h is the estimation function, and g(n) is the distance of edge node n. The determined path length of the iteration starting point.
The method according to claim 8, characterized in that when initially setting the first node set to include the iteration starting point in step S21, the cumulative movement step associated with the first node set is also set to zero;

In step S24, the cumulative movement step associated with the first node set is also updated;

In step S25, the execution conditions for selecting excellent nodes from the slack operation set and moving them to the candidate node set include:

The first node set and the candidate node set are empty; or

The heuristic path length of the excellent node is less than the maximum heuristic path length of the nodes in the first node set and the candidate node set, and the determined path length of the excellent node is greater than or equal to the cumulative movement step associated with the first node set.
The method according to claim 8, characterized in that in the small step algorithm of step S24, from the adjacent nodes of the first node in the first node set for which the shortest path has not been determined, the adjacent node closest to the iteration starting point is selected to determine the shortest path and Update the first node set to include:

S241. When the first node set is empty, move the node with the smallest determined path length in the candidate node set to the first node set and set the cumulative movement step associated with the first node set as the mobile node. The determined path length;

S242: Determine relevant information for the first node in the first node set for which no relevant information has been determined and remove the first node without relevant information from the first node set. The relevant information of the first node includes: second The estimated shortest path length of the node, the adjacent edge to be processed and the second node; the adjacent edge to be processed is the adjacent edge that satisfies the following conditions among all the adjacent edges between the first node and its adjacent node for which the shortest path has not been determined: the The adjacent edge weight is the minimum value of all adjacent edges, and the sum of the adjacent edge weight and the determined path length of the first node is greater than the first The cumulative movement step associated with the node set; the second node is a non-first node of the adjacent edge to be processed; the estimated shortest path length of the second node is equal to the weight of the adjacent edge to be processed plus the first node The determined path length of a node;

S243, select the first node with the smallest expected shortest path length of the second node from the first node set; select a node from the candidate node set to move to the first node set according to the screened out first node;

S244, the second node related to the filtered first node is the adjacent node closest to the iteration starting point;

S245, update the cumulative movement step length associated with the first node set to the estimated shortest path length of the second node associated with the filtered first node;

S246: Perform the following processing on the second node, the adjacent edge to be processed and the estimated shortest path length of the second node related to each filtered out first node:

S2461, when the expected shortest path length of the second node is equal to the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes according to Add a shortest path and determine the relevant information of the first node. If there is no relevant information, remove the first node from the first node set;

S2462, when the estimated shortest path length of the second node is less than the determined path length of the second node, the second node is regarded as the successor node, and the first node and its successor node are the successor nodes according to Add a shortest path to determine the relevant information of the first node. If there is no relevant information, remove the first node from the first node set and add the second node as the first node to the set. In the first node set, relevant information of the newly added first node is determined. If there is no relevant information, the newly added first node is removed from the first node set.
The method of claim 16, wherein step S243, based on the filtered first node, selecting a node from the candidate node set to move to the first node set includes:

When the estimated shortest path length of the second node related to the filtered first node is greater than or equal to the minimum determined path length of the node in the candidate node set, follow the following steps to move the node in the candidate node set to the first node set:

For each node pX corresponding to the minimum determined path length in the set of nodes to be selected, determine whether the second node related to the filtered first node contains the node pX,

If not, move the node pX from the candidate node set to the first node set, and determine relevant information for the node pX, where the estimated shortest path length of the second node related to the node pX = the determined path length of the node pX + the node pX The weight of the adjacent edge to be processed, jump to step S243 and execute this step again,

If so, remove the node pX from the candidate node set, and repeat the above steps until the determined path lengths of the nodes in the candidate node set are greater than the estimated shortest path length of the second node related to the filtered first node.
The method of claim 16, wherein the small step algorithm further includes:

Determine the third node according to the ordered adjacent edge linear list of each first node;

If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the determined path length of the third node, then the third node is regarded as the successor node, according to The first node and its successor node add a shortest path to the successor node; otherwise,

If the sum of the determined path length of the first node and the weight of the adjacent edge between the first node and its third node is equal to the cumulative movement step associated with the first node set, then the third node is regarded as the successor node, Add a shortest path to the successor node according to the first node and its successor node, add the third node as the first node to the first node set, and determine the number of the newly added first node. Relevant information, if there is no relevant information, remove the newly added first node from the first node set.
The method according to claim 16 or 18, characterized in that after determining the relevant information of the first node, it further includes:

Determine whether the same second node exists in the relevant information of at least two first nodes;

If so, compare the estimated shortest path length of the second node of the first node with the same second node;

If the estimated shortest path lengths of the second nodes of the first node are different, only the relevant information of the first node with the smallest estimated shortest path length of the second node is retained, and the relevant information is re-determined for other first nodes.
The method of claim 16 or 18, wherein adding a shortest path to the successor node based on the first node and its successor node includes:

The predecessor node of the new shortest path of the successor node is the first node;

The new shortest path length of the successor node is the weight of the adjacent edge between the first node and the successor node plus the determined path length of the first node.
An information acquisition device based on the shortest path of a knowledge graph, characterized in that the knowledge graph includes a plurality of nodes and adjacent edges between nodes, and the weight of the adjacent edges between nodes represents the distance between nodes, and the device includes:

The initialization unit is used to initially set the iteration starting point and determine the shortest path based on the shortest path search strategy and the nodes in the user request;

The shortest path determination unit is configured to use nodes with determined shortest paths and adjacent nodes with undetermined shortest paths as first nodes to form a first node set, from the adjacent nodes of the first node in the first node set with undetermined shortest paths. , select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;

A loop control unit, configured to repeatedly start the shortest path determination unit until all expected shortest paths are found, wherein the expected shortest path is related to the shortest path search strategy;

An information acquisition unit is used to acquire information from the knowledge graph according to the shortest path of each node and respond to the user request according to the acquired information.
An information acquisition device based on the shortest path of a knowledge graph, characterized in that the knowledge graph includes a plurality of nodes and adjacent edges between nodes, and the weight of the adjacent edges between nodes represents the distance between nodes, and the device includes:

The initialization unit is used to initially set the iteration starting point and the edge node according to the shortest path search strategy and the source node and target node in the user request. The edge node includes the first node that has determined the shortest path and has an adjacent node with an undetermined shortest path. node set nodes, candidate node set nodes that can be added to the first node set and the shortest path is not determined, and relaxation operation set nodes whose paths are obtained through the relaxation operation but the shortest path is not determined; the first node set is initially set to include the iteration starting point and Determine the shortest path, the node set to be selected and the relaxation operation set are empty; initially set the determined path length of the iteration starting point to zero, and the determined path lengths of the remaining nodes to infinity;

The calculation unit is used to calculate the heuristic path length of the edge node, where the heuristic path length of the edge node is the path length from the iteration starting point to the iteration end point passing through the edge node;

A screening unit, configured to filter out the node pB with the smallest determined path length among the nodes with the largest inspired path length from the first node set and the candidate node set based on the inspired path length and the determined path length of the edge node, and select the node pB with the smallest determined path length from the relaxation operation Centrally filter out the node pA with the largest determined path length among the nodes with the smallest heuristic path length;

Algorithm selection unit for use if the relaxation operation set is empty or the heuristic path length of node pB is less than the heuristic path length of node pA, or the heuristic path length of node pB is equal to the heuristic path length of node pA and the determined path length of node pB is greater than is equal to the determined path length of node pA, then the following small-step algorithm is executed: from the adjacent nodes of the first node in the first node set whose shortest path has not been determined, select the adjacent node closest to the iteration starting point to determine the shortest path and update the first node set;

Otherwise, calculate the paths of all adjacent nodes of the node pA in the relaxation operation set based on the relaxation operation and update the relaxation operation set;

The node moving unit is used to select excellent nodes from the relaxation operation set and move them to the candidate node set. The excellent node is the node pA2 with the largest path length determined among the nodes with the smallest inspired path length in the relaxation operation set;

When the number of nodes in the first node set and the candidate node set is greater than a second predetermined value, the nodes exceeding the second predetermined value are filtered out from the first node set and the candidate node set and moved to the relaxation operation set;

The loop control unit is used to repeatedly start the calculation unit, filtering unit, algorithm selection unit and node moving unit until the shortest path from the source node to the target node is found and the heuristic path length of the edge node is greater than the shortest path from the source node to the target node. the length of the path;

The information acquisition unit is used to obtain information from the knowledge graph according to the shortest path of each node and respond to user requests based on the obtained information.
A computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that when the processor executes the computer program, it implements any one of claims 1 to 20. described method.
A computer storage medium having a computer program stored thereon, characterized in that when the computer program is run by a processor of a computer device, instructions for executing the method according to any one of claims 1 to 20 are provided.