Disclosure of Invention
The technical problem to be solved by the application is to provide a method for calculating an enterprise actual controller based on a knowledge graph, and the storage efficiency and the calculation efficiency can be improved. Therefore, the application also provides a corresponding enterprise actual controller operation system.
In order to solve the technical problem, the application discloses an enterprise actual controller operation method which comprises the following steps. Step S110: and according to the equity data in the enterprise business information, a knowledge graph reflecting the stockholder investment relation of the enterprise is constructed by adopting a data structure calculated by a graph. Step S120: and segmenting the knowledge graph to obtain one or more connected subgraphs. Step S130: and in each connected subgraph, according to the first class of edge expansion representing the direct investment relation, adding a second class of edge representing the absolute stock control relation and a third class of edge representing the important stockholder relation. Step S140: and in each connected subgraph, extending and adding a fourth class of edges representing the indirect investment relation of the important shareholders according to the third class of edges representing the important shareholders relation. Step S150: and in each connected subgraph, searching a target node representing an actual controller or a suspected actual controller of the enterprise for each enterprise node. The operation method of the enterprise actual controller is the first embodiment of the application, and the actual controller and/or the suspected actual controller node of the enterprise node can be accurately calculated through graph calculation; the pruning operation is adopted in the graph calculation process, so that the operation amount is reduced, and the operation speed is improved.
Further, in step S110, data cleaning is performed on the equity data in the enterprise business information, and then a graph is constructed based on the cleaned data. Therefore, additional burden on the construction of the knowledge graph caused by invalid data, error data and the like can be avoided, and interference on subsequent operation can also be avoided.
Further, the data cleaning comprises one or more of stock ratio legality detection cleaning, data consistency checking, invalid data eliminating and missing data filling. This is a preferred implementation of data cleansing.
Furthermore, in the knowledge graph, each enterprise and the direct stock holder thereof are respectively used as each node in the graph; the direct investment relationship of the direct shareholder node to the enterprise node is represented by a first type edge. This is a preferred implementation of constructing a knowledge graph.
Further, each node has entity type attributes, including one or more of E, P, G, S, Z; e represents an enterprise; p represents a natural person; g represents a government agency; s represents a career unit; z represents a social organization. This is used to distinguish between different types of nodes.
Further, the attribute value of the first class edge is a direct investment proportion. This is a preferred implementation of constructing a knowledge graph.
Further, the edges all have a type attribute to distinguish the different types of edges. This is used to distinguish between different types of edges.
Further, in step S130, if any two nodes are connected by a first class edge, and the attribute value of the direct investment proportion of the first class edge is greater than or equal to the first threshold, a second class edge representing the absolute stock control relationship is added between the two nodes. This is a preferred implementation of expanding the knowledge-graph, and new data is expanded by graph computation on the basis of the original data.
Further, the second-class edge has a direction that is the same as the direction of the first-class edge connecting the two nodes. This is a preferred implementation of the extended knowledge-graph.
Further, the attribute of the second-class edge is an absolute stock control relationship, and the direct shareholder node connected with the second-class edge is an absolute stock control shareholder node. This is a preferred implementation of the extended knowledge-graph.
Further, the first threshold is between 45% and 66.7%. This is a preferred range of values for a parameter.
Further, in step S130, if any two nodes are connected by a first class edge, and the direct investment proportion attribute value of the first class edge is smaller than the first threshold and is greater than or equal to the second threshold, a third class edge representing the important shareholder relationship is added between the two nodes. This is a preferred implementation of expanding the knowledge-graph, and new data is expanded by graph computation on the basis of the original data.
Further, the third class of edge has a direction that is the same as the direction of the first class of edge connecting the two nodes. This is a preferred implementation of the extended knowledge-graph.
Further, the attribute of the third-class edge is an important shareholder relationship, and the connected direct shareholder node is an important shareholder node. This is a preferred implementation of the extended knowledge-graph.
Further, the second threshold is between 25% and 35%. This is a preferred range of values for a parameter.
Further, in the step S140, if any two nodes are sequentially connected in the same direction through a plurality of third-type edges, a fourth-type edge representing an indirect investment relationship of an important shareholder is added between the two nodes. This is a preferred implementation of expanding the knowledge-graph, and new data is expanded by graph computation on the basis of the original data.
Further, the fourth class of edges has a direction that is the same as the direction in which the combination of the third class of edges connecting the two nodes points. This is a preferred implementation of the extended knowledge-graph.
Further, the attribute of the fourth class of edge is an indirect investment proportion of an important shareholder, and is obtained by adding attribute values of all paths formed by the first class of edges connecting the two nodes. This is a preferred implementation of the extended knowledge-graph.
Further, in step S150, when a certain source node is connected by one or more second-type edges, all nodes connected by the source node through the second-type edges are used as target nodes representing actual controllers of the enterprise. This is the first implementation to find the target node.
Further, in the step S150, when a certain source node is not connected by any second-class edge or third-class edge, the source node does not have a target node representing an actual controller or a suspected actual controller of the enterprise. This is a second implementation of finding a target node.
Further, in the step S150, when a certain source node is not connected by any second-class edge but is connected by one or more third-class edges, all nodes of the source node connected only by one third-class edge or one fourth-class edge are regarded as candidate nodes, and one or more candidate nodes having the largest attribute value among the direct investment proportion attribute values of the first-class edge connected to the candidate node and the fourth-class edge or the indirect investment proportion attribute values of the important stakeholders of the fourth-class edge are regarded as target nodes representing suspected actual controllers of the enterprise. This is a third implementation of finding a target node.
The application also discloses an enterprise actual controller operation system which comprises a map building module, a connected subgraph segmentation module, a second-class edge expansion module, a third-class edge expansion module, a fourth-class edge expansion module and a target node judgment module. The map construction module is used for constructing a knowledge map reflecting the stockholder investment relation of the enterprise by adopting a data structure calculated by a map according to the equity data in the enterprise business information. The connected subgraph segmentation module is used for segmenting the knowledge graph to obtain one or more connected subgraphs. The second-class edge extension module is used for increasing a second-class edge representing an absolute stock control relation according to the first-class edge extension representing the direct investment relation. The third type edge extension module is used for increasing a third type edge representing an important shareholder relationship according to the first type edge extension representing the direct investment relationship. And the fourth type edge extension module is used for increasing the fourth type edge of the indirect investment relation representing the important shareholder according to the third type edge extension representing the important shareholder relation. The target node judgment module is used for searching a target node representing an actual controller or a suspected actual controller of the enterprise for each enterprise node. The above-mentioned enterprise actual controller operation system is the first embodiment of the present application, and the actual controller and/or suspected actual controller node of the enterprise node can be calculated more accurately through graph calculation; the pruning operation is adopted in the graph calculation process, so that the operation amount is reduced, and the operation speed is improved.
The application also discloses an enterprise actual controller operation method which comprises the following steps. Step S810: and according to the equity data in the enterprise business information, a knowledge graph reflecting the stockholder investment relation of the enterprise is constructed by adopting a data structure calculated by a graph. Step S820: and segmenting the knowledge graph to obtain one or more connected subgraphs. Step S830: and in each connected subgraph, searching a target node representing an actual controller or a suspected actual controller of the enterprise for each enterprise node. The above-mentioned method for calculating the actual controller of the enterprise is the second embodiment of the present application, and can be regarded as a variation of the first embodiment.
Further, in step S830, when a certain source node is connected to the first class of edges, and the direct investment proportion attribute value of at least one first class of edges is greater than or equal to the first threshold, all nodes connected to the source node through the first class of edges whose direct investment proportion attribute value is greater than or equal to the first threshold are used as target nodes representing actual controllers of the enterprise. This is the first implementation to find the target node.
Further, in step S830, when a certain source node is not connected to the first class edge, or is connected to the first class edge, but the direct investment proportion attribute values of all the first class edges are smaller than the second threshold, the source node does not have a target node representing an actual controller or a suspected actual controller of the enterprise. This is a second implementation of finding a target node.
Further, in step S830, when a certain source node is connected to a first class edge, the direct investment proportion attribute values of all the first class edges are smaller than a first threshold, and the direct investment proportion attribute value of at least one first class edge is greater than or equal to a second threshold, all nodes of the source node that are sequentially connected in the same direction through the first class edges whose direct investment proportion attribute values are smaller than the first threshold and greater than or equal to the second threshold are all used as candidate nodes, and one or more candidate nodes connected to one or more paths having the largest overall attribute value among all paths formed by the first class edges whose direct investment proportion attribute values are smaller than the first threshold and greater than or equal to the second threshold, to which the source node is connected, are used as target nodes representing suspected actual controllers of the enterprise. This is a third implementation of finding a target node.
The application also discloses an enterprise actual controller operation system which comprises a map construction module, a connected subgraph segmentation module and a target node judgment module. The map construction module is used for constructing a knowledge map reflecting the stockholder investment relation of the enterprise by adopting a data structure calculated by a map according to the equity data in the enterprise business information. The connected subgraph segmentation module is used for segmenting the knowledge graph to obtain one or more connected subgraphs. The target node judgment module is used for searching a target node representing an actual controller or a suspected actual controller of the enterprise for each enterprise node. The above-mentioned enterprise real controller computing system is the second embodiment of the present application, and can be regarded as a variation of the first embodiment.
The method has the technical effects that the enterprise related data are stored through the graph database, and the actual controller and the suspected actual controller of the enterprise are relatively accurately calculated through a graph calculation mode. The pruning operation is adopted in the graph calculation process, so that the operation amount is reduced, and the operation speed is improved.
Detailed Description
Referring to fig. 1, an embodiment of a method for calculating an actual controller of an enterprise provided by the present application includes the following steps.
Step S110: and according to the equity data in the enterprise business information, a knowledge graph reflecting the stockholder investment relation of the enterprise is constructed by adopting a data structure calculated by a graph.
The enterprise business information refers to information registered by an enterprise in a business administration management department, and comprises an enterprise name, an enterprise address, enterprise registered capital, enterprise share right data, enterprise high management data and the like. The stock right data refers to direct stockholders and the ratio of capital investment of the enterprise.
Preferably, in the step S110, data cleaning (datacleaning) is performed on the equity data in the enterprise business information, and then a knowledge graph is constructed based on the cleaned data. The data cleaning comprises one or more of stock ratio legality detection cleaning, data consistency checking, invalid data eliminating and missing data filling.
Referring to fig. 2, the construction of the knowledge graph specifically includes the following steps.
Step S210: and taking each enterprise and the direct stock holder thereof in the enterprise business information as each node in the graph respectively. Each node contains two attributes: entity name, entity type. The entity name attribute refers to a unit name or a natural person name. The entity type attributes include one or more of E, P, G, S, Z. Wherein E represents various types of enterprises such as individual industrial and commercial enterprises, individual exclusive enterprises, cooperative agencies, enterprise legal persons and the like; p represents a natural person; g represents a government agency; s represents a career unit; z represents a social organization.
Step S220: and adding a first class edge representing a direct investment relation between the enterprise node and the direct shareholder node thereof based on the equity data of each enterprise. The first type of edge has a direction, which may be, for example, from the direct shareholder node toward the enterprise node, or may change to the opposite direction. The attribute of the first class of edges is the direct investment proportion.
The map constructed through steps S210 to S220 is a knowledge map reflecting the stockholder investment relationship of the enterprise.
Preferably, all edges in the knowledge-graph have a type attribute to distinguish the first class of edges from the second class of edges, … ….
Step S120: the knowledge graph constructed in step S110 is segmented to obtain one or more connected subgraphs. In the knowledge graph constructed in step S110, if any two nodes can be connected by one or more edges, the two nodes are in a connected subgraph; otherwise, the sub-graphs belong to different connected sub-graphs respectively.
Step S130: in each connected subgraph segmented in step S120, a second class edge representing an absolute stock control relationship and a third class edge representing an important shareholder relationship are added according to the first class edge extension representing the direct investment relationship.
If any two nodes are connected through a first class edge, and the attribute value of the direct investment proportion of the first class edge is larger than or equal to a first threshold value, a second class edge representing the absolute stock control relationship is newly added between the two nodes. The second class of edges has a direction that is the same as the direction of the first class of edges connecting the two nodes. The attribute of the second kind of edge is an absolute stock control relationship, and the direct stock holding shareholder node connected with the second kind of edge is an absolute stock control shareholder node. The first threshold is between 45% and 66.7%, preferably 50%.
If any two nodes are connected through a first class edge, and the direct investment proportion attribute value of the first class edge is smaller than the first threshold value and larger than or equal to the second threshold value, a third class edge representing the important stockholder relation is newly added between the two nodes. The third class of edges has a direction that is the same as the direction of the first class of edges connecting the two nodes. The attribute of the third kind of edge is important shareholder relationship, and the direct shareholder node connected with the third kind of edge is the important shareholder node. The second threshold is between 25% and 35%, preferably 30%.
Step S140: in each connected subgraph divided in step S120, the fourth class of edges representing the indirect investment relationship of the important shareholder is added according to the third class of edge extension representing the relationship of the important shareholder.
If any two nodes are connected in sequence in the same direction through a plurality of third edges, a fourth edge representing the indirect investment relation of the important shareholders is added between the two nodes. The fourth class of edges has a direction that is the same as the direction in which the combination of the third class of edges connecting the two nodes points. The attribute of the fourth category is the indirect investment proportion of the important shareholder, i.e. the actual investment proportion. The attribute of the fourth class edge is obtained by adding the attribute values of all paths formed by the first class edges connecting the two nodes. If any path is only one first-class edge, the overall attribute value of the path is the direct investment proportion attribute value of the first-class edge. If any path is formed by sequentially connecting a plurality of first-class edges in the same direction, the direct investment proportion attribute value of each first-class edge is multiplied to be used as the integral attribute value of the path. If any path is formed by connecting a plurality of first-class edges in different directions, the path is not in the consideration range of calculating the attribute of the fourth-class edge, or the overall attribute value of the path is zero.
Step S150: in each connected subgraph segmented in step S120, the node with the entity type attribute of E is called a source node, the source node is all the enterprise nodes, and a target node representing an actual controller or a suspected actual controller of the enterprise is searched for each source node.
For example, when a source node is connected by one or more edges of the second type, all nodes connected by the source node through the edges of the second type are used as target nodes for representing actual controllers of the enterprise. If the first threshold is greater than or equal to 50%, each source node will only connect one edge of the second type at most. If the first threshold is less than 50%, each source node will connect more than two edges of the second type at the most.
For another example, when a certain source node is not connected by any second-type edge or third-type edge, the source node does not have a target node representing an actual controller or a suspected actual controller of the enterprise, that is, the enterprise does not have an actual controller.
For another example, when a certain source node is not connected by any second-class edge but is connected by one or more third-class edges, all nodes of the source node connected only by one third-class edge or one fourth-class edge are taken as candidate nodes, and one or more candidate nodes with the largest attribute value among the direct investment proportion attribute values of the first-class edges connected with the candidate node and the source node or the indirect investment proportion attribute values of important stakeholders of the fourth-class edges are taken as target nodes for representing suspected actual controllers of the enterprise.
In step S130, the number of the newly added third class edges characterizing the important stakeholder relationship is necessarily less than or equal to the number of the first class edges characterizing the direct investment relationship. In step S140, a fourth edge of the indirect investment relationship representing the important shareholder is expanded based on the third edge, and the operation scale of the fourth edge is inevitably smaller than that of the fourth edge expanded based on the first edge. This is a Pruning operation in graph computation, which can greatly reduce computation resources and computation time.
Referring to fig. 3, this is an example of the knowledge-graph constructed in step S110. Where circles represent nodes and lines represent edges. For simplicity of description, nodes E1 to E7 represent that the entity type attribute is E, node P1 represents that the entity type attribute is P, node G1 represents that the entity type attribute is G, node S1 represents that the entity type attribute is S, and node Z1 represents that the entity type attribute is Z. The direct investment proportion attribute values of the first class edges are represented by k1 and k2 … ….
Referring to fig. 4, this is an example of splitting the connected subgraph in step S120. The knowledge graph shown in fig. 3 is segmented, and 3 connected subgraphs can be obtained.
Referring to fig. 5, this is an example of adding the second class edge and the third class edge in step S130. The node E1 and the node E3 are connected through a first edge k2, and if k2 is greater than or equal to a first threshold value, a second edge s1 is added between the node E1 and the node E3. The direction of the second class of sides s1 is the same as the direction of the first class of sides k 2. The attribute of the second class of edges s1 is an absolute stock control relationship. The node E2 and the node E3 are connected through a first type edge k3, and if k3 is smaller than a first threshold and larger than or equal to a second threshold, a third type edge t1 is added between the node E2 and the node E3. The direction of the third class of side t1 is the same as the direction of the first class of side k 3. The attribute of the third-class edge t1 is the important shareholder relationship.
Referring to fig. 6, this is an example of adding a fourth class edge in step S140. The node E1 and the node E4 are sequentially connected in the same direction through two third-class edges t1 and t2, so that a fourth-class edge f1 is added between the nodes E1 and E4. The direction of the fourth class of side f1 is the same as the direction in which the combination of the third class of sides t1 and t2 points. The node E1 and the node E4 are also connected in sequence in the same direction by first edges k1 and k3, which are called path one, for example. The node E1 and the node E4 are also connected in sequence in the same direction by first edges k2 and k4, which are called path two, for example. The overall attribute value of the path one is k1 × k3, the overall attribute value of the path two is k2 × k4, and the attribute value of the indirect investment proportion of the important shareholder of the fourth type of edge f1 is k1 × k3+ k2 × k4, which represents the actual investment proportion of the node E1 to the node E4 after considering the indirect holdup factor.
Referring to fig. 7, in correspondence with the first embodiment of the above-mentioned method for calculating an actual controller of an enterprise, the present application further provides a first embodiment of a calculation system for an actual controller of an enterprise. The enterprise actual controller computing system 700 comprises a map building module 710, a connected subgraph segmentation module 720, a second-class edge extension module 730, a third-class edge extension module 740, a fourth-class edge extension module 750 and a target node judgment module 760.
The map construction module 710 is used for constructing a knowledge map reflecting the stockholder investment relationship of the enterprise by adopting a data structure of map calculation according to the equity data in the enterprise business information. In the constructed knowledge graph, each enterprise and a direct stock holder thereof are respectively used as each node in the graph. Each node contains two attributes: entity name, entity type. The entity name refers to a unit name or a natural person name. The entity type includes one or more of E, P, G, S, Z. The direct investment relation and the direct investment proportion of the direct stockholder nodes to the enterprise nodes are represented by directional first class edges.
The connected subgraph segmentation module 720 is used for segmenting the knowledge graph constructed by the graph construction module 710 to obtain one or more connected subgraphs.
The second-class edge extension module 730 is configured to add a second-class edge representing an absolute stock control relationship according to the first-class edge extension representing the direct investment relationship.
The third type edge extension module 740 is configured to add a third type edge representing an important stakeholder relationship according to the first type edge extension representing the direct investment relationship.
The fourth type edge extension module 750 is configured to add a fourth type edge of the indirect investment relation representing the important shareholder according to the third type edge extension representing the important shareholder relation.
The target node determination module 760 is configured to find a target node representing an actual controller or a suspected actual controller of the enterprise for each enterprise node, and is responsible for executing step S150.
Referring to fig. 8, an embodiment of the method for calculating an actual controller of an enterprise provided by the present application includes the following steps.
Step S810: and according to the equity data in the enterprise business information, a knowledge graph reflecting the stockholder investment relation of the enterprise is constructed by adopting a data structure calculated by a graph.
Step S820: and (3) segmenting the knowledge graph constructed in the step (S110) to obtain one or more connected subgraphs.
Step S830: in each connected subgraph segmented in step S820, the node with entity type attribute E is called a source node, the source node is all enterprise nodes, and a target node representing an actual controller or a suspected actual controller of the enterprise is searched for each source node.
For example, when a certain source node is connected with a first class edge, and the direct investment proportion attribute value of at least one first class edge is greater than or equal to a first threshold, all nodes connected by the first class edge of which the direct investment proportion attribute value is greater than or equal to the first threshold are used as target nodes representing actual controllers of enterprises. The first threshold is between 45% and 66.7%, preferably 50%.
For another example, when a certain source node is not connected with the first class edge, or is connected with the first class edge but the direct investment proportion attribute values of all the first class edges are smaller than the second threshold, the source node does not have a target node representing an actual controller or a suspected actual controller of the enterprise. The second threshold is between 25% and 35%, preferably 30%.
For another example, when a certain source node is connected with a first class edge, the direct investment proportion attribute values of all the first class edges are smaller than a first threshold, and the direct investment proportion attribute value of at least one first class edge is greater than or equal to a second threshold, all nodes which are sequentially connected in the same direction by the source node through the first class edges of which the direct investment proportion attribute values are smaller than the first threshold and greater than or equal to the second threshold are taken as candidate nodes, and one or more candidate nodes connected with one or more paths with the largest overall attribute value in all paths formed by the first class edges of which the direct investment proportion attribute values are smaller than the first threshold and greater than or equal to the second threshold, which are connected with the candidate node, are taken as target nodes representing suspected actual controllers of the enterprise. If any path is only the first-class edge, the overall attribute value of the path is the direct investment proportion attribute value of the first-class edge. If any path is formed by sequentially connecting a plurality of first-class edges in the same direction, the direct investment proportion attribute value of each first-class edge is multiplied to be used as the integral attribute value of the path. If any path is formed by connecting a plurality of first-class edges in different directions, the overall attribute value of the path is zero.
In step S830, the condition for determining the suspected actual controller node of the enterprise is that the first-class edge is extended outwards according to the direct investment proportion attribute value being smaller than the first threshold and greater than or equal to the second threshold, which significantly reduces the operation scale compared with the external extension only according to the first-class edge. The pruning operation in the graph calculation can greatly reduce the calculation resources and the calculation time.
Referring to fig. 9, in correspondence with the second embodiment of the above-mentioned method for calculating an actual controller of an enterprise, the present application further provides a second embodiment of a calculation system for an actual controller of an enterprise. The enterprise actual controller computing system 900 comprises a map building module 910, a connected subgraph segmentation module 920 and a target node judgment module 930.
The map building block 910 is the same as the map building block 710 described previously.
The connected subgraph segmentation module 920 is the same as the connected subgraph segmentation module 720.
The target node determination module 930 is configured to find a target node representing an actual controller or a suspected actual controller of the enterprise for each enterprise node, and is responsible for executing step S830.
The method comprises the steps of establishing and storing the knowledge map reflecting the equity investment relationship of the enterprise based on a map database of the knowledge map, finding and calculating an actual controller of the enterprise by adopting a technical means of map calculation, and storing the actual controller in the knowledge map. On the premise of ensuring the correct operation result, the method and the device greatly reduce the operation amount of graph calculation through pruning operation, and have the advantages of high calculation speed, low requirement on hardware resources and the like. The query of the actual controller or the suspected actual controller of any enterprise can immediately obtain a result in the knowledge graph, so that the storage and operation efficiency and the response timeliness are greatly improved.
The above are merely preferred embodiments of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.