CN111382843A - Method and device for establishing upstream and downstream relation recognition model of enterprise and relation mining - Google Patents

Method and device for establishing upstream and downstream relation recognition model of enterprise and relation mining Download PDF

Info

Publication number
CN111382843A
CN111382843A CN202010153608.3A CN202010153608A CN111382843A CN 111382843 A CN111382843 A CN 111382843A CN 202010153608 A CN202010153608 A CN 202010153608A CN 111382843 A CN111382843 A CN 111382843A
Authority
CN
China
Prior art keywords
enterprise
nodes
node
upstream
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010153608.3A
Other languages
Chinese (zh)
Other versions
CN111382843B (en
Inventor
王炀
杨硕
孙望
钟娙雩
张志强
周俊
方彦明
余泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang eCommerce Bank Co Ltd
Original Assignee
Zhejiang eCommerce Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang eCommerce Bank Co Ltd filed Critical Zhejiang eCommerce Bank Co Ltd
Priority to CN202010153608.3A priority Critical patent/CN111382843B/en
Publication of CN111382843A publication Critical patent/CN111382843A/en
Application granted granted Critical
Publication of CN111382843B publication Critical patent/CN111382843B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The embodiment of the specification provides a method and a device for establishing an upstream and downstream relation identification model and mining a relation, wherein the method for establishing the upstream and downstream relation identification model of an enterprise comprises the following steps: acquiring an enterprise network sample; the method comprises the steps of carrying out node embedding vector expression calculation on an enterprise network sample to obtain a vector for expressing enterprise node structural features, inputting the vector of an enterprise node into a graph neural network model for iteration, wherein feature expressions of the enterprise node calculated in the graph neural network model iteration process can aggregate feature information of neighbor nodes, so that upstream and downstream relations of the enterprise node including the feature expressions of the enterprise node, the neighbor nodes of the enterprise node and relationship feature expressions between the enterprise node and the neighbor nodes can be accurately expressed, therefore, the graph neural network model for accurately identifying the upstream and downstream relations of the enterprise node can be obtained through training, and the upstream and downstream relations of enterprises with different confidence degrees can be identified.

Description

Method and device for establishing upstream and downstream relation recognition model of enterprise and relation mining
Technical Field
The embodiment of the specification relates to the technical field of data mining, in particular to a method for establishing an upstream and downstream relation identification model of an enterprise. One or more embodiments of the present specification also relate to a method for mining the upstream and downstream relationship of an enterprise, an apparatus for building an enterprise upstream and downstream relationship identification model, an apparatus for mining the upstream and downstream relationship of an enterprise, a computing device and a computer readable storage medium.
Background
The enterprise upstream and downstream relationship refers to the relationship between upstream enterprises and downstream enterprises determined according to the supply relationship. Generally, the health of an upstream enterprise and a downstream enterprise of an enterprise directly affects the business status of the enterprise. If a business that has an upstream and downstream relationship with the business can be known, many factors of the upstream and downstream businesses can be taken into account.
Therefore, in many scenarios, such as credit assessment for a business, it is desirable to be able to accurately learn the business upstream and downstream relationships.
Disclosure of Invention
In view of this, the embodiments of the present specification provide a method for establishing an enterprise upstream and downstream relationship identification model. One or more embodiments of the present disclosure relate to a method for mining an upstream and downstream relationship of an enterprise, an apparatus for building an enterprise upstream and downstream relationship identification model, an apparatus for mining an upstream and downstream relationship of an enterprise, a computing device, and a computer-readable storage medium, so as to solve technical defects in the prior art.
According to a first aspect of embodiments of the present specification, there is provided a method for establishing an enterprise upstream and downstream relationship identification model, including: acquiring an enterprise network sample; carrying out node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise nodes; inputting the vectors of the enterprise nodes into a graph neural network model for iteration, and using the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iteration process of the graph neural network model, wherein the upstream and downstream relation characteristics of the enterprise nodes comprise characteristic expression of the enterprise nodes, characteristic expression of neighbor nodes of the enterprise nodes and characteristic expression of relations between the enterprise nodes and the neighbor nodes; and finishing iteration to obtain the trained graph neural network model.
Optionally, the performing node-embedded vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural features of the enterprise node includes: and carrying out node embedding vector expression calculation on the enterprise network sample by using a node2vec algorithm to obtain a vector for expressing the structural characteristics of the enterprise nodes.
Optionally, the method further comprises: generating a vector for expressing the attribute characteristics of the enterprise nodes according to the enterprise attributes of the enterprise nodes; and combining the vector for expressing the structural characteristics of the enterprise nodes with the vector for expressing the attribute characteristics of the enterprise nodes to obtain the vector of the enterprise nodes.
Optionally, the inputting the vector of the enterprise node into the neural network model for iteration includes: inputting the vector of the enterprise node into an attention-based breadth adaptive function for aggregation to obtain an aggregation characteristic expression of a next iteration process of a neighbor node aggregated by the enterprise node; inputting the vector of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; updating the next iteration process to the current iteration process; combining the feature expression of the current iterative process of the enterprise node, the feature expression of the current iterative process of the neighbor node of the enterprise node and the feature expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship features of the enterprise node; judging the upstream and downstream relation characteristics of the enterprise nodes through a full connection layer to obtain the upstream and downstream relation identification results of the enterprise nodes; judging whether the identification result meets the requirement of minimum loss or not; under the condition of meeting the minimum loss, ending the iteration to obtain a trained graph neural network model; under the condition of not meeting the minimum loss, updating parameters of the graph neural network model, inputting the feature expression of the current iteration process of the enterprise node into an attention-based breadth adaptive function for aggregation, and obtaining the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node; inputting the feature expression of the current iteration process of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; and re-entering the step of updating the next iteration process to the current iteration process.
According to a second aspect of the embodiments of the present specification, there is provided an apparatus for establishing an enterprise upstream and downstream relationship identification model, including: a sample acquisition module configured to acquire an enterprise network sample. And the sample vector calculation module is configured to perform node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise nodes. The model iteration module is configured to input the vectors of the enterprise nodes into a graph neural network model for iteration, and the graph neural network model performs training of identification of upstream and downstream relations of the enterprise nodes by using upstream and downstream relation characteristics of the enterprise nodes in an iteration process, wherein the upstream and downstream relation characteristics of the enterprise nodes comprise characteristic expressions of the enterprise nodes, characteristic expressions of neighbor nodes of the enterprise nodes and characteristic expressions of relations between the enterprise nodes and the neighbor nodes; and finishing iteration to obtain the trained graph neural network model.
Optionally, the sample vector calculation module is configured to perform node embedding vector expression calculation on the enterprise network sample by using a node2vec algorithm, so as to obtain a vector for expressing enterprise node structural features.
Optionally, the method further comprises: and the sample attribute feature calculation module is configured to generate a vector for expressing the attribute features of the enterprise nodes according to the enterprise attributes of the enterprise nodes. And the sample vector combination module is configured to combine the vector for expressing the enterprise node structural feature and the vector for expressing the enterprise node attribute feature to obtain the vector of the enterprise node.
Optionally, the model iteration module comprises: and the initial vector input submodule is configured to aggregate the vector inputs of the enterprise nodes based on an attention mechanism breadth adaptive function to obtain an aggregate characteristic expression of a next iteration process of the neighbor nodes aggregated by the enterprise nodes. And the initial feature updating submodule is configured to input the vector of the enterprise node and the feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node. And the iteration process updating submodule is configured to update the next iteration process to the current iteration process. And the relation characteristic combination submodule is configured to combine the characteristic expression of the current iteration process of the enterprise node, the characteristic expression of the current iteration process of the neighbor node of the enterprise node and the characteristic expression of the relation between the enterprise node and the neighbor node to form the upstream and downstream relation characteristics of the enterprise node. And the upstream and downstream judgment sub-module is configured to judge the upstream and downstream relation characteristics of the enterprise nodes through a full connection layer to obtain the identification result of the upstream and downstream relation of the enterprise nodes. A loss judgment submodule configured to judge whether the recognition result satisfies a loss minimum. And the iteration ending submodule is configured to enter the step of ending the iteration to obtain the trained graph neural network model under the condition of meeting the minimum loss. A parameter update sub-module configured to update parameters of the graph neural network model if the loss minimization is not satisfied. And the neighbor feature updating submodule is configured to input the feature expression of the current iteration process of the enterprise node into an attention-based breadth adaptive function for aggregation, so as to obtain the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node. The node feature updating submodule is configured to input the feature expression of the current iteration process of the enterprise node and the aggregation feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; and re-triggering the iterative process updating submodule to execute.
According to a third aspect of the embodiments of the present specification, there is provided a method for mining upstream and downstream relationships of an enterprise, including: carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing the structural characteristics of the enterprise nodes; and inputting the vectors of the two or more enterprise nodes into a trained neural network model obtained by the method for establishing the upstream and downstream relation recognition model of the enterprise according to any embodiment of the specification, and outputting the recognition results of the upstream and downstream relations of the two or more enterprise nodes.
Optionally, the method further comprises: generating vectors for expressing the attribute characteristics of the enterprise nodes of the two or more enterprise nodes according to the enterprise attributes of the two or more enterprise nodes; and respectively aiming at the two or more enterprise nodes, combining the vector for expressing the structural characteristics of the enterprise nodes and the vector for expressing the attribute characteristics of the enterprise nodes to obtain respective vectors of the two or more enterprise nodes.
Optionally, the method further comprises: acquiring enterprise transaction data; calculating the delivery dispersion and the transaction frequency of the enterprise according to the enterprise transaction data; judging enterprises belonging to the affiliation mode by using the delivery dispersion and the transaction frequency; and outputting the enterprises belonging to the affiliation mode and buyer enterprises corresponding to the transaction as an upstream-downstream relationship.
Optionally, the method further comprises: acquiring enterprise transaction data; calculating the transaction frequency of the enterprise and the number of commodities per transaction according to the enterprise transaction data; judging the enterprises belonging to the goods-in mode by utilizing the transaction frequency and the quantity of the commodities per pen; and outputting the enterprises belonging to the goods-feeding mode and buyer enterprises corresponding to the transaction as an upstream-downstream relationship.
Optionally, the identification result is a score of an upstream-downstream relationship between the two or more enterprise nodes, and the method further includes: and sequencing the upstream and downstream relations of the two or more enterprise nodes according to the scores of the upstream and downstream relations of the two or more enterprise nodes to obtain the upstream and downstream relations of the enterprises with different confidence degrees.
According to a fourth aspect of embodiments herein, there is provided an apparatus for mining upstream and downstream relationships of an enterprise, including: and the node vector calculation module is configured to perform node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing the structural characteristics of the enterprise nodes. And the recognition module is configured to input the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by the method for establishing the upstream and downstream relationship recognition model of the enterprise according to any embodiment of the specification, and output a recognition result of the upstream and downstream relationship of the two or more enterprise nodes.
Optionally, the method further comprises: and the node attribute feature calculation module is configured to generate vectors for expressing the attribute features of the enterprise nodes of the two or more enterprise nodes according to the enterprise attributes of the two or more enterprise nodes. And the node vector combination module is configured to combine the vector for expressing the enterprise node structural feature and the vector for expressing the enterprise node attribute feature respectively for the two or more enterprise nodes to obtain respective vectors of the two or more enterprise nodes.
Optionally, the method further comprises: a transaction data acquisition module configured to acquire enterprise transaction data. And the affiliation characteristic calculation module is configured to calculate the delivery dispersion and the transaction frequency of the enterprise according to the enterprise transaction data. And the affiliation mode judging module is configured to judge the enterprises belonging to the affiliation mode by using the delivery dispersion and the transaction frequency. And the affiliation output module is configured to output the enterprises belonging to the affiliation mode and buyer enterprises corresponding to the transaction as upstream and downstream relations.
Optionally, the method further comprises: the transaction data acquisition module is configured to acquire enterprise transaction data. And the stock characteristic calculation module is configured to calculate the transaction frequency and the per-pen transaction commodity quantity of the enterprise according to the enterprise transaction data. And the goods feeding mode judging module is configured to judge the enterprise belonging to the goods feeding mode by utilizing the transaction frequency and the per-pen transaction commodity quantity. And the goods-in relation output module is configured to output the enterprise belonging to the goods-in mode and a buyer enterprise corresponding to the transaction as an upstream-downstream relation.
According to a fifth aspect of embodiments herein, there is provided a computing device comprising: a memory and a processor; the memory is to store computer-executable instructions, and the processor is to execute the computer-executable instructions to: acquiring an enterprise network sample; carrying out node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise nodes; inputting the vectors of the enterprise nodes into a graph neural network model for iteration, and using the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iteration process of the graph neural network model, wherein the upstream and downstream relation characteristics of the enterprise nodes comprise characteristic expression of the enterprise nodes, characteristic expression of neighbor nodes of the enterprise nodes and characteristic expression of relations between the enterprise nodes and the neighbor nodes; and finishing iteration to obtain the trained graph neural network model.
According to a sixth aspect of embodiments herein, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method for establishing an enterprise upstream and downstream relationship identification model according to any of the embodiments herein.
According to a seventh aspect of embodiments herein, there is provided a computing device comprising: a memory and a processor; the memory is to store computer-executable instructions, and the processor is to execute the computer-executable instructions to: carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing the structural characteristics of the enterprise nodes; and inputting the vectors of the two or more enterprise nodes into a trained neural network model obtained by the method for establishing the upstream and downstream relation recognition model of the enterprise according to any embodiment of the specification, and outputting the recognition results of the upstream and downstream relations of the two or more enterprise nodes.
According to an eighth aspect of embodiments herein, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method for enterprise upstream and downstream relationship mining according to any of the embodiments herein.
The embodiment of one aspect of the specification provides a method for establishing an enterprise upstream and downstream relation identification model, which comprises the steps of carrying out node embedding vector expression calculation on an enterprise network sample to obtain a vector for expressing enterprise node structural features, inputting the vector of the enterprise node into a graph neural network model for iteration because the vector of the enterprise node obtained by the node embedding vector expression calculation is enough to identify the neighbor of the enterprise node in the enterprise network, and aggregating the feature information of the neighbor node by the feature expression of the enterprise node calculated in the graph neural network model iteration process based on the mechanism of the graph neural network model. Therefore, the upstream and downstream relation characteristics of the enterprise nodes including the characteristic expression of the enterprise nodes, the characteristic expression of the neighbor nodes of the enterprise nodes and the relation characteristic expression between the enterprise nodes and the neighbor nodes can accurately express the upstream and downstream relation. Therefore, the graph neural network uses the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iteration process, and a graph neural network model for accurately identifying the upstream and downstream relation of the enterprise nodes can be obtained.
In another aspect of the present description, a method for mining an upstream-downstream relationship of an enterprise is provided, where an enterprise network formed by two or more enterprises is subjected to node-embedded vector expression calculation to obtain vectors for expressing structural features of the enterprise nodes of the two or more enterprises, and the vectors of the two or more enterprise nodes are input into a trained neural network model obtained by the method for establishing an enterprise upstream-downstream relationship recognition model according to any embodiment of the present description, so that the upstream-downstream relationship of the two or more enterprise nodes can be accurately recognized.
Drawings
FIG. 1 is a flow chart of a method for establishing an enterprise upstream and downstream relationship identification model according to an embodiment of the present disclosure;
FIG. 2 is a process flow diagram of a method for enterprise upstream and downstream relationship identification model building according to another embodiment of the present disclosure;
FIG. 3 is a schematic structural diagram of an apparatus for modeling an enterprise upstream and downstream relationship identification, according to an embodiment of the present disclosure;
FIG. 4 is a schematic structural diagram of an apparatus for modeling an enterprise upstream and downstream relationship identification according to another embodiment of the present disclosure;
FIG. 5 is a flow chart of a method for enterprise upstream and downstream relationship mining provided by an embodiment of the present description;
FIG. 6 is a schematic diagram of an enterprise upstream and downstream relationship mining architecture provided by an embodiment of the present description;
FIG. 7 is a schematic structural diagram of an apparatus for mining upstream and downstream relationships of an enterprise according to an embodiment of the present disclosure;
FIG. 8 is a schematic structural diagram of an apparatus for enterprise upstream and downstream relationship mining according to another embodiment of the present disclosure;
fig. 9 is a block diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
In the present specification, a method for establishing an enterprise upstream and downstream relationship recognition model is provided, and the present specification relates to an apparatus for establishing an enterprise upstream and downstream relationship recognition model, a method for mining an enterprise upstream and downstream relationship, an apparatus for mining an enterprise upstream and downstream relationship, a computing device, and a computer readable storage medium, which are described in detail in the following embodiments one by one.
Fig. 1 is a flowchart illustrating a method for establishing an enterprise upstream and downstream relationship identification model according to an embodiment of the present disclosure, which includes steps 102 to 108.
Step 102: and acquiring an enterprise network sample.
For example, the upstream and downstream relationship data of the enterprise with a strong degree of confidence may be extracted from the billing data and the supply and sale data of the enterprise. Enterprise network samples are organized according to these strong confidence enterprise upstream and downstream relationship data. For another example, business upstream and downstream relationship data with weak confidence is extracted from business transaction data. And carrying out reasoning based on an upstream rule and a downstream rule on the enterprise upstream and downstream relation data with the weak confidence degrees to obtain the enterprise upstream and downstream relation data with different confidence degrees. And organizing enterprise network samples for verifying the recognition effect of the neural network model of the graph according to the enterprise upstream and downstream relation data with different confidence degrees. In an enterprise network, one enterprise may correspond to one node, and if there is a relationship such as a transaction relationship, a transfer relationship, etc. between any two enterprises, there is a corresponding edge between the nodes of the two enterprises.
Step 104: and carrying out node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise nodes.
For example, methods such as deepwalk, line, DNGR, SDNE, node2vector, etc. can be used for node-embedded vector expression calculation. The node-embedded vector expression calculation is directed to a vector for the enterprise node sufficient to identify the enterprise node's neighbors in the enterprise network.
Step 106: and inputting the vectors of the enterprise nodes into a neural network model of the graph for iteration.
In the iterative process, the upstream and downstream relation characteristics of the enterprise nodes are used for training the upstream and downstream relation identification of the enterprise nodes, wherein the upstream and downstream relation characteristics of the enterprise nodes comprise characteristic expressions of the enterprise nodes, characteristic expressions of neighbor nodes of the enterprise nodes and characteristic expressions of relations between the enterprise nodes and the neighbor nodes.
Based on the mechanism of the graph neural network model, when the graph neural network model calculates the feature expression of the enterprise node in each iteration process, feature information of the neighbor nodes is aggregated based on the input feature expression of the enterprise node, and the feature expression of the enterprise node in the next iteration process is obtained. And training the upstream and downstream relation characteristics of the enterprise nodes through a full connection layer, wherein the upstream and downstream relation characteristics of the enterprise nodes comprise characteristic expression of the enterprise nodes, characteristic expression of neighbor nodes of the enterprise nodes and characteristic expression of the relation between the enterprise nodes and the neighbor nodes. Before entering the next iteration, the parameters are optimized and adjusted by the graph neural network model so as to achieve the purpose that the recognition result is closer to the target. For example, a gradient descent method may be used to optimally adjust parameters in the model. And (4) realizing the training of the graph neural network model by optimizing and adjusting the parameters through one iteration, and finally obtaining the trained graph neural network model.
The feature expression of the relationship between the enterprise node and the neighboring node can be specifically obtained according to an attribute value corresponding to a relationship attribute of an edge in an enterprise network sample. For example, the relational attributes of an edge may include the following attributes:
the attribute of the purchasing relationship of the merchant is as follows: the corresponding attribute values may include, for example: commodity type, amount, number, transaction frequency, etc.;
communication relationship attribute: the corresponding attribute values may include, for example: the mutual storage relation and the remark name comprise an industry keyword and the like;
LBS (Location Based Services) relationship attributes, corresponding attribute values may include, for example: camp, residential distance, etc.
Suppose that based on the relationship attributes, the characteristics of the relationship between the enterprise nodes express XE(u,v)The sequence of values is: the merchant purchasing relation attribute, the communication relation attribute and the LBS relation attribute. Wherein the default information value is 0. For the text attribute values, the numerical values corresponding to the text attribute values can be given in advance, and in the vector, the numerical values are used for representing different attribute values, for example, the numerical value corresponding to the female merchandise category is 1, the numerical value corresponding to the interactive telephone is 1, and the numerical value corresponding to the keyword of the buyer is 1. Suppose that a business purchases women's dress from another business for 5 transactions, the sum of money is 1000, 10 women's dress are in total, and two business address books mutually store the opposite party's telephone, the telephone is annotated with ' buyer ', the distance between the two business address books is 50 publicAnd (c) removing the residue. According to the above assumed scenario, the characteristics of the relationship between the two enterprise nodes express XE(u,v)=(1,1000,10,5,1,1,0,50)。
It should be noted that the above-mentioned relational attributes are only used to describe the characteristic expression of the relationship between the enterprise node and the neighboring node in the embodiment of the present specification, and do not limit the embodiment of the present specification.
Step 108: and finishing iteration to obtain the trained graph neural network model.
Whether the iteration of the graph neural network model is finished can be determined according to whether the recognition result of the graph neural network model reaches the target. For example, in one embodiment of the present description, a cross entropy loss function may be employed as an objective function of the neural network model. In case the minimum loss is determined by inputting the recognition result into the cross entropy loss function, the end of the iteration can be determined.
In summary, the method carries out node embedding vector expression calculation on an enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node, and the vector of the enterprise node is enough to identify the neighbors of the enterprise node in the enterprise network, so the vector of the enterprise node is input into a graph neural network model for iteration, and the characteristic expression of the enterprise node calculated in the graph neural network model iteration process can aggregate the characteristic information of the neighbor nodes based on the mechanism of the graph neural network model. Therefore, the upstream and downstream relation characteristics of the enterprise nodes including the characteristic expression of the enterprise nodes, the characteristic expression of the neighbor nodes of the enterprise nodes and the relation characteristic expression between the enterprise nodes and the neighbor nodes can accurately express the upstream and downstream relation. Therefore, the graph neural network uses the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iteration process, and a graph neural network model for accurately identifying the upstream and downstream relation of the enterprise nodes can be obtained.
Next, a specific implementation manner of performing node embedding vector expression calculation by using a node2vec algorithm in the method for establishing an enterprise upstream and downstream relationship identification model provided in the embodiment of the present specification is described in detail.
For the identification of upstream and downstream relationships of an enterprise, the graph structure information of an enterprise network sample is an important criterion for judgment, and a specific graph structure may exist between enterprise nodes with the upstream and downstream relationships. And (3) carrying out node embedding vector expression calculation by adopting a node2vec algorithm, namely coding graph information to obtain a vector for expressing the structural characteristics of the enterprise nodes.
Wherein the node embedding vector expression calculation is directed to a vector of the enterprise node sufficient to identify the enterprise node's neighbors in the enterprise network. That is, the primary motivation for vector representation of enterprise nodes is the desire to maximize the identification of the neighbors of an enterprise node by its vector representation. To achieve this goal, the following optimization function can be employed:
Figure BDA0002403273910000131
where u is the enterprise node in the graph, V is the set of enterprise nodes, f is the mapping function of the u to u vector representation for the enterprise node, NS(u) is a set of neighbor nodes for enterprise node u, Pr (N)S(u) | f (u)) means that the vector expression of the enterprise node u infers that the neighborhood of u is NS(u) probability.
For the convenience of solution and calculation, assuming that the probabilities of inferring the neighbors of an enterprise node from the enterprise node are independent of each other, the calculation formula is as follows:
Figure BDA0002403273910000132
in order to ensure the possible symmetry of the enterprise nodes with each other, the cosine similarity between the two vectors is used for representing the possibility of the enterprise nodes with each other, and the sum of the similarity of all the enterprise nodes is used as a normalization factor to obtain the enterprise node niThe probability of the neighborhood of the enterprise node u, and the calculation method can be expressed as follows:
Figure BDA0002403273910000133
in summary, node2vec algorithm is adopted to perform node embedding vector expression calculation, cosine similarity is adopted to represent the possibility that enterprise nodes are adjacent to each other in the process of vector expression learning, and the sum of the similarity of all enterprise nodes is used as a normalization factor to obtain an enterprise node niThe probability of the neighborhood of the enterprise node u is the probability of the neighborhood of the enterprise node u, so that the spatial distances of the enterprise nodes are similar, the enterprise nodes which are similar in structure have similar embedded expression, the structural information between the enterprise nodes is learned, and whether the edge information exists or not is also learned, the vector for expressing the structural characteristics of the enterprise nodes can be obtained, and the accurate identification of the upstream and downstream relations of the enterprise nodes in the neural network model is facilitated.
In order to express the characteristics of the enterprise node more accurately, in one or more embodiments of the present disclosure, a vector for expressing the attribute characteristics of the enterprise node is further generated according to the enterprise attributes of the enterprise node, that is, the characteristic information of the enterprise itself, and the vector for expressing the structural characteristics of the enterprise node is combined with the vector for expressing the attribute characteristics of the enterprise node to obtain the vector of the enterprise node. And inputting the vectors of the enterprise nodes obtained by the combination into a neural network model of the graph for training. For example, the vector for expressing the structural feature of the enterprise node and the vector for expressing the attribute feature of the enterprise node may be spliced by a CONCAT vector splicing function to obtain a vector of the enterprise node. Through the implementation mode, the vectors of the enterprise nodes can express the structural characteristics of the enterprise nodes and the attribute characteristics of the enterprise nodes, the characteristics of the enterprise nodes can be accurately expressed, and the upstream and downstream relation recognition of the fusion attribute semantic analysis and the topological structure is realized.
The vector for expressing the attribute features of the enterprise node may be generated by performing semantic analysis on enterprise attributes of the enterprise node in an enterprise network sample. Business attributes correspond to a business representation, for example, the business attributes may include the following:
online merchant category attribute: the corresponding attribute values can be classified into Taobao, Tianmao, Chinese station, etc., for example;
sales characteristic attributes: the corresponding attribute values can be divided into main business, commodity types, sale amount and the like;
offline receive code merchant attributes: the corresponding attribute values may include, for example, an operation time, a money amount, an operation address, LBS, and the like.
Assuming that the vector value sequence of the enterprise node attribute features is based on the enterprise attributes as follows: the attribute of online merchant category, the attribute of sales characteristic and the attribute of offline money receiving code merchant. Wherein the default information may take the value 0. For the text attribute values, numerical values corresponding to the text attribute values can be assigned, and the corresponding semantics are expressed by the numerical values in the vector, for example, the numerical value corresponding to the Taobao shop is 1, the numerical value corresponding to the clothing is 1, and the numerical value corresponding to the dress is 2. Suppose that the enterprise on the line is a treasure house, the main business of the enterprise is clothing, the commodity types comprise women's clothing, and the sales amount is 10000 in total. According to the assumed scenario, the vector of the attribute features of the enterprise node takes the value of (1,1,2, 10000).
The above-mentioned enterprise attributes are only used for describing the feature expression of the enterprise node attribute features in the embodiments of the present specification, and do not limit the embodiments of the present specification.
The method for establishing the upstream and downstream relationship identification model of the enterprise provided in the embodiment of the present disclosure is further described below with reference to fig. 2, by taking an example that an adaptive function with a breadth based on an attention mechanism is used to aggregate enterprise nodes, and a depth adaptive function based on an LStM operator is used to update feature expression. Fig. 2 is a flowchart illustrating a processing procedure of a method for establishing an enterprise upstream and downstream relationship identification model according to one or more embodiments of the present disclosure, where the specific steps include step 202 to step 222.
Step 202: and acquiring an enterprise network sample.
Step 204: and carrying out node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise nodes.
Step 206: and inputting the vector of the enterprise node into an attention-based breadth adaptive function for aggregation to obtain an aggregated feature expression of a next iterative process of the neighbor node aggregated by the enterprise node.
And finally, the importance of the enterprise nodes is used for aggregating the enterprise nodes. For example, in graph G ═ (V, E), V denotes all the point sets, and E denotes all the edge sets. For enterprise node y, the importance of enterprise node x can be expressed as:
Figure BDA0002403273910000161
where x and y are the feature vector expressions for the corresponding enterprise node, W, respectivelys TAnd
Figure BDA0002403273910000162
feature transformation matrices, v, for an originating enterprise node and a terminating enterprise node, respectively, of an edgeTIs the attention translation vector, softmaxx,yIs a normalization function and can be expressed as:
Figure BDA0002403273910000163
where (x ', y) ∈ E represents an enterprise node x' in the enterprise network that is connected to enterprise node y.
Aggregation feature expression of all neighbor nodes of enterprise node u aggregation in next iteration process t +1
Figure BDA0002403273910000164
Can be expressed as follows:
Figure BDA0002403273910000165
step 208: and inputting the vector of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node.
Assuming that the current iteration process is t-order, 0 is taken at the beginning of t, and the depth adaptive function is gradually increased according to the increase of the iteration times when the iteration is carried out. The characteristic expression of the t +1 th order of the enterprise node is aggregated from the characteristic expression of the t order of the enterprise node and the characteristic expressions of the t orders of all the neighbor nodes of the enterprise node. Therefore, the cellular state of each iteration of the enterprise node needs to be maintained, and the result of the t-th order is fused by using an LStM-like structure. The specific structure is composed of an input door, a forgetting door and an output door, and the detailed description is given below.
And an input gate for selecting important information of the t-th order result:
Figure BDA0002403273910000171
wherein the content of the first and second substances,
Figure BDA0002403273910000172
for a t-order representation of enterprise node u,
Figure BDA0002403273910000173
is to use the breadth adaptive function to aggregate vectors for all neighboring nodes of the enterprise node u,
Figure BDA0002403273910000174
is the weight vector of the input gate and CONCAT is the vector concatenation function.
Forget gate, discard garbage in previous cell states:
Figure BDA0002403273910000175
and an output gate for selecting useful information in the (t + 1) th iteration, wherein the specific expression is as follows:
Figure BDA0002403273910000176
the cell state can be calculated as:
Figure BDA0002403273910000177
Figure BDA0002403273910000178
Figure BDA0002403273910000179
the depth adaptation function output may be expressed as the elementwise product of the output gate and the cell state:
Figure BDA00024032739100001710
for example, in the initial state, parameters of the depth adaptive function based on the LStM operator can be adjusted
Figure BDA00024032739100001711
Figure BDA00024032739100001712
And
Figure BDA00024032739100001713
initialization is carried out, and in the later iteration process, the method can be based on gradient descent
Figure BDA00024032739100001716
Figure BDA00024032739100001714
And
Figure BDA00024032739100001715
and (6) solving.
Step 210: and updating the next iteration process to the current iteration process.
For example, let t be t + 1.
Step 212: and combining the feature expression of the current iterative process of the enterprise node, the feature expression of the current iterative process of the neighbor node of the enterprise node and the feature expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship features of the enterprise node.
For example, assuming that the t order is the current iteration process,
Figure BDA0002403273910000181
for the feature expression of the enterprise node u,
Figure BDA0002403273910000182
a feature expression of a neighboring node v for u, XE(u,v)And expressing the characteristics of the relationship between the enterprise node u and the neighbor node v.
Figure BDA0002403273910000183
For the upstream and downstream relation characteristics of the enterprise node u, CONCAT is a vector splicing function.
Step 214: and judging the upstream and downstream relation characteristics of the enterprise nodes through a full connection layer to obtain the upstream and downstream relation identification result of the enterprise nodes.
Through a full connection layer pair
Figure BDA0002403273910000184
And judging to obtain the score of the upstream and downstream relation between the enterprise nodes u and v, namely the identification result of the upstream and downstream relation between the corresponding enterprise nodes u and v.
Step 216: and judging whether the identification result meets the minimum loss.
In the case where the minimum loss is satisfied, step 222 is entered.
Step 218: and under the condition of not meeting the minimum loss, updating parameters of the graph neural network model, inputting the feature expression of the current iteration process of the enterprise node into an attention-based breadth adaptive function for aggregation, and obtaining the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node.
For example, in the case where the loss minimization is not satisfied, based on the gradient descent method, for Ws t
Figure BDA0002403273910000185
Wi t
Figure BDA0002403273910000186
And
Figure BDA0002403273910000187
and carrying out solving and updating.
Step 220: and inputting the feature expression of the current iteration process of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node.
Step 210 is re-entered.
Step 222: and finishing iteration to obtain the trained graph neural network model.
In the embodiment, the breadth adaptive function based on the attention mechanism is adopted to aggregate the enterprise nodes to obtain the feature expression of the enterprise nodes, and the depth adaptive function based on the LStM operator is adopted to update the feature expression, so that the feature expression of the enterprise nodes is explored from the breadth and depth based on the attention mechanism, more accurate feature expression of the upstream and downstream relations can be obtained, and accurate identification of the upstream and downstream relations of the enterprise nodes in the graph neural network model is facilitated.
Corresponding to the above method embodiment, the present specification further provides an embodiment of an apparatus for establishing an enterprise upstream and downstream relationship identification model, and fig. 3 illustrates a schematic structural diagram of an apparatus for establishing an enterprise upstream and downstream relationship identification model provided in an embodiment of the present specification. As shown in fig. 3, the apparatus includes: a sample acquisition module 302, a sample vector calculation module 304, and a model iteration module 306.
The sample acquisition module 302 may be configured to acquire an enterprise network sample.
The sample vector calculation module 304 may be configured to perform node-embedded vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural features of the enterprise nodes.
The model iteration module 306 may be configured to input the vector of the enterprise node into a graph neural network model for iteration, where the graph neural network model performs training of identifying upstream and downstream relationships of the enterprise node by using upstream and downstream relationship features of the enterprise node in an iteration process, where the upstream and downstream relationship features of the enterprise node include feature expressions of the enterprise node, feature expressions of neighbor nodes of the enterprise node, and feature expressions of relationships between the enterprise node and the neighbor nodes; and finishing iteration to obtain the trained graph neural network model.
In summary, the device performs node embedding vector expression calculation on an enterprise network sample to obtain a vector for expressing the structural features of the enterprise node, and the vector of the enterprise node is sufficient for identifying the neighbors of the enterprise node in the enterprise network, so that the vector of the enterprise node is input into a graph neural network model for iteration, and the feature expression of the enterprise node calculated in the graph neural network model iteration process can aggregate the feature information of the neighbor nodes based on the mechanism of the graph neural network model. Therefore, the upstream and downstream relation characteristics of the enterprise nodes including the characteristic expression of the enterprise nodes, the characteristic expression of the neighbor nodes of the enterprise nodes and the relation characteristic expression between the enterprise nodes and the neighbor nodes can accurately express the upstream and downstream relation. Therefore, the graph neural network uses the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iteration process, and a graph neural network model for accurately identifying the upstream and downstream relation of the enterprise nodes can be obtained.
In one or more embodiments of the present description, the sample vector calculation module 304 may be configured to perform node-embedded vector expression calculation on the enterprise network sample by using a node2vec algorithm, so as to obtain a vector for expressing the structural features of the enterprise nodes. By using the node2vec algorithmLine nodes are embedded into vector expression calculation, in the process of vector expression learning, cosine similarity is adopted to express the possibility that enterprise nodes are adjacent to each other, the sum of the similarity of all the enterprise nodes is used as a normalization factor, and the enterprise node n is obtainediThe probability of the neighborhood of the enterprise node u is the probability of the neighborhood of the enterprise node u, so that the spatial distances of the enterprise nodes are similar, the enterprise nodes which are similar in structure have similar embedded expression, the structural information between the enterprise nodes is learned, and whether the edge information exists or not is also learned, the vector for expressing the structural characteristics of the enterprise nodes can be obtained, and the accurate identification of the upstream and downstream relations of the enterprise nodes in the neural network model is facilitated.
Fig. 4 is a schematic structural diagram illustrating an apparatus for establishing an enterprise upstream and downstream relationship identification model according to another embodiment of the present disclosure. As shown in fig. 4, the apparatus may further include: a sample attribute feature calculation module 308, and a sample vector combination module 310.
The sample attribute feature calculation module 308 may be configured to generate a vector for expressing the attribute features of the enterprise node based on the enterprise attributes of the enterprise node.
The sample vector combination module 310 may be configured to combine the vector for expressing the structural feature of the enterprise node with the vector for expressing the attribute feature of the enterprise node to obtain the vector of the enterprise node. Through the implementation mode, the vectors of the enterprise nodes can express the structural characteristics of the enterprise nodes and the attribute characteristics of the enterprise nodes, the characteristics of the enterprise nodes can be accurately expressed, and the upstream and downstream relation recognition of the fusion attribute semantic analysis and the topological structure is realized.
Through the implementation mode, the vectors of the enterprise nodes can express the structural characteristics of the enterprise nodes and the attribute characteristics of the enterprise nodes, the characteristics of the enterprise nodes can be accurately expressed, and the upstream and downstream relation recognition of the fusion attribute semantic analysis and the topological structure is realized.
In one or more embodiments of the present description, the enterprise nodes are aggregated by using an attention-based breadth adaptive function, and the feature expression is updated by using a LStM operator-based depth adaptive function. Specifically, for example, as shown in fig. 4, the model iteration module 306 of the apparatus may include: an initial vector input submodule 3060, an initial feature update submodule 3061, an iterative process update submodule 3062, a relation feature combination submodule 3063, an upstream and downstream discrimination submodule 3064, a loss judgment submodule 3065, an iteration end submodule 3066, a parameter update submodule 3067, a neighbor feature update submodule 3068 and a node feature update submodule 3069.
The initial vector input submodule 3060 may be configured to aggregate the vector inputs of the enterprise node based on an attention mechanism breadth adaptive function, and obtain an aggregate feature expression of a next iteration process of the neighbor node aggregated by the enterprise node.
The initial feature update submodule 3061 may be configured to input the vector of the enterprise node and the feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator, so as to obtain the feature expression of the next iteration process of the enterprise node.
The iterative process update submodule 3062 may be configured to update the next iterative process to the current iterative process.
The relationship feature combination submodule 3063 may be configured to combine the feature expression of the current iterative process of the enterprise node, the feature expression of the current iterative process of the neighboring node of the enterprise node, and the feature expression of the relationship between the enterprise node and the neighboring node to form the upstream and downstream relationship features of the enterprise node.
The upstream and downstream determining submodule 3064 may be configured to determine the upstream and downstream relationship characteristics of the enterprise node through a full connection layer, so as to obtain a result of identifying the upstream and downstream relationship of the enterprise node.
The loss determination submodule 3065 may be configured to determine whether the recognition result satisfies a loss minimum.
The end-of-iteration sub-module 3066 may be configured to enter a step of ending the iteration to obtain a trained graph neural network model if the minimum loss is satisfied.
The parameter update submodule 3067 may be configured to update parameters of the graph neural network model if the loss minimization is not satisfied.
The neighbor feature updating submodule 3068 may be configured to input the feature expression of the current iteration process of the enterprise node into an attention-based breadth adaptive function for aggregation, so as to obtain an aggregated feature expression of a next iteration process of the neighbor node aggregated by the enterprise node.
The node feature updating submodule 3069 may be configured to input the feature expression of the current iteration process of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator, so as to obtain the feature expression of the next iteration process of the enterprise node; the iterative process update submodule 3062 is retriggered to execute.
In the embodiment, the breadth adaptive function based on the attention mechanism is adopted to aggregate the enterprise nodes to obtain the feature expression of the enterprise nodes, and the depth adaptive function based on the LStM operator is adopted to update the feature expression, so that the feature expression of the enterprise nodes is explored from the breadth and depth based on the attention mechanism, more accurate feature expression of the upstream and downstream relations can be obtained, and accurate identification of the upstream and downstream relations of the enterprise nodes in the graph neural network model is facilitated.
The above is an illustrative scheme of an apparatus for establishing an enterprise upstream and downstream relationship identification model according to this embodiment. It should be noted that the technical solution of the device for establishing the enterprise upstream and downstream relationship identification model and the technical solution of the method for establishing the enterprise upstream and downstream relationship identification model belong to the same concept, and details of the technical solution of the device for establishing the enterprise upstream and downstream relationship identification model, which are not described in detail, can be referred to the description of the technical solution of the method for establishing the enterprise upstream and downstream relationship identification model.
Fig. 5 is a flowchart illustrating a method for mining upstream and downstream relationships of an enterprise, according to an embodiment of the present disclosure, including steps 502 to 504.
Step 502: and carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing the structural characteristics of the enterprise nodes.
For example, node-embedded vector expression calculation may be performed on an enterprise network formed by two or more enterprises based on a node2vector algorithm adopted in the method for establishing an enterprise upstream and downstream relationship identification model according to the above embodiments of the present specification.
Step 504: and inputting the vectors of the two or more enterprise nodes into a trained neural network model obtained by the method for establishing the upstream and downstream relation recognition model of the enterprise according to any embodiment of the specification, and outputting the recognition results of the upstream and downstream relations of the two or more enterprise nodes.
Corresponding to the method for establishing the enterprise upstream and downstream relation identification model, in an enterprise network formed by two or more enterprise nodes, edges can have corresponding relation attributes, so that the graph neural network model can obtain feature expressions of the relation between the enterprise nodes, and combines the feature expressions of the enterprise nodes, feature expressions of neighbor nodes of the enterprise nodes and feature expressions of the relation between the enterprise nodes and the neighbor nodes to obtain the upstream and downstream relation characteristics of the enterprise nodes, so that the upstream and downstream relation characteristics of the enterprise nodes can be identified through a full connection layer.
It can be seen that, in this embodiment, the graph structure information of the enterprise network is used to perform node embedding vector expression calculation to obtain vectors for expressing the structural features of the enterprise nodes, and the vectors of two or more enterprise nodes are input into the graph neural network model.
In order to express the characteristics of the enterprise nodes more accurately, corresponding to an embodiment of the method for establishing the upstream and downstream relationship identification model of the enterprise, vectors for expressing the attribute characteristics of the enterprise nodes of the two or more enterprise nodes may be further generated according to the enterprise attributes of the two or more enterprise nodes, respectively; and respectively aiming at the two or more enterprise nodes, combining the vector for expressing the structural characteristics of the enterprise nodes and the vector for expressing the attribute characteristics of the enterprise nodes to obtain respective vectors of the two or more enterprise nodes. Through the implementation mode, the vectors of the enterprise nodes can express the structural characteristics of the enterprise nodes and the attribute characteristics of the enterprise nodes, the characteristics of the enterprise nodes can be accurately expressed, and the upstream and downstream relation recognition of the fusion attribute semantic analysis and the topological structure is realized.
In one or more embodiments of the present disclosure, the business upstream and downstream relationships may be mined according to certain rules, considering that the business transaction data may illustrate from the side that there are potential upstream and downstream relationships between related businesses. Accordingly, in one or more embodiments of the present description, enterprise upstream and downstream relationships are further mined from enterprise transaction data according to certain rules. Enterprises generally have two modes, namely an affiliate mode and a stock mode. These two modes will be described separately below.
For the affiliate model, it is common to ship directly to individual customers. Therefore, the characteristics of high delivery dispersion and high transaction frequency are presented in the transaction data. The delivery spread is equal to the number of consignees divided by the number of transactions. In addition, for the upstream of the affiliation mode, the buyer business is high in occupation ratio and the individual occupation ratio is low in the transaction data. Therefore, according to the characteristics, the enterprises belonging to the affiliation mode can be distinguished, and the buyer enterprises corresponding to the transaction are selected as the upstream and downstream relation for output. Specifically, in this embodiment, enterprise transaction data may be further obtained; calculating the delivery dispersion and the transaction frequency of the enterprise according to the enterprise transaction data; and judging the enterprises belonging to the affiliation mode by using the delivery dispersion and the transaction frequency, and outputting the enterprises belonging to the affiliation mode and buyer enterprises corresponding to the transaction as an upstream-downstream relationship.
For the mode of shipping, it is common for a business to take a good from a wholesaler and then sell it. Therefore, the characteristics of high transaction frequency and large single transaction quantity are presented in the transaction data. In addition, for the upstream of the shipping mode, the transaction data has a high customer/business ratio and a low personal ratio. Therefore, according to the characteristics, the enterprise belonging to the goods-in mode can be distinguished, and the buyer enterprise corresponding to the transaction is selected as the upstream and downstream relation for outputting. Specifically, in this embodiment, enterprise transaction data may be obtained asynchronously; calculating the transaction frequency of the enterprise and the number of commodities per transaction according to the enterprise transaction data; and judging the enterprise belonging to the goods-in mode by using the transaction frequency and the per-transaction commodity quantity, and outputting the enterprise belonging to the goods-in mode and the buyer enterprise corresponding to the transaction as an upstream-downstream relation.
The specific determination method for determining the enterprise belonging to the affiliation mode by using the delivery dispersion and the transaction frequency is not limited, and the specific determination method for determining the enterprise belonging to the stocking mode by using the transaction frequency and the per-pen transaction commodity number is not limited. For example, the determination of the affiliation mode may be performed by presetting a threshold value of the shipment dispersion and a threshold value of the transaction frequency. For example, the affiliate mode may be determined by presetting a threshold of transaction frequency and a threshold of number of commodities per pen to be transacted. For another example, the factoring patterns may be determined based on a trained factoring pattern recognition model based on a decision tree. For another example, the determination of the shipping mode may be performed based on a trained decision tree-based shipping mode recognition model. Specifically, for example, samples of the upstream and downstream relationships of the enterprise with strong confidence may be obtained first; and then, training for factoring mode discrimination and/or stocking mode discrimination is carried out on the basis of a decision tree algorithm by utilizing the enterprise upstream and downstream relation sample with the strong confidence coefficient, so as to obtain a trained factoring mode recognition model and/or stocking mode recognition model based on the decision tree.
The decision tree is a classification method, and a recognition model is obtained by learning samples. For example, the business context samples may have respective corresponding shipping discrepancies, transaction frequencies, and corresponding categories, such as whether or not they are affiliate patterns. For another example, the enterprise upstream and downstream relationship samples may have their respective transaction frequencies, the number of commodity per transaction, and the corresponding categories, such as whether the model is a shipping mode. A classifier, namely a commission pattern recognition model or a delivery pattern recognition model can be obtained by learning the upstream and downstream relation samples of the enterprise through a decision tree. For the enterprise relation to be identified, the delivery dispersion and the transaction frequency in the enterprise transaction can be input into the affiliation mode identification model, so that the enterprise belonging to the affiliation mode is judged; the transaction frequency and the quantity of commodity per pen in enterprise transaction can be input into the delivery mode identification model, so that the enterprise belonging to the delivery mode can be distinguished. The training of the factoring mode discrimination or the delivery mode discrimination is carried out on the samples of the upstream and downstream relations of the enterprise with strong confidence coefficient based on the decision tree algorithm, so that the transaction data of the large enterprise can be discriminated in a relatively short time, and the mining efficiency of the upstream and downstream relations of the enterprise is improved.
In one or more embodiments of the present application, the identification result may be scores of upstream and downstream relationships of the two or more enterprise nodes, and the method may further rank the upstream and downstream relationships of the two or more enterprise nodes according to the scores of the upstream and downstream relationships of the two or more enterprise nodes, so as to obtain the upstream and downstream relationships of the enterprises with different confidence degrees.
Therefore, according to one or more embodiments of the application, a large number of upstream and downstream relationships of enterprises with different confidence degrees can be mined from an enterprise network formed by two or more enterprises and from enterprise transaction data, so that the relationships among the enterprises can be more comprehensively described, and great help is brought to scenes such as enterprise wind control, marketing and the like. For example, as shown in the schematic diagram of the enterprise upstream and downstream relationship mining architecture shown in fig. 6, the enterprise upstream and downstream relationship samples with strong confidence may be extracted from strong confidence data such as billing data and enterprise supply and sale data among enterprises; mining an enterprise upstream and downstream relation sample with weak confidence coefficient from enterprise transaction data according to a certain rule, namely the enterprise upstream and downstream relation identified by the rule; the samples of the upstream and downstream relations of the enterprise with strong confidence can be used as evaluation basis of a graph neural network model and rule mining (such as a commission mode identification model and a stock-in mode identification model); the method can utilize the characteristics of the enterprises to output the enterprise upstream and downstream relations scored by the model based on the graph neural network model, thereby forming the enterprise upstream and downstream relations with different confidence degrees so as to provide the enterprise with scenes such as wind control, marketing and the like as reference bases.
Corresponding to the above method embodiment, the present specification further provides an embodiment of an apparatus for mining an upstream and downstream relationship of an enterprise, and fig. 7 illustrates a schematic structural diagram of an apparatus for mining an upstream and downstream relationship of an enterprise, provided in an embodiment of the present specification. As shown in fig. 7, the apparatus includes: a node vector calculation module 702 and an identification module 704.
The node vector calculation module 702 may be configured to perform node-embedded vector expression calculation on an enterprise network formed by two or more enterprises, so as to obtain vectors for expressing the structural features of the enterprise nodes of the two or more enterprise nodes.
The identification module 704 may be configured to input the vectors of the two or more enterprise nodes into a trained neural network model obtained by the method for establishing the upstream and downstream relationship identification model of the enterprise according to any embodiment of the present specification, and output the identification result of the upstream and downstream relationship of the two or more enterprise nodes.
In the embodiment, the graph structure information of the enterprise network is utilized to perform node embedding vector expression calculation to obtain vectors for expressing the structural features of the enterprise nodes, the vectors of two or more enterprise nodes are input into the graph neural network model, and the graph neural network model performs training of identifying the upstream and downstream relations of the enterprise nodes by using the upstream and downstream relation features of the enterprise nodes, wherein the upstream and downstream relation features of the enterprise nodes comprise feature expression of the enterprise nodes, feature expression of neighbor nodes of the enterprise nodes and feature expression of the relations between the enterprise nodes and the neighbor nodes, so that the relations between enterprises are described more clearly, and the upstream and downstream relations between the two or more enterprise nodes can be accurately identified.
Fig. 8 is a schematic structural diagram illustrating an apparatus for enterprise upstream and downstream relationship mining according to another embodiment of the present disclosure. As shown in fig. 8, the apparatus may further include: a node attribute feature calculation module 706 and a node vector combination module 708.
The node attribute feature calculation module 706 may be configured to generate a vector for expressing the attribute features of the enterprise node for each of the two or more enterprise nodes according to the enterprise attribute for each of the two or more enterprise nodes.
The node vector combination module 708 is configured to combine, for the two or more enterprise nodes, the vector for expressing the structural feature of the enterprise node and the vector for expressing the attribute feature of the enterprise node, respectively, to obtain respective vectors of the two or more enterprise nodes.
Through the implementation mode, the vectors of the enterprise nodes can express the structural characteristics of the enterprise nodes and the attribute characteristics of the enterprise nodes, the characteristics of the enterprise nodes can be accurately expressed, and the upstream and downstream relation recognition of the fusion attribute semantic analysis and the topological structure is realized.
In one or more embodiments of the present description, enterprise upstream and downstream relationships may be mined from enterprise transaction data, taking into account that the enterprise transaction data may illustrate, from a side view, that potential upstream and downstream relationships exist between related enterprises. Accordingly, in one or more embodiments of the present description, enterprise upstream and downstream relationships are further mined from enterprise transaction data. Enterprises generally have two modes, namely an affiliate mode and a stock mode. These two modes will be described separately below.
For the affiliation mode, as shown in fig. 8, the apparatus for enterprise upstream and downstream relationship mining may further include: the trading system comprises a trading data acquisition module 7100, a factoring characteristic calculation module 7102, a factoring mode judgment module 7104 and a factoring relation output module 7106.
The transaction data acquisition module 7100 may be configured to acquire enterprise transaction data.
The factoring feature calculation module 7102 may be configured to calculate a shipment dispersion and a transaction frequency of the enterprise based on the enterprise transaction data.
The affiliate model determination module 7104 may be configured to determine a business belonging to an affiliate model using the delivery dispersion and the transaction frequency.
The affiliation output module 7106 may be configured to output the enterprise belonging to the affiliation mode and the buyer enterprise corresponding to the transaction as an upstream-downstream relationship.
For the stocking pattern, as shown in fig. 8, the apparatus for mining upstream and downstream relationships of the enterprise may further include: a stock characteristic calculation module 7202, a stock mode determination module 7204, and a stock relationship output module 7206.
The stocking characteristics calculating module 7202 may be configured to calculate a transaction frequency of the enterprise and a per-pen transaction commodity quantity according to the enterprise transaction data.
The stocking pattern determining module 7204 may be configured to determine an enterprise belonging to the stocking pattern by using the transaction frequency and the per-transaction commodity number.
The shipping relationship output module 7206 may be configured to output the enterprise belonging to the shipping mode and the buyer enterprise corresponding to the transaction as an upstream-downstream relationship.
In one or more embodiments of the present application, the identification result may be scores of upstream and downstream relationships of the two or more enterprise nodes, and the device for identifying the upstream and downstream relationships of the enterprise may further rank the upstream and downstream relationships of the two or more enterprise nodes according to the scores of the upstream and downstream relationships of the two or more enterprise nodes, so as to obtain the upstream and downstream relationships of the enterprise with different confidence degrees.
In one or more embodiments of the present application, a large number of upstream and downstream relationships of enterprises with different confidence levels can be mined from enterprise transaction data in an enterprise network formed by two or more enterprises by using evaluation bases of a neural network model and a decision tree discrimination model, so that the relationships among the enterprises can be more comprehensively described, and great help is brought to scenarios such as enterprise wind control and marketing.
The above is an illustrative scheme of the apparatus for mining the upstream and downstream relationships of the enterprise according to the embodiment. It should be noted that the technical solution of the apparatus for mining the upstream and downstream relationship of the enterprise belongs to the same concept as the technical solution of the method for mining the upstream and downstream relationship of the enterprise described above, and details of the technical solution of the apparatus for mining the upstream and downstream relationship of the enterprise, which are not described in detail, can be referred to the description of the technical solution of the method for mining the upstream and downstream relationship of the enterprise described above.
FIG. 9 illustrates a block diagram of a computing device 900 provided in accordance with one embodiment of the present specification. Components of the computing device 900 include, but are not limited to, a memory 910 and a processor 920. The processor 920 is coupled to the memory 910 via a bus 930, and a database 950 is used to store data.
Computing device 900 also includes access device 940, access device 940 enabling computing device 900 to communicate via one or more networks 960. Examples of such networks include the public switched telephone network (PStN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 940 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 900, as well as other components not shown in FIG. 9, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 9 is for purposes of example only and is not limiting as to the scope of the description. Those skilled in the art may add or replace other components as desired.
Computing device 900 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), a mobile phone (e.g., smartphone), a wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 900 may also be a mobile or stationary server.
Wherein, the processor 920 is configured to execute the following computer-executable instructions:
acquiring an enterprise network sample;
carrying out node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise nodes;
inputting the vectors of the enterprise nodes into a graph neural network model for iteration, and using the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iteration process of the graph neural network model, wherein the upstream and downstream relation characteristics of the enterprise nodes comprise characteristic expression of the enterprise nodes, characteristic expression of neighbor nodes of the enterprise nodes and characteristic expression of relations between the enterprise nodes and the neighbor nodes;
and finishing iteration to obtain the trained graph neural network model.
Optionally, the performing node-embedded vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural features of the enterprise node includes:
and carrying out node embedding vector expression calculation on the enterprise network sample by using a node2vec algorithm to obtain a vector for expressing the structural characteristics of the enterprise nodes.
Optionally, the method further comprises:
generating a vector for expressing the attribute characteristics of the enterprise nodes according to the enterprise attributes of the enterprise nodes;
and combining the vector for expressing the structural characteristics of the enterprise nodes with the vector for expressing the attribute characteristics of the enterprise nodes to obtain the vector of the enterprise nodes.
Optionally, the inputting the vector of the enterprise node into the neural network model for iteration includes:
inputting the vector of the enterprise node into an attention-based breadth adaptive function for aggregation to obtain an aggregation characteristic expression of a next iteration process of a neighbor node aggregated by the enterprise node;
inputting the vector of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node;
updating the next iteration process to the current iteration process;
combining the feature expression of the current iterative process of the enterprise node, the feature expression of the current iterative process of the neighbor node of the enterprise node and the feature expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship features of the enterprise node;
judging the upstream and downstream relation characteristics of the enterprise nodes through a full connection layer to obtain the upstream and downstream relation identification results of the enterprise nodes;
judging whether the identification result meets the requirement of minimum loss or not;
under the condition of meeting the minimum loss, ending the iteration to obtain a trained graph neural network model;
under the condition of not meeting the minimum loss, updating parameters of the graph neural network model, inputting the feature expression of the current iteration process of the enterprise node into an attention-based breadth adaptive function for aggregation, and obtaining the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node; inputting the feature expression of the current iteration process of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; and re-entering the step of updating the next iteration process to the current iteration process.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the method for establishing the enterprise upstream and downstream relationship identification model belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the method for establishing the enterprise upstream and downstream relationship identification model.
In another embodiment of the present description, the processor 920 may be configured to execute the following computer-executable instructions:
carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing the structural characteristics of the enterprise nodes;
and inputting the vectors of the two or more enterprise nodes into a trained neural network model obtained by the method for establishing the upstream and downstream relation recognition model of the enterprise according to any embodiment of the specification, and outputting the recognition results of the upstream and downstream relations of the two or more enterprise nodes.
Optionally, the method further comprises:
generating vectors for expressing the attribute characteristics of the enterprise nodes of the two or more enterprise nodes according to the enterprise attributes of the two or more enterprise nodes;
and respectively aiming at the two or more enterprise nodes, combining the vector for expressing the structural characteristics of the enterprise nodes and the vector for expressing the attribute characteristics of the enterprise nodes to obtain respective vectors of the two or more enterprise nodes.
Optionally, the method further comprises:
acquiring enterprise transaction data;
calculating the delivery dispersion and the transaction frequency of the enterprise according to the enterprise transaction data;
judging enterprises belonging to the affiliation mode by using the delivery dispersion and the transaction frequency;
and outputting the enterprises belonging to the affiliation mode and buyer enterprises corresponding to the transaction as an upstream-downstream relationship.
Optionally, the method further comprises:
acquiring enterprise transaction data;
calculating the transaction frequency of the enterprise and the number of commodities per transaction according to the enterprise transaction data;
judging the enterprises belonging to the goods-in mode by utilizing the transaction frequency and the quantity of the commodities per pen;
and outputting the enterprises belonging to the goods-feeding mode and buyer enterprises corresponding to the transaction as an upstream-downstream relationship.
Optionally, the identification result is a score of an upstream-downstream relationship between the two or more enterprise nodes, and the method further includes:
and sequencing the upstream and downstream relations of the two or more enterprise nodes according to the scores of the upstream and downstream relations of the two or more enterprise nodes to obtain the upstream and downstream relations of the enterprises with different confidence degrees.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the method for mining the upstream and downstream relationship of the enterprise described above belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the method for mining the upstream and downstream relationship of the enterprise described above.
An embodiment of the present specification also provides a computer readable storage medium storing computer instructions that, when executed by a processor, are operable to:
acquiring an enterprise network sample;
carrying out node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise nodes;
inputting the vectors of the enterprise nodes into a graph neural network model for iteration, and using the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iteration process of the graph neural network model, wherein the upstream and downstream relation characteristics of the enterprise nodes comprise characteristic expression of the enterprise nodes, characteristic expression of neighbor nodes of the enterprise nodes and characteristic expression of relations between the enterprise nodes and the neighbor nodes;
and finishing iteration to obtain the trained graph neural network model.
Optionally, the performing node-embedded vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural features of the enterprise node includes:
and carrying out node embedding vector expression calculation on the enterprise network sample by using a node2vec algorithm to obtain a vector for expressing the structural characteristics of the enterprise nodes.
Optionally, the method further comprises:
generating a vector for expressing the attribute characteristics of the enterprise nodes according to the enterprise attributes of the enterprise nodes;
and combining the vector for expressing the structural characteristics of the enterprise nodes with the vector for expressing the attribute characteristics of the enterprise nodes to obtain the vector of the enterprise nodes.
Optionally, the inputting the vector of the enterprise node into the neural network model for iteration includes:
inputting the vector of the enterprise node into an attention-based breadth adaptive function for aggregation to obtain an aggregation characteristic expression of a next iteration process of a neighbor node aggregated by the enterprise node;
inputting the vector of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node;
updating the next iteration process to the current iteration process;
combining the feature expression of the current iterative process of the enterprise node, the feature expression of the current iterative process of the neighbor node of the enterprise node and the feature expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship features of the enterprise node;
judging the upstream and downstream relation characteristics of the enterprise nodes through a full connection layer to obtain the upstream and downstream relation identification results of the enterprise nodes;
judging whether the identification result meets the requirement of minimum loss or not;
under the condition of meeting the minimum loss, ending the iteration to obtain a trained graph neural network model;
under the condition of not meeting the minimum loss, updating parameters of the graph neural network model, inputting the feature expression of the current iteration process of the enterprise node into an attention-based breadth adaptive function for aggregation, and obtaining the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node; inputting the feature expression of the current iteration process of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; and re-entering the step of updating the next iteration process to the current iteration process.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium and the technical solution of the method for establishing the enterprise upstream and downstream relationship identification model belong to the same concept, and details of the technical solution of the storage medium, which are not described in detail, can be referred to the description of the technical solution of the method for establishing the enterprise upstream and downstream relationship identification model.
An embodiment of the present specification also provides another computer-readable storage medium storing computer instructions that, when executed by a processor, are operable to:
carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing the structural characteristics of the enterprise nodes;
and inputting the vectors of the two or more enterprise nodes into a trained neural network model obtained by the method for establishing the upstream and downstream relation recognition model of the enterprise according to any embodiment of the specification, and outputting the recognition results of the upstream and downstream relations of the two or more enterprise nodes.
Optionally, the method further comprises:
generating vectors for expressing the attribute characteristics of the enterprise nodes of the two or more enterprise nodes according to the enterprise attributes of the two or more enterprise nodes;
and respectively aiming at the two or more enterprise nodes, combining the vector for expressing the structural characteristics of the enterprise nodes and the vector for expressing the attribute characteristics of the enterprise nodes to obtain respective vectors of the two or more enterprise nodes.
Optionally, the method further comprises:
acquiring enterprise transaction data;
calculating the delivery dispersion and the transaction frequency of the enterprise according to the enterprise transaction data;
judging enterprises belonging to the affiliation mode by using the delivery dispersion and the transaction frequency;
and outputting the enterprises belonging to the affiliation mode and buyer enterprises corresponding to the transaction as an upstream-downstream relationship.
Optionally, the method further comprises:
acquiring enterprise transaction data;
calculating the transaction frequency of the enterprise and the number of commodities per transaction according to the enterprise transaction data;
judging the enterprises belonging to the goods-in mode by utilizing the transaction frequency and the quantity of the commodities per pen;
and outputting the enterprises belonging to the goods-feeding mode and buyer enterprises corresponding to the transaction as an upstream-downstream relationship.
Optionally, the identification result is a score of an upstream-downstream relationship between the two or more enterprise nodes, and the method further includes:
and sequencing the upstream and downstream relations of the two or more enterprise nodes according to the scores of the upstream and downstream relations of the two or more enterprise nodes to obtain the upstream and downstream relations of the enterprises with different confidence degrees.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium and the technical solution of the method for mining the upstream and downstream relationship of the enterprise described above belong to the same concept, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the method for mining the upstream and downstream relationship of the enterprise described above.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts, but those skilled in the art should understand that the present embodiment is not limited by the described acts, because some steps may be performed in other sequences or simultaneously according to the present embodiment. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for an embodiment of the specification.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the embodiments. The specification is limited only by the claims and their full scope and equivalents.

Claims (21)

1. A method for establishing an enterprise upstream and downstream relation identification model comprises the following steps:
acquiring an enterprise network sample;
carrying out node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise nodes;
inputting the vectors of the enterprise nodes into a graph neural network model for iteration, and using the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iteration process of the graph neural network model, wherein the upstream and downstream relation characteristics of the enterprise nodes comprise characteristic expression of the enterprise nodes, characteristic expression of neighbor nodes of the enterprise nodes and characteristic expression of relations between the enterprise nodes and the neighbor nodes;
and finishing iteration to obtain the trained graph neural network model.
2. The method of claim 1, wherein performing a node-embedded vector representation calculation on the enterprise network sample to obtain a vector for representing structural characteristics of an enterprise node comprises:
and carrying out node embedding vector expression calculation on the enterprise network sample by using a node2vec algorithm to obtain a vector for expressing the structural characteristics of the enterprise nodes.
3. The method of claim 1, further comprising:
generating a vector for expressing the attribute characteristics of the enterprise nodes according to the enterprise attributes of the enterprise nodes;
and combining the vector for expressing the structural characteristics of the enterprise nodes with the vector for expressing the attribute characteristics of the enterprise nodes to obtain the vector of the enterprise nodes.
4. The method of claim 1, the iterating the vector inputs of the enterprise nodes into a graph neural network model comprising:
inputting the vector of the enterprise node into an attention-based breadth adaptive function for aggregation to obtain an aggregation characteristic expression of a next iteration process of a neighbor node aggregated by the enterprise node;
inputting the vector of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node;
updating the next iteration process to the current iteration process;
combining the feature expression of the current iterative process of the enterprise node, the feature expression of the current iterative process of the neighbor node of the enterprise node and the feature expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship features of the enterprise node;
judging the upstream and downstream relation characteristics of the enterprise nodes through a full connection layer to obtain the upstream and downstream relation identification results of the enterprise nodes;
judging whether the identification result meets the requirement of minimum loss or not;
under the condition of meeting the minimum loss, ending the iteration to obtain a trained graph neural network model;
under the condition of not meeting the minimum loss, updating parameters of the graph neural network model, inputting the feature expression of the current iteration process of the enterprise node into an attention-based breadth adaptive function for aggregation, and obtaining the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node; inputting the feature expression of the current iteration process of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; and re-entering the step of updating the next iteration process to the current iteration process.
5. An apparatus for establishing an enterprise upstream and downstream relationship identification model, comprising:
a sample acquisition module configured to acquire an enterprise network sample;
the sample vector calculation module is configured to perform node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing enterprise node structural features;
the model iteration module is configured to input the vectors of the enterprise nodes into a graph neural network model for iteration, and the graph neural network model performs training of identification of upstream and downstream relations of the enterprise nodes by using upstream and downstream relation characteristics of the enterprise nodes in an iteration process, wherein the upstream and downstream relation characteristics of the enterprise nodes comprise characteristic expressions of the enterprise nodes, characteristic expressions of neighbor nodes of the enterprise nodes and characteristic expressions of relations between the enterprise nodes and the neighbor nodes; and finishing iteration to obtain the trained graph neural network model.
6. The apparatus of claim 5, the sample vector computation module configured to perform node-embedded vector expression computation on the enterprise network sample using a node2vec algorithm, resulting in a vector for expressing enterprise node structural features.
7. The apparatus of claim 5, further comprising:
the sample attribute feature calculation module is configured to generate a vector for expressing the attribute features of the enterprise nodes according to the enterprise attributes of the enterprise nodes;
and the sample vector combination module is configured to combine the vector for expressing the enterprise node structural feature and the vector for expressing the enterprise node attribute feature to obtain the vector of the enterprise node.
8. The apparatus of claim 5, the model iteration module comprising:
the initial vector input submodule is configured to aggregate vector inputs of the enterprise nodes based on an attention mechanism breadth adaptive function to obtain an aggregate characteristic expression of a next iteration process of neighbor nodes aggregated by the enterprise nodes;
the initial feature updating submodule is configured to input the vector of the enterprise node and the feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node;
an iterative process update sub-module configured to update a next iterative process to a current iterative process;
the relation feature combination submodule is configured to combine the feature expression of the current iteration process of the enterprise node, the feature expression of the current iteration process of the neighbor node of the enterprise node and the feature expression of the relation between the enterprise node and the neighbor node to form an upstream and downstream relation feature of the enterprise node;
the upstream and downstream judgment sub-module is configured to judge the upstream and downstream relation characteristics of the enterprise nodes through a full connection layer to obtain the identification result of the upstream and downstream relation of the enterprise nodes;
a loss judgment submodule configured to judge whether the recognition result satisfies a loss minimum;
the iteration ending submodule is configured to enter the step of ending the iteration to obtain a trained graph neural network model under the condition of meeting the minimum loss;
a parameter updating submodule configured to update parameters of the graph neural network model if the loss minimization is not satisfied;
the neighbor feature updating submodule is configured to input the feature expression of the current iteration process of the enterprise node into an attention-based breadth adaptive function for aggregation, and obtain the aggregated feature expression of the next iteration process of the neighbor node aggregated by the enterprise node;
the node feature updating submodule is configured to input the feature expression of the current iteration process of the enterprise node and the aggregation feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; and re-triggering the iterative process updating submodule to execute.
9. A method for mining upstream and downstream relationships of an enterprise, comprising:
carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing the structural characteristics of the enterprise nodes;
inputting the vectors of the two or more enterprise nodes into the trained neural network model obtained by the method for establishing the upstream and downstream relationship identification model of the enterprise according to claim 1, and outputting the identification results of the upstream and downstream relationship of the two or more enterprise nodes.
10. The method of claim 9, further comprising:
generating vectors for expressing the attribute characteristics of the enterprise nodes of the two or more enterprise nodes according to the enterprise attributes of the two or more enterprise nodes;
and respectively aiming at the two or more enterprise nodes, combining the vector for expressing the structural characteristics of the enterprise nodes and the vector for expressing the attribute characteristics of the enterprise nodes to obtain respective vectors of the two or more enterprise nodes.
11. The method of claim 9, further comprising:
acquiring enterprise transaction data;
calculating the delivery dispersion and the transaction frequency of the enterprise according to the enterprise transaction data;
judging enterprises belonging to the affiliation mode by using the delivery dispersion and the transaction frequency;
and outputting the enterprises belonging to the affiliation mode and buyer enterprises corresponding to the transaction as an upstream-downstream relationship.
12. The method of claim 9, further comprising:
acquiring enterprise transaction data;
calculating the transaction frequency of the enterprise and the number of commodities per transaction according to the enterprise transaction data;
judging the enterprises belonging to the goods-in mode by utilizing the transaction frequency and the quantity of the commodities per pen;
and outputting the enterprises belonging to the goods-feeding mode and buyer enterprises corresponding to the transaction as an upstream-downstream relationship.
13. The method of claim 9, the identification being a score of an upstream and downstream relationship of the two or more enterprise nodes, the method further comprising:
and sequencing the upstream and downstream relations of the two or more enterprise nodes according to the scores of the upstream and downstream relations of the two or more enterprise nodes to obtain the upstream and downstream relations of the enterprises with different confidence degrees.
14. An apparatus for enterprise upstream and downstream relationship mining, comprising:
the node vector calculation module is configured to perform node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing the structural characteristics of the enterprise nodes;
a recognition module configured to input the vectors of the two or more enterprise nodes into the trained neural network model obtained by the method for establishing the upstream and downstream relationship recognition model of the enterprise according to claim 1, and output the recognition result of the upstream and downstream relationship of the two or more enterprise nodes.
15. The apparatus of claim 14, further comprising:
a node attribute feature calculation module configured to generate a vector for expressing an enterprise node attribute feature of each of the two or more enterprise nodes according to an enterprise attribute of each of the two or more enterprise nodes;
and the node vector combination module is configured to combine the vector for expressing the enterprise node structural feature and the vector for expressing the enterprise node attribute feature respectively for the two or more enterprise nodes to obtain respective vectors of the two or more enterprise nodes.
16. The apparatus of claim 14, further comprising:
a transaction data acquisition module configured to acquire enterprise transaction data;
the factoring characteristic calculation module is configured to calculate the delivery dispersion and the transaction frequency of the enterprise according to the enterprise transaction data;
the affiliation mode judging module is configured to judge the enterprises belonging to the affiliation mode by utilizing the delivery dispersion and the transaction frequency;
and the affiliation output module is configured to output the enterprises belonging to the affiliation mode and buyer enterprises corresponding to the transaction as upstream and downstream relations.
17. The apparatus of claim 14, further comprising:
a transaction data acquisition module configured to acquire enterprise transaction data;
the stock characteristic calculation module is configured to calculate the transaction frequency and the per-pen transaction commodity quantity of the enterprise according to the enterprise transaction data;
the goods feeding mode judging module is configured to judge an enterprise belonging to the goods feeding mode by utilizing the transaction frequency and the per-pen transaction commodity quantity;
and the goods-in relation output module is configured to output the enterprise belonging to the goods-in mode and a buyer enterprise corresponding to the transaction as an upstream-downstream relation.
18. A computing device, comprising:
a memory and a processor;
the memory is to store computer-executable instructions, and the processor is to execute the computer-executable instructions to:
acquiring an enterprise network sample;
carrying out node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise nodes;
inputting the vectors of the enterprise nodes into a graph neural network model for iteration, and using the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iteration process of the graph neural network model, wherein the upstream and downstream relation characteristics of the enterprise nodes comprise characteristic expression of the enterprise nodes, characteristic expression of neighbor nodes of the enterprise nodes and characteristic expression of relations between the enterprise nodes and the neighbor nodes;
and finishing iteration to obtain the trained graph neural network model.
19. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method for enterprise upstream and downstream relationship identification model building according to any one of claims 1 to 4.
20. A computing device, comprising:
a memory and a processor;
the memory is to store computer-executable instructions, and the processor is to execute the computer-executable instructions to:
carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing the structural characteristics of the enterprise nodes;
inputting the vectors of the two or more enterprise nodes into the trained neural network model obtained by the method for establishing the upstream and downstream relationship identification model of the enterprise according to claim 1, and outputting the identification results of the upstream and downstream relationship of the two or more enterprise nodes.
21. A computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the method for enterprise upstream and downstream relationship mining of any of claims 9 to 13.
CN202010153608.3A 2020-03-06 2020-03-06 Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship Active CN111382843B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010153608.3A CN111382843B (en) 2020-03-06 2020-03-06 Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010153608.3A CN111382843B (en) 2020-03-06 2020-03-06 Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship

Publications (2)

Publication Number Publication Date
CN111382843A true CN111382843A (en) 2020-07-07
CN111382843B CN111382843B (en) 2023-10-20

Family

ID=71217219

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010153608.3A Active CN111382843B (en) 2020-03-06 2020-03-06 Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship

Country Status (1)

Country Link
CN (1) CN111382843B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111915191A (en) * 2020-08-03 2020-11-10 支付宝(杭州)信息技术有限公司 Industrial chain identification method and device
CN112015907A (en) * 2020-08-18 2020-12-01 大连东软教育科技集团有限公司 Method and device for quickly constructing discipline knowledge graph and storage medium
CN112035683A (en) * 2020-09-30 2020-12-04 北京百度网讯科技有限公司 User interaction information processing model generation method and user interaction information processing method
CN113836903A (en) * 2021-08-17 2021-12-24 淮阴工学院 Method and device for extracting enterprise portrait label based on situation embedding and knowledge distillation

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120046992A1 (en) * 2010-08-23 2012-02-23 International Business Machines Corporation Enterprise-to-market network analysis for sales enablement and relationship building
CN106504084A (en) * 2016-11-16 2017-03-15 航天信息股份有限公司 A kind of method and system for recognizing core enterprise in supply chain
US20170115682A1 (en) * 2015-10-27 2017-04-27 Pulse Energy Inc. Extended business name categorization apparatus and method
CN108182295A (en) * 2018-02-09 2018-06-19 重庆誉存大数据科技有限公司 A kind of Company Knowledge collection of illustrative plates attribute extraction method and system
CN108288115A (en) * 2018-03-15 2018-07-17 安徽大学 A kind of daily short-term express delivery amount prediction technique of loglstics enterprise
CN109255054A (en) * 2017-07-14 2019-01-22 元素征信有限责任公司 A kind of community discovery algorithm in enterprise's map based on relationship weight
US20190043483A1 (en) * 2017-08-02 2019-02-07 [24]7.ai, Inc. Method and apparatus for training of conversational agents
WO2019081781A1 (en) * 2017-10-27 2019-05-02 Deepmind Technologies Limited Graph neural network systems for generating structured representations of objects
WO2019085328A1 (en) * 2017-11-02 2019-05-09 平安科技(深圳)有限公司 Enterprise relationship extraction method and device, and storage medium
CN109933703A (en) * 2019-03-14 2019-06-25 鹿寨知航科技信息服务有限公司 A kind of construction method of Intellectual Property Right of Enterprises appraisal Model
CN110245787A (en) * 2019-05-24 2019-09-17 阿里巴巴集团控股有限公司 A kind of target group's prediction technique, device and equipment
CN110489481A (en) * 2019-08-06 2019-11-22 北京邮电大学 Data analysing method, device and the data analytics server of industry data
CN110555455A (en) * 2019-06-18 2019-12-10 东华大学 Online transaction fraud detection method based on entity relationship
CN110570111A (en) * 2019-08-30 2019-12-13 阿里巴巴集团控股有限公司 Enterprise risk prediction method, model training method, device and equipment
WO2019242125A1 (en) * 2018-06-19 2019-12-26 平安科技(深圳)有限公司 Method and apparatus for acquiring upstream and downstream relationships between companies, terminal device and medium
CN110852856A (en) * 2019-11-04 2020-02-28 西安交通大学 Invoice false invoice identification method based on dynamic network representation

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120046992A1 (en) * 2010-08-23 2012-02-23 International Business Machines Corporation Enterprise-to-market network analysis for sales enablement and relationship building
US20170115682A1 (en) * 2015-10-27 2017-04-27 Pulse Energy Inc. Extended business name categorization apparatus and method
CN106504084A (en) * 2016-11-16 2017-03-15 航天信息股份有限公司 A kind of method and system for recognizing core enterprise in supply chain
CN109255054A (en) * 2017-07-14 2019-01-22 元素征信有限责任公司 A kind of community discovery algorithm in enterprise's map based on relationship weight
US20190043483A1 (en) * 2017-08-02 2019-02-07 [24]7.ai, Inc. Method and apparatus for training of conversational agents
WO2019081781A1 (en) * 2017-10-27 2019-05-02 Deepmind Technologies Limited Graph neural network systems for generating structured representations of objects
WO2019085328A1 (en) * 2017-11-02 2019-05-09 平安科技(深圳)有限公司 Enterprise relationship extraction method and device, and storage medium
CN108182295A (en) * 2018-02-09 2018-06-19 重庆誉存大数据科技有限公司 A kind of Company Knowledge collection of illustrative plates attribute extraction method and system
CN108288115A (en) * 2018-03-15 2018-07-17 安徽大学 A kind of daily short-term express delivery amount prediction technique of loglstics enterprise
WO2019242125A1 (en) * 2018-06-19 2019-12-26 平安科技(深圳)有限公司 Method and apparatus for acquiring upstream and downstream relationships between companies, terminal device and medium
CN109933703A (en) * 2019-03-14 2019-06-25 鹿寨知航科技信息服务有限公司 A kind of construction method of Intellectual Property Right of Enterprises appraisal Model
CN110245787A (en) * 2019-05-24 2019-09-17 阿里巴巴集团控股有限公司 A kind of target group's prediction technique, device and equipment
CN110555455A (en) * 2019-06-18 2019-12-10 东华大学 Online transaction fraud detection method based on entity relationship
CN110489481A (en) * 2019-08-06 2019-11-22 北京邮电大学 Data analysing method, device and the data analytics server of industry data
CN110570111A (en) * 2019-08-30 2019-12-13 阿里巴巴集团控股有限公司 Enterprise risk prediction method, model training method, device and equipment
CN110852856A (en) * 2019-11-04 2020-02-28 西安交通大学 Invoice false invoice identification method based on dynamic network representation

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BIJAYA ADHIKARI, ET.AL: "Sub2Vec: feature learning for subgraphs", 《ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING》, pages 170 - 182 *
MENGYUAN CHEN, ET.AL: "Inference for network structure and dynamics from time series data via graph neural network", pages 1 - 10 *
PAU RIBA, ET.AL: "Table detection in invoice documents by graph neural networks", pages 1 - 5 *
STEPHAN M. WAGNER, ET.AL: "Assessing the vulnerability of supply chains using graph theory", pages 121 - 129 *
彭志忠: "供应链需求预测中的神经网络预测技术应用分析", no. 12, pages 15 - 17 *
许爽等: "基于子图特征的科学家合作网络链路预测", 《大连民族大学学报》, vol. 22, no. 1, pages 51 - 63 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111915191A (en) * 2020-08-03 2020-11-10 支付宝(杭州)信息技术有限公司 Industrial chain identification method and device
CN112015907A (en) * 2020-08-18 2020-12-01 大连东软教育科技集团有限公司 Method and device for quickly constructing discipline knowledge graph and storage medium
CN112035683A (en) * 2020-09-30 2020-12-04 北京百度网讯科技有限公司 User interaction information processing model generation method and user interaction information processing method
CN113836903A (en) * 2021-08-17 2021-12-24 淮阴工学院 Method and device for extracting enterprise portrait label based on situation embedding and knowledge distillation
CN113836903B (en) * 2021-08-17 2023-07-18 淮阴工学院 Enterprise portrait tag extraction method and device based on situation embedding and knowledge distillation

Also Published As

Publication number Publication date
CN111382843B (en) 2023-10-20

Similar Documents

Publication Publication Date Title
US20220335501A1 (en) Item recommendations using convolutions on weighted graphs
CN111382843A (en) Method and device for establishing upstream and downstream relation recognition model of enterprise and relation mining
CN109918454B (en) Method and device for embedding nodes into relational network graph
CN109615452B (en) Product recommendation method based on matrix decomposition
CN109636430A (en) Object identifying method and its system
CN111949887A (en) Item recommendation method and device and computer-readable storage medium
CN112633927B (en) Combined commodity mining method based on knowledge graph rule embedding
CN112231583A (en) E-commerce recommendation method based on dynamic interest group identification and generation of countermeasure network
CN111695024A (en) Object evaluation value prediction method and system, and recommendation method and system
CN113298546A (en) Sales prediction method and device, and commodity processing method and device
Solairaj et al. Enhanced Elman spike neural network based sentiment analysis of online product recommendation
US20220100720A1 (en) Method and system for entity resolution
CN113610610B (en) Session recommendation method and system based on graph neural network and comment similarity
CN114491263A (en) Recommendation model training method and device, and recommendation method and device
CN113506173A (en) Credit risk assessment method and related equipment thereof
CN113569059A (en) Target user identification method and device
CN111177581A (en) Multi-platform-based social e-commerce website commodity recommendation method and device
CN110889716A (en) Method and device for identifying potential registered user
CN115222203A (en) Risk identification method and device
CN113762415A (en) Neural network-based intelligent matching method and system for automobile financial products
CN114519600A (en) Graph neural network CTR estimation algorithm fusing adjacent node variances
Peddarapu et al. Customer Churn Prediction using Machine Learning
CN116028719B (en) Object recommendation method and device, and cross-domain federal commodity recommendation method and device
Wang Application of E-Commerce Recommendation Algorithm in Consumer Preference Prediction
Saha et al. Sentiment Analysis to Review Products based on Machine Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant