CN111382843B - Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship - Google Patents

Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship Download PDF

Info

Publication number
CN111382843B
CN111382843B CN202010153608.3A CN202010153608A CN111382843B CN 111382843 B CN111382843 B CN 111382843B CN 202010153608 A CN202010153608 A CN 202010153608A CN 111382843 B CN111382843 B CN 111382843B
Authority
CN
China
Prior art keywords
enterprise
node
upstream
vector
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010153608.3A
Other languages
Chinese (zh)
Other versions
CN111382843A (en
Inventor
王炀
杨硕
孙望
钟娙雩
张志强
周俊
方彦明
余泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang eCommerce Bank Co Ltd
Original Assignee
Zhejiang eCommerce Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang eCommerce Bank Co Ltd filed Critical Zhejiang eCommerce Bank Co Ltd
Priority to CN202010153608.3A priority Critical patent/CN111382843B/en
Publication of CN111382843A publication Critical patent/CN111382843A/en
Application granted granted Critical
Publication of CN111382843B publication Critical patent/CN111382843B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The embodiment of the specification provides a method and a device for establishing an enterprise upstream and downstream relation recognition model and excavating the relation, wherein the method for establishing the enterprise upstream and downstream relation recognition model comprises the following steps: acquiring an enterprise network sample; the method comprises the steps of carrying out node embedding vector expression calculation on an enterprise network sample to obtain a vector for expressing structural characteristics of the enterprise node, inputting the vector of the enterprise node into a graph neural network model for iteration, and enabling characteristic expression of the enterprise node calculated in the iteration process of the graph neural network model to aggregate characteristic information of neighbor nodes, so that upstream and downstream relation characteristics of the enterprise node including characteristic expression of the enterprise node, characteristic expression of the neighbor node of the enterprise node and relation characteristic expression between the enterprise node and the neighbor node can be accurately expressed, and therefore, the graph neural network model for accurately identifying the upstream and downstream relation of the enterprise node can be obtained through training, and further, the upstream and downstream relation of enterprises with different confidence degrees can be identified.

Description

Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship
Technical Field
The embodiment of the specification relates to the technical field of data mining, in particular to a method for establishing an enterprise upstream and downstream relation recognition model. One or more embodiments of the present specification relate to a method for enterprise upstream and downstream relationship mining, an apparatus for enterprise upstream and downstream relationship recognition model establishment, an apparatus for enterprise upstream and downstream relationship mining, a computing device, and a computer readable storage medium.
Background
The enterprise upstream-downstream relationship refers to a relationship between an upstream enterprise and a downstream enterprise determined according to a supply relationship. Typically, the health of an enterprise upstream and downstream of the enterprise directly affects the business status of the enterprise. If an enterprise having an upstream-downstream relationship with the enterprise is known, a number of factors for the upstream-downstream enterprise may be taken into account.
Therefore, in many scenarios, for example, credit evaluation for an enterprise, it is desirable to accurately know the relationship between the upstream and downstream of the enterprise.
Disclosure of Invention
In view of this, the embodiments of the present disclosure provide a method for establishing an enterprise upstream and downstream relationship identification model. One or more embodiments of the present disclosure relate to a method for mining an upstream and downstream relationship of an enterprise, an apparatus for building an identification model of an upstream and downstream relationship of an enterprise, an apparatus for mining an upstream and downstream relationship of an enterprise, a computing device, and a computer-readable storage medium, so as to solve technical defects in the prior art.
According to a first aspect of embodiments of the present disclosure, there is provided a method for establishing an enterprise upstream and downstream relationship identification model, including: acquiring an enterprise network sample; performing node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node; the vector of the enterprise node is input into a graph neural network model for iteration, and the graph neural network model uses the upstream and downstream relation characteristics of the enterprise node to train the upstream and downstream relation identification of the enterprise node in the iteration process, wherein the upstream and downstream relation characteristics of the enterprise node comprise characteristic expression of the enterprise node, characteristic expression of neighbor nodes of the enterprise node and characteristic expression of the relation between the enterprise node and the neighbor nodes; and (5) after the iteration is finished, obtaining a trained graph neural network model.
Optionally, the performing node embedded vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural feature of the enterprise node includes: and performing node embedding vector expression calculation on the enterprise network sample by using a node2vec algorithm to obtain a vector for expressing the structural characteristics of the enterprise node.
Optionally, the method further comprises: generating a vector for expressing the attribute characteristics of the enterprise node according to the enterprise attribute of the enterprise node; and combining the vector for expressing the structural characteristics of the enterprise node with the vector for expressing the attribute characteristics of the enterprise node to obtain the vector of the enterprise node.
Optionally, iterating the vector input graph neural network model of the enterprise node includes: inputting the vector of the enterprise node into a breadth-adaptive function based on an attention mechanism to aggregate, so as to obtain an aggregate feature expression of a next iteration process of the neighbor node aggregated by the enterprise node; inputting the vector of the enterprise node and the aggregate feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; updating the next iteration process to be the current iteration process; combining the characteristic expression of the current iteration process of the enterprise node, the characteristic expression of the current iteration process of the neighbor node of the enterprise node and the characteristic expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship characteristic of the enterprise node; judging the characteristics of the upstream and downstream relations of the enterprise nodes through the full connection layer to obtain the identification result of the upstream and downstream relations of the enterprise nodes; judging whether the identification result meets the minimum loss or not; under the condition of minimum loss, entering the iteration to end to obtain a trained graph neural network model; under the condition that the minimum loss is not met, updating parameters of the graph neural network model, inputting the feature expression of the current iteration process of the enterprise node into a breadth self-adaptive function based on an attention mechanism for aggregation, and obtaining an aggregation feature expression of the next iteration process of the neighbor node aggregated by the enterprise node; inputting the feature expression of the current iteration process of the enterprise node and the aggregation feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; re-entering the step of updating the next iterative process to the current iterative process.
According to a second aspect of embodiments of the present disclosure, there is provided an apparatus for establishing an enterprise upstream and downstream relationship identification model, including: a sample acquisition module configured to acquire an enterprise network sample. And the sample vector calculation module is configured to perform node embedding vector expression calculation on the enterprise network samples to obtain vectors for expressing the structural characteristics of the enterprise nodes. The model iteration module is configured to iterate the vector input of the enterprise node into a graph neural network model, and the graph neural network model uses the upstream and downstream relationship characteristics of the enterprise node to train the upstream and downstream relationship identification of the enterprise node in the iteration process, wherein the upstream and downstream relationship characteristics of the enterprise node comprise the characteristic expression of the enterprise node, the characteristic expression of the neighbor node of the enterprise node and the characteristic expression of the relationship between the enterprise node and the neighbor node; and (5) after the iteration is finished, obtaining a trained graph neural network model.
Optionally, the sample vector calculation module is configured to perform node-embedded vector expression calculation on the enterprise network sample using a node2vec algorithm, to obtain a vector for expressing the structural characteristics of the enterprise node.
Optionally, the method further comprises: a sample attribute feature calculation module configured to generate a vector for expressing the attribute feature of the enterprise node according to the enterprise attribute of the enterprise node. And the sample vector combination module is configured to combine the vector for expressing the enterprise node structural characteristics with the vector for expressing the enterprise node attribute characteristics to obtain the vector of the enterprise node.
Optionally, the model iteration module includes: and the initial vector input sub-module is configured to aggregate vector input of the enterprise nodes based on the breadth self-adaptive function of the attention mechanism, so as to obtain an aggregate characteristic expression of a next iteration process of the neighbor nodes aggregated by the enterprise nodes. And the initial characteristic updating sub-module is configured to input the vector of the enterprise node and the characteristic expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the characteristic expression of the next iteration process of the enterprise node. And the iteration process updating sub-module is configured to update the next iteration process to the current iteration process. And the relation feature combination sub-module is configured to combine the feature expression of the current iteration process of the enterprise node, the feature expression of the current iteration process of the neighbor node of the enterprise node and the feature expression of the relation between the enterprise node and the neighbor node to form the upstream and downstream relation feature of the enterprise node. And the upstream and downstream distinguishing sub-module is configured to distinguish the upstream and downstream relation characteristics of the enterprise node through the full connection layer to obtain the identification result of the upstream and downstream relation of the enterprise node. And the loss judging sub-module is configured to judge whether the identification result meets the minimum loss. And the iteration ending submodule is configured to enter the iteration ending step to obtain a trained graph neural network model under the condition of minimum loss. And the parameter updating sub-module is configured to update the parameters of the graph neural network model under the condition that the minimum loss is not satisfied. And the neighbor feature updating sub-module is configured to input the feature expression of the current iteration process of the enterprise node into a breadth-adaptive function based on an attention mechanism for aggregation, so as to obtain the aggregation feature expression of the next iteration process of the neighbor node aggregated by the enterprise node. The node characteristic updating sub-module is configured to input the characteristic expression of the current iteration process of the enterprise node and the aggregation characteristic expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the characteristic expression of the next iteration process of the enterprise node; and re-triggering the execution of the updating submodule of the iterative process.
According to a third aspect of embodiments of the present disclosure, there is provided a method for mining an upstream-downstream relationship of an enterprise, including: carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing structural characteristics of the enterprise nodes; and inputting the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by the method for establishing the enterprise upstream and downstream relationship identification model according to any embodiment of the specification, and outputting the identification result of the upstream and downstream relationship of the two or more enterprise nodes.
Optionally, the method further comprises: generating respective vectors for expressing the attribute characteristics of the enterprise nodes of the two or more enterprise nodes according to the respective enterprise attributes of the two or more enterprise nodes; and combining the vector for expressing the structural characteristics of the enterprise nodes with the vector for expressing the attribute characteristics of the enterprise nodes for the two or more enterprise nodes respectively to obtain the respective vectors of the two or more enterprise nodes.
Optionally, the method further comprises: acquiring enterprise transaction data; calculating shipping dispersion and transaction frequency of enterprises according to the enterprise transaction data; judging enterprises belonging to the sales substituting mode by utilizing the shipping dispersion and the transaction frequency; and outputting the business belonging to the expense-over mode and the buyer business corresponding to the transaction as an upstream-downstream relationship.
Optionally, the method further comprises: acquiring enterprise transaction data; calculating the transaction frequency of enterprises and the number of the trade commodities according to the enterprise transaction data; judging the enterprises belonging to the commodity feeding mode by utilizing the transaction frequency and the number of the commodity subjected to uniform transaction; and outputting the business belonging to the commodity feeding mode and the buyer business corresponding to the transaction as an upstream-downstream relationship.
Optionally, the identifying result is a score of an upstream-downstream relationship of the two or more enterprise nodes, and the method further includes: and sorting the upstream and downstream relations of the two or more enterprise nodes according to the scores of the upstream and downstream relations of the two or more enterprise nodes to obtain the upstream and downstream relations of the enterprises with different confidence degrees.
According to a fourth aspect of embodiments of the present disclosure, there is provided an apparatus for mining an upstream-downstream relationship of an enterprise, including: the node vector calculation module is configured to perform node embedded vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing structural characteristics of the enterprise nodes. The recognition module is configured to input the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by the method for establishing the enterprise upstream and downstream relationship recognition model according to any embodiment of the specification, and output recognition results of the upstream and downstream relationship of the two or more enterprise nodes.
Optionally, the method further comprises: and the node attribute characteristic calculation module is configured to generate a vector for expressing the attribute characteristics of the enterprise nodes according to the enterprise attributes of the two or more enterprise nodes. And the node vector combination module is configured to combine the vector for expressing the structural characteristics of the enterprise nodes and the vector for expressing the attribute characteristics of the enterprise nodes for the two or more enterprise nodes respectively to obtain the respective vectors of the two or more enterprise nodes.
Optionally, the method further comprises: and the transaction data acquisition module is configured to acquire enterprise transaction data. And the sales replacement feature calculation module is configured to calculate the shipping dispersion of the enterprise and the transaction frequency according to the enterprise transaction data. And the sales replacement mode judging module is configured to judge the enterprise belonging to the sales replacement mode by utilizing the shipping dispersion and the transaction frequency. And the expense relation output module is configured to output the business belonging to the expense relation mode and the buyer business corresponding to the transaction as an upstream-downstream relation.
Optionally, the method further comprises: and the transaction data acquisition module is configured to acquire enterprise transaction data. And the commodity feeding characteristic calculation module is configured to calculate the trading frequency of the enterprise and the commodity quantity of the business according to the enterprise trading data. And the commodity feeding mode judging module is configured to judge the enterprise belonging to the commodity feeding mode by using the transaction frequency and the number of the commodity subjected to the uniform transaction. And the commodity-in relation output module is configured to output the commodity-in-relation enterprise corresponding to the transaction and the buyer enterprise belonging to the commodity-in-relation mode as an upstream-downstream relation.
According to a fifth aspect of embodiments of the present specification, there is provided a computing device comprising: a memory and a processor; the memory is for storing computer-executable instructions, and the processor is for executing the computer-executable instructions: acquiring an enterprise network sample; performing node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node; the vector of the enterprise node is input into a graph neural network model for iteration, and the graph neural network model uses the upstream and downstream relation characteristics of the enterprise node to train the upstream and downstream relation identification of the enterprise node in the iteration process, wherein the upstream and downstream relation characteristics of the enterprise node comprise characteristic expression of the enterprise node, characteristic expression of neighbor nodes of the enterprise node and characteristic expression of the relation between the enterprise node and the neighbor nodes; and (5) after the iteration is finished, obtaining a trained graph neural network model.
According to a sixth aspect of embodiments of the present specification, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of a method for building an enterprise upstream and downstream relationship identification model according to any embodiment of the present specification.
According to a seventh aspect of embodiments of the present specification, there is provided a computing device comprising: a memory and a processor; the memory is for storing computer-executable instructions, and the processor is for executing the computer-executable instructions: carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing structural characteristics of the enterprise nodes; and inputting the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by the method for establishing the enterprise upstream and downstream relationship identification model according to any embodiment of the specification, and outputting the identification result of the upstream and downstream relationship of the two or more enterprise nodes.
According to an eighth aspect of embodiments of the present specification, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of a method of enterprise upstream and downstream relationship mining according to any of the embodiments of the present specification.
The embodiment of one aspect of the specification provides a method for establishing an enterprise upstream and downstream relation recognition model, which comprises the steps of carrying out node embedding vector expression calculation on an enterprise network sample to obtain a vector for expressing the structural characteristics of an enterprise node, and inputting the vector of the enterprise node into a graph neural network model for iteration because the vector of the enterprise node obtained by the node embedding vector expression calculation is enough to identify the neighbors of the enterprise node in the enterprise network, wherein the characteristic expression of the enterprise node calculated in the graph neural network model iteration process can aggregate the characteristic information of the neighbors based on a mechanism of the graph neural network model. Therefore, the upstream and downstream relation features of the enterprise node, which comprise the feature expression of the enterprise node, the feature expression of the neighbor nodes of the enterprise node and the relation feature expression between the enterprise node and the neighbor nodes, can accurately express the upstream and downstream relation. Therefore, the graph neural network uses the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iterative process, and a graph neural network model for accurately identifying the upstream and downstream relation of the enterprise nodes can be obtained.
The embodiment of the other aspect of the present disclosure provides a method for mining an upstream and downstream relationship of an enterprise, where the method performs node embedded vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors for expressing structural features of the two or more enterprise nodes, and inputs the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by a method for building an upstream and downstream relationship recognition model of the enterprise according to any embodiment of the present disclosure, so that the upstream and downstream relationships of the two or more enterprise nodes can be accurately recognized.
Drawings
FIG. 1 is a flow chart of a method for establishing an enterprise upstream and downstream relationship identification model according to one embodiment of the present disclosure;
FIG. 2 is a process flow diagram of a method for modeling an enterprise upstream and downstream relationship identification model according to another embodiment of the present disclosure;
FIG. 3 is a schematic structural diagram of an apparatus for establishing an enterprise upstream and downstream relationship identification model according to an embodiment of the present disclosure;
FIG. 4 is a schematic structural diagram of an apparatus for establishing an enterprise upstream and downstream relationship identification model according to another embodiment of the present disclosure;
FIG. 5 is a flow chart of a method for enterprise upstream and downstream relationship mining provided in one embodiment of the present disclosure;
FIG. 6 is a schematic diagram of an enterprise upstream and downstream relation mining architecture provided in one embodiment of the present disclosure;
FIG. 7 is a schematic structural diagram of an apparatus for mining an upstream-downstream relationship of an enterprise according to an embodiment of the present disclosure;
FIG. 8 is a schematic structural diagram of an apparatus for mining an upstream-downstream relationship of an enterprise according to another embodiment of the present disclosure;
FIG. 9 is a block diagram of a computing device provided in one embodiment of the present description.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
In the present specification, a method for establishing an enterprise upstream and downstream relationship recognition model is provided, and the present specification relates to an apparatus for establishing an enterprise upstream and downstream relationship recognition model, a method for mining an enterprise upstream and downstream relationship, an apparatus for mining an enterprise upstream and downstream relationship, a computing device, and a computer readable storage medium, which are described in detail in the following embodiments one by one.
FIG. 1 shows a flowchart of a method for establishing an enterprise upstream and downstream relationship identification model, according to one embodiment of the present disclosure, including steps 102 through 108.
Step 102: an enterprise network sample is obtained.
For example, the business upstream and downstream relation data with strong confidence can be extracted from the billing data, the business marketing data and other data among businesses. Enterprise network samples are organized according to these strongly trusted enterprise upstream and downstream relationship data. For another example, business upstream and downstream relationship data with weak confidence is extracted from business transaction data. And carrying out reasoning based on the upstream and downstream rules on the enterprise upstream and downstream relation data with weak confidence degrees to obtain enterprise upstream and downstream relation data with different confidence degrees. Enterprise network samples for verifying the recognition effect of the graph neural network model are organized according to the enterprise upstream and downstream relationship data with different confidence degrees. In the enterprise network, one enterprise may correspond to one node, and if a relationship such as a transaction relationship, a transfer relationship, etc. exists between any two enterprises, the nodes of the two enterprises have a corresponding edge therebetween.
Step 104: and carrying out node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node.
For example, the node embedded vector expression calculation may be performed by a method such as deepwalk, line, DNGR, SDNE, node2 vector. The node embedded vector expression calculation targets a vector of the enterprise node sufficient to identify a neighbor of the enterprise node in the enterprise network.
Step 106: and inputting the vector of the enterprise node into a graph neural network model for iteration.
And training the upstream and downstream relation recognition of the enterprise nodes by using the upstream and downstream relation features of the enterprise nodes in the iterative process of the graph neural network model, wherein the upstream and downstream relation features of the enterprise nodes comprise feature expression of the enterprise nodes, feature expression of neighbor nodes of the enterprise nodes and feature expression of relations between the enterprise nodes and the neighbor nodes.
And (3) based on a mechanism of the graph neural network model, when the graph neural network model calculates the characteristic expression of the enterprise node in each iteration process, the characteristic information of the neighbor nodes is aggregated based on the input characteristic expression of the enterprise node, so that the characteristic expression of the enterprise node in the next iteration process is obtained. And training the upstream and downstream relation characteristics of the enterprise node through the full connection layer, wherein the upstream and downstream relation characteristics of the enterprise node comprise characteristic expression of the enterprise node, characteristic expression of neighbor nodes of the enterprise node and characteristic expression of relations between the enterprise node and the neighbor nodes. Before entering the next iteration, the parameters are optimized and adjusted by the graph neural network model so as to achieve the aim that the recognition result is increasingly close to the target. For example, the parameters in the model can be optimally adjusted by adopting a gradient descent method. And training the graphic neural network model through optimizing and adjusting parameters through one iteration, and finally obtaining the trained graphic neural network model.
The feature expression of the relationship between the enterprise node and the neighbor node can be obtained specifically according to an attribute value corresponding to the relationship attribute of the edge in the enterprise network sample. For example, the relationship attributes of an edge may include the following attributes:
merchant purchasing relationship attributes: the corresponding attribute values may include, for example: commodity category, amount, number of pieces, transaction frequency, etc.;
communication relation attribute: the corresponding attribute values may include, for example: the mutual storage relationship and remark names contain industry keywords and the like;
LBS (Location Based Services, location-based service) relationship attributes, the corresponding attribute values may include, for example: operational, constant ground distance, etc.
Assuming that based on the relationship attributes, the feature expression X of the relationship between enterprise nodes E(u,v) The sequence of the values is as follows: merchant purchasing relationship attributes, communication relationship attributes, LBS relationship attributes. Wherein the default information takes a value of 0. For the text attribute values, the values corresponding to the text attribute values may be given in advance, and in the vector, the values are used to represent different attribute values, for example, the value corresponding to the female commodity class is 1, the value corresponding to the mutual telephone is 1, and the "buyer" is usedThe value corresponding to the keyword is 1. Suppose that one enterprise purchases lady's goods from another enterprise, 5 transactions are total, the sum is 1000, 10 ladies are total, and two enterprise address books store each other's telephone, the telephone notes "buyer", and the distance between the two places is 50 km. Based on the above hypothetical scenario, the feature expression X of the relationship between these two enterprise nodes E(u,v) =(1,1000,10,5,1,1,0,50)。
The above description of the relationship attribute is merely for describing the characteristic expression of the relationship between the enterprise node and the neighboring node in the embodiment of the present disclosure, and is not limited to the embodiment of the present disclosure.
Step 108: and (5) after the iteration is finished, obtaining a trained graph neural network model.
Whether the iteration of the graph neural network model is finished or not can be specifically determined according to whether the recognition result of the graph neural network model reaches the target or not. For example, in one embodiment of the present disclosure, a cross entropy loss function may be employed as an objective function of the graph neural network model. In the case where the loss is determined to be minimum by inputting the recognition result into the cross entropy loss function, the end of the iteration can be determined.
In summary, the method performs node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node, and because the vector of the enterprise node is enough to identify the neighbors of the enterprise node in the enterprise network, the vector of the enterprise node is input into the graph neural network model for iteration, and the characteristic expression of the enterprise node calculated in the iteration process of the graph neural network model can aggregate the characteristic information of the neighbor node based on the mechanism of the graph neural network model. Therefore, the upstream and downstream relation features of the enterprise node, which comprise the feature expression of the enterprise node, the feature expression of the neighbor nodes of the enterprise node and the relation feature expression between the enterprise node and the neighbor nodes, can accurately express the upstream and downstream relation. Therefore, the graph neural network uses the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iterative process, and a graph neural network model for accurately identifying the upstream and downstream relation of the enterprise nodes can be obtained.
Next, a specific embodiment of the method for establishing the enterprise upstream and downstream relationship recognition model provided in the embodiment of the present disclosure for performing node embedding vector expression calculation by using a node2vec algorithm will be described in detail.
For the identification of the upstream and downstream relationships of enterprises, the graph structure information of the enterprise network samples is an important basis for judgment, and specific graph structures may exist between enterprise nodes with the upstream and downstream relationships. And performing node embedding vector expression calculation by adopting a node2vec algorithm, namely encoding the graph information, and obtaining a vector for expressing the structural characteristics of the enterprise nodes.
Wherein the node embedded vector expression calculation targets a vector of the enterprise node sufficient to identify a neighbor of the enterprise node in the enterprise network. That is, the fundamental motivation for the vector representation of an enterprise node is the desire to maximize the identification of the neighbors of an enterprise node by the vector representation of that enterprise node. To achieve this, the following optimization function may be employed:
where u is the enterprise node in the graph, V is the set of enterprise nodes, f is the mapping function expressed by the u-to-u vector of the enterprise node, N S (u) is the neighbor node set of enterprise node u, pr (N) S (u) |f (u)) represents that the neighbor of the estimated u from the vector expression of the enterprise node u is N S Probability of (u).
For ease of solution and calculation, assuming that the probabilities of the neighbors of an enterprise node are inferred from the enterprise node are independent of each other, the calculation formula is as follows:
to ensure possible symmetry of enterprise nodes as neighbors, cosine similarity between two vectors is used to represent the likelihood of enterprise nodes as neighborsCan calculate the enterprise node n by using the sum of the similarity of all the enterprise nodes as a normalization factor i The probability of being the enterprise node u neighbor, the calculation method can be expressed as:
in summary, node2vec algorithm is adopted to perform node embedding vector expression calculation, cosine similarity is adopted to represent the possibility that enterprise nodes are neighbors of each other in the process of vector representation learning, and the sum of the similarity of all enterprise nodes is used as a normalization factor to obtain enterprise node n i The probability of the enterprise node u neighbors is so that the spatial distances to the enterprise nodes are similar, the structurally similar enterprise nodes have similar embedded representations, the structural information between the enterprise nodes, namely whether the side information exists, is learned, the vector for expressing the structural characteristics of the enterprise nodes can be obtained, and the accurate identification of the upstream and downstream relations of the enterprise nodes in the graph neural network model is facilitated.
In order to more accurately express the characteristics of the enterprise node, in one or more embodiments of the present disclosure, a vector for expressing the characteristics of the enterprise node is further generated according to the enterprise attribute of the enterprise node, that is, the characteristic information of the enterprise itself, and the vector for expressing the structural characteristics of the enterprise node is combined with the vector for expressing the characteristics of the enterprise node, so as to obtain the vector of the enterprise node. And inputting the vector of the enterprise node obtained by the combination into a graph neural network model for training. For example, the vector for expressing the structural feature of the enterprise node and the vector for expressing the attribute feature of the enterprise node can be spliced through a CONCAT vector splicing function to obtain the vector of the enterprise node. By the implementation mode, the vector of the enterprise node can express the structural characteristics of the enterprise node and the attribute characteristics of the enterprise node, can accurately express the characteristics of the enterprise node, and realizes the integration of attribute semantic analysis and the identification of the upstream and downstream relations of the topological structure.
The vector for expressing the enterprise node attribute features can be specifically generated by carrying out semantic analysis on enterprise attributes of enterprise nodes in an enterprise network sample. The enterprise attributes correspond to enterprise portraits, for example, the enterprise attributes may include the following:
On-line merchant category attribute: the corresponding attribute values may be classified into, for example, naughty, kittens, chinese stations, etc.;
sales feature attributes: the corresponding attribute values may be classified into, for example, a camping service, a commodity type, a sales amount, and the like;
off-line pay-off merchant attributes: the corresponding attribute values may include, for example, business hours, money amounts, business addresses, LBS, etc.
Assume that, based on the enterprise attributes, the order of vector values of the enterprise node attribute features is: an online merchant category attribute, a sales feature attribute, an offline money order merchant attribute. Wherein the default information may take a value of 0. The text attribute values may be assigned values corresponding to the text attribute values, and the corresponding semantics are represented by the values in a vector, for example, 1 for a panning store, 1 for a clothing, and 2 for a women's dress. Assuming that the first-line enterprise is a Taobao store, the main business is clothing, the commodity types comprise women's clothing, and the sales amount is 10000. According to the above hypothetical scenario, the vector of enterprise node attribute features takes a value of (1,1,2,10000).
The above examples of the enterprise attribute are merely for explaining the characteristic expression of the enterprise node attribute characteristic of the embodiments of the present specification, and do not limit the embodiments of the present specification.
The following description is given by taking the method for establishing the enterprise upstream and downstream relationship identification model provided by the embodiment of the present specification as an example by adopting a breadth adaptive function based on an attention mechanism to aggregate enterprise nodes and adopting a depth adaptive function based on an LStM operator to update feature expression. FIG. 2 is a flowchart illustrating a process of a method for establishing an enterprise upstream and downstream relationship identification model according to one or more embodiments of the present disclosure, where the specific steps include steps 202 to 222.
Step 202: an enterprise network sample is obtained.
Step 204: and carrying out node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node.
Step 206: and inputting the vector of the enterprise node into a breadth self-adaptive function based on an attention mechanism to aggregate, so as to obtain an aggregate characteristic expression of a next iteration process of the neighbor node aggregated by the enterprise node.
The breadth self-adaptive function measures the importance of different enterprise nodes through an attention mechanism, and finally the importance of the enterprise nodes is used for aggregating the enterprise nodes. For example, in the graph g= (V, E), V represents all point sets, and E represents all edge sets. For enterprise node y, the importance of enterprise node x may be expressed as:
Wherein x and y are respectively the eigenvector expressions of the corresponding enterprise nodes, W s T Andfeature transformation matrix, v, of the originating enterprise node and the terminating enterprise node, respectively, of an edge T Is the transformation vector of the intent, softmax x,y Is a normalization function, which can be expressed as:
where (x ', y) ∈E represents enterprise node x' connected to enterprise node y in the enterprise network.
Aggregation feature expression of next iteration process t+1 of all neighbor nodes aggregated by enterprise node uCan be expressed as follows:
step 208: and inputting the vector of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node.
Assuming that the current iteration process is t-order, taking 0 at t initial time, and increasing the depth self-adaptive function according to the increase of iteration times when the iteration is carried out. The t+1st order feature expression of the enterprise node is aggregated from the t-order feature expression of the enterprise node and the t-order feature expressions of all neighbor nodes of the enterprise node. Thus, it is necessary to maintain the cell state of each level of iteration of the enterprise node, using LStM-like structures to fuse the results of the nth level. The specific structure is composed of three components of an input door, a forgetting door and an output door, and the detailed description is given below.
The input gate is used for selecting important information of a t-th order result for selection:
wherein, the liquid crystal display device comprises a liquid crystal display device,for t-level representation of enterprise node u,/->Is to aggregate vectors for all neighbor nodes of enterprise node u using breadth-adaptive function, +.>Is the weight vector of the input gate, and CONCAT is the vector stitching function.
Forget gate, discard garbage in previous cell state:
and outputting a gate, and selecting useful information in the t+1st iteration, wherein the useful information is specifically expressed as follows:
the cell state can be calculated as:
the depth adaptive function output can be expressed as the elementwise product of the output gate and the cell state:
for example, in an initial state, parameters of a depth adaptive function based on the LStM operator can be adjusted And->Initializing at the following pointIn the subsequent iteration process, the method based on gradient descent can be used for +.> Andand solving.
Step 210: and updating the next iteration process to the current iteration process.
For example, let t=t+1.
Step 212: and combining the characteristic expression of the current iteration process of the enterprise node, the characteristic expression of the current iteration process of the neighbor node of the enterprise node and the characteristic expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship characteristic of the enterprise node.
For example, assuming that the t-th order is the current iterative process,for the characterization of the enterprise node u, +.>Feature expression of neighbor node v for u, X E(u,v) Is the characteristic expression of the relation between the enterprise node u and the neighbor node v. />For the upstream and downstream relation feature of the enterprise node u, CONCAT is a vector splicing function.
Step 214: and judging the characteristics of the upstream and downstream relations of the enterprise nodes through the full connection layer to obtain the identification result of the upstream and downstream relations of the enterprise nodes.
By pairs of fully-connected layersDiscrimination is carried out to obtainScoring the upstream and downstream relationship between the enterprise nodes u and v, namely, corresponding to the identification result of the upstream and downstream relationship between the enterprise nodes u and v.
Step 216: and judging whether the identification result meets the minimum loss or not.
In the case where the minimum loss is satisfied, the process proceeds to step 222.
Step 218: and under the condition that the minimum loss is not met, updating parameters of the graph neural network model, inputting the feature expression of the current iteration process of the enterprise node into a breadth self-adaptive function based on an attention mechanism for aggregation, and obtaining the aggregation feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node.
For example, in the case of not meeting the minimum loss, the gradient descent-based method is applied to W s tW i t ,/>And->And carrying out solving and updating.
Step 220: and inputting the feature expression of the current iteration process of the enterprise node and the aggregated feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node.
Step 210 is re-entered.
Step 222: and (5) after the iteration is finished, obtaining a trained graph neural network model.
In the embodiment, the enterprise nodes are aggregated by adopting the breadth self-adaptive function based on the attention mechanism to obtain the characteristic expression of the enterprise nodes, and the characteristic expression is updated by adopting the depth self-adaptive function based on the LStM operator, so that the characteristic expression of the enterprise nodes is explored from the aspects of the breadth and the depth based on the attention mechanism, the characteristic expression of the more accurate upstream and downstream relations can be obtained, and the accurate identification of the upstream and downstream relations of the enterprise nodes in the graph neural network model is facilitated.
Corresponding to the method embodiment, the present disclosure further provides an embodiment of an apparatus for establishing an enterprise upstream and downstream relationship identification model, and fig. 3 shows a schematic structural diagram of an apparatus for establishing an enterprise upstream and downstream relationship identification model according to an embodiment of the present disclosure. As shown in fig. 3, the apparatus includes: a sample acquisition module 302, a sample vector calculation module 304, and a model iteration module 306.
The sample acquisition module 302 may be configured to acquire enterprise network samples.
The sample vector calculation module 304 may be configured to perform node embedded vector expression calculation on the enterprise network samples to obtain a vector for expressing structural characteristics of the enterprise nodes.
The model iteration module 306 may be configured to iterate the vector input of the enterprise node into a graph neural network model, where the graph neural network model performs training for identifying an upstream and downstream relationship of the enterprise node by using an upstream and downstream relationship feature of the enterprise node in an iteration process, where the upstream and downstream relationship feature of the enterprise node includes a feature expression of the enterprise node, a feature expression of a neighbor node of the enterprise node, and a feature expression of a relationship between the enterprise node and the neighbor node; and (5) after the iteration is finished, obtaining a trained graph neural network model.
In summary, the device performs node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node, and because the vector of the enterprise node is enough to identify the neighbors of the enterprise node in the enterprise network, the vector of the enterprise node is input into the graph neural network model for iteration, and the characteristic expression of the enterprise node calculated in the graph neural network model iteration process can aggregate the characteristic information of the neighbor node based on the mechanism of the graph neural network model. Therefore, the upstream and downstream relation features of the enterprise node, which comprise the feature expression of the enterprise node, the feature expression of the neighbor nodes of the enterprise node and the relation feature expression between the enterprise node and the neighbor nodes, can accurately express the upstream and downstream relation. Therefore, the graph neural network uses the upstream and downstream relation characteristics of the enterprise nodes to train the upstream and downstream relation identification of the enterprise nodes in the iterative process, and a graph neural network model for accurately identifying the upstream and downstream relation of the enterprise nodes can be obtained.
In one or more embodiments of the present disclosure, the sample vector calculation module 304 may be configured to perform node-embedded vector expression calculations on the enterprise network samples using a node2vec algorithm to obtain vectors for expressing structural characteristics of the enterprise nodes. Node2vec algorithm is adopted to perform node embedding vector expression calculation, cosine similarity is adopted to represent the possibility that enterprise nodes are adjacent to each other in the process of vector representation learning, and the sum of the similarity of all enterprise nodes is used as a normalization factor to obtain enterprise node n i The probability of the enterprise node u neighbors is so that the spatial distances to the enterprise nodes are similar, the structurally similar enterprise nodes have similar embedded representations, the structural information between the enterprise nodes, namely whether the side information exists, is learned, the vector for expressing the structural characteristics of the enterprise nodes can be obtained, and the accurate identification of the upstream and downstream relations of the enterprise nodes in the graph neural network model is facilitated.
Fig. 4 is a schematic structural diagram of an apparatus for building an enterprise upstream and downstream relationship recognition model according to another embodiment of the present disclosure. As shown in fig. 4, the apparatus may further include: sample attribute feature calculation module 308 and sample vector combination module 310.
The sample attribute feature calculation module 308 may be configured to generate a vector for expressing the enterprise node attribute feature based on the enterprise attributes of the enterprise node.
The sample vector combining module 310 may be configured to combine the vector for expressing the structural feature of the enterprise node with the vector for expressing the attribute feature of the enterprise node to obtain the vector of the enterprise node. By the implementation mode, the vector of the enterprise node can express the structural characteristics of the enterprise node and the attribute characteristics of the enterprise node, can accurately express the characteristics of the enterprise node, and realizes the integration of attribute semantic analysis and the identification of the upstream and downstream relations of the topological structure.
By the implementation mode, the vector of the enterprise node can express the structural characteristics of the enterprise node and the attribute characteristics of the enterprise node, can accurately express the characteristics of the enterprise node, and realizes the integration of attribute semantic analysis and the identification of the upstream and downstream relations of the topological structure.
In one or more embodiments of the present description, enterprise nodes are aggregated using a breadth-adaptive function based on an attention mechanism, and feature expression updates are performed using a depth-adaptive function based on an LStM operator. Specifically, for example, as shown in fig. 4, the model iteration module 306 of the apparatus may include: an initial vector input sub-module 3060, an initial feature update sub-module 3061, an iterative process update sub-module 3062, a relationship feature combination sub-module 3063, an upstream and downstream discrimination sub-module 3064, a loss determination sub-module 3065, an iteration end sub-module 3066, a parameter update sub-module 3067, a neighbor feature update sub-module 3068, and a node feature update sub-module 3069.
The initial vector input submodule 3060 may be configured to aggregate vector inputs of the enterprise nodes based on a breadth-adaptive function of an attention mechanism to obtain an aggregate feature expression of a next iteration process of neighbor nodes aggregated by the enterprise nodes.
The initial feature update submodule 3061 may be configured to input the vector of the enterprise node and the feature expression of the next iteration process of the neighbor node aggregated by the enterprise node into a depth adaptive function based on an LStM operator, so as to obtain the feature expression of the next iteration process of the enterprise node.
The iterative process update submodule 3062 may be configured to update a next iterative process to a current iterative process.
The relationship feature combining submodule 3063 may be configured to combine the feature expression of the current iterative process of the enterprise node, the feature expression of the current iterative process of the neighbor node of the enterprise node, and the feature expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship feature of the enterprise node.
The upstream and downstream discriminating submodule 3064 may be configured to discriminate the upstream and downstream relationship features of the enterprise node through the full connection layer, so as to obtain a recognition result of the upstream and downstream relationship of the enterprise node.
The loss determination submodule 3065 may be configured to determine whether the recognition result satisfies a loss minimum.
The iteration end submodule 3066 may be configured to enter a step of ending the iteration to obtain a trained graph neural network model if the minimum loss is met.
The parameter update sub-module 3067 may be configured to update parameters of the graph neural network model without meeting a minimum loss.
The neighbor feature update submodule 3068 may be configured to input the feature expression of the current iteration process of the enterprise node into a breadth-adaptive function based on an attention mechanism for aggregation, so as to obtain an aggregate feature expression of a next iteration process of the neighbor node aggregated by the enterprise node.
The node characteristic updating submodule 3069 can be configured to input the characteristic expression of the current iteration process of the enterprise node and the aggregation characteristic expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the characteristic expression of the next iteration process of the enterprise node; the iterative process update sub-module 3062 is re-triggered to execute.
In the embodiment, the enterprise nodes are aggregated by adopting the breadth self-adaptive function based on the attention mechanism to obtain the characteristic expression of the enterprise nodes, and the characteristic expression is updated by adopting the depth self-adaptive function based on the LStM operator, so that the characteristic expression of the enterprise nodes is explored from the aspects of the breadth and the depth based on the attention mechanism, the characteristic expression of the more accurate upstream and downstream relations can be obtained, and the accurate identification of the upstream and downstream relations of the enterprise nodes in the graph neural network model is facilitated.
The above is an exemplary scheme of an apparatus for establishing an enterprise upstream and downstream relationship identification model according to this embodiment. It should be noted that, the technical solution of the device for establishing the enterprise upstream and downstream relationship recognition model and the technical solution of the method for establishing the enterprise upstream and downstream relationship recognition model belong to the same concept, and details of the technical solution of the device for establishing the enterprise upstream and downstream relationship recognition model, which are not described in detail, can be referred to the description of the technical solution of the method for establishing the enterprise upstream and downstream relationship recognition model.
FIG. 5 shows a flowchart of a method for enterprise upstream and downstream relationship mining, including steps 502 through 504, provided in accordance with one embodiment of the present description.
Step 502: and performing node embedded vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing the structural characteristics of the enterprise nodes.
For example, node embedded vector expression calculation may be performed on an enterprise network formed by two or more enterprises based on a node2vector algorithm adopted in the method for establishing an enterprise upstream-downstream relationship identification model according to the above embodiment of the present disclosure.
Step 504: and inputting the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by the method for establishing the enterprise upstream and downstream relationship identification model according to any embodiment of the specification, and outputting the identification result of the upstream and downstream relationship of the two or more enterprise nodes.
Corresponding to the method for establishing the enterprise upstream and downstream relationship identification model, the edges in the enterprise network formed by the two or more enterprise nodes can have corresponding relationship attributes, so that the graph neural network model can acquire the feature expression of the relationship between the enterprise nodes, and the feature expression of the enterprise nodes, the feature expression of the neighbor nodes of the enterprise nodes and the feature expression of the relationship between the enterprise nodes are combined to obtain the enterprise node upstream and downstream relationship features so as to identify the enterprise node upstream and downstream relationship features through the full connection layer.
It can be seen that, in this embodiment, the node embedded vector expression calculation is performed by using the graph structure information of the enterprise network to obtain a vector for expressing the structural feature of the enterprise node, and the vector of two or more enterprise nodes is input into the graph neural network model.
In order to more accurately express the characteristics of the enterprise nodes, corresponding to an embodiment of the method for establishing the enterprise upstream and downstream relationship identification model, vectors for expressing the characteristics of the enterprise nodes of the two or more enterprise nodes can be further generated according to the respective enterprise attributes of the two or more enterprise nodes; and combining the vector for expressing the structural characteristics of the enterprise nodes with the vector for expressing the attribute characteristics of the enterprise nodes for the two or more enterprise nodes respectively to obtain the respective vectors of the two or more enterprise nodes. According to the embodiment, the vector of the enterprise node can express the structural characteristics of the enterprise node and the attribute characteristics of the enterprise node, so that the characteristics of the enterprise node can be accurately expressed, and the integration of attribute semantic analysis and the identification of the upstream and downstream relations of the topological structure is realized.
In one or more embodiments of the present disclosure, considering that enterprise transaction data may laterally describe a potential upstream-downstream relationship between related enterprises, the upstream-downstream relationship of the enterprises may be mined from the enterprise according to a certain rule. Accordingly, in one or more embodiments of the present disclosure, business upstream and downstream relationships are further mined from business transaction data according to certain rules. The business upstream and downstream are generally divided into two modes, a sales replacement mode and a shipping mode. These two modes are described below.
For the sales replacement mode, it is common to ship directly to individual customers. Therefore, the transaction data has the characteristics of higher delivery dispersion and higher transaction frequency. The shipping dispersion is equal to the number of receivers divided by the number of transactions. Further, the transaction data of the upstream of the sales pattern has a high buyer enterprise ratio and a low individual ratio. Therefore, according to the characteristics, it is possible to identify the business belonging to the sales mode, and to select the buyer business corresponding to the transaction as the upstream-downstream relationship and output the selected business. Specifically, in this embodiment, the enterprise transaction data may be further acquired; calculating shipping dispersion and transaction frequency of enterprises according to the enterprise transaction data; and judging the enterprises belonging to the sales substituting mode by utilizing the shipping dispersion and the transaction frequency, and outputting the enterprises belonging to the sales substituting mode and the buyer enterprises corresponding to the transactions as an upstream-downstream relationship.
For the shipment mode, the business typically takes the shipment from the wholesaler and then sells it. Therefore, the transaction data has the characteristics of higher transaction frequency and larger single transaction goods quantity. Further, the transaction data of the upstream of the shipment pattern has a high buyer enterprise ratio and a low individual ratio. Therefore, according to the characteristics, the business belonging to the stock mode can be determined, and the buyer business corresponding to the transaction is selected as the upstream-downstream relationship to be output. Specifically, in this embodiment, the enterprise transaction data may be acquired asynchronously; calculating the transaction frequency of enterprises and the number of the trade commodities according to the enterprise transaction data; and judging the enterprises belonging to the commodity-in mode by utilizing the transaction frequency and the number of the commodity in the transaction, and outputting the enterprises belonging to the commodity-in mode and the buyer enterprises corresponding to the transaction as an upstream-downstream relationship.
The specific distinguishing mode for distinguishing the enterprises belonging to the stock-in mode by using the shipping dispersion and the transaction frequency is not limited, and the specific distinguishing mode for distinguishing the enterprises belonging to the stock-in mode by using the transaction frequency and the number of the transaction commodities is not limited. For example, the judgment of the sales replacement mode may be performed by presetting a threshold value of shipping dispersion and a threshold value of transaction frequency. For example, the threshold of the transaction frequency may be preset, and the threshold of the number of articles to be transacted with each pen may be preset to determine the sales substituting mode. For another example, the determination of the replacement round-robin pattern may be performed based on a trained decision tree-based replacement round-robin pattern recognition model. For another example, the determination of the shipping pattern may be based on a trained decision tree-based shipping pattern recognition model. Specifically, for example, a business upstream-downstream relationship sample with strong confidence may be obtained first; and then, utilizing the enterprise upstream and downstream relation sample with the strong confidence coefficient, and carrying out the replacement sales mode discrimination and/or the stock mode discrimination training based on a decision tree algorithm to obtain a trained replacement sales mode recognition model and/or stock mode recognition model based on a decision tree.
Decision tree, which is a classification method, is used to learn samples to obtain recognition models. For example, the business's upstream and downstream relationship samples may have respective corresponding shipping dispersions, transaction frequencies, and corresponding categories, such as whether in a sales-by-sales mode. For another example, the business's upstream and downstream relationship samples may have respective corresponding transaction frequencies, amounts of items that are transacted by the pen, and corresponding categories, such as whether in a shipping mode. A classifier, namely a replacement sale mode identification model or a delivery mode identification model, can be obtained by learning the enterprise upstream and downstream relation sample through the decision tree. For the enterprise relationship to be identified, the shipping dispersion and the transaction frequency in enterprise transaction can be input into a sales replacement mode identification model, so that the enterprise belonging to the sales replacement mode can be distinguished; the transaction frequency and the number of the transaction commodities in the business transaction can be input into the shipping mode identification model, so that the business belonging to the shipping mode can be distinguished. By training the replacement mode discrimination or the delivery mode discrimination based on the decision tree algorithm through the enterprise upstream and downstream relation sample with high confidence, the transaction data of the large enterprise can be discriminated in a relatively short time, and the mining efficiency of the enterprise upstream and downstream relation is improved.
In one or more embodiments of the present application, the identification result may be a score of an upstream-downstream relationship of the two or more enterprise nodes, and the method may further sort the upstream-downstream relationship of the two or more enterprise nodes according to the score of the upstream-downstream relationship of the two or more enterprise nodes, so as to obtain the upstream-downstream relationship of the enterprise with different confidence degrees.
Therefore, in one or more embodiments of the present application, a large number of enterprise upstream and downstream relations with different confidence degrees can be mined from enterprise transaction data in an enterprise network formed by two or more enterprises, so that the relations among the enterprises are more comprehensively described, and great help is brought to the scenes of enterprise wind control, marketing, etc. For example, as shown in the schematic diagram of the enterprise upstream and downstream relation mining architecture in fig. 6, enterprise upstream and downstream relation samples with strong confidence can be extracted from strong confidence data such as billing data between enterprises, enterprise supply and sales data, etc.; enterprise upstream and downstream relation samples with weak confidence coefficient, namely enterprise upstream and downstream relations identified by rules, can be mined from enterprise transaction data according to certain rules; the enterprise upstream and downstream relation sample with strong confidence can be used as an evaluation basis of a graph neural network model and rule mining (such as a sales replacement mode identification model and a stock mode identification model); the enterprise upstream and downstream relations scored based on the graph neural network model output model can be formed by utilizing the characteristics of the enterprise, so that the enterprise upstream and downstream relations with different confidence degrees are formed, and scenes such as enterprise wind control and marketing are provided as reference basis.
Corresponding to the method embodiment, the present disclosure further provides an embodiment of an apparatus for mining an upstream and downstream relationship of an enterprise, and fig. 7 shows a schematic structural diagram of an apparatus for mining an upstream and downstream relationship of an enterprise according to one embodiment of the present disclosure. As shown in fig. 7, the apparatus includes: the node vector calculation module 702 and the identification module 704.
The node vector calculation module 702 may be configured to perform node embedded vector expression calculation on an enterprise network formed by two or more enterprises, to obtain respective vectors of the two or more enterprise nodes for expressing structural features of the enterprise nodes.
The recognition module 704 may be configured to input the vectors of the two or more enterprise nodes into a trained neural network model obtained by a method for establishing an enterprise upstream and downstream relationship recognition model according to any embodiment of the present disclosure, and output a recognition result of the upstream and downstream relationship of the two or more enterprise nodes.
According to the embodiment, node embedding vector expression calculation is performed by utilizing graph structure information of an enterprise network to obtain a vector for expressing structural characteristics of the enterprise node, and the vector of two or more enterprise nodes is input into the graph neural network model.
Fig. 8 is a schematic structural diagram of an apparatus for mining an upstream-downstream relationship of an enterprise according to another embodiment of the present disclosure. As shown in fig. 8, the apparatus may further include: the node attribute feature calculation module 706 and the node vector combination module 708.
The node attribute feature calculation module 706 may be configured to generate a vector for expressing the attribute feature of the enterprise node for each of the two or more enterprise nodes based on the respective enterprise attributes of the two or more enterprise nodes.
The node vector combination module 708 is configured to combine, for the two or more enterprise nodes, a vector for expressing the structural feature of the enterprise node and a vector for expressing the attribute feature of the enterprise node, so as to obtain respective vectors of the two or more enterprise nodes.
According to the embodiment, the vector of the enterprise node can express the structural characteristics of the enterprise node and the attribute characteristics of the enterprise node, so that the characteristics of the enterprise node can be accurately expressed, and the integration of attribute semantic analysis and the identification of the upstream and downstream relations of the topological structure is realized.
In one or more embodiments of the present description, given that business transaction data may laterally describe potential upstream and downstream relationships between related businesses, from which business upstream and downstream relationships can be mined. Accordingly, in one or more embodiments of the present description, business upstream and downstream relationships are further mined from business transaction data. The business upstream and downstream are generally divided into two modes, a sales replacement mode and a shipping mode. These two modes are described below.
For the sales replacement mode, as shown in fig. 8, the device for mining the upstream and downstream relations of the enterprise may further include: the transaction data acquisition module 7100, the expense feature calculation module 7102, the expense mode judgment module 7104 and the expense relation output module 7106.
The transaction data acquisition module 7100 may be configured to acquire enterprise transaction data.
The expense feature calculation module 7102 may be configured to calculate shipping dispersion and frequency of transactions for the enterprise based on the enterprise transaction data.
The sales replacement mode determination module 7104 may be configured to determine an enterprise belonging to a sales replacement mode using the shipping dispersion and the transaction frequency.
The sales outlet module 7106 may be configured to output the buyer business corresponding to the transaction and the business belonging to the sales outlet mode as an upstream-downstream relationship.
For the shipment mode, as shown in fig. 8, the device for mining the upstream-downstream relationship of the enterprise may further include: a shipment characteristics calculation module 7202, a shipment mode determination module 7204, and a shipment relationship output module 7206.
The shipping characteristics calculation module 7202 may be configured to calculate a transaction frequency of the business and a transaction amount of the business from the business transaction data.
The stock mode determination module 7204 may be configured to determine an enterprise belonging to the stock mode using the transaction frequency and the number of items of commerce.
The stock relationship output module 7206 may be configured to output the buyer business corresponding to the transaction as an upstream-downstream relationship.
In one or more embodiments of the present application, the identification result may be a score of an upstream-downstream relationship of the two or more enterprise nodes, and the device for identifying an upstream-downstream relationship of an enterprise may further sort the upstream-downstream relationship of the two or more enterprise nodes according to the score of the upstream-downstream relationship of the two or more enterprise nodes, so as to obtain the upstream-downstream relationship of the enterprise with different confidence degrees.
Therefore, in one or more embodiments of the present application, a great number of enterprise upstream and downstream relations with different confidence degrees can be mined from enterprise transaction data in an enterprise network formed by two or more enterprises by using the evaluation basis of the neural network model and the decision tree discrimination model, so that the relations among the enterprises are more comprehensively described, and great help is brought to the scenes of enterprise wind control, marketing, etc.
The above is a schematic scheme of an apparatus for mining an upstream-downstream relationship of an enterprise in this embodiment. It should be noted that, the technical solution of the device for mining the upstream and downstream relationships of the enterprise and the technical solution of the method for mining the upstream and downstream relationships of the enterprise belong to the same concept, and details of the technical solution of the device for mining the upstream and downstream relationships of the enterprise, which are not described in detail, can be referred to the description of the technical solution of the method for mining the upstream and downstream relationships of the enterprise.
Fig. 9 illustrates a block diagram of a computing device 900 provided in accordance with one embodiment of the present specification. The components of computing device 900 include, but are not limited to, memory 910 and processor 920. Processor 920 is coupled to memory 910 via bus 930 with database 950 configured to hold data.
Computing device 900 also includes an access device 940, access device 940 enabling computing device 900 to communicate via one or more networks 960. Examples of such networks include public switched telephone networks (PStN), local Area Networks (LANs), wide Area Networks (WANs), personal Area Networks (PANs), or combinations of communication networks such as the internet. Access device 940 may include one or more of any type of network interface, wired or wireless (e.g., a Network Interface Card (NIC)), such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 900 and other components not shown in FIG. 9 may also be connected to each other, for example, by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 9 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 900 may be any type of stationary or mobile computing device including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 900 may also be a mobile or stationary server.
Wherein the processor 920 is configured to execute the following computer-executable instructions:
acquiring an enterprise network sample;
performing node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node;
The vector of the enterprise node is input into a graph neural network model for iteration, and the graph neural network model uses the upstream and downstream relation characteristics of the enterprise node to train the upstream and downstream relation identification of the enterprise node in the iteration process, wherein the upstream and downstream relation characteristics of the enterprise node comprise characteristic expression of the enterprise node, characteristic expression of neighbor nodes of the enterprise node and characteristic expression of the relation between the enterprise node and the neighbor nodes;
and (5) after the iteration is finished, obtaining a trained graph neural network model.
Optionally, the performing node embedded vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural feature of the enterprise node includes:
and performing node embedding vector expression calculation on the enterprise network sample by using a node2vec algorithm to obtain a vector for expressing the structural characteristics of the enterprise node.
Optionally, the method further comprises:
generating a vector for expressing the attribute characteristics of the enterprise node according to the enterprise attribute of the enterprise node;
and combining the vector for expressing the structural characteristics of the enterprise node with the vector for expressing the attribute characteristics of the enterprise node to obtain the vector of the enterprise node.
Optionally, iterating the vector input graph neural network model of the enterprise node includes:
inputting the vector of the enterprise node into a breadth-adaptive function based on an attention mechanism to aggregate, so as to obtain an aggregate feature expression of a next iteration process of the neighbor node aggregated by the enterprise node;
inputting the vector of the enterprise node and the aggregate feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node;
updating the next iteration process to be the current iteration process;
combining the characteristic expression of the current iteration process of the enterprise node, the characteristic expression of the current iteration process of the neighbor node of the enterprise node and the characteristic expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship characteristic of the enterprise node;
judging the characteristics of the upstream and downstream relations of the enterprise nodes through the full connection layer to obtain the identification result of the upstream and downstream relations of the enterprise nodes;
judging whether the identification result meets the minimum loss or not;
under the condition of minimum loss, entering the iteration to end to obtain a trained graph neural network model;
Under the condition that the minimum loss is not met, updating parameters of the graph neural network model, inputting the feature expression of the current iteration process of the enterprise node into a breadth self-adaptive function based on an attention mechanism for aggregation, and obtaining an aggregation feature expression of the next iteration process of the neighbor node aggregated by the enterprise node; inputting the feature expression of the current iteration process of the enterprise node and the aggregation feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; re-entering the step of updating the next iterative process to the current iterative process.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the method for establishing the enterprise upstream and downstream relationship identification model belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the method for establishing the enterprise upstream and downstream relationship identification model.
In another embodiment of the present description, the processor 920 may be configured to execute the following computer-executable instructions:
Carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing structural characteristics of the enterprise nodes;
and inputting the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by the method for establishing the enterprise upstream and downstream relationship identification model according to any embodiment of the specification, and outputting the identification result of the upstream and downstream relationship of the two or more enterprise nodes.
Optionally, the method further comprises:
generating respective vectors for expressing the attribute characteristics of the enterprise nodes of the two or more enterprise nodes according to the respective enterprise attributes of the two or more enterprise nodes;
and combining the vector for expressing the structural characteristics of the enterprise nodes with the vector for expressing the attribute characteristics of the enterprise nodes for the two or more enterprise nodes respectively to obtain the respective vectors of the two or more enterprise nodes.
Optionally, the method further comprises:
acquiring enterprise transaction data;
calculating shipping dispersion and transaction frequency of enterprises according to the enterprise transaction data;
Judging enterprises belonging to the sales substituting mode by utilizing the shipping dispersion and the transaction frequency;
and outputting the business belonging to the expense-over mode and the buyer business corresponding to the transaction as an upstream-downstream relationship.
Optionally, the method further comprises:
acquiring enterprise transaction data;
calculating the transaction frequency of enterprises and the number of the trade commodities according to the enterprise transaction data;
judging the enterprises belonging to the commodity feeding mode by utilizing the transaction frequency and the number of the commodity subjected to uniform transaction;
and outputting the business belonging to the commodity feeding mode and the buyer business corresponding to the transaction as an upstream-downstream relationship.
Optionally, the identifying result is a score of an upstream-downstream relationship of the two or more enterprise nodes, and the method further includes:
and sorting the upstream and downstream relations of the two or more enterprise nodes according to the scores of the upstream and downstream relations of the two or more enterprise nodes to obtain the upstream and downstream relations of the enterprises with different confidence degrees.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the method for mining the upstream and downstream relationships of the enterprise belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the method for mining the upstream and downstream relationships of the enterprise.
An embodiment of the present disclosure also provides a computer-readable storage medium storing computer instructions that, when executed by a processor, are configured to:
acquiring an enterprise network sample;
performing node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node;
the vector of the enterprise node is input into a graph neural network model for iteration, and the graph neural network model uses the upstream and downstream relation characteristics of the enterprise node to train the upstream and downstream relation identification of the enterprise node in the iteration process, wherein the upstream and downstream relation characteristics of the enterprise node comprise characteristic expression of the enterprise node, characteristic expression of neighbor nodes of the enterprise node and characteristic expression of the relation between the enterprise node and the neighbor nodes;
and (5) after the iteration is finished, obtaining a trained graph neural network model.
Optionally, the performing node embedded vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural feature of the enterprise node includes:
and performing node embedding vector expression calculation on the enterprise network sample by using a node2vec algorithm to obtain a vector for expressing the structural characteristics of the enterprise node.
Optionally, the method further comprises:
generating a vector for expressing the attribute characteristics of the enterprise node according to the enterprise attribute of the enterprise node;
and combining the vector for expressing the structural characteristics of the enterprise node with the vector for expressing the attribute characteristics of the enterprise node to obtain the vector of the enterprise node.
Optionally, iterating the vector input graph neural network model of the enterprise node includes:
inputting the vector of the enterprise node into a breadth-adaptive function based on an attention mechanism to aggregate, so as to obtain an aggregate feature expression of a next iteration process of the neighbor node aggregated by the enterprise node;
inputting the vector of the enterprise node and the aggregate feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node;
updating the next iteration process to be the current iteration process;
combining the characteristic expression of the current iteration process of the enterprise node, the characteristic expression of the current iteration process of the neighbor node of the enterprise node and the characteristic expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship characteristic of the enterprise node;
Judging the characteristics of the upstream and downstream relations of the enterprise nodes through the full connection layer to obtain the identification result of the upstream and downstream relations of the enterprise nodes;
judging whether the identification result meets the minimum loss or not;
under the condition of minimum loss, entering the iteration to end to obtain a trained graph neural network model;
under the condition that the minimum loss is not met, updating parameters of the graph neural network model, inputting the feature expression of the current iteration process of the enterprise node into a breadth self-adaptive function based on an attention mechanism for aggregation, and obtaining an aggregation feature expression of the next iteration process of the neighbor node aggregated by the enterprise node; inputting the feature expression of the current iteration process of the enterprise node and the aggregation feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; re-entering the step of updating the next iterative process to the current iterative process.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the method for establishing the enterprise upstream and downstream relationship identification model belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the method for establishing the enterprise upstream and downstream relationship identification model.
An embodiment of the present disclosure also provides another computer-readable storage medium storing computer instructions that, when executed by a processor, are configured to:
carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing structural characteristics of the enterprise nodes;
and inputting the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by the method for establishing the enterprise upstream and downstream relationship identification model according to any embodiment of the specification, and outputting the identification result of the upstream and downstream relationship of the two or more enterprise nodes.
Optionally, the method further comprises:
generating respective vectors for expressing the attribute characteristics of the enterprise nodes of the two or more enterprise nodes according to the respective enterprise attributes of the two or more enterprise nodes;
and combining the vector for expressing the structural characteristics of the enterprise nodes with the vector for expressing the attribute characteristics of the enterprise nodes for the two or more enterprise nodes respectively to obtain the respective vectors of the two or more enterprise nodes.
Optionally, the method further comprises:
acquiring enterprise transaction data;
calculating shipping dispersion and transaction frequency of enterprises according to the enterprise transaction data;
judging enterprises belonging to the sales substituting mode by utilizing the shipping dispersion and the transaction frequency;
and outputting the business belonging to the expense-over mode and the buyer business corresponding to the transaction as an upstream-downstream relationship.
Optionally, the method further comprises:
acquiring enterprise transaction data;
calculating the transaction frequency of enterprises and the number of the trade commodities according to the enterprise transaction data;
judging the enterprises belonging to the commodity feeding mode by utilizing the transaction frequency and the number of the commodity subjected to uniform transaction;
and outputting the business belonging to the commodity feeding mode and the buyer business corresponding to the transaction as an upstream-downstream relationship.
Optionally, the identifying result is a score of an upstream-downstream relationship of the two or more enterprise nodes, and the method further includes:
and sorting the upstream and downstream relations of the two or more enterprise nodes according to the scores of the upstream and downstream relations of the two or more enterprise nodes to obtain the upstream and downstream relations of the enterprises with different confidence degrees.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the method for mining the upstream and downstream relationships of the enterprise belong to the same concept, and details of the technical solution of the storage medium, which are not described in detail, can be referred to the description of the technical solution of the method for mining the upstream and downstream relationships of the enterprise.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may be performed in other order or simultaneously according to the embodiments of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments described in the specification.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teaching of the embodiments. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims (18)

1. A method for establishing an enterprise upstream and downstream relation recognition model comprises the following steps:
acquiring an enterprise network sample;
performing node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node, wherein the vector for expressing the structural characteristics of the enterprise node is used for identifying all neighbor nodes of the enterprise node in the enterprise network;
generating a vector for expressing the attribute characteristics of the enterprise node according to the enterprise attribute of the enterprise node;
combining the vector for expressing the structural characteristics of the enterprise node with the vector for expressing the attribute characteristics of the enterprise node to obtain the vector of the enterprise node;
the vector of the enterprise node is input into a graph neural network model for iteration, and the graph neural network model uses the upstream and downstream relation characteristics of the enterprise node to train the upstream and downstream relation identification of the enterprise node in the iteration process, wherein the upstream and downstream relation characteristics of the enterprise node comprise characteristic expression of the enterprise node, characteristic expression of neighbor nodes of the enterprise node and characteristic expression of the relation between the enterprise node and the neighbor nodes;
and (5) after the iteration is finished, obtaining a trained graph neural network model.
2. The method of claim 1, the performing node embedded vector expression computation on the enterprise network sample to obtain a vector for expressing enterprise node structural features comprising:
and performing node embedding vector expression calculation on the enterprise network sample by using a node2vec algorithm to obtain a vector for expressing the structural characteristics of the enterprise node.
3. The method of claim 1, the iterating the vector input graph neural network model of the enterprise node comprising:
inputting the vector of the enterprise node into a breadth-adaptive function based on an attention mechanism to aggregate, so as to obtain an aggregate feature expression of a next iteration process of the neighbor node aggregated by the enterprise node;
inputting the vector of the enterprise node and the aggregate feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node;
updating the next iteration process to be the current iteration process;
combining the characteristic expression of the current iteration process of the enterprise node, the characteristic expression of the current iteration process of the neighbor node of the enterprise node and the characteristic expression of the relationship between the enterprise node and the neighbor node to form the upstream and downstream relationship characteristic of the enterprise node;
Judging the characteristics of the upstream and downstream relations of the enterprise nodes through the full connection layer to obtain the identification result of the upstream and downstream relations of the enterprise nodes;
judging whether the identification result meets the minimum loss or not;
under the condition of minimum loss, entering the iteration to end to obtain a trained graph neural network model;
under the condition that the minimum loss is not met, updating parameters of the graph neural network model, inputting the feature expression of the current iteration process of the enterprise node into a breadth self-adaptive function based on an attention mechanism for aggregation, and obtaining an aggregation feature expression of the next iteration process of the neighbor node aggregated by the enterprise node; inputting the feature expression of the current iteration process of the enterprise node and the aggregation feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node; re-entering the step of updating the next iterative process to the current iterative process.
4. An apparatus for establishing an enterprise upstream and downstream relation recognition model, comprising:
a sample acquisition module configured to acquire an enterprise network sample;
The sample vector calculation module is configured to perform node embedding vector expression calculation on the enterprise network samples to obtain vectors for expressing enterprise node structural features, wherein the vectors for expressing the enterprise node structural features are used for identifying all neighbor nodes of the enterprise node in an enterprise network;
a sample attribute feature calculation module configured to generate a vector for expressing the attribute feature of the enterprise node according to the enterprise attribute of the enterprise node;
a sample vector combination module configured to combine a vector for expressing the structural feature of the enterprise node with a vector for expressing the attribute feature of the enterprise node to obtain a vector of the enterprise node;
the model iteration module is configured to iterate the vector input of the enterprise node into a graph neural network model, and the graph neural network model uses the upstream and downstream relationship characteristics of the enterprise node to train the upstream and downstream relationship identification of the enterprise node in the iteration process, wherein the upstream and downstream relationship characteristics of the enterprise node comprise the characteristic expression of the enterprise node, the characteristic expression of the neighbor node of the enterprise node and the characteristic expression of the relationship between the enterprise node and the neighbor node; and (5) after the iteration is finished, obtaining a trained graph neural network model.
5. The apparatus of claim 4, the sample vector computation module configured to perform node-embedded vector expression computation on the enterprise network samples using a node2vec algorithm to obtain vectors for expressing enterprise node structural features.
6. The apparatus of claim 4, the model iteration module comprising:
the initial vector input sub-module is configured to aggregate vector input of the enterprise nodes based on breadth self-adaptive functions of an attention mechanism, so as to obtain an aggregate feature expression of a next iteration process of neighbor nodes aggregated by the enterprise nodes;
the initial feature updating sub-module is configured to input the vector of the enterprise node and the feature expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the feature expression of the next iteration process of the enterprise node;
an iteration process updating sub-module configured to update a next iteration process to a current iteration process;
a relation feature combination sub-module configured to combine the feature expression of the current iteration process of the enterprise node, the feature expression of the current iteration process of the neighbor node of the enterprise node, and the feature expression of the relation between the enterprise node and the neighbor node to form an upstream-downstream relation feature of the enterprise node;
The upstream and downstream distinguishing sub-module is configured to distinguish the upstream and downstream relation characteristics of the enterprise node through the full connection layer to obtain the identification result of the upstream and downstream relation of the enterprise node;
a loss determination sub-module configured to determine whether the recognition result satisfies a loss minimum;
the iteration ending submodule is configured to enter the iteration ending step to obtain a trained graph neural network model under the condition of minimum loss;
a parameter updating sub-module configured to update parameters of the graph neural network model if the minimum loss is not satisfied;
the neighbor feature updating sub-module is configured to input the feature expression of the current iteration process of the enterprise node into a breadth self-adaptive function based on an attention mechanism for aggregation to obtain an aggregation feature expression of the next iteration process of the neighbor node aggregated by the enterprise node;
the node characteristic updating sub-module is configured to input the characteristic expression of the current iteration process of the enterprise node and the aggregation characteristic expression of the next iteration process of the neighbor nodes aggregated by the enterprise node into a depth self-adaptive function based on an LStM operator to obtain the characteristic expression of the next iteration process of the enterprise node; and re-triggering the execution of the updating submodule of the iterative process.
7. A method of enterprise upstream and downstream relationship mining, comprising:
carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors for expressing the structural characteristics of the enterprise nodes of the two or more enterprise nodes, wherein the vectors for expressing the structural characteristics of the enterprise nodes are used for identifying all neighbor nodes of the enterprise nodes in the enterprise network;
generating a vector for expressing the attribute characteristics of the enterprise node according to the enterprise attribute of the enterprise node;
combining the vector for expressing the structural characteristics of the enterprise node with the vector for expressing the attribute characteristics of the enterprise node to obtain the vector of the enterprise node;
inputting the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by the method for establishing the enterprise upstream and downstream relationship identification model according to claim 1, and outputting the identification result of the upstream and downstream relationship of the two or more enterprise nodes.
8. The method of claim 7, further comprising:
generating respective vectors for expressing the attribute characteristics of the enterprise nodes of the two or more enterprise nodes according to the respective enterprise attributes of the two or more enterprise nodes;
And combining the vector for expressing the structural characteristics of the enterprise nodes with the vector for expressing the attribute characteristics of the enterprise nodes for the two or more enterprise nodes respectively to obtain the respective vectors of the two or more enterprise nodes.
9. The method of claim 7, further comprising:
acquiring enterprise transaction data;
calculating shipping dispersion and transaction frequency of enterprises according to the enterprise transaction data;
judging enterprises belonging to the sales substituting mode by utilizing the shipping dispersion and the transaction frequency;
and outputting the business belonging to the expense-over mode and the buyer business corresponding to the transaction as an upstream-downstream relationship.
10. The method of claim 7, further comprising:
acquiring enterprise transaction data;
calculating the transaction frequency of enterprises and the number of the trade commodities according to the enterprise transaction data;
judging the enterprises belonging to the commodity feeding mode by utilizing the transaction frequency and the number of the commodity subjected to uniform transaction;
and outputting the business belonging to the commodity feeding mode and the buyer business corresponding to the transaction as an upstream-downstream relationship.
11. The method of claim 7, the recognition result being a score of an upstream-downstream relationship of the two or more enterprise nodes, the method further comprising:
And sorting the upstream and downstream relations of the two or more enterprise nodes according to the scores of the upstream and downstream relations of the two or more enterprise nodes to obtain the upstream and downstream relations of the enterprises with different confidence degrees.
12. An apparatus for enterprise upstream and downstream relationship mining, comprising:
the node vector calculation module is configured to perform node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors for expressing enterprise node structural features of the two or more enterprise nodes, wherein the vectors for expressing the enterprise node structural features are used for identifying all neighbor nodes of the enterprise nodes in the enterprise network;
a node attribute feature calculation module configured to generate a vector for expressing an attribute feature of the enterprise node for each of the two or more enterprise nodes according to the enterprise attribute of each of the two or more enterprise nodes;
a node vector combination module configured to combine, for the two or more enterprise nodes, a vector for expressing the structural features of the enterprise nodes and a vector for expressing the attribute features of the enterprise nodes, respectively, to obtain respective vectors of the two or more enterprise nodes;
The recognition module is configured to input the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by the method for establishing the enterprise upstream and downstream relationship recognition model according to claim 1, and output recognition results of the upstream and downstream relationship of the two or more enterprise nodes.
13. The apparatus of claim 12, further comprising:
a transaction data acquisition module configured to acquire enterprise transaction data;
a sales replacement feature calculation module configured to calculate shipping dispersion and transaction frequency of an enterprise from the enterprise transaction data;
the sales replacement mode judging module is configured to judge enterprises belonging to the sales replacement mode by utilizing the shipping dispersion and the transaction frequency;
and the expense relation output module is configured to output the business belonging to the expense relation mode and the buyer business corresponding to the transaction as an upstream-downstream relation.
14. The apparatus of claim 12, further comprising:
a transaction data acquisition module configured to acquire enterprise transaction data;
a commodity feeding feature calculation module configured to calculate a trade frequency of an enterprise and a commodity number of the trade according to the enterprise trade data;
The commodity feeding mode judging module is configured to judge enterprises belonging to the commodity feeding mode by utilizing the transaction frequency and the number of the commodity subjected to uniform transaction;
and the commodity-in relation output module is configured to output the commodity-in-relation enterprise corresponding to the transaction and the buyer enterprise belonging to the commodity-in-relation mode as an upstream-downstream relation.
15. A computing device, comprising:
a memory and a processor;
the memory is for storing computer-executable instructions, and the processor is for executing the computer-executable instructions:
acquiring an enterprise network sample;
performing node embedding vector expression calculation on the enterprise network sample to obtain a vector for expressing the structural characteristics of the enterprise node, wherein the vector for expressing the structural characteristics of the enterprise node is used for identifying all neighbor nodes of the enterprise node in the enterprise network;
generating a vector for expressing the attribute characteristics of the enterprise node according to the enterprise attribute of the enterprise node;
combining the vector for expressing the structural characteristics of the enterprise node with the vector for expressing the attribute characteristics of the enterprise node to obtain the vector of the enterprise node;
the vector of the enterprise node is input into a graph neural network model for iteration, and the graph neural network model uses the upstream and downstream relation characteristics of the enterprise node to train the upstream and downstream relation identification of the enterprise node in the iteration process, wherein the upstream and downstream relation characteristics of the enterprise node comprise characteristic expression of the enterprise node, characteristic expression of neighbor nodes of the enterprise node and characteristic expression of the relation between the enterprise node and the neighbor nodes;
And (5) after the iteration is finished, obtaining a trained graph neural network model.
16. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method of enterprise upstream and downstream relationship identification model establishment of any one of claims 1 to 3.
17. A computing device, comprising:
a memory and a processor;
the memory is for storing computer-executable instructions, and the processor is for executing the computer-executable instructions:
carrying out node embedding vector expression calculation on an enterprise network formed by two or more enterprises to obtain respective vectors of the two or more enterprise nodes for expressing structural characteristics of the enterprise nodes;
inputting the vectors of the two or more enterprise nodes into a trained graph neural network model obtained by the method for establishing the enterprise upstream and downstream relationship identification model according to claim 1, and outputting the identification result of the upstream and downstream relationship of the two or more enterprise nodes.
18. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method of enterprise upstream and downstream relationship mining of any one of claims 7 to 11.
CN202010153608.3A 2020-03-06 2020-03-06 Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship Active CN111382843B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010153608.3A CN111382843B (en) 2020-03-06 2020-03-06 Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010153608.3A CN111382843B (en) 2020-03-06 2020-03-06 Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship

Publications (2)

Publication Number Publication Date
CN111382843A CN111382843A (en) 2020-07-07
CN111382843B true CN111382843B (en) 2023-10-20

Family

ID=71217219

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010153608.3A Active CN111382843B (en) 2020-03-06 2020-03-06 Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship

Country Status (1)

Country Link
CN (1) CN111382843B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111915191A (en) * 2020-08-03 2020-11-10 支付宝(杭州)信息技术有限公司 Industrial chain identification method and device
CN112015907A (en) * 2020-08-18 2020-12-01 大连东软教育科技集团有限公司 Method and device for quickly constructing discipline knowledge graph and storage medium
CN112035683A (en) * 2020-09-30 2020-12-04 北京百度网讯科技有限公司 User interaction information processing model generation method and user interaction information processing method
CN113836903B (en) * 2021-08-17 2023-07-18 淮阴工学院 Enterprise portrait tag extraction method and device based on situation embedding and knowledge distillation

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106504084A (en) * 2016-11-16 2017-03-15 航天信息股份有限公司 A kind of method and system for recognizing core enterprise in supply chain
CN108182295A (en) * 2018-02-09 2018-06-19 重庆誉存大数据科技有限公司 A kind of Company Knowledge collection of illustrative plates attribute extraction method and system
CN108288115A (en) * 2018-03-15 2018-07-17 安徽大学 A kind of daily short-term express delivery amount prediction technique of loglstics enterprise
CN109255054A (en) * 2017-07-14 2019-01-22 元素征信有限责任公司 A kind of community discovery algorithm in enterprise's map based on relationship weight
WO2019081781A1 (en) * 2017-10-27 2019-05-02 Deepmind Technologies Limited Graph neural network systems for generating structured representations of objects
WO2019085328A1 (en) * 2017-11-02 2019-05-09 平安科技(深圳)有限公司 Enterprise relationship extraction method and device, and storage medium
CN109933703A (en) * 2019-03-14 2019-06-25 鹿寨知航科技信息服务有限公司 A kind of construction method of Intellectual Property Right of Enterprises appraisal Model
CN110245787A (en) * 2019-05-24 2019-09-17 阿里巴巴集团控股有限公司 A kind of target group's prediction technique, device and equipment
CN110489481A (en) * 2019-08-06 2019-11-22 北京邮电大学 Data analysing method, device and the data analytics server of industry data
CN110555455A (en) * 2019-06-18 2019-12-10 东华大学 Online transaction fraud detection method based on entity relationship
CN110570111A (en) * 2019-08-30 2019-12-13 阿里巴巴集团控股有限公司 Enterprise risk prediction method, model training method, device and equipment
WO2019242125A1 (en) * 2018-06-19 2019-12-26 平安科技(深圳)有限公司 Method and apparatus for acquiring upstream and downstream relationships between companies, terminal device and medium
CN110852856A (en) * 2019-11-04 2020-02-28 西安交通大学 Invoice false invoice identification method based on dynamic network representation

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120046992A1 (en) * 2010-08-23 2012-02-23 International Business Machines Corporation Enterprise-to-market network analysis for sales enablement and relationship building
US10274983B2 (en) * 2015-10-27 2019-04-30 Yardi Systems, Inc. Extended business name categorization apparatus and method
WO2019028261A1 (en) * 2017-08-02 2019-02-07 [24]7.ai, Inc. Method and apparatus for training of conversational agents

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106504084A (en) * 2016-11-16 2017-03-15 航天信息股份有限公司 A kind of method and system for recognizing core enterprise in supply chain
CN109255054A (en) * 2017-07-14 2019-01-22 元素征信有限责任公司 A kind of community discovery algorithm in enterprise's map based on relationship weight
WO2019081781A1 (en) * 2017-10-27 2019-05-02 Deepmind Technologies Limited Graph neural network systems for generating structured representations of objects
WO2019085328A1 (en) * 2017-11-02 2019-05-09 平安科技(深圳)有限公司 Enterprise relationship extraction method and device, and storage medium
CN108182295A (en) * 2018-02-09 2018-06-19 重庆誉存大数据科技有限公司 A kind of Company Knowledge collection of illustrative plates attribute extraction method and system
CN108288115A (en) * 2018-03-15 2018-07-17 安徽大学 A kind of daily short-term express delivery amount prediction technique of loglstics enterprise
WO2019242125A1 (en) * 2018-06-19 2019-12-26 平安科技(深圳)有限公司 Method and apparatus for acquiring upstream and downstream relationships between companies, terminal device and medium
CN109933703A (en) * 2019-03-14 2019-06-25 鹿寨知航科技信息服务有限公司 A kind of construction method of Intellectual Property Right of Enterprises appraisal Model
CN110245787A (en) * 2019-05-24 2019-09-17 阿里巴巴集团控股有限公司 A kind of target group's prediction technique, device and equipment
CN110555455A (en) * 2019-06-18 2019-12-10 东华大学 Online transaction fraud detection method based on entity relationship
CN110489481A (en) * 2019-08-06 2019-11-22 北京邮电大学 Data analysing method, device and the data analytics server of industry data
CN110570111A (en) * 2019-08-30 2019-12-13 阿里巴巴集团控股有限公司 Enterprise risk prediction method, model training method, device and equipment
CN110852856A (en) * 2019-11-04 2020-02-28 西安交通大学 Invoice false invoice identification method based on dynamic network representation

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Mengyuan Chen, et.al.Inference for network structure and dynamics from time series data via graph neural network.《arXiv》.2020,第1-10页. *
Pau Riba, et.al.Table detection in invoice documents by graph neural networks.《2019 International conference on document analysis and recognition》.2020,第1-5页. *
Stephan M. Wagner, et.al.Assessing the vulnerability of supply chains using graph theory.《International journal of Production economics》.2010,第121-129页. *
Sub2Vec: feature learning for subgraphs;Bijaya Adhikari, et.al;《Advances in Knowledge discovery and data mining》;第170-182页 *
基于子图特征的科学家合作网络链路预测;许爽等;《大连民族大学学报》;第22卷(第1期);第51-63页 *
彭志忠.供应链需求预测中的神经网络预测技术应用分析.《中国流通经济》.2007,(第12期),第15-17页. *

Also Published As

Publication number Publication date
CN111382843A (en) 2020-07-07

Similar Documents

Publication Publication Date Title
CN111382843B (en) Method and device for establishing enterprise upstream and downstream relationship identification model and mining relationship
US20220335501A1 (en) Item recommendations using convolutions on weighted graphs
CN110009174B (en) Risk recognition model training method and device and server
CN108320171A (en) Hot item prediction technique, system and device
CN112633927B (en) Combined commodity mining method based on knowledge graph rule embedding
AU2020260401A1 (en) Prospect recommendation
Malik et al. EPR-ML: E-Commerce Product Recommendation Using NLP and Machine Learning Algorithm
CN111695024A (en) Object evaluation value prediction method and system, and recommendation method and system
Zhang et al. Improvement of collaborative filtering recommendation algorithm based on intuitionistic fuzzy reasoning under missing data
CN114723535A (en) Supply chain and knowledge graph-based item recommendation method, equipment and medium
CN113506173A (en) Credit risk assessment method and related equipment thereof
Barik et al. A blockchain-based evaluation approach to analyse customer satisfaction using AI techniques
Zhang et al. Combination classification method for customer relationship management
Vaca et al. Buy & sell trends analysis using decision trees
Nagaraju et al. Methodologies used for customer churn detection in customer relationship management
Agarwal et al. A Comparative Study and enhancement of classification techniques using Principal Component Analysis for credit card dataset
Arutjothi et al. Assessment of probability defaults using K-means based multinomial logistic regression
CN115169960A (en) Supply chain wind control processing method and equipment
Ullah et al. Predicting Default Payment of Credit Card Users: Applying Data Mining Techniques
CN113762415A (en) Neural network-based intelligent matching method and system for automobile financial products
Kanamarlapudi et al. Classification and Prediction of Financial Datasets Using Genetic Algorithms
Kumar et al. Review and Analysis of Stock Market Data Prediction Using Data mining Techniques
KR20200029647A (en) Generalization method for curated e-Commerce system by user personalization
Chua et al. AI To Predict Price Movements in the Stock Market
Dinavahi et al. Customer Segmentation in Retailing using Machine Learning Techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant