CN116542323A - Training prediction method, system and storage medium for multivalent value chain evolution - Google Patents

Training prediction method, system and storage medium for multivalent value chain evolution Download PDF

Info

Publication number
CN116542323A
CN116542323A CN202310580677.6A CN202310580677A CN116542323A CN 116542323 A CN116542323 A CN 116542323A CN 202310580677 A CN202310580677 A CN 202310580677A CN 116542323 A CN116542323 A CN 116542323A
Authority
CN
China
Prior art keywords
client
node
local
server
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310580677.6A
Other languages
Chinese (zh)
Inventor
唐小川
刘鑫
王宇
李莹莹
胡强
李冬芬
樊超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Univeristy of Technology
Original Assignee
Chengdu Univeristy of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Univeristy of Technology filed Critical Chengdu Univeristy of Technology
Priority to CN202310580677.6A priority Critical patent/CN116542323A/en
Publication of CN116542323A publication Critical patent/CN116542323A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Development Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a training prediction method, a training prediction system and a storage medium for multivalent value chain evolution, which comprise the following steps: generating a characteristic vector of a node based on local original data of the client and adjacent clients thereof, wherein the characteristic vector of the node where each client is located and node information form distributed time sequence diagram data; calculating local network parameters of the distributed federation time sequence diagram neural network model at the current moment, sending the local network parameters to a server, and establishing the distributed federation time sequence diagram neural network model in advance by a client; the server obtains global network parameters according to the local network parameters and broadcasts the global network parameters to all clients; and the client updates the network parameters of the distributed federal timing diagram neural network model by using the global network parameters, and the distributed federal timing diagram neural network model receives the distributed timing diagram data at the next moment to learn and outputs the predicted value of the edge at the next moment. Therefore, future relations among companies in the multi-value chain and whether a certain company is going to die can be predicted.

Description

Training prediction method, system and storage medium for multivalent value chain evolution
Technical Field
The invention belongs to the field of artificial intelligence, and particularly relates to a training prediction method, a training prediction system and a storage medium for multivalent value chain evolution.
Background
This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
Because of the great improvement of the machine hardware performance, the deep learning method based on the neural network has great success in the fields of intelligent recommendation, network performance prediction and the like, but the existing prediction method rarely considers dynamic time delay, is used for prediction under a given topology, and cannot output time-varying information.
With the emphasis of various social subjects on problems such as data compliance, trade secrets, information security, and trade competition, various companies, enterprises or individuals on the upstream and downstream sides in the multivalent value chain are not willing to disclose their own original data, so that the degree of sharing of data on the upstream, downstream sides is low, a large amount of data is not fully utilized, and the relationship between other subjects is difficult for the companies, enterprises or individuals in the multivalent value chain. Therefore, how to predict the relationships between the subjects without acquiring the original data of the subjects is an urgent problem to be solved by each enterprise in the value chain.
Because the relationships among the subjects in the multi-value chain are changed along with time, including but not limited to cooperative relationships, supply and demand flow relationships, business flow relationships and value flow relationships, if the subjects have good cooperative foundations, the cooperative relationships can be enhanced, and the conventional method cannot predict the cooperative relationships between the two subjects after a period of time. On this basis, it is difficult for each subject to know the cooperative relationship between other subjects. Therefore, it is not possible to predict the relationship between future subjects in the multi-value chain system and whether a subject is going to die or not, based on the time dimension.
Because of the extremely large number of nodes in the multi-value chain, the existence degree of the graph can be achieved by one node corresponding to a plurality of clients, when the quantity of the graph nodes is large, the difficulty is that communication among the nodes and message transmission and back propagation of the distributed graph neural network are carried out through the Internet, the communication quantity is too large, time consumption is very high, the degree difference of the graph nodes is large, for example, the degree of the scale-free network nodes is exponentially distributed, the degree of a few nodes is large, and the degree of most of the nodes is small. Since the number of messages sent/received by a node is proportional to the degree of the node, a node with a large degree will become a performance bottleneck of the entire distributed time-sequential neural network. The problem of excessive time consumption cannot be overcome by deleting nodes.
Therefore, how to predict the relationship between companies in future multivalent value chain systems and whether a company is going to die is a urgent issue to be resolved.
Disclosure of Invention
In order to solve the problems in the prior art, a training prediction method, a training prediction system and a storage medium for multi-valence chain evolution are provided, and the problems can be solved by using the training prediction method, the training prediction system and the storage medium.
The present invention provides the following.
In one embodiment, the present invention provides a training prediction method for multi-valence chain evolution, including: based on client c i And its neighboring client c j Generates the local raw data of the (b)Client c i Node v at which i Feature vectors of (a)Said feature vector->For characterizing said client c i Respectively with its adjacent client c j Information of edges between nodes, each client v i The feature vector of the node and the node information form distributed time chart data G t (V,E t ) V represents node V i Each node v i Representing a client, i.e., an enterprise; e is edge E ij Each edge e ij Representing node v i And v j Whether there is a cooperative relationship, a supply-demand stream, a service stream, a value stream; t represents time sequence, and values between 0 and T+1, because the distributed federal time sequence diagram neural network is adopted, the network of each node is a component part of the whole distributed federal time sequence diagram neural network, parameters of the neural network obtained by local training of each client are different, and no body in the network knows the complete distributed time sequence diagram data, and any body comprises a client or a server. The client side is in accordance with the characteristic vector of the node where the client side is located +. >Calculating local network parameters of a distributed federal timing diagram neural network model at the current moment, and sending the local network parameters to a server, wherein the client establishes the distributed federal timing diagram neural network model in advance; the server obtains global network parameters according to the local network parameters and broadcasts the global network parameters to all clients; the client updates the network parameters of the distributed federation time sequence diagram neural network model by utilizing the global network parameters, inputs the distributed time sequence diagram data at the next moment into the updated distributed federation time sequence diagram neural network model for learning, and outputs the edge e at the next moment ij Is a predicted value of (a). Preferably, said edge e ij Is a label vectorPreferably, the next time is time t+1.
One of the advantages of the above embodiment is that the nodes of each client only know the edges between itself and its neighboring nodes, and the server only knows the information of all nodes, and obtains the feature transformation matrix W through the federal timing diagram neural network l And (t) and a local loss function L (t), which are hidden with the transfer relation information of each node on the multi-value chain at the moment t, wherein the server and the node have no side information between other nodes.
In another embodiment, the present invention provides a training prediction method for multi-valence chain evolution, wherein generating a feature vector of a node where the client is located based on local original data of the client and its neighboring clients includes: at time t, each client c i Generating the client c based on local raw data i Node v at which i Feature vectors of (a)The client c i Acquiring client c adjacent to client c j Generating a neighbor list N (i) of neighbor nodes willing to participate in prediction, wherein i is the sequence number of the node, and j is the sequence number of the neighbor node.
In another embodiment, the present invention provides a training prediction method for multi-valence chain evolution, the method further comprising:
the client c i Updating the own feature vector by the following aggregation formula:
wherein,,representing the saidClient c i Feature vector of the node where the node is located, +.>Representing the neighbor client c j The feature vector of the node where the end is located, i represents the sequence number of the node, j represents the sequence number of the neighbor node, and the Aggregate aggregation mode comprises summation and/or averaging and/or maximization.
In another embodiment, the invention provides a training prediction method for multi-valence chain evolution, wherein the server records the response time of each client for completing forward propagation, arranges all clients in descending order, selects the nodes where the theta clients with the longest response time are located for regularization, and informs the theta nodes with the longest response time to reduce the number of neighbors.
In another embodiment, the present invention provides a training prediction method for multi-valence chain evolution, the method further comprising: at the client c i Aggregation neighbor client c j When the feature vector of the node is located, the client c i For each neighbor node v of the node where it is located j The client of E N (i) learns a correlation coefficient a ij Updating the feature vector of the client by adopting the following formula:
the correlation coefficient of most neighbors is made to approach 0 by regularization R (A) in the final loss function, the absolute value of the correlation coefficient of few neighbors is far greater than 0, wherein A is the correlation coefficient matrix, a ij As a coefficient of correlation (co) with the reference signal,representing the feature vector of the node where the neighbor client is located, i represents the sequence number of the node, j represents the sequence number of the neighbor node, W l (t) represents a local feature transformation matrix, and Linear represents a Linear operation.
In another embodiment, the present invention provides a training prediction method for multi-valence chain evolution, the method further comprising:
the regularization includes l 1 Regularization of
And l 2, Regularization of
Wherein said a ij E A, reserving important neighbors through a regularization method, and simultaneously removing a large number of unimportant neighbors, so as to adaptively reduce theta nodes v i Reducing the number of neighbors of client c i Is a traffic volume of (a) in the network. And updating the neighbor node set N (i) into k neighbors with highest correlation, so as to reserve a few important neighbor nodes and remove unimportant neighbor nodes.
In another embodiment, the present invention provides a training prediction method for multi-valence chain evolution, where the client c i The local network parameters generated include a local weight W E (t-1), local feature transformation matrix W l (t-1), a local loss function L (t-1); the server performs back propagation based on the local network parameters sent by each client to train a cyclic neural network to obtain the global network parameters, wherein the global network parameters comprise global weightsGlobal feature transformation matrix->And global loss function->Where t represents time of day.
In another embodimentIn an embodiment, the present invention provides a training prediction method for multi-valence chain evolution, where the client v i Generating feature vectors of nodes where the client is located by using local original datal represents the layer of the neural network model of the distributed federal timing diagram, i represents the sequence number of the node, t represents the time of day, preferably t epsilon 1, T]Or t.epsilon.1, T-1];
The client establishes a distributed federal timing diagram neural network model according to the node v i Feature vectors of (a)Calculating local weight W E (t-1), local feature transformation matrix W l (t-1), local loss function L (t-1), and node v i Connected edge e ij Feature vector +.>Wherein e ij (t-1) represents node v i With neighbor node v j Edge between v j ∈N(i);
The server S obtains the client c i Transmitted local weight W E (t-1), local feature transformation matrix W l (t-1) and the local loss function L (t-1) and performing back propagation to obtain global weightGlobal feature transformation matrix->And global loss function->Wherein t represents the time of day; preferably, the server uses Adam algorithm for back propagation to obtain global network parameters, the global network parameters comprising global weights +.>Global feature transformation matrix->And/or global loss function
Each node v i Updating the feature vector, wherein each client receives the global weight sent by the serverGlobal feature transformation matrix->And global loss function->And t represents the time and performs back propagation to obtain the pre-trained distributed federal timing diagram neural network. Preferably, T is sequentially taken from 0 to T.
In another embodiment, the invention provides a training prediction method for multi-valence value chain evolution, wherein the distributed federal time sequence diagram neural network model comprises a feature learning network and a label prediction network, the feature learning network mainly comprises a message transfer layer, and the message transfer layer is used for realizing graph convolution and graph meaning calculation; the label prediction network is composed of a full connection layer for generating an edge e ij Predicted value of (2)
In another embodiment, the present invention provides a training prediction method for multi-valence chain evolution, wherein the global penalty is calculated by the following formula:
wherein,,for edge loss, λ·r (a) is a regularization term, i is a sequence number of the node, j is a sequence number of a neighboring node of the node, and t is a time.
In another embodiment, the present invention provides a training prediction method for multi-valence chain evolution, wherein the client uses feature vectorsAnd->Calculating edge e ij Is a predicted value of (1):
wherein,,for the feature vector of the node where the client is located, < >>For the feature vector of the node where the neighbor client is located,/-for>E is edge E for updated global weight ij I represents the layer of the neural network model of the distributed federal timing diagram, i represents the sequence number of the node, j represents the sequence number of the neighbor node of the node, and t represents the time. In one embodiment, the client may send edge e ij Predicted value of +.>And sending the data to a server.
In another embodiment, the present invention provides a training prediction method for multi-valence chain evolution, wherein the client calculates the edge e by using the following formula ij Is a local loss function of:
wherein,,for the local loss function of the client, < > for >For edge e ij Is->For edge e ij I is the sequence number of the node, j is the sequence number of the neighbor node of the node, and t is the time.
In another embodiment, the invention provides a training prediction system for multi-valence value chain evolution, which comprises a server side and a plurality of clients; the client comprises a data acquisition module, an original data module, a local parameter module, a data labeling module, a message transfer module, a local training module and a blockchain, wherein the data acquisition module is used for acquiring and recording original data of the client, the original data module is used for storing the original data of the client, the local parameter module is used for storing all parameters of a local graph neural network of the client, the data labeling module labels the data for training of the graph neural network, the message transfer module is used for supporting multiple concurrent communication of the client and a communication module of a server and information uplink, the local training module is used for executing all calculation of the client, the calculation of a feature vector, training and prediction of the client federal time sequence graph neural network, and the blockchain is used for recording all communication data of the server and the client; the server side comprises a metadata module, a parameter database module, a parameter management module, a communication module, a global training module and a blockchain, wherein the metadata module is used for managing and storing metadata, the parameter database module is used for storing all parameters of a server circulating neural network, the parameter management module is used for managing and updating parameters of a graph neural network, the communication module is used for supporting multiple concurrent communication and information uplink between a server and a message transmission module of multiple clients, the global training module is used for fusing local network parameters sent by the multiple clients, and the blockchain is used for recording all communication data between the server and the clients; the server side and the client side comprise at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method according to any one of the embodiments of the present invention.
In another embodiment, the invention provides a computer readable storage medium storing a program which, when executed by a multi-core processor, causes the multi-core processor to perform a method according to any of the embodiments of the invention.
Other advantages of the present invention will be explained in more detail in connection with the following description and accompanying drawings.
It should be understood that the foregoing description is only an overview of the technical solutions of the present invention, so that the technical means of the present invention may be more clearly understood and implemented in accordance with the content of the specification. The following specific embodiments of the present invention are described in order to make the above and other objects, features and advantages of the present invention more comprehensible.
Drawings
The advantages and benefits described herein, as well as other advantages and benefits, will become apparent to those of ordinary skill in the art upon reading the following detailed description of the exemplary embodiments. The drawings are only for purposes of illustrating exemplary embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is a schematic diagram of a system architecture according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a distributed federal timing diagram neural network architecture according to an embodiment of the present invention.
In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In the description of embodiments of the present application, it should be understood that terms such as "comprises" or "comprising" are intended to indicate the presence of features, numbers, steps, acts, components, portions or combinations thereof disclosed in the present specification, and are not intended to exclude the possibility of the presence of one or more other features, numbers, steps, acts, components, portions or combinations thereof.
Unless otherwise indicated, "/" means or, e.g., A/B may represent A or B; "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone.
The terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", etc. may explicitly or implicitly include one or more such feature. In the description of the embodiments of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
All code in the present invention is exemplary and variations will occur to those skilled in the art depending upon the programming language used, the specific needs and personal habits, etc., without departing from the spirit of the invention.
In order to clearly illustrate the embodiments of the present application, concepts that may appear in some of the following embodiments will first be described.
The original data is actual data acquired by each main body in the multi-value chain in a specific period, and the actual data comprises characteristics and samples; the raw data may be collected by a rule and a manner specified by a person skilled in the art, where the raw data records the raw data of the client in the form of tensors or tables, and includes features and samples, for example, feature 1 is an automobile sales, and the samples include corresponding automobile sales data of one month, corresponding automobile sales data of two months, and automobile sales data of three months of different clients, and … …; feature 2 is an automobile inventory, and the samples comprise corresponding automobile inventory data of one month, corresponding automobile inventory data of two months and corresponding automobile inventory data of three months of different clients … …; feature 3 is the sales of the accessories, and the samples comprise corresponding data of sales of the accessories in March, data of sales of the accessories in February and data of sales of the accessories in March of different clients … …; feature 4 is an inventory of accessories, … …, and so on, and it should be apparent to those skilled in the art that the features and samples referred to herein are not limited to the above list, and that features may be features of some or all of the data actually collected or recorded by each client, and that samples may also be some or all of the data actually collected or recorded by each client. The characteristics, dimensions, types and the like of the original data of different clients capable of inputting the data can be different, the characteristics, dimensions, types and the like of the original data among different groups can be different, the clients can collect the original data in a month unit, can collect the original data in a week unit, can collect or record the original data in real time, and can be set arbitrarily according to actual needs by a person skilled in the art. Metadata is data describing objects such as information resources or data, and is used for the purpose of: identifying a resource; evaluating the resource; tracking the change of the resource in the using process; the realization is simple and the management of a large amount of networking data is high-efficient; the method and the device realize effective discovery, searching, integrated organization and effective management of the used resources of the information resources. Metadata is thus data that can be disclosed. Metadata includes characteristics, size, dimensions, attributes, generation time, type, shape, client flags, variable names, and dimensions of the input data. Metadata is data describing other data, or structural data for providing information about a certain resource, and may be generated by a client or a server side from input data or raw data of each client.
Based on the prediction method based on the evolution of the multi-value chain, the client only needs to share metadata, and does not need to share original data, so that the data privacy and safety of enterprises are protected, and under the condition of not sharing data, certain cooperation among enterprises in the multi-value chain is possible to be safely developed.
The invention provides a distributed federal time sequence diagram neural network, which adopts a distributed architecture of a server and multiple clients, wherein the server is mainly used for managing parameters and coordinating the clients to form the distributed federal time sequence diagram neural network. The federal time-sequence diagram neural network corresponds to a diagram at each moment, nodes in the diagram refer to clients or companies, wherein the clients correspond to the companies, edges refer to relationships between the clients or the companies, including cooperative relationships, trade relationships, supply-demand relationships and the like, and as time goes on, topological relationships of the diagram change, and edges in the diagram change. The federal time sequence diagram neural network constructed by the invention allows companies to predict the edges of future diagrams and predict whether the companies will die or not under the condition of not sharing the original data of the companies.
The present invention will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Referring first to FIG. 1, a schematic diagram of an environment 100 in which an exemplary implementation according to the present disclosure may be used is schematically illustrated.
Fig. 1 shows a schematic diagram of an example of a computing device 100 according to an embodiment of the disclosure. It should be noted that fig. 1 is a schematic structural diagram of a hardware running environment of a prediction method and device architecture of a multi-value chain. The embodiment of the invention is based on the fact that the client device can be a terminal device such as a PC, a portable computer and the like.
As shown in fig. 1, in the "server-multi-client" distributed architecture, the client is configured to train a sub-model for representing the node and the edge of the current graph locally on the client by using the original data, and preferably, the client includes an original data module, a local parameter module, a data acquisition module, a data labeling module, a message passing module, and a local training module. The system comprises an original data module, a data acquisition module, a local parameter module, a data labeling module, a message transfer module, a communication module and a local training module, wherein the original data module is used for storing original data of a client, the data acquisition module is used for acquiring and recording the original data of the client, the local parameter module is used for storing all parameters of a local graph neural network of the client, the data labeling module labels the data for training of the graph neural network, the message transfer module is used for supporting multiple concurrent communication of the client and a communication module of a server and information uplink, the local training module is used for training of a federal time sequence graph neural network of the client, and the blockchain is used for recording all communication of the server and the client. The server is used for managing global parameters of the graphic neural network, and preferably, the server side comprises a metadata module, a parameter database module, a parameter management module, a communication module and a global training module. The system comprises a metadata module, a parameter database module, a parameter management module, a communication module, a global training module, a server and a multi-client message transmission module, wherein the metadata module is used for managing and storing metadata, the parameter database module is used for storing all parameters of a server graph neural network, the parameter management module is used for managing and updating parameters of the graph neural network, the communication module is used for supporting the server and the multi-client message transmission module to carry out multi-concurrency communication and information uplink, the global training module is used for fusing data sent by a plurality of clients, and the blockchain is used for recording all communication between the server and the clients and tracing and anti-repudiation of information leakage. In one embodiment of the invention, each graph neural network is trained locally at the client, and the server performs mainly the parameter processing without the graph neural network, the processing including averaging. Further, in another embodiment of the present invention, each graph neural network is trained locally at the client, and the server uses the neural network for parameter processing, including fitting and/or predicting parameters using the neural network. The client and server may further include: a processor, such as a CPU, network interface, user interface, memory, communications bus. Wherein the communication bus is used for realizing connection communication among different components. The user interface may comprise a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface may further comprise a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface, bluetooth interface, 5G interface). The memory may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory may alternatively be a storage device separate from the aforementioned processor.
Those skilled in the art will appreciate that the structure of the server and client shown in fig. 1 does not constitute a limitation of the server or client, and that the server and client may include more or less components than illustrated, or may combine certain components, or may be a different arrangement of components.
Example 1
The prediction of the evolution state of the node is performed, as shown in fig. 2, and a program of an operating system, a network communication module, a user interface module and a federal time chart neural network may be included in a memory as a computer storage medium. A graph node may be set for each client, where the graph nodes correspond to the clients one by one. The operating system is a program for managing and controlling hardware and software resources of the client device, and supports the running of the federal timing diagram neural network program and other software or programs. The federal time-sequence diagram neural network corresponds to the topological structure of a diagram at each moment, and the edges in the diagram change with time under the condition that the nodes are unchanged. In the server shown in fig. 1, the communication module is mainly used for receiving and transmitting requests, data and the like between the server and multiple clients such as a client a, a client B, a client C, a client D, a client E and the like, and the message transfer module in the multiple clients is mainly used for receiving and transmitting requests, data and the like between each client and the server, wherein the number of the clients can be more or less; preferably, the communication module and the message passing module can perform data communication through respective network interfaces; and the processor may be configured to invoke the federal timing diagram neural network program stored in the memory and perform the following operations:
(1) The client establishes a distributed graph neural network model, which comprises a feature learning network and a label prediction network.
The invention provides a distributed graph neural network architecture of a server-multi-client structure, which consists of 1 server S and n clients { c } 1 ,c 2 ,…,c n A step of establishing a node v for each client corresponding to one client by one enterprise i The nodes corresponding to the clients in the multi-valence chain form a mesh of timing diagram data G t (V,E t ) In the present invention, the timing chart data G is provided without causing a collision t (V,E t ) The set of time diagram data at multiple moments is called multi-time phase diagram data, abbreviated as diagram G, which exists in the whole network space or logically, each client c i Corresponding to a node v in the graph G i Thus client c i May also be denoted as client v i The two may be interchanged without contradiction. Because the distributed federal timing diagram neural network is adopted, the network of each node is a component part of the whole distributed federal timing diagram neural network, parameters of the neural network obtained by local training of each client are different, no main body in the network knows the complete distributed timing diagram data, and any main body comprises any client or server. Wherein each client c i Can only with neighbor node v j Client c of (2) j And communicating with the server, wherein n is a natural number or n is a positive integer, i represents the sequence number of the node and takes a value between 0 and n, j represents the sequence number of the neighbor node and takes a value between 0 and n. In one embodiment, the adjacent clients in the present inventionOr a neighbor node refers to a first-order neighbor node. In another embodiment, the neighbor nodes in the present invention include first order neighbor nodes and/or second order neighbor nodes. In another embodiment, the neighbor nodes in the present invention include first-order neighbor nodes and/or second-order neighbor nodes and/or other higher-order neighbor nodes. For any one node v i It constructs the first feature, namely the structural relationship of the graph, on the graph, neighbor list N (i).
In an alternative embodiment, the server sends a request to the client to build the distributed neural network, the request message including the data resources required for the requested training, preferably metadata, such as description information of value flows, supply and demand flows, traffic flows, etc. If client c i If participation is desired, a feedback message to the server indicates acceptance of the request. In another alternative embodiment, client c i Can actively send a request message for applying to join the neural network prediction to a server, and the server selects whether to agree with the client c i Joining, if client c is agreed to i Joining, client c i A graph neural network is created. In the invention, the client side c does not cause contradiction i Refers to node v i And the corresponding client. In the invention, the node realizes the receiving and transmitting of signals or the calculation of calculation force through the corresponding client. The distributed graph neural network model includes a feature learning network and a label prediction network. The feature learning network mainly comprises a message transfer layer, a client receives neighbor messages, and the operations of graph convolution GNN, graph meaning force GAT, graph transform and the like are realized through the message transfer layer, and the label prediction network comprises a full connection layer. In the present invention, the global loss functionCan also be expressed as +.>Global loss function->Can also be expressed asThe meaning is the same. Preferably, as shown in fig. 2, the nodes a, B, C, D, and E correspond to the clients a, B, C, D, and E, respectively, and the local parameters of the client a at the time t-1 at least include the local feature transformation matrix W in the dashed frame beside the client a l (t-1), local weight W E (t-1) and local Loss function Loss A (t-1) in the present invention, the local Loss function Loss A (t-1) represents the Loss function of the client A, the local Loss function of each node is represented as Loss (t-1), loss (t-1) is abbreviated as L (t-1), and the local parameters at the time t at least comprise a local feature transformation matrix W in a dashed line frame beside the client A l (t), local weight W E (t) and local Loss function Loss A (t), correspondingly, each other node such as node B, node C, node D, node E and the like comprises a corresponding local parameter (not shown in the figure) at a corresponding time, and the global parameter in the server at least comprises a global feature transformation matrix +_of t-1>Global weightingAnd global loss function->And the global feature transformation matrix at time t>Global weight->And global loss function->Where t represents time, time or sequence. Node B can only communicate with the server,Node A and node C, node D can only communicate with the server, node A, node C and node E, and so on, and other nodes are similarly described herein and below.
(2) The server obtains a graph node set V.
When the server agrees to client c i When joining the graph neural network, the server sends the graph neural network to the client c i Sending an agreement token for client c i Training to join the neural network of the graph. In an alternative embodiment, client c i Inviting neighbor clients with related data to participate in modeling, namely training a graph neural network evolution prediction model of a multi-value chain, and determining a neighbor list N (i) willing to participate, wherein the neighbor list does not comprise a client c i I is the serial number of the client, i is a natural number or i is a positive integer. Each client acquires feature vectors of neighbor nodes. Thereby logically forming a distributed memory timing diagram data G t (V,E t ) Where t=0, 1,2, … represents time or timing, the server does not know the complete timing diagram data G t (V,E t ) The method comprises the steps of carrying out a first treatment on the surface of the V denotes a set of nodes, each node V i Representing a client, i.e., an enterprise; e is a collection of edges, each edge E ij Representing node v i And v j Whether there is a cooperative relationship, supply and demand flow, service flow and value flow, and the label vector corresponding to the edge is recorded asAt this time, the server knows the graph node set V, but does not know the edge set E, the edge tag vectorThe functions of the client and the server are as follows:
client c i For processing a node v in the graph G i Related calculation of v i E V, comprising: computing node v i Feature vectors of (i.e. compute and node v) i Connected edge e ij Feature vectors of (a)v j E N (i), predictive node v i With neighbor node v j Weights w of the continuous edges of (2) ij Wherein i represents the node v i J represents the neighbor node v j Is a sequence number of (c). Transmitting and/or receiving graph node v i Feature vector on, graph node v i And the error is counter-propagated. Client c i Neighbor node information v with corresponding graph node j And side information e ij ,v j ∈N i Label of edge->Preferably, the node v i With neighbor node v j Weight w of continuous edge at time t ij I.e. edge tag->
The third party server is responsible for coordinating the training process of the whole graph neural network, and comprises the following steps: the losses of the various nodes are collected and global losses are calculated, global parameters are updated using Adam algorithm, and back propagation of all client sides is initiated. As shown in fig. 2, the third party server is also called a server, one or more servers may be provided, a distributed server cluster may also be used to implement related functions, and a neural network may also be provided on the server. In an embodiment, the neural network on the server is different from the neural network of the client, and preferably, the neural network on the server is implemented by using any one or more of a cyclic graph neural network (RNN), LSTM, graph convolution GNN, graph meaning force GAT, or graph transform. In another embodiment, when the server generates the current time parameter, the parameter data in the server can be taken into consideration in historical time, and the generated current time parameter is broadcasted to all clients or designated clients. In another embodiment, the neural network on the server is similar to the neural network of the client, is part of a distributed neural network, and corresponds to one node of the distributed neural network. In another embodiment, there is no neural network on the server, only uploading to all clients at each instant. For the purpose of protecting privacy, the third party server does not know the global topological structure information of the graph G, namely the node V of the graph, but does not know the connecting edge E of the graph, so that privacy protection of each node is realized.
(3) Training the graph neural network.
(1) The client updates the node feature vector;
the method comprises the steps of firstly, respectively utilizing the graph snapshot of each moment to learn graph node characteristics, wherein the graph snapshot of each moment refers to time sequence graph data of a certain moment, then utilizing multi-time phase graph data to learn the evolution characteristics of the node characteristics along with time, and finally utilizing the obtained graph node characteristics to predict graph edges. Specifically, in the r-th iteration, the following operations are performed on the first layer of the graph neural network at the t-th moment:
for each graph node v i There is a client v i Is fully responsible for processing learning tasks on the node. Node v i The feature vector update rule of (1) is:
wherein v is j E N (i) is node v i Is a neighbor node of (a). Node v i Node updating is realized in a distributed mode through the Internet: the node v i To neighbor node v through the internet j Sending a request to obtain v j Node feature vector of (a)v i After the feature vectors of all the neighbor nodes are received, the feature vectors of the neighbor nodes are aggregated to update the feature vectors of the neighbor nodes The optional Aggregate aggregation mode is any one or combination of summation, average, maximum and the like.
Node v i Transmitting a request to obtain a global feature transformation matrix at t moment to a server Wherein the server utilizes->Computing a global feature transformation matrix at time t>I.e. the weight matrix learns the timing effect of the multi-temporal pattern neural network. By->For->Performing linear transformation to obtain a node v i Final features at layer I
And repeating the steps, and calculating the feature vector of each layer of nodes.
(2) The client selects a neighbor;
the distributed graph neural network transmits and back propagates messages through the Internet, and has large traffic and long time consumption. The degree of graph nodes varies greatly, for example, the degree of a scaleless network node is exponentially distributed, the degree of a few nodes is large, and the degree of most nodes is small. Since the number of messages sent/received by a node is proportional to the degree of the node, a node with a large degree will become a performance bottleneck of the entire distributed time-sequential neural network. To address this problem, client c when it aggregates neighbor information i For each neighbour node v j E N (i) learning oneCorrelation coefficient a ij The update matrix becomes:
by regularizing R (A) in the final global loss function, the correlation coefficient of most neighbor nodes approaches to 0, the absolute value of the correlation coefficient of few neighbors is far greater than 0, and the preferred regularization method is l 1 Regularization of
R(A)=∑ i,j |a ij |
And l 2,1 Regularization of
And updating the neighbor node set N (i) into k neighbors with highest correlation, so as to reserve a few important neighbor nodes and remove unimportant neighbor nodes. The number of communication messages generated by forward propagation and backward propagation in the training process of the distributed graph neural network is reduced, and the communication efficiency is remarkably improved.
Optionally, the server records the response time of each node for completing forward propagation, arranges all clients in descending order, selects the θ nodes with the longest response time for regularization, and notifies the θ nodes with the longest response time to reduce the number of neighbors.
(3) Predicting an edge label by a client;
client v i Computing edge e using a full connection layer ij Predictive value of the tag. The client acquires global weight of the edge prediction full connection layer from the serverCalculating edge e ij Preferably, the edge e ij Is a label vector
/>
Wherein the method comprises the steps ofThe distance between the node i and the node j is represented as a distance function, alternatively, the distance function may employ a cosine distance, a euclidean distance, or a mahalanobis distance. In an embodiment, the client c i Or client c j Edge e can be used ij Predicted value of +.>And sending the data to a server. The local edge loss function is
Client v i Will locally edge lossAnd sending the data to a server.
(4) The server updates the global parameters;
the server performs global weight matrix according to the t-1 momentCalculate global weight matrix at time t>Alternatively, RNN, GRU, LSTM or transducer is used. Alternatively, the server S will +.>Broadcasting to all clients, and reducing the waiting time of the clients for receiving the global parameters. The server calculates the global penalty as the sum of the node penalty and the edge penalty, i.e
Wherein,,for edge loss, λ·r (a) is a regularization term.
(5) The server initiates back propagation;
the server begins back propagation after computing the global penalty. Parameters of the model were optimized using Adam algorithm. In an embodiment, the neural network on the server is different from the neural network of the client, and preferably, the neural network on the server is implemented by using any one or more of a cyclic graph neural network (RNN), LSTM, graph convolution GNN, graph meaning force GAT, or graph transform. In another embodiment, the neural network on the server is similar to the neural network of the client, is part of a distributed neural network, and corresponds to one node of the distributed neural network. The server is responsible for updating global parameters And->And->k represents the number of global parameters. In an embodiment, the server broadcasts the global parameter to clients and initiates back propagation for all clients. In another embodiment, the server broadcasts the global parameter to clients that decide whether to initiate back propagation. The client is responsible for updating the local parameters, including the correlation coefficient matrix a.
Optionally, a blockchain is established, and communication traffic such as node feature vectors, parameters, gradients and the like between the client side and between the server and the client side is stored in the blockchain, so that repudiation is prevented, and whether privacy leakage behaviors exist in the audit server and the client side or not is supported.
(4) Predicting labels of edges.
In one embodiment, to predict the T-moment edge e ij The server uses the graph data { G } at time 0 to time T-1 t (V, E) T E {0,1, …, T-1}, training the completed distributed federal timing diagram neural network model, informing the client to input the diagram G at time T T (V,E)\e ij In one embodiment, the server ultimately outputs edge e ij In another embodiment, edge e is finally output by the client ij Is the predicted value of the label of client c) i Or client c j Or a designated client. In another embodiment, to predict the T+1 time edge e ij Is a label of (2)The server uses the graph data { G } from time 0 to time T t (V, E) T E {0,1, …, T }, the trained distributed federal timing diagram neural network model informs the client to input the diagram G at time T+1 T (V,E)\e ij Final output edge e of server ij In another embodiment, edge e is finally output by the client ij Is the predicted value of the label of client c) i Or client c j Or a designated client. />
Example two
And predicting the evolution state of the node, wherein the manufacturing industry multivalent value chain is a complex system, and complex competition-cooperation relationship exists between enterprises. With the updating of technology, innovative enterprises are continuously emerging, and the enterprises with the technology behind are gradually eliminated, so that the evolution state of enterprise nodes is difficult to predict. The following steps are used for predicting the evolution state of the enterprise node on the multi-value chain by using the neural network of the distributed federal time sequence diagram:
(1) And obtaining the characteristic vector among enterprises in the value chain.
The server S can be a third party server, and enterprises in the value chain serve as clientsEnd { v 1 ,v 2 ,…,v n Where n represents the number of clients, the server sends a request to the clients to establish the distributed graph neural network, the request message including description information of data resources required for training, such as value flows, supply and demand flows, traffic flows, etc. If each enterprise client v i If participation is desired, a feedback message to the server indicates acceptance of the request. v i Inviting the neighbor client to participate in modeling, determining a neighbor list N (i) willing to participate.
(2) And establishing a distributed graph neural network model.
The distributed graph neural network is divided into a feature learning network and a label prediction network. The feature learning network mainly comprises a message transmission layer, and the operations of graph convolution, graph meaning force and the like are realized through message transmission. The label prediction network consists of a fully connected layer. The architecture of the distributed graph neural network is a client-server structure, and consists of a server side and n client sides { c } 1 ,c 2 ,…,c n Each client corresponds to the time chart data G t (V,E t ) Wherein t=0, 1,2, … is time; v is node V i Each node v i Representing an enterprise, in the invention, under the condition that the nodes are in one-to-one correspondence with the clients, the node v i And client v i Client c i Can be expressed mutually; in the case where one node corresponds to a plurality of clients, node v i And client v i The representation needs to be distinguished; e (E) t For time t edge e ij Each edge e ij Representing enterprise node v i And v j Network traffic such as value flows, supply and demand flows, service flows and/or technical flows among the label vectors are recorded as The server knows the set of graph nodes V, but does not know the set of edges E t Edge tag vector->Due to the respective nodes v i Clear oneself and neighborLiving node v j Weights w of the continuous edges of (2) ij In one embodiment, the weight w of the edge is calculated ij Tag vector as said border +.>I.e. each node v i Label vector for clarifying all sides of oneself->The label vector of an edge is simply referred to as the label of the edge.
The enterprise client consists of original data, local parameters, a data acquisition module, a data labeling module, a message transmission module and a local training module, and has the function of processing one node v in the graph G i E V, comprising: computing node v i Is calculated and node v i Connected edge e ij Feature vector of (t-1), j represents neighbor node v j Sequence number v of (v) j E N (i), predictive node v i Weight w of connecting edge with other nodes in graph ij J represents a neighbor node v j Sequence number v of (v) j E N (i), the characteristic vectors on the neighbor nodes are transmitted and/or received between the nodes participating in the evolution prediction of the multi-value chain, and preferably, the graph node v i Receiving feature vectors on neighbor nodes, graph node v i And the error is counter-propagated. Client v i Neighbor node information v with corresponding graph node j And side information e ij ,v j E N (i) and tag vector with edge
The server side consists of metadata, a parameter database, a parameter management module, a communication module and a global training module, and has the function of coordinating the training process of the whole graph neural network, and comprises the following steps: the losses of each node are collected, global losses are calculated, global parameters are updated, and back propagation is initiated, wherein the back propagation is initiated, and the back propagation comprises back propagation of an initiating server side and/or back propagation of an initiating client side. The third party server is unaware of the global topology of graph GThe structure information can know the node E of the graph, but not the weight of the connecting edge E or the edge of the graph or the label vector of the edge
(3) And training a federal timing diagram neural network model.
The client and the server jointly train the distributed federal time sequence diagram neural network model, and mainly comprise the following five steps:
(1) updating node feature vectors on clients of retailers, dealers, factories and other enterprises on the value chain; firstly, respectively utilizing the graph snapshot at each moment to learn the graph node characteristics, then utilizing the multi-time phase graph data to learn the evolution characteristics of the node characteristics along with time, and finally utilizing the obtained graph node characteristics to predict the graph nodes/edges. Specifically, in the r-th iteration, the following operations are performed on the first layer of the neural network at the t-th moment:
For each graph node v i There is a client v i Is fully responsible for processing learning tasks on the node. Node v i The feature vector update rule of (1) is:
node v i To neighbor node v through the internet j Sending a request to obtain v j Node feature vector of (a)v i After the feature vectors of all the neighbor nodes are received, the feature vectors of the neighbor nodes are aggregated to update the feature vectors of the neighbor nodes The optional Aggregate aggregation mode is summing, averaging and optimizingLarge, etc.
Node v i Transmitting a request to a server to acquire a feature transformation matrix at t timeWherein the server utilizesCalculate global weight matrix at time t>When the information at the time t is calculated, the information at the historical time is synchronously considered, namely, the time sequence effect of the multi-time phase diagram neural network is learned by the weight matrix. Preferably, the synchronization takes into account the amount at time T-1. Preferably, the synchronization takes into account the amount at time 0 to T-1. Preferably, the synchronization takes into account the amount of 0 to T moments. By->For a pair ofPerforming linear transformation to obtain a node v i Final features in layer I->
And repeating the steps, and calculating the feature vector of each layer of nodes.
(2) The client selects a neighbor; client c when the client aggregates neighbor information i For the node v where it is i Each neighbor node v of (1) j E N (i) learning a correlation coefficient a ij The update matrix becomes:
adding regularization operation to the final loss function to make the correlation coefficient of most neighbors equal to about 0, and the correlation coefficient of few neighborsThe absolute value of the number is far greater than 0, and the regularization method is l 1 Regularization and/or l 2,1 Regularization. And updating the neighbor node set N (i) into k neighbors with highest correlation, so as to reserve a few important neighbor nodes and remove unimportant neighbor nodes. The number of communication messages generated by forward propagation and backward propagation in the training process of the distributed graph neural network is reduced, and the communication efficiency is remarkably improved.
Meanwhile, the server records the response time of each node for completing forward propagation, arranges all clients in descending order, selects the theta nodes with the longest response time for regularization, and informs the theta nodes with the longest response time to reduce the number of neighbors.
(3) Predicting labels of edges by client enterprises; client c i Computing edge e using a full connection layer ij Predictive value of the tag. The client acquires global weight of the edge prediction full connection layer from the serverCalculating predictive value +.> Local loss of edge is->Node v i At least local side loss- >And sending the data to a server. The server updates the global parameters. Preferably, the global parameter includes global weightGlobal feature transformation matrix->And a global loss function/>Where l represents the layer of the distributed federal timing diagram neural network model and t represents time of day. The global parameters are also referred to as the global network parameters. The server is according to the global weight matrix at time t-1 +.>Calculate global weight matrix at time t>Alternatively, the server calculates the global weight matrix using RNN, GRU, LSTM or transducer>Alternatively, the server S will +.>And broadcasting at least one global parameter to all clients, and reducing the time for the clients to wait for receiving the global parameter. In one embodiment, the server calculates the global penalty as the sum of the node penalty and the edge penalty, i.e.:
wherein,,for edge loss, λ·r (a) is a regularization term, i.e., node loss.
(4) Initiating back propagation; the server begins back propagation after computing the global penalty. The server is responsible for updating global parametersAnd->The server updates the global parametersAnd->At least one of the clients is broadcasted to all the clients, and the clients are back-propagated and responsible for updating a parameter A, wherein the parameter A is a correlation coefficient matrix. Preferably, the server will have global parameters And->At least one is updated to->And->Wherein (1)>Is thatAbbreviations of->Is->Is an abbreviation for (c).
(5) Establishing a block chain; and storing the communication traffic such as node feature vectors, parameters, gradients and the like between the client and between the server and the client into the blockchain, preventing repudiation and supporting the action of auditing whether the server and the client have privacy leakage or not.
(4) And predicting the enterprise cooperative relationship according to the model.
In one embodiment, to predict the T-moment edge e ij The server uses the graph data { G } at time 0 to time T-1 t (V, E) T E {0,1, …, T-1}, and informing the client of inputting the graph G at the time T T (V,E)\e ij Final output edge e of server ij Is a predicted value of a tag of (a). In another embodiment, to predict the T-moment edge e ij Is finally output by the client side of the label of the edge e ij Is a predicted value of a tag of (a). Similarly, the T+1 time edge e can be predicted ij Is not described in detail herein. For any node v i If v is equal to i All edges connected have a predicted value of 0 or a trend of continuously approaching 0, v i Is "extinction".
By the method, not only the future value flow, supply and demand flow, service flow, technical flow and other network flows among enterprises can be predicted, but also whether the enterprises will evolve to an 'extinction' state can be predicted.
Example III
And predicting sales of the enterprise, wherein in the multivalent value chain digital ecosystem, sales of the enterprise are predicted, an accurate prediction result cannot be obtained only by using the internal data of the enterprise, and the data of the enterprise related to the upstream and downstream of the enterprise value chain are required to be comprehensively utilized. The distributed federal timing diagram neural network provided by the invention is applied to enterprise product sales prediction in a multi-value chain, and the method is as follows:
(1) Acquiring characteristic data of associated enterprises in each value chain
The server S is located in a trusted third party and the client { c } 1 ,c 2 ,…,c n Where n represents the number of clients, the server sends a request to the clients to build the distributed graph neural network, the request message including description information of data resources required for training, such as value flows, supply and demand flows, traffic flows, etc.
(2) Establishing a distributed graph neural network model
Distribution ofThe map neural network is divided into a feature learning network and a label prediction network. The feature learning network mainly comprises a message transmission layer, and the operations of graph convolution, graph meaning force and the like are realized through message transmission. The label prediction network consists of a fully connected layer. The architecture of the distributed graph neural network is a client-server structure, and consists of a server side and n client sides { c } 1 ,c 2 ,…,c n Composition of each client map G t (V,E t ) Wherein t=0, 1,2, … is time; v is a set of nodes, each node V i Representing an enterprise; e (E) t Each edge e is a collection of edges ij Representing enterprise node v i And v j Whether there is a cooperative relationship between them, the label vector of the edge is recorded as
The enterprise client consists of original data, local parameters, a data acquisition module, a data labeling module, a message transmission module and a local training module, and is used for locally training a sub-model representing the current graph node and the connecting edge at the client by utilizing the data of the client enterprise.
The server side consists of metadata, a parameter database, a parameter management module, a communication module and a global training module, and has the function of coordinating the training process of the whole graph neural network, and comprises the following steps: and collecting the loss of each node, calculating the global loss, updating the global parameters and initiating back propagation.
(3) Training model
The server and client training distributed federal timing diagram neural network model mainly comprises the following five steps:
(1) the node feature vectors are updated by clients such as retailers, dealers, and manufacturers. Firstly, respectively utilizing the graph snapshot at each moment to learn the graph node characteristics, then utilizing the multi-time phase graph data to learn the evolution characteristics of the node characteristics along with time, and finally utilizing the obtained graph node characteristics to predict the labels of the edges. Specifically, in the r-th iteration, the following operations are performed on the first layer of the neural network at the t-th moment:
For each graph node v i There is a client v i Is fully responsible for processing learning tasks on the node. Node v i The feature vector update rule of (1) is:
node v i To neighbor node v through the internet j Sending a request to obtain v j Node feature vector of (a)v i After the feature vectors of all the neighbor nodes are received, the feature vectors of the neighbor nodes are aggregated to update the feature vectors of the neighbor nodes Alternative Aggregate aggregation modes are summation, averaging, maximization, etc.
Node v i Sending a request to obtain global weight at t moment to a serverGlobal feature transformation matrix->And global loss function->Wherein l represents a layer of the distributed federal timing diagram neural network model and t represents a time of day. Wherein the server utilizes->And->The global network parameter at least one parameter or history time calculates the global weight of the weight matrix at time t>I.e. the weight matrix learns the timing effect of the multi-temporal pattern neural network. By W l (t) pair->Performing linear transformation to obtain a node v i Final features at layer I
And repeating the steps, and calculating the feature vector of each layer of nodes.
(2) The node where the client is located selects important neighbor nodes. Client c when the client aggregates neighbor information i For the node v where i Each neighbor node v of (1) j E N (i) learning a correlation coefficient a ij The update matrix becomes:
wherein a is ij E A, reserving important neighbors through a regularization method, and simultaneously removing a large number of unimportant neighbors, thereby adaptively reducing each node v i Reducing the number of neighbors of client c i Is a traffic volume of (a) in the network. By regularizing R (A) in the final loss function, the correlation coefficient of most neighbors is approximately 0, and the absolute value of the correlation coefficient of a small part of neighbors is far greater than 0, wherein a ij For the correlation coefficient, a is the correlation coefficient matrix,feature vector representing the node where the client is located, < >>Feature vector, W, representing the node where the neighbor client is located l (t) represents a local feature transformation matrix, and Linear represents a Linear operation. The regularization operation is added into the final loss function to enable the correlation coefficient of most neighbors to be reduced to be close to 0, the absolute value of the correlation coefficient of a few neighbors is far greater than 0, and the regularization method is l 1 Regularization or l 2,1 Regularization. And updating the neighbor node set N (i) into k neighbors with highest correlation, so as to reserve a few important neighbor nodes and remove unimportant neighbor nodes. The number of communication messages generated by forward propagation and backward propagation in the training process of the distributed graph neural network is reduced, and the communication efficiency is remarkably improved. Meanwhile, the server records the response time of each node for completing forward propagation, arranges all clients in descending order, selects the theta nodes with the longest response time for regularization, and informs the theta nodes with the longest response time to reduce the number of neighbors.
(3) Client enterprises predict labels of edges.
Client v i Computing edge e using a full connection layer ij Predictive value of the tag. The client acquires global weight of the edge prediction full connection layer from the serverCalculating predictive value +.>Wherein the method comprises the steps ofAs a distance function, represent node v i And node v j And the distance between the two is selected from cosine distance, euclidean distance and Mahalanobis distance. Edge local loss is->Node v i Local loss of edges->And sending the data to a server.
(4) The server updates the global parameters. The server performs global weight matrix according to the t-1 momentCalculate global weight matrix at time t>Optionally, the server calculates a global weight matrix at time t using RNN, GRU, LSTM or transducer. Alternatively, the server S will +.>Broadcasting to all clients, reducing the time for the clients to wait for receiving the global parameters. In one embodiment, the server calculates the global penalty as the sum of the node penalty and the edge penalty, i.e
Wherein,,for edge loss, λ·r (a) is a regularization term.
(5) Back propagation is initiated. The server begins back propagation after computing the global penalty. The server is responsible for updating global parameters global weightsGlobal feature transformation matrix->And global loss function- >For global weight->Global feature transformation matrix->And global loss function->The client is responsible for updating the local parameters. Preferably, the client is responsible for updating the local parameters including the correlation coefficient matrix parameters a.
(4) Sales prediction from models
Local data are input by clients such as retailers, distributors, factories and the like, feature vectors of nodes at the next moment are learned through a distributed graph neural network model, and finally predicted values of sales of products at the next moment are obtainedPredicted value of sales of the product at the next moment +.>For node v i All product sales prediction value +.>And (3) summing. Preferably, the server or the designated client end finally outputs the predicted value of the sales of the product at the next moment +.>The designated client is preferably client c i
Through the steps, a distributed federal timing diagram neural network model is built on the basis of the privacy data, and product sales prediction is performed. And comprehensively utilizing the data of the related enterprises at the upstream and downstream of the target enterprise value chain to perform distributed training, thereby realizing more accurate sales prediction.
It should be noted that, the steps not described in detail in this embodiment may refer to descriptions of related steps in the embodiment shown in fig. 1, and are not described herein.
In the description of the present specification, reference to the terms "some possible embodiments," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiments or examples is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the various embodiments or examples described in this specification and the features of the various embodiments or examples may be combined and combined by those skilled in the art without contradiction.
With respect to the method flowcharts of the embodiments of the present application, certain operations are described as distinct steps performed in a certain order. Such a flowchart is illustrative and not limiting. Some steps described herein may be grouped together and performed in a single operation, may be partitioned into multiple sub-steps, and may be performed in an order different than that shown herein. The various steps illustrated in the flowcharts may be implemented in any manner by any circuit structure and/or tangible mechanism (e.g., by software running on a computer device, hardware (e.g., processor or chip implemented logic functions), etc., and/or any combination thereof).
While the spirit and principles of the present invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments nor does it imply that features of the various aspects are not useful in combination, nor are they useful in any combination, such as for convenience of description. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. A training prediction method for multi-valence chain evolution, comprising:
generating a feature vector of a node where the client is located based on local original data of the client and adjacent clients thereof, wherein the feature vector is used for representing information of edges between the client and the nodes where the adjacent clients are located respectively, and the feature vector and the node information of the nodes where the clients are located form distributed time sequence diagram data;
the client calculates local network parameters of a distributed federation time sequence diagram neural network model at the current moment according to the feature vector of the node, and sends the local network parameters to a server, wherein the client establishes the distributed federation time sequence diagram neural network model in advance;
The server obtains global network parameters according to the local network parameters and broadcasts the global network parameters to all clients;
and the client updates the network parameters of the distributed federation time sequence diagram neural network model by utilizing the global network parameters, inputs the distributed time sequence diagram data at the next moment into the updated distributed federation time sequence diagram neural network model for learning, and outputs the predicted value at the next moment.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the local network parameters generated by the client include a local weight W E (t-1), local feature transformation matrix W l (t-1), a local loss function L (t-1);
the server performs back propagation based on the local network parameters sent by each client to train a cyclic neural network to obtain the global network parameters, wherein the global network parameters comprise global weightsGlobal feature transformation matrixAnd global loss function->Where l represents the layer of the distributed federal timing diagram neural network model and t represents time of day.
3. The method of claim 2, wherein the generating the feature vector of the node where the client is located based on the local raw data of the client and the neighboring clients thereof comprises:
Each client generates a feature vector of a node where the client is located based on local original data, and the client acquires feature vectors of adjacent clients and generates a neighbor list N (i) of neighbor nodes willing to participate in prediction, wherein i is a sequence number of the node where the client is located.
4. A method according to claim 3, characterized in that the method further comprises:
the client updates its own feature vector by the following aggregation formula:
wherein,,feature vector representing the node where the client is located, < >>And representing the characteristic vector of the node where the neighbor client is located, wherein l represents a layer of the neural network model of the distributed federal time sequence diagram, i represents a sequence number of the node, t represents time, and the Aggregate aggregation mode comprises summation and/or averaging and/or maximization.
5. The method of claim 4, wherein the step of determining the position of the first electrode is performed,
the server records the response time of each client to finish forward propagation, arranges all the clients in descending order, selects the nodes where the theta clients with the longest response time are located for regularization, and informs the theta nodes to reduce the number of neighbor nodes.
6. The method of claim 5, wherein the method further comprises:
When the client gathers the feature vector of the node where the neighbor client is located, the client learns a correlation coefficient a for the client of each neighbor node of the node where the client is located ij Updating the feature vector of the client by adopting the following formula:
wherein a is ij ∈A,a ij In the final global loss function, the correlation coefficient of most of the neighbor nodes is made to approach to 0 by regularizing R (A), the absolute value of the correlation coefficient of a small part of the neighbors is far greater than 0, wherein i represents the serial number of the node, j represents the serial number of the neighbor node,feature vector representing the node where the client is located, < >>Feature vector, W, representing the node where the neighbor client is located l (t) represents a local feature transformation matrix, and Linear represents a Linear operation.
7. The method of claim 6, wherein the method further comprises:
the regularized R (A) comprises l 1 Regularization of
R(A)=∑ i,j | ij |,
And l 2, Regularization of
Wherein said a ij And E, retaining important neighbor nodes through a regularization method, and removing a large number of unimportant neighbor nodes at the same time, so that the number of the neighbor nodes of theta nodes is reduced in a self-adaptive mode, and the traffic of the client is reduced.
8. The method of claim 7, wherein the step of determining the position of the probe is performed,
the client uses the local original data to generate a feature vector of the node where the client is locatedWherein l represents a layer of the neural network model of the distributed federal timing diagram, i represents a sequence number of a node, and t represents time;
the client establishes a distributed federal timing diagram neural network model according to the characteristic vector of the nodeCalculating local weight W E (t-1), local feature transformation matrix W l (t-1), a local loss function L (t-1), and an edge e connected to the node ij Feature vector of ("-1)>
The server acquires the local weight W sent by the client E (t-1), local feature transformation matrix W l (t-1) and the local loss function L (t-1) and performing back propagation to obtain global weightGlobal feature transformation matrix->And global loss function->
Each node v i Updating the feature vector, wherein each client receives the global weight sent by the serverGlobal feature transformation matrix->And global loss function->And back propagation is carried out to obtain a pre-trained distributed federal time sequence diagram neural network, wherein l represents a layer of the distributed federal time sequence diagram neural network model, i represents a serial number of a node, and t represents time.
9. A training predictive system for multi-valence chain evolution, the system comprising:
a server side and a plurality of clients;
the client comprises a data acquisition module, an original data module, a local parameter module, a data labeling module, a message transfer module, a local training module and a blockchain, wherein the data acquisition module is used for acquiring and recording original data of the client, the original data module is used for storing the original data of the client, the local parameter module is used for storing all parameters of a local graph neural network of the client, the data labeling module labels the data for training of the graph neural network, the message transfer module is used for supporting multiple concurrent communication of the client and a communication module of a server and information uplink, the local training module is used for executing all calculation of the client, the calculation of a feature vector, training and prediction of the client federal time sequence graph neural network, and the blockchain is used for recording all communication data of the server and the client;
the server side comprises a metadata module, a parameter database module, a parameter management module, a communication module, a global training module and a blockchain, wherein the metadata module is used for managing and storing metadata, the parameter database module is used for storing all parameters of a server circulating neural network, the parameter management module is used for managing and updating parameters of a graph neural network, the communication module is used for supporting multiple concurrent communication and information uplink between a server and a message transmission module of multiple clients, the global training module is used for fusing local network parameters sent by the multiple clients, and the blockchain is used for recording all communication data between the server and the clients;
The server side and the client side comprise at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.
10. A computer readable storage medium storing a program which, when executed by a multi-core processor, causes the multi-core processor to perform the method of any of claims 1-8.
CN202310580677.6A 2023-05-22 2023-05-22 Training prediction method, system and storage medium for multivalent value chain evolution Pending CN116542323A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310580677.6A CN116542323A (en) 2023-05-22 2023-05-22 Training prediction method, system and storage medium for multivalent value chain evolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310580677.6A CN116542323A (en) 2023-05-22 2023-05-22 Training prediction method, system and storage medium for multivalent value chain evolution

Publications (1)

Publication Number Publication Date
CN116542323A true CN116542323A (en) 2023-08-04

Family

ID=87455927

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310580677.6A Pending CN116542323A (en) 2023-05-22 2023-05-22 Training prediction method, system and storage medium for multivalent value chain evolution

Country Status (1)

Country Link
CN (1) CN116542323A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117688425A (en) * 2023-12-07 2024-03-12 重庆大学 Multi-task graph classification model construction method and system for Non-IID graph data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117688425A (en) * 2023-12-07 2024-03-12 重庆大学 Multi-task graph classification model construction method and system for Non-IID graph data

Similar Documents

Publication Publication Date Title
Jiao et al. Toward an automated auction framework for wireless federated learning services market
Tong et al. Spatial crowdsourcing: a survey
Baccour et al. Pervasive AI for IoT applications: A survey on resource-efficient distributed artificial intelligence
Qi et al. High-quality model aggregation for blockchain-based federated learning via reputation-motivated task participation
Wahab et al. Towards trustworthy multi-cloud services communities: A trust-based hedonic coalitional game
CN113191484A (en) Federal learning client intelligent selection method and system based on deep reinforcement learning
Teimoori et al. A secure cloudlet-based charging station recommendation for electric vehicles empowered by federated learning
Chahoud et al. On the feasibility of federated learning towards on-demand client deployment at the edge
CN114896899B (en) Multi-agent distributed decision method and system based on information interaction
CN113992692B (en) Method and system for layered federal learning under terminal edge cloud architecture and incomplete information
CN113240086B (en) Complex network link prediction method and system
Stai et al. A hyperbolic space analytics framework for big network data and their applications
CN116542323A (en) Training prediction method, system and storage medium for multivalent value chain evolution
Nguyen et al. A marketplace for trading ai models based on blockchain and incentives for iot data
Timilsina et al. A reinforcement learning approach for user preference-aware energy sharing systems
CN116227632A (en) Federation learning method and device for heterogeneous scenes of client and heterogeneous scenes of data
Park et al. An efficient multilateral negotiation system for pervasive computing environments
Zhao et al. An incentive mechanism for big data trading in end-edge-cloud hierarchical federated learning
Huang et al. Collective reinforcement learning based resource allocation for digital twin service in 6G networks
Yu et al. Employing social participants for timely data collection using pub/sub solutions in dynamic IoT systems
Ometov et al. On applicability of imagery-based CNN to computational offloading location selection
CN115564532A (en) Training method and device of sequence recommendation model
Asheralieva et al. Ultra-reliable low-latency slicing in space-air-ground multi-access edge computing networks for next-generation internet of things and mobile applications
Chen et al. Smart futures based resource trading and coalition formation for real-time mobile data processing
Hsieh et al. A multiagent approach for managing collaborative workflows in supply chains

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination