CN115511053A - Method and device for generating timing diagram sample, storage medium and electronic equipment - Google Patents

Method and device for generating timing diagram sample, storage medium and electronic equipment Download PDF

Info

Publication number
CN115511053A
CN115511053A CN202211185663.6A CN202211185663A CN115511053A CN 115511053 A CN115511053 A CN 115511053A CN 202211185663 A CN202211185663 A CN 202211185663A CN 115511053 A CN115511053 A CN 115511053A
Authority
CN
China
Prior art keywords
edge
timing
time sequence
node
constraint condition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211185663.6A
Other languages
Chinese (zh)
Inventor
宦成颖
刘永超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202211185663.6A priority Critical patent/CN115511053A/en
Publication of CN115511053A publication Critical patent/CN115511053A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The specification discloses a method, a device, a storage medium and an electronic device for generating a timing diagram sample, wherein the method comprises the following steps: generating a mask vector according to the time sequence of the edge from the last node to the current node, the time sequence of each edge connected with the current node and the time sequence constraint condition, performing alias sampling according to an alias table constructed by the weight of each edge to obtain a candidate edge, judging whether the candidate edge meets the time sequence constraint condition or not by using the mask vector, if so, adding the obtained edge into a random walk path, and when the random walk is finished, taking the random walk path as a training sample of a training graph neural network. In the method, alias sampling is carried out by using the alias table when random walk is carried out on the timing diagram, so that the sampling efficiency is improved, and meanwhile, whether the edges obtained by alias sampling meet the timing constraint condition is judged by using the mask vector, so that the random walk path is generated at a high speed, and the path serving as a training sample can be generated on the timing diagram efficiently.

Description

Method and device for generating timing diagram sample, storage medium and electronic equipment
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for generating a timing chart sample, a storage medium, and an electronic device.
Background
With the development of science and technology, machine learning is widely applied. With the development of machine learning, machine learning models are more and more widely applied, especially for graph neural networks.
There are many types of samples of the neural network of the training graph, wherein a path from one node to another node in the graph can be used as a sample of the neural network of the training graph. At present, the samples of the neural network of the training graph are mainly paths from one node to another node on the static graph. However, in practical applications, most graphs contain some timing information, which is information of an edge from one node to its neighboring node, and is used to constrain the order of each node in a path from one node to another node. Such a graph with timing information on the edge is called a timing graph, and the application range of the timing graph is very wide, as shown in fig. 1.
Fig. 1 is a schematic diagram illustrating a using process of an application by using a sequence diagram in the prior art, where in fig. 1, 4 nodes are included, a node 1 indicates a login application, a node 2 indicates an entry to a contact module, a node 3 indicates a selection contact, a node 4 indicates sending a message in a dialog box, there is timing information on an edge where one node is connected to another node, the timing information on the edge where the node 1 is connected to the node 2 is 1, the timing information on the edge where the node 2 is connected to the node 3 is 2, and the timing information on the edge where the node 3 is connected to the node 4 is 3. In the using process of the application, firstly, the execution node 1 logs in first to enter the application, then the execution node 2 enters the contact module, then the execution node 3 selects the contact needing to send the message, and then the execution node 4 sends the message in the dialog box. Each node is in sequence, the executing node 3 cannot select the contact needing to send the message before the executing node 2 enters the contact module, and the executing node 2 cannot enter the contact module after the message is sent in the dialog box of the executing node 4. That is, the timing constraints for the timing diagram shown in fig. 1 are: for two edges, the edge with a higher timing in the path needs to be ranked behind the edge with a lower timing. This is the timing constraint that imposes on the order of the nodes in the path.
Meanwhile, training a graph neural network with good performance requires the use of a large number of training samples, and the collection of the training samples is a time-consuming and labor-consuming process. Therefore, how to efficiently generate paths as training samples according to the timing diagram is an urgent problem to be solved.
Disclosure of Invention
The present specification provides a method, an apparatus, a storage medium, and an electronic device for generating a timing chart sample, so as to partially solve the above problems in the prior art.
The technical scheme adopted by the specification is as follows:
the present specification provides a method for generating a timing chart sample, including:
determining a current node to which the current random walk arrives and a previous node of the current node in a path of the current random walk in a sequence diagram;
determining the time sequence of the edge from the last node to the current node in the time sequence chart as a reference time sequence;
determining each edge connected with the current node in the time sequence diagram, constructing a nickname table according to the weight of each edge, and generating a mask vector according to the time sequence of each edge, the reference time sequence and the time sequence constraint condition corresponding to the time sequence diagram;
carrying out alias sampling on each edge according to the alias table to obtain candidate edges;
judging whether the candidate edge meets the time sequence constraint condition or not according to the mask vector;
if yes, adding the candidate edge as a sampling edge into the path;
and when the current random walk is finished, taking the path of the current random walk as a generated training sample, wherein the training sample is used for inputting a graph neural network to be trained so as to train the graph neural network to be trained.
Optionally, generating a mask vector according to the timing sequence of each edge, the reference timing sequence, and the timing constraint condition corresponding to the timing sequence, specifically including:
aiming at each edge, judging whether the edge meets the time sequence constraint condition or not according to the time sequence of the edge, the reference time sequence and the time sequence constraint condition corresponding to the time sequence chart, if so, setting an element corresponding to the edge in the mask vector as a first element, otherwise, setting an element corresponding to the edge in the mask vector as a second element;
judging whether the candidate edge meets the time sequence constraint condition specifically comprises the following steps:
when the element of the candidate edge corresponding to the mask vector is the first element, determining that the candidate edge satisfies the timing constraint condition;
and when the element of the candidate edge corresponding to the mask vector is the second element, determining that the candidate edge does not satisfy the timing constraint condition.
Optionally, before setting the element corresponding to the edge in the mask vector as the first element, the method further includes:
determining that the edge is not in the current randomly walked path.
Optionally, the determining whether the edge satisfies the timing constraint condition specifically includes:
when the time sequence of the edge is greater than the reference time sequence, determining that the edge meets a time sequence constraint condition;
and when the time sequence of the edge is not greater than the reference time sequence, determining that the edge does not meet the time sequence constraint condition.
Optionally, when the candidate edge does not satisfy the timing constraint, the method further includes:
and performing alias sampling on each edge again according to the alias table until candidate edges obtained by alias sampling meet the time sequence constraint condition.
Optionally, when the candidate edge satisfies the timing constraint, the method further includes:
taking the neighbor node connected with the current node through the sampling edge in the timing diagram as the current node to which the current random walk is carried out again;
and carrying out alias sampling again according to the redetermined current node until the current random walk is finished.
Optionally, the ending of the current random walk specifically includes:
when the length of the path of the current random walk reaches a set threshold value, the current random walk is finished; or
And when the candidate edge meeting the time sequence constraint condition cannot be obtained, finishing the current random walk.
The present specification provides an apparatus for generating a timing chart sample, comprising:
the acquisition module is used for determining a current node which is currently and randomly walked to and a previous node of the current node in a path which is currently and randomly walked;
a timing determination module, configured to determine a timing of an edge from the previous node to the current node in the timing graph, as a reference timing;
a generating module, configured to determine each edge connected to the current node in the timing graph, construct a nickname table according to the weight of each edge, and generate a mask vector according to the timing sequence of each edge, the reference timing sequence, and a timing constraint condition corresponding to the timing graph;
the sampling module is used for carrying out alias sampling on each edge according to the alias table to obtain candidate edges;
a first judging module, configured to judge whether the candidate edge satisfies the timing constraint condition according to the mask vector;
the processing module is used for adding the candidate edge as a sampling edge into the path when the judgment result of the first judgment module is yes;
and the output module is used for taking the path of the current random walk as a generated training sample when the current random walk is finished, wherein the training sample is used for inputting a graph neural network to be trained so as to train the graph neural network to be trained.
Optionally, the generating module is specifically configured to determine, according to the timing sequence of the edge, the reference timing sequence, and a timing constraint condition corresponding to the timing sequence, whether the edge meets the timing constraint condition, if yes, set an element of the edge corresponding to the mask vector as a first element, otherwise, set an element of the edge corresponding to the mask vector as a second element;
the first determining module is specifically configured to determine that the candidate edge satisfies the timing constraint condition when an element of the candidate edge corresponding to the mask vector is the first element; and when the element of the candidate edge corresponding to the mask vector is the second element, determining that the candidate edge does not satisfy the timing constraint condition.
Optionally, the generating module is further configured to determine that the edge is not in the current random walk path before setting the element of the edge corresponding to the mask vector as the first element.
Optionally, the generating module is specifically configured to determine that the edge satisfies a timing constraint condition when the timing of the edge is greater than the reference timing; and when the time sequence of the edge is not greater than the reference time sequence, determining that the edge does not meet the time sequence constraint condition.
Optionally, the sampling module is further configured to, when the determination result of the first determining module is negative, perform alias sampling on each edge again according to the alias table to obtain the candidate edge until the candidate edge satisfies the timing constraint condition.
Optionally, the processing module is further configured to, when the determination result of the first determining module is yes, instruct the obtaining module to take a neighbor node connected to the current node in the timing graph via the sampling edge as the current node to which the current random walk travels again, and instruct the timing determining module, the generating module, and the sampling module to perform alias sampling again according to the current node that is determined again until the current random walk is finished.
Optionally, the apparatus further comprises:
a second judging module, configured to determine that the current random walk ends when the number of nodes in the path of the current random walk has reached a set threshold; or, when the candidate edge meeting the timing constraint condition cannot be obtained, determining that the current random walk is finished.
The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method of generating the above-described timing chart samples.
The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method for generating the timing diagram samples described above when executing the program.
The technical scheme adopted by the specification can achieve the following beneficial effects:
in the method for generating the timing diagram sample provided by the specification, a current node to which a random walk currently arrives in a timing diagram and a previous node of the current node in a current random walk path are determined, a mask vector is generated according to a time sequence of an edge from the previous node to the current node, a time sequence of each edge connected with the current node and a time sequence constraint condition of the timing diagram, an alias table is established according to weights of the edges, whether the edge obtained by alias sampling meets the time sequence constraint condition is judged by using the mask vector, if yes, the obtained edge is added into the random walk path, and when the random walk is finished, the random walk path is used as a training sample of a training graph neural network.
According to the method, alias sampling is carried out by using the alias table when random walk is carried out on the timing diagram, the sampling efficiency is improved, meanwhile, whether the edge obtained by alias sampling meets the timing constraint condition or not is judged by using the mask vector, if yes, the edge obtained by alias sampling is added into the random walk path, the random walk path is generated at a high speed, and therefore the path serving as a training sample can be generated on the timing diagram efficiently.
Drawings
The accompanying drawings, which are included to provide a further understanding of the specification and are incorporated in and constitute a part of this specification, illustrate embodiments of the specification and together with the description serve to explain the specification and not to limit the specification in a non-limiting sense. In the drawings:
FIG. 1 is a schematic diagram illustrating a prior art application using a timing chart in the present specification;
FIG. 2 is a flow chart of a method for generating timing diagram samples according to the present disclosure;
FIG. 3 is a schematic diagram of a timing diagram of a sample of a random walk generating path in the present specification;
FIG. 4 is a diagram illustrating the creation of a histogram according to the adjusted weights of the neighboring edges in this specification;
FIG. 5 is a flow chart illustrating a method for generating samples in a time chart with edges sorted;
FIG. 6 is a schematic diagram illustrating the creation of a histogram according to the adjusted weights of each set of neighbor edges in this specification;
FIG. 7 is a schematic diagram of an apparatus for generating timing diagram samples provided herein;
fig. 8 is a schematic diagram of an electronic device corresponding to fig. 2 provided in the present specification.
Detailed Description
To make the objects, technical solutions and advantages of the present specification clearer and more complete, the technical solutions of the present specification will be described in detail and completely with reference to the specific embodiments of the present specification and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without any creative effort belong to the protection scope of the present specification.
As described above, for a timing chart, how to efficiently generate a path as a training sample according to the timing chart is a problem to be solved. At present, random walk on a timing diagram usually starts from a certain node on a graph, during each step of walk, according to weights on neighbor edges of a current arrival node, transition probabilities from the current arrival node to neighbor nodes of the current arrival node during a next step of random walk are determined, a new edge is sampled from the neighbor edges of the current arrival node according to transition probability distribution, a time sequence of the sampled edge needs to be larger than a time sequence of the edge of the current arrival node, the sampled edge can be used as a next step of random walk until a new edge cannot be sampled, and a generated random walk path is used as a training sample of a graph neural network. However, the time complexity of sampling by transition probability distribution is O (logN), the sampling efficiency is low, the random walk path generation speed is slow, and the path generated as a training sample on the timing chart is inefficient.
Therefore, embodiments of the present specification provide a method, an apparatus, a storage medium, and an electronic device for generating a timing chart sample, and technical solutions provided by embodiments of the present specification are described in detail below with reference to the accompanying drawings.
Fig. 2 is a schematic flow chart of a method for generating a timing chart sample in this specification, which specifically includes the following steps:
s100: the method comprises the steps of determining a current node to which a current random walk arrives and a node which is previous to the current node in a path of the current random walk in a timing chart.
When the neural network of the graph is trained, a path from one node to another node on the time sequence diagram can be used as a training sample, and the path sample can be generated mainly in a random walk mode on the time sequence diagram. When random walk is carried out on a sequence diagram, multiple rounds of sampling are needed, a new edge is sampled from a neighbor edge of a current node during each round of sampling, a neighbor node corresponding to the sampled edge is used as a next node of the random walk, the sampled edge is added into a random walk path, and finally the node sampled in each round and the edge connected between the nodes are used as path output.
Based on this, in the present specification, the apparatus for generating a training sample determines a current node to which a random walk is currently made in a timing chart, and a node immediately preceding the current node in a path of the random walk. The device for generating the training sample may be a server for training a neural network, or a device such as a mobile phone, a Personal Computer (PC) or the like capable of executing the solution of the present specification. For convenience of explanation, the following description will be made with a server as an execution subject.
The current node is a node currently randomly walked on the timing diagram, and the node previous to the current node is a node previous to the current node in the path currently randomly walked on the timing diagram, for example, fig. 3 is a schematic diagram of a timing diagram of a path sample generated by performing random walk, and the diagram includes nodes 1 to 7, and each edge where two nodes are connected includes timing information and weight (only the timing information is shown in fig. 3), for example, the timing information on the edges of nodes 2 to 3 is 1. The current path traveled randomly is from node 1 to node 2, so the current node is node 2, that is, the current node traveled randomly, and the previous node of the current node is node 1, that is, the previous node of the current node in the current path traveled randomly.
S102: and determining the time sequence of the edge from the last node to the current node in the time sequence chart as a reference time sequence.
The server determines the timing of the edge from the previous node to the current node in the timing chart as a reference timing. And the reference time sequence is used for generating the mask vector in the subsequent mask vector generating process according to the edges connected with the current node and the time sequence constraint condition corresponding to the time sequence diagram. For example, in the timing chart shown in fig. 3, the current node that is currently randomly walked to is node 2, and the previous node is node 1, so the timing of the edge from the previous node to the current node, that is, the timing of the edge from node 1 to node 2 is 1, and therefore the reference timing is 1 when the current node is randomly walked to in fig. 3.
S104: and determining each edge connected with the current node in the time sequence diagram, and constructing a nickname table according to the weight of each edge.
The server determines each edge (i.e., neighbor edge) of the current node connection in the timing graph, and constructs a nickname table according to the weight of each edge, wherein the nickname table is constructed based on equal probability distribution and binomial distribution. Specifically, the weights of the edges may be normalized to obtain the normalized weights of the edges, and then, for each edge, a product of the normalized weight of the edge and the number of the edges connected to the current node is determined, and as the adjusted weight of the edge, a histogram is created according to the adjustment of each edge, where each column in the histogram corresponds to one edge, and the column height of each column is the adjusted weight of the corresponding edge, and the histogram is adjusted in a manner of transferring the weight of the edge with the adjusted weight greater than 1 in the histogram to the edge with the adjusted weight less than 1, so that the column height of each column of the adjusted histogram is 1, and each column contains at most two edges, and a nickname table is generated according to the adjusted histogram, where the nickname table includes a first array and a second array, the first array stores the probability of the original corresponding edge of the ith column, and the second array stores the number of the edge transferred to the ith column. Wherein i is a positive integer.
Continuing with the example of fig. 3, a nickname table of the current node (node 2) is constructed, nodes connected to the node 2 by edges include a node 3, a node 5, a node 6, and a node 7, and the number of the connected node is taken as the number of the connecting edge, so that the neighboring edges connected to the node 2 include an edge 3, an edge 5, an edge 6, and an edge 7. Assume that the weights of the neighbor edges of node 2 are as shown in table 1.
Neighbor edge Weight of
Edge 3 0.3
Edge 5 0.1
Edge 6 0.1
Edge 7 0.5
TABLE 1
Based on the weights shown in table 1, since the weights in table 1 are already normalized weights, the weights of the neighbor edges do not need to be normalized. For each neighbor edge, the product of the weight of the neighbor edge and the number of the neighbor edges is used as the adjustment weight of the neighbor edge, that is, the weight of the neighbor edge is multiplied by 4, and after adjustment, the adjustment weights corresponding to the edges 3, 5, 6 and 7 are 1.2,0.4,0.4 and 2 respectively.
Then, as shown in fig. 4, a histogram is created, that is, a histogram is created according to the adjustment weight of each neighbor edge, the 4 neighbor edges are divided into 4 columns, each column corresponds to one neighbor edge, the column height of each column is the adjustment weight of the corresponding neighbor edge, that is, the edge 3 corresponding to the 1 st column is 1.2, the edge 5 corresponding to the 2 nd column is 0.4, the edge 6 corresponding to the 3 rd column is 0.4, and the edge 7 corresponding to the 4 th column is 2. The weight of the neighbor side with the adjustment weight larger than 1 in each column is transferred to the neighbor side with the adjustment weight smaller than 1, so that the column height of each column of the adjusted histogram is 1, and each column at most only contains two neighbor sides, that is, the partial weight (i.e., 0.2) with the adjustment weight larger than 1 of the side 3 is transferred to the side 7, the partial weight (i.e., 1.2) with the adjustment weight larger than 1 of the side 7 is respectively transferred to the side 5 and the side 6, so that the column height of each column is 1, and the adjusted histogram is obtained, wherein the probability of the side 3 in the 1 st column is 1, the probability of the side 5 in the 2 nd column is 0.4, the probability of the side 7 is 0.6, the probability of the side 6 in the 3 rd column is 0.4, the probability of the side 7 in the 4 th column is 0.8, and the probability of the side 3 is 0.2, and the adjusted histogram can be directly used as the name list. The probability of the original corresponding neighbor edge of each column is stored in a first array, the number of the neighbor edge transferred to each column is stored in a second array, and then the number stored in the first array is (1,0.4,0.4,0.8) and the number stored in the second array is (null, 7,7,3).
S106: and generating a mask vector according to the time sequence of each side, the reference time sequence and the time sequence constraint condition corresponding to the time sequence chart.
And the server generates a mask vector according to the time sequence of each side, the reference time sequence and the time sequence constraint condition corresponding to the time sequence. The timing constraint condition is defined on each timing graph, and the timing constraint conditions may be the same or different for each timing graph, and are determined according to the definition of the timing graph, for example, fig. 1 is a schematic diagram showing a use process of an application by using a timing graph in the prior art, the sequence of nodes in the path of fig. 1 is node 1, node 2, node 3, and node 4, and is performed in sequence, and the timing constraint condition is that for two edges in fig. 1, an edge with a small timing in the path needs to be arranged before an edge with a large timing. Since the timing constraints are determined from the timing diagram, for ease of explanation, the following timing constraints are the timing constraints of FIG. 1. The mask vector is composed of elements, each element corresponds to each edge connected with the current node one by one, the elements are set according to the time sequence of each edge, the reference time sequence and the time sequence constraint condition corresponding to the time sequence, and the mask vector can shield the edges which do not meet the time sequence constraint condition according to the difference of the elements. For example, the mask vector may be a vector having an element of 0 or 1, where the element 0 or 1 is determined according to the timing of the edge, the reference timing, and the timing constraint, the element 1 indicates that the edge connected to the current node satisfies the timing constraint, the element 0 indicates that the edge connected to the current node does not satisfy the timing constraint, and each element corresponds to each edge connected to the current node one by one, and the mask vector may determine, according to the element, which edges connected to the current node satisfy the timing constraint and which edges connected to the current node do not satisfy the timing constraint.
Specifically, the timing constraint condition of the timing diagram is determined first, and whether the timing constraint condition is satisfied is determined as follows: when the time sequence of the edge is greater than the reference time sequence, the time sequence constraint condition is satisfied, and when the time sequence of the edge is not greater than the reference time sequence, the time sequence constraint condition is not satisfied. And aiming at each edge, judging whether the edge meets the time sequence constraint condition according to the time sequence of the edge, the reference time sequence and the time sequence constraint condition corresponding to the time sequence, if so, setting the element of the edge corresponding to the mask vector as a first element, and otherwise, setting the element of the edge corresponding to the mask vector as a second element. For example, the reference timing sequence of the current random walk on the timing diagram described in fig. 3 is 1, the neighbor edges of the current node (i.e., node 2) are edge 3, edge 5, edge 6, and edge 7, when the corresponding elements of each neighbor edge are set, the corresponding element of the neighbor edge whose timing is greater than 1 on the neighbor edge in the mask vector is set to be the first element 1, the corresponding element of the neighbor edge whose timing is not greater than 1 on the neighbor edge in the mask vector is set to be the second element 0, the timing on the edge 5 is 5 and is greater than reference timing 1, the corresponding element of the edge 5 in the mask vector is set to be the first element 1, the timing on the edge 6 is 2 and is greater than reference timing 1, the corresponding element of the edge 6 in the mask vector is set to be the first element 1, the timing on the edge 7 is 0 and is not greater than reference timing 1, the corresponding element of the edge 7 in the mask vector is set to be the second element 890, and the element of the mask vector is 8978 [ xft ] when the corresponding element of the mask vector is set.
S108: and performing alias sampling on each edge according to the alias table to obtain candidate edges.
And (5) performing alias sampling on each edge according to the alias table constructed according to the weight in the step (S104), and taking the sampled edge as a candidate edge. Specifically, a random number is generated within the range from 0 to the number of columns of the nickname table, and the generated random number is rounded to be used as the number of candidate columns, and the jth column in the nickname table corresponding to the number of candidate columns is determined, wherein j is a positive integer. And randomly generating a probability p1, judging whether the probability of the p1 is smaller than that of the edge originally corresponding to the jth row in the first array, if so, taking the edge originally corresponding to the jth row as a candidate edge, and if not, taking the edge transferred to the jth row in the second array as a candidate edge. Wherein p1 is a number in the range of 0 to 1. Continuing with fig. 4, assuming that a random number 1.7 is generated, the randomly generated number is rounded up to be the number of candidate columns, and the jth column in the nickname table corresponding to the number of candidate columns, i.e. 2 rounded up to 1.7, is determined to be the number of candidate columns, i.e. the 2 nd column in the corresponding nickname table. Assuming that a probability 0.2 is randomly generated, judging whether the randomly generated probability is smaller than the probability of the edge corresponding to the 2 nd row in the first array, if so, taking the edge corresponding to the 2 nd row as a candidate edge, if not, taking the edge transferred to the 2 nd row in the second array as a candidate edge, wherein the probability of 0.4 in the 2 nd row is edge 5,0.6 is edge 7, the probability of the edge corresponding to the 2 nd row is 0.4, and the randomly generated probability of 0.2 is smaller than the probability of the edge corresponding to the 2 nd row, and then taking the edge 5 corresponding to the 2 nd row as a candidate edge.
S110: and judging whether the candidate edge meets the time sequence constraint condition or not according to the mask vector, if so, executing a step S112, and otherwise, executing a step S108.
And determining corresponding elements of the candidate edges according to the mask vectors, and judging whether the candidate edges meet the time sequence constraint conditions according to the elements. Specifically, the corresponding elements of the candidate edge are determined according to the mask vector, when the corresponding elements of the candidate edge in the mask vector are the first elements, the candidate edge satisfies the timing constraint condition, and when the corresponding elements of the candidate edge in the mask vector are the second elements, the candidate edge does not satisfy the timing constraint condition. Continuing with the above example, assuming that the candidate edge obtained in step S108 is edge 5, and if the element of the edge 5 corresponding to the mask vector is the first element 1, the edge 5 satisfies the timing constraint condition, and step S112 is executed.
S112: adding the candidate edge as a sampling edge to the path.
If the candidate edge obtained in step S108 satisfies the timing constraint condition, the server adds the candidate edge as a sampling edge to the path, and continues to use the above example, assuming that the edge 5 is the candidate edge obtained in step S108 and satisfies the timing constraint condition, adds the edge 5 to the path, and the path that is currently randomly walked is node 1, node 2, and node 5.
S114: and when the current random walk is finished, taking the path of the current random walk as a generated training sample, wherein the training sample is used for inputting a graph neural network to be trained so as to train the graph neural network to be trained.
And when the current random walk is finished on the timing diagram, taking the path of the current random walk as a generated training sample, and inputting the training sample into the graph neural network to be trained so as to train the graph neural network to be trained. Specifically, when the length of the current random walk path reaches a set threshold, the current random walk path is used as a generated training sample, and the training sample is input to the graph neural network to be trained so as to train the graph neural network to be trained. The length of the path may be determined by the number of nodes in the path or by the number of edges in the path. Or when the candidate edge meeting the timing constraint condition cannot be obtained, taking the current path which is randomly walked as a generated training sample, and inputting the training sample into the graph neural network to be trained so as to train the graph neural network to be trained.
Specifically, a path sample for training a neural network of a graph to be trained and a label of the path sample are obtained, wherein the neural network of the graph to be trained is used for obtaining a shortest path from one node to another node on a timing diagram, the path sample is a path randomly walked from the one node serving as the training sample to the another node generated on the timing diagram, and the label of the path sample is the shortest path randomly walked from the one node to the another node on the timing diagram. And inputting the path sample into the neural network of the graph to be trained to obtain an output path of the neural network of the graph to be trained, and training the neural network of the graph to be trained by using the minimum difference between the output path of the neural network of the graph to be trained and the label of the path sample as a training target.
It can be seen from the above method that, in the method, when a sample is generated on a timing diagram, a random walk manner is adopted to sample a timing diagram to generate a path sample, an alias table is used to sample during sampling, elements corresponding to candidate edges in a mask vector are determined, whether the candidate edges meet a timing constraint condition is judged according to the elements, and if yes, the candidate edges are added to a random walk path as sampling edges. At the end of the random walk, the output random walk path may be used as a training sample. Although the path sample is generated by sampling the time sequence diagram in a random walk mode, the sampling efficiency is improved by using alias sampling, whether the candidate edge meets the time sequence constraint condition or not is judged according to the corresponding element of the candidate edge in the mask vector, and if the candidate edge meets the time sequence constraint condition, the candidate edge is added into the random walk path, so that the random walk path is generated at a high speed, and the path as the training sample can be efficiently generated on the time sequence diagram.
In step S106, for each edge, it is determined whether the edge satisfies the timing constraint condition according to the timing of the edge, the reference timing, and the timing constraint condition corresponding to the timing chart, and if so, the element of the edge corresponding to the mask vector is set as the first element. It may also be determined that the edge is not in the current randomly-traveled path before the corresponding element of the edge in the mask vector is set as the first element. Specifically, if the current random walk sample is not put back to sample, that is, an edge existing in the current random walk path is not allowed to be repeatedly added, for each edge, according to the time sequence of the edge, the reference time sequence and the time sequence constraint condition corresponding to the time sequence chart, whether the edge meets the time sequence constraint condition is judged, if yes, whether the edge connected by the current node exists in the current random walk path is determined, if not, an element corresponding to the edge in the mask vector is set as a first element, and if yes, an element corresponding to the edge in the mask vector is set as a second element. For example, continuing with fig. 3, assume that the current random walk path is node 2, node 6, node 3, node 6, the edge to which the current node (i.e., node 6) connects is edge 3, the timing of edge 3 is 4, and the reference timing is 3, so edge 3 satisfies the timing constraint. And determining whether the edge connected with the current node exists in the current random walk path, if not, setting an element of the edge corresponding to the mask vector as a first element, and if so, setting an element of the edge corresponding to the mask vector as a second element. There is node 6 to node 3 in the current random walk path (the node 6 to node 3 are connected by edge 3), that is, there is edge 3 in the current random walk path, so the corresponding element of edge 3 in the mask vector is set as the second element.
In step S110, it is determined whether the candidate edge satisfies the timing constraint condition according to the mask vector, if so, step S112 is executed, otherwise, step S108 is executed. And when the candidate edge does not meet the time sequence constraint condition, carrying out alias sampling on each edge again according to the alias table until the candidate edge obtained by alias sampling meets the time sequence constraint condition. Specifically, whether the candidate edge satisfies the timing constraint condition is determined according to the mask vector, and when the element of the candidate edge corresponding to the mask vector is the first element, the candidate edge satisfies the timing constraint condition, and step S112 is executed. However, when the element corresponding to the candidate edge in the mask vector is the second element and the candidate edge does not satisfy the timing constraint condition, the process returns to step S108, performs alias sampling according to the alias table generated in step S104 until the candidate edge obtained by alias sampling satisfies the timing constraint condition, and then executes step S112. Continuing with fig. 3, assuming that the candidate edge obtained in step S108 is edge 7, and the element of edge 7 corresponding to the mask vector is the second element 0, the edge 7 does not satisfy the timing constraint condition, and step S108 needs to be executed again, and alias sampling is performed according to the alias table generated in step S104 until the candidate edge obtained by alias sampling satisfies the timing constraint condition.
In step S112 of the method, when the candidate edge satisfies the timing constraint condition, after the candidate edge is added to the path as a sampling edge, a neighbor node connected to the current node via the sampling edge in the timing diagram is required to be re-used as the current node to which the current random walk arrives, and the steps S100 to S112 are re-executed according to the re-determined current node until the current random walk ends. For example, continuing the example in step S112 in the above method, assuming that edge 5 is the candidate edge obtained in step S108 and satisfies the timing constraint, edge 5 is added to the path, and the path currently traveled randomly is node 1, node 2, and node 5. And (3) taking the neighbor node connected with the current node through the sampling edge in the timing diagram as the current node to which the current random walk currently takes place again, namely taking the node 5 as the current node to which the current random walk currently takes place again, and executing the steps S100-S112 again according to the node 5 until the current random walk is finished.
The method for generating the timing graph sample shown in fig. 2 can efficiently generate a path as a training sample according to the timing graph, wherein elements of each edge in a mask vector are set according to a timing sequence of an edge from a previous node to a current node, a timing sequence of each edge connected to the current node, and a timing constraint condition of the timing graph, whether a candidate edge obtained by alias sampling through an alias table satisfies the timing constraint condition is judged by using the mask vector, the satisfied candidate edge is added to a current random walk path as a sampling edge, and the current random walk path is used as the generated training sample until the current random walk is finished. By the method, the sampling efficiency can be improved, so that the generation speed of the random walk path is improved, and the path serving as the training sample can be generated efficiently. However, because some edges do not satisfy the timing constraint condition, that is, the edge set as the second element in the method of fig. 2, in the alias sampling process, the edge that does not satisfy the timing constraint condition and the edge that satisfies the timing constraint condition are simultaneously sampled, and although it is determined whether the timing constraint condition is satisfied by the corresponding element of the candidate edge in the mask vector after the candidate edge is obtained, the candidate edge is obtained by sampling the edge that does not satisfy the timing constraint condition and the edge that satisfies the timing constraint condition together in the alias sampling process, which may reduce the sampling efficiency.
Based on this, the edges connected to the current node may be arranged in the order specified by the timing information, the arranged edges may be grouped, and the maximum timing of each group may be determined. And aiming at each group, constructing a nickname table of the group according to the weight of each edge in the group, and generating a mask vector of the group according to the time sequence of each edge of the group, the reference time sequence and the time sequence constraint condition corresponding to the time sequence. According to the reference timing and each group of maximum timings, a group meeting the condition is determined, and a group is randomly selected from the group meeting the condition as a candidate group. Alias sampling is performed on the candidate group, i.e., the operations in step S108 to step S114 in fig. 2 are performed. The designated sequence may be an ascending sequence or a descending sequence, and for convenience of description, the ascending sequence is exemplified as shown in fig. 5. Fig. 5 is a schematic flow chart of a method for generating a sample when edges in a timing chart are sorted, which specifically includes the following steps:
s200: and determining each edge connected with the current node, and arranging each edge according to the increasing sequence of the time sequence information to obtain the increasing sequence.
Determining each edge (namely, neighbor edge) connected with the current node, and arranging each edge according to the ascending order of the time sequence information. Wherein the current node is the current random walk-to node in the timing graph. Continuing with the example in step S102 in the method shown in fig. 2, the nodes connected to the current node 2 by the edges include node 3, node 5, node 6, and node 7, and the neighboring edges connected to node 2 include edge 3, edge 5, edge 6, and edge 7 by using the connected node number as the number of the connecting edge. The time sequences of the edge 3, the edge 5, the edge 6 and the edge 7 are respectively 1, 5, 2 and 0, and the edges are arranged according to the time sequences of all the adjacent edges and the increasing sequence of the time sequence information to obtain the increasing sequence, namely the edges 3, the edge 5, the edge 6 and the edges 7 are arranged according to the increasing sequence of the time sequence information to obtain the increasing sequence of the edges 7,3, 5 and 6.
S202: and grouping the increasing sequence, and determining the maximum time sequence of each group.
And grouping the edges which are arranged in an increasing order, wherein each group can comprise the same number of edges or different numbers of edges, and determining the maximum time sequence of each group. Continuing with the above example, the increasing order is grouped, i.e. the sides 7,3, 5, and 6 are grouped, assuming that the groups are divided into two groups a and B, where the group a includes the sides 7 and 3, the group B includes the sides 6 and 5, the maximum timing of the group a is 1, and the maximum timing of the group B is 5.
S204: and aiming at each group, constructing a nickname table of the group according to the weight of each edge in the group.
For each group, the nickname table of the group is constructed according to the weight of each edge in the group, and the construction process of the nickname table is substantially the same as that of step S104 in the method shown in fig. 2, and is not described here again. Continuing with the above example, assume that the weights of the neighbor edges of node 2 are as shown in Table 2.
Figure BDA0003867535220000111
TABLE 2
Based on the weights shown in table 2, since the weights of the neighbor edges in each group in table 2 are not normalized weights, the weights of the neighbor edges in each group need to be normalized again. The weights of the side 7 and the side 3 in the group A are 0.5 and 0.3 respectively, and after normalization processing, the normalized weights of the side 7 and the side 3 in the group A are 5/8 and 3/8 respectively. The weights of the edge 6 and the edge 5 in the group B are 0.1 and 0.1 respectively, and after normalization processing, the normalized weights of the edge 6 and the edge 5 in the group B are 1/2 and 1/2 respectively. For each group, the product of the normalized weight of each neighbor edge of the group and the number of neighbor edges in the group is used as the adjustment weight of each neighbor edge in the group, that is, the weight of each neighbor edge in the group is multiplied by 2, and after adjustment, the adjustment weights corresponding to the edge 7 and the edge 3 in the group A are respectively 5/4 and 3/4, and the adjustment weights corresponding to the edge 6 and the edge 5 in the group B are respectively 1 and 1.
A histogram is created based on the adjusted weights, as shown in fig. 6. The 2 neighbor edges in the group A are divided into 2 columns, each column corresponds to one neighbor edge, the column height of each column is the adjustment weight of the corresponding neighbor edge, namely the edge 7 corresponding to the 1 st column is 5/4, the edge 3 corresponding to the 2 nd column is 3/4. Transferring the weight of the neighbor side with the adjustment weight larger than 1 to the neighbor side with the adjustment weight smaller than 1, so that the column height of each column of the histogram of the adjusted group A is 1, and each column at most only contains two neighbor sides, that is, transferring the partial weight (namely, 1/4) with the adjustment weight larger than 1 of the side 7 to the side 3, so that the column height of each column is 1, obtaining the histogram of the adjusted group A, wherein the probability of the side 7 in the 1 st column is 1, the probability of the side 3 in the 2 nd column is 3/4, and the probability of the side 7 is 1/4, obtaining the histogram of the adjusted group B according to the mode of obtaining the histogram of the adjusted group A, generating a nickname table according to the adjusted histogram shown in FIG. 6, and directly using the adjusted histogram as the nickname table. The probability of the original corresponding neighbor edge of each column is stored in the first array, the number of the neighbor edge transferred to each column is stored in the second array, so that the number stored in the first array of the group A is (1,3/4), and the number stored in the second array is (null, 3). Group B stores (1,1) in the first array and (null) in the second array.
S206: and aiming at each group, generating a mask vector of the group according to the time sequence of each side in the group, the reference time sequence and the time sequence constraint condition corresponding to the time sequence.
And determining the time sequence from the previous node to the edge of the current node in the time sequence chart as a reference time sequence according to the current node and the previous node. And aiming at each group, generating a mask vector of the group according to the time sequence of each side in the group, the reference time sequence and the time sequence constraint condition corresponding to the time sequence. Specifically, the timing constraint condition of the timing diagram is determined first, and whether the timing constraint condition is satisfied is determined as follows: when the time sequence of the edge is greater than the reference time sequence, the time sequence constraint condition is satisfied, and when the time sequence of the edge is not greater than the reference time sequence, the time sequence constraint condition is not satisfied. And aiming at each edge in each group, judging whether the edge meets the time sequence constraint condition according to the time sequence of the edge, the reference time sequence and the time sequence constraint condition corresponding to the time sequence, if so, setting an element of the edge corresponding to the mask vector as a first element, and otherwise, setting an element of the edge corresponding to the mask vector as a second element.
Continuing with the above example, when the reference timing sequence of the current random walk on the graph is 1, when the element corresponding to each neighbor edge is set, the element corresponding to the neighbor edge whose timing sequence is greater than 1 on the neighbor edge in the mask vector is set as the first element 1, the element corresponding to the neighbor edge whose timing sequence is not greater than 1 on the neighbor edge in the mask vector is set as the second element 0, after the setting of each group of edges is completed, the mask vector of each group is generated, the timing sequence on the edge 7 is 0 and is not greater than the reference timing sequence 1, so the element corresponding to the edge 7 in the mask vector is set as the second element 0, the timing sequence on the edge 3 is set as the second element 0 and is not greater than the reference timing sequence 1, so the element corresponding to the edge 3 in the mask vector is set as the second element 0, and the generated group a mask vector is [0,0]. The timing on edge 6 is 2, which is greater than the reference timing 1, so the corresponding element of edge 6 in the mask vector is set to the first element 1, the timing on edge 5 is 5, which is greater than the reference timing 1, so the corresponding element of edge 5 in the mask vector is set to the first element 1, and the resulting group B mask vector is [1,1].
S208: according to the reference timing and the maximum timing of each group, a group meeting the condition is determined, and a group is randomly selected from the groups meeting the condition as a candidate group.
And determining a qualified group according to the reference time sequence and each group of maximum time sequences, wherein for each group, when the maximum time sequence of the group is greater than the reference time sequence, the group is determined to be the qualified group, and when the maximum time sequence of the group is not greater than the reference time sequence, the group is determined not to be the qualified group. A group is randomly selected from the eligible groups as a candidate group. Specifically, since the sides are arranged according to the ascending order, the groups are also arranged according to the ascending order, the minimum maximum time sequence larger than the reference time sequence can be determined according to the maximum time sequence and the reference time sequence of each group, and is used as the boundary time sequence, the group where the boundary time sequence is located is used as the boundary group, and then the maximum time sequences of the groups arranged behind the boundary group are both larger than the reference time sequence, that is, the groups arranged behind the boundary group and the boundary group can be directly used as the groups meeting the conditions. A group is randomly selected from the eligible groups as a candidate group. Continuing with the above example, the reference timing is 1, group A includes edge 7 and edge 3, the maximum timing of group A is 1, and the maximum timing of group A is not greater than the reference timing, i.e., group A is not a qualified group. Group B contains edge 6 and edge 5, the maximum timing of group B is 5, and the maximum timing of group B is greater than the reference timing, i.e., group B is a qualified group. Since only one group is a group that meets the condition in the example, the group is directly used as a candidate group, that is, the group B is used as a candidate group.
S210: and carrying out alias sampling on each side according to the candidate group alias table to obtain a candidate side.
And performing alias sampling on each edge according to the candidate group alias table, and taking the sampled edge as a candidate edge. Specifically, a random number is generated in the range from 0 to the number of columns of the candidate group name table, and the generated random number is rounded up to be used as the number of candidate columns, and the jth column in the candidate group name table corresponding to the number of candidate columns is determined, wherein j is a positive integer. And randomly generating a probability p1, judging whether the probability of the p1 is smaller than that of the edge originally corresponding to the jth row in the first array, if so, taking the edge originally corresponding to the jth row as a candidate edge, and if not, taking the edge transferred to the jth row in the second array as a candidate edge. Wherein p1 is a number in the range of 0 to 1. Continuing with fig. 6, assuming that a random number 1.7 is generated, the randomly generated number is rounded up to obtain a candidate column number, and the jth column in the nickname table of the candidate group (i.e., group B) corresponding to the candidate column number is determined, i.e., 2, which is rounded up to 1.7, is used as the candidate column number, i.e., 2, in the corresponding nickname table of the candidate group. And then, assuming that a probability of 0.2 is randomly generated, judging whether the randomly generated probability is smaller than the probability of the edge originally corresponding to the 2 nd row in the first array, if so, taking the edge originally corresponding to the 2 nd row as a candidate edge, and if not, taking the edge transferred to the 2 nd row in the second array as a candidate edge. Since there is only edge 5 in column 2, edge 5 can be directly taken as a candidate edge.
S212: and judging whether the candidate edges meet the time sequence constraint condition or not according to the mask vectors of the candidate group, if so, executing a step S214, and otherwise, executing a step S208.
And determining corresponding elements of the candidate edges according to the mask vectors, and judging whether the candidate edges meet the time sequence constraint conditions according to the elements. Specifically, the corresponding elements of the candidate edge are determined according to the mask vector, when the corresponding elements of the candidate edge in the mask vector are the first elements, the candidate edge satisfies the timing constraint condition, when the corresponding elements of the candidate edge in the mask vector are the second elements, the candidate edge does not satisfy the timing constraint condition, when the candidate edge does not satisfy the timing constraint condition, step S208 needs to be executed to reselect a group, the operations in step S210 to step S212 are executed until the candidate edge obtained by alias sampling satisfies the timing constraint condition, and step S214 is executed. Continuing with the above example, assuming that the candidate edge obtained in step S210 is edge 5, and the element of edge 5 corresponding to the mask vector is the first element 1, then edge 5 satisfies the timing constraint, and step S214 is executed.
S214: the candidate edges are added to the path as sampled edges.
And if the candidate edge meets the time sequence constraint condition, the server adds the candidate edge as a sampling edge into the path, takes the neighbor node connected with the current node in the time sequence diagram through the sampling edge as the current node to which the current random walk currently walks again, and re-executes the steps S200-S216 according to the re-determined current node until the current random walk is finished. Continuing with the above example, assuming that the edge 5 is the candidate edge obtained in step S210 and meets the timing constraint, the edge 5 is added to the path, the path that is currently randomly walked is the node 1, the node 2, and the node 5 is taken as the current node that is currently randomly walked.
S216: and when the current random walk is finished, taking the path of the current random walk as a generated training sample, wherein the training sample is used for inputting the graph neural network to be trained so as to train the graph neural network to be trained.
And when the current random walk is finished on the timing diagram, taking the path of the current random walk as a generated training sample, and inputting the training sample into the graph neural network to be trained so as to train the graph neural network to be trained. Specifically, when the length of the current random walk path reaches a set threshold, the current random walk path is used as a generated training sample, and the training sample is input to the graph neural network to be trained so as to train the graph neural network to be trained. The length of the path may be determined by the number of nodes in the path or by the number of edges in the path. Or when the candidate edge meeting the timing constraint condition cannot be obtained, taking the current path which is randomly walked as a generated training sample, and inputting the training sample into the graph neural network to be trained so as to train the graph neural network to be trained.
Specifically, a path sample for training a neural network of a graph to be trained and a label of the path sample are obtained, wherein the neural network of the graph to be trained is used for obtaining a shortest path from one node to another node on a timing diagram, the path sample is a path randomly walked from the one node serving as the training sample to the another node generated on the timing diagram, and the label of the path sample is the shortest path randomly walked from the one node to the another node on the timing diagram. And inputting the path sample into the neural network of the graph to be trained to obtain an output path of the neural network of the graph to be trained, and training the neural network of the graph to be trained by using the minimum difference between the output path of the neural network of the graph to be trained and the label of the path sample as a training target.
According to the method, the sides which are arranged in an increasing order are grouped, one group is randomly selected from the groups which meet the conditions that the maximum time sequence of each group is greater than the time sequence of the side from the last node to the current node, the probability that the side which does not meet the time sequence constraint condition is selected is reduced, the sampling probability is improved, the generation speed of the random walk path is improved, and the path serving as the training sample can be generated efficiently.
Based on the same idea, the present specification further provides a corresponding apparatus for generating a timing chart sample, as shown in fig. 7.
Fig. 7 is a schematic diagram of a timing chart sample generation apparatus provided in this specification, which specifically includes:
an obtaining module 300, configured to determine, in a timing chart, a current node to which the current random walk arrives and a node that is previous to the current node in a path of the current random walk;
a timing determining module 302, configured to determine a timing of an edge from the previous node to the current node in the timing graph as a reference timing;
a generating module 304, configured to determine each edge connected to the current node in the timing graph, construct a nickname table according to the weight of each edge, and generate a mask vector according to the timing sequence of each edge, the reference timing sequence, and a timing constraint condition corresponding to the timing graph;
a sampling module 306, configured to perform alias sampling on each edge according to the alias table to obtain a candidate edge;
a first determining module 308, configured to determine whether the candidate edge satisfies the timing constraint condition according to the mask vector;
a processing module 310, configured to, if a determination result of the first determining module is yes, add the candidate edge as a sampling edge to the path;
an output module 312, configured to, when the current random walk is finished, use a path of the current random walk as a generated training sample, where the training sample is used to input a to-be-trained graph neural network, so as to train the to-be-trained graph neural network.
Optionally, the generating module 304 is specifically configured to determine whether the edge satisfies the timing constraint condition according to the timing sequence of the edge, the reference timing sequence, and the timing constraint condition corresponding to the timing sequence, if so, set an element of the edge corresponding to the mask vector as a first element, otherwise, set an element of the edge corresponding to the mask vector as a second element;
the first determining module 308 is specifically configured to determine that the candidate edge satisfies the timing constraint condition when an element of the candidate edge corresponding to the mask vector is the first element; and when the element of the candidate edge corresponding to the mask vector is the second element, determining that the candidate edge does not satisfy the timing constraint condition.
Optionally, the generating module 304 is further configured to determine that the edge is not in the current randomly-traveled path before setting the element of the edge corresponding to the mask vector as the first element.
Optionally, the generating module 304 is specifically configured to, when the time sequence of the edge is greater than the reference time sequence, determine that the edge satisfies a time sequence constraint condition; and when the time sequence of the edge is not greater than the reference time sequence, determining that the edge does not meet the time sequence constraint condition.
Optionally, the sampling module 306 is further configured to, when the determination result of the first determining module 308 is negative, perform alias sampling on each edge according to the alias table again to obtain the candidate edge until the candidate edge satisfies the timing constraint condition.
Optionally, the processing module 310 is further configured to, when the determination result of the first determining module 308 is yes, instruct the obtaining module 300 to take a neighboring node connected to the current node via the sampling edge in the timing diagram as the current node to which the current random walk arrives again, and instruct the timing determining module 302, the generating module 304, and the sampling module 306 to perform alias sampling again according to the re-determined current node until the current random walk ends.
Optionally, the apparatus further comprises:
a second determining module 314, configured to determine that the current random walk ends when the number of nodes in the path of the current random walk has reached a set threshold; or, when the candidate edge meeting the timing constraint condition cannot be obtained, determining that the current random walk is finished.
The present specification also provides a computer-readable storage medium storing a computer program, which can be used to execute the method for generating the timing chart sample shown in fig. 2.
This specification also provides a schematic block diagram of the electronic device shown in fig. 8. As shown in fig. 8, at the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, but may also include hardware required for other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the method for generating the timing chart sample shown in fig. 2. Of course, besides the software implementation, this specification does not exclude other implementations, such as logic devices or combination of software and hardware, and so on, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as ABEL (Advanced Boolean Expression Language), AHDL (alternate Hardware Description Language), traffic, CUPL (core universal Programming Language), HDCal, jhddl (Java Hardware Description Language), lava, lola, HDL, PALASM, rhyd (Hardware Description Language), and vhigh-Language (Hardware Description Language), which is currently used in most popular applications. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the various elements may be implemented in the same one or more pieces of software and/or hardware in the practice of this description.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims (16)

1. A method of generating timing diagram samples, comprising:
determining a current node to which the current random walk arrives and a node which is previous to the current node in a path of the current random walk in a time sequence diagram;
determining the time sequence of the edge from the last node to the current node in the time sequence chart as a reference time sequence;
determining each edge connected with the current node in the time sequence diagram, constructing a nickname table according to the weight of each edge, and generating a mask vector according to the time sequence of each edge, the reference time sequence and the time sequence constraint condition corresponding to the time sequence diagram;
performing alias sampling on each edge according to the alias table to obtain candidate edges;
judging whether the candidate edge meets the time sequence constraint condition or not according to the mask vector;
if so, adding the candidate edge as a sampling edge into the path;
and when the current random walk is finished, taking the path of the current random walk as a generated training sample, wherein the training sample is used for inputting a graph neural network to be trained so as to train the graph neural network to be trained.
2. The method according to claim 1, wherein generating a mask vector according to the timing sequence of each edge, the reference timing sequence, and the timing constraint condition corresponding to the timing sequence includes:
for each edge, judging whether the edge meets the time sequence constraint condition or not according to the time sequence of the edge, the reference time sequence and the time sequence constraint condition corresponding to the time sequence chart, if so, setting an element of the edge corresponding to the mask vector as a first element, and otherwise, setting an element of the edge corresponding to the mask vector as a second element;
judging whether the candidate edge meets the time sequence constraint condition specifically comprises the following steps:
when the element of the candidate edge corresponding to the mask vector is the first element, determining that the candidate edge satisfies the timing constraint condition;
and when the element of the candidate edge corresponding to the mask vector is the second element, determining that the candidate edge does not satisfy the timing constraint condition.
3. The method of claim 2, prior to setting the element of the edge corresponding in the mask vector to the first element, the method further comprising:
determining that the edge is not in the current randomly walked path.
4. The method of claim 2, wherein determining whether the edge satisfies the timing constraint specifically comprises:
when the time sequence of the edge is greater than the reference time sequence, determining that the edge meets a time sequence constraint condition;
and when the time sequence of the edge is not greater than the reference time sequence, determining that the edge does not meet the time sequence constraint condition.
5. The method of claim 1, when the candidate edge does not satisfy the timing constraint, the method further comprising:
and performing alias sampling on each edge again according to the alias table until candidate edges obtained by alias sampling meet the time sequence constraint condition.
6. The method of claim 1, when the candidate edge satisfies the timing constraint, the method further comprising:
taking the neighbor node connected with the current node through the sampling edge in the timing diagram as the current node to which the current random walk is carried out again;
and carrying out alias sampling again according to the redetermined current node until the current random walk is finished.
7. The method according to claim 6, wherein the end of the current random walk specifically includes:
when the length of the path of the current random walk reaches a set threshold value, the current random walk is finished; or
And when the candidate edge meeting the time sequence constraint condition cannot be obtained, finishing the current random walk.
8. An apparatus for generating timing diagram samples, comprising:
the acquisition module is used for determining a current node which is currently and randomly walked to and a previous node of the current node in a path which is currently and randomly walked;
a timing determination module, configured to determine a timing of an edge from the previous node to the current node in the timing graph, as a reference timing;
a generating module, configured to determine each edge connected to the current node in the timing graph, construct a nickname table according to the weight of each edge, and generate a mask vector according to the timing sequence of each edge, the reference timing sequence, and a timing constraint condition corresponding to the timing graph;
the sampling module is used for carrying out alias sampling on each edge according to the alias table to obtain candidate edges;
the first judging module is used for judging whether the candidate edge meets the time sequence constraint condition or not according to the mask vector;
the processing module is used for adding the candidate edge as a sampling edge into the path when the judgment result of the first judgment module is yes;
and the output module is used for taking the path of the current random walk as a generated training sample when the current random walk is finished, wherein the training sample is used for inputting the graph neural network to be trained so as to train the graph neural network to be trained.
9. The apparatus according to claim 8, wherein the generating module is specifically configured to determine whether the edge satisfies the timing constraint condition according to the timing of the edge, the reference timing, and the timing constraint condition corresponding to the timing graph, if so, set an element of the edge corresponding to the mask vector as a first element, otherwise, set an element of the edge corresponding to the mask vector as a second element;
the first determining module is specifically configured to determine that the candidate edge satisfies the timing constraint condition when an element of the candidate edge corresponding to the mask vector is the first element; and when the element of the candidate edge corresponding to the mask vector is the second element, determining that the candidate edge does not satisfy the timing constraint condition.
10. The apparatus of claim 9, the generation module, prior to setting the element of the edge corresponding to the mask vector as the first element, is further to determine that the edge is not in the path of the current random walk.
11. The apparatus of claim 9, wherein the generating module is specifically configured to determine that the edge satisfies a timing constraint when the timing of the edge is greater than the reference timing; and when the time sequence of the edge is not greater than the reference time sequence, determining that the edge does not meet the time sequence constraint condition.
12. The apparatus according to claim 8, wherein the sampling module is further configured to, when the determination result of the first determining module is negative, perform alias sampling on the edges again according to the alias table to obtain the candidate edges until the candidate edges satisfy the timing constraint condition.
13. The apparatus according to claim 8, wherein the processing module is further configured to, when the determination result of the first determining module is yes, instruct the obtaining module to re-use a neighbor node in the timing diagram, to which the current node is connected via the sampling edge, as the current node to which the current random walk arrives, and instruct the timing determining module, the generating module, and the sampling module to perform alias sampling again according to the re-determined current node until the current random walk ends.
14. The apparatus of claim 13, the apparatus further comprising:
a second judging module, configured to determine that the current random walk ends when the number of nodes in the path of the current random walk has reached a set threshold; or, when the candidate edge meeting the timing constraint condition cannot be obtained, determining that the current random walk is finished.
15. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method of any one of the preceding claims 1 to 7.
16. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any of the preceding claims 1 to 7 when executing the program.
CN202211185663.6A 2022-09-27 2022-09-27 Method and device for generating timing diagram sample, storage medium and electronic equipment Pending CN115511053A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211185663.6A CN115511053A (en) 2022-09-27 2022-09-27 Method and device for generating timing diagram sample, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211185663.6A CN115511053A (en) 2022-09-27 2022-09-27 Method and device for generating timing diagram sample, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN115511053A true CN115511053A (en) 2022-12-23

Family

ID=84506523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211185663.6A Pending CN115511053A (en) 2022-09-27 2022-09-27 Method and device for generating timing diagram sample, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN115511053A (en)

Similar Documents

Publication Publication Date Title
CN107450979B (en) Block chain consensus method and device
CN107957989B (en) Cluster-based word vector processing method, device and equipment
CN112200132A (en) Data processing method, device and equipment based on privacy protection
CN116049761A (en) Data processing method, device and equipment
CN116151363B (en) Distributed Reinforcement Learning System
CN115511053A (en) Method and device for generating timing diagram sample, storage medium and electronic equipment
CN115204395A (en) Data processing method, device and equipment
CN111753990B (en) Quantum computer environment simulation method, device and medium
CN111931797B (en) Method, device and equipment for identifying network to which service belongs
CN116415103B (en) Data processing method, device, storage medium and electronic equipment
CN116991388B (en) Graph optimization sequence generation method and device of deep learning compiler
CN117171401B (en) Query method and device for shortest path in graph data based on hierarchical pre-calculation
CN115423485B (en) Data processing method, device and equipment
CN116911224B (en) Method for optimizing digital logic circuit, computer device and storage medium
CN115840910A (en) Training sample generation method and device, storage medium and electronic equipment
CN115841335A (en) Data processing method, device and equipment
CN113344186A (en) Neural network architecture searching method and image classification method and device
CN117612556A (en) Data processing method, device and equipment
CN115878665A (en) Distributed extremely-large independent set searching method and device and storage medium
CN114154559A (en) Image recognition model training method and device
CN116150634A (en) Sampling processing method and device
CN117391166A (en) Hypergraph neural network updating method, device and equipment based on redundancy elimination
US10031982B2 (en) Rule based data normalization utilizing multi-key sorting
CN114676132A (en) Data table association method and device, storage medium and electronic equipment
CN116543759A (en) Speech recognition processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination