CN113849698A - Method and device for determining pre-estimated weight matrix of undirected graph - Google Patents

Method and device for determining pre-estimated weight matrix of undirected graph Download PDF

Info

Publication number
CN113849698A
CN113849698A CN202111082304.3A CN202111082304A CN113849698A CN 113849698 A CN113849698 A CN 113849698A CN 202111082304 A CN202111082304 A CN 202111082304A CN 113849698 A CN113849698 A CN 113849698A
Authority
CN
China
Prior art keywords
weight matrix
vertex
current
matrix
grouping
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111082304.3A
Other languages
Chinese (zh)
Inventor
庞博
曾凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202111082304.3A priority Critical patent/CN113849698A/en
Publication of CN113849698A publication Critical patent/CN113849698A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Software Systems (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method for determining a pre-estimated weight matrix of an undirected graph is provided: obtaining an original weight matrix W0And original grouping information, and applying the current weight matrix WxAnd current packet information is initialized to W0And original packet information. And circularly executing the following steps until a cycle end condition is met: according to WxAnd current packet information, calculating current total network information
Figure DDA0003264516950000011
Wherein Q isijThe difference degree of the vertex i and the vertex j is inversely related to the probability that the vertex i and the vertex j belong to the same group; judging whether a circulation ending condition is met, if so, ending the circulation, and if not, updating WxAnd current packet information.After the circulation is finished, W at the end of the circulation is determinedxAs the estimated weight matrix of the undirected graph.

Description

Method and device for determining pre-estimated weight matrix of undirected graph
Technical Field
One or more embodiments of the present disclosure relate to the field of graph application technologies, and in particular, to a method and an apparatus for determining a pre-estimated weight matrix of an undirected graph.
Background
The practical application of the graph theory is very wide, and an undirected graph is used as a branch of the graph theory and also has very wide application. When the undirected graph is actually applied, the undirected graph needs to be constructed according to an actual application scene, for example, what each vertex of the undirected graph represents, what edges between the vertices represent, what the weights of the edges represent, and the like. Taking the social relationship as an example, each vertex represents a user, an edge between the vertices represents whether a direct friend relationship exists between the users corresponding to the two vertices, and the weight of the edge represents the relationship closeness between the users corresponding to the two vertices.
In the related art, when performing related calculation based on a constructed undirected graph, it is generally default that the constructed graph is accurate, that is, the undirected graph is constructed based on comprehensive and accurate information in an application scene, and then the related calculation based on the constructed undirected graph is also accurate.
However, in practical applications, the constructed undirected graph is not very accurate due to data missing, dirty data and the like, and the calculation of correlation based on the constructed undirected graph is also inaccurate.
Disclosure of Invention
In view of this, one or more embodiments of the present disclosure provide a method and an apparatus for determining an estimated weight matrix of an undirected graph.
To achieve the above object, one or more embodiments of the present disclosure provide the following technical solutions:
according to a first aspect of one or more embodiments of the present specification, there is provided a method for determining an estimated weight matrix of an undirected graph, the method including:
obtaining an original weight matrix W of an undirected graph0And original packet information; wherein, for any element W in any weight matrix WijRepresents the weight between vertex i and vertex j, WijMore than 0, i is not equal to j, i, j belongs to {1, 2, 3, … …, n }, and n is the total number of the vertex points;
taking the original weight matrix as a current weight matrix WxTaking the original grouping information as current grouping information;
and circularly executing the following steps until a cycle end condition is met:
according to the current weight matrix WxAnd current packet informationCalculating the current total network information of the undirected graph
Figure BDA0003264516930000011
Wherein Q isijIs the degree of difference between vertex i and vertex j, which is inversely related to the probability of belonging to the same group between vertex i and vertex j; the probability of belonging to the same group between the vertex i and the vertex j is obtained according to the grouping information;
if the cycle end condition is not met, updating the current weight matrix W according to a preset updating algorithm based on TxAnd current grouping information;
after the circulation is finished, the current weight matrix W at the end of the circulation is used1And determining the estimated weight matrix as the undirected graph.
According to a second aspect of one or more embodiments of the present specification, there is provided an apparatus for determining an estimated weight matrix of an undirected graph, the apparatus comprising:
an obtaining module, configured to obtain an original weight matrix W of an undirected graph0And original packet information; wherein, for any element W in any weight matrix WijRepresents the weight between vertex i and vertex j, WijMore than 0, i is not equal to j, i, j belongs to {1, 2, 3, … …, n }, and n is the total number of the vertex points;
an initialization module for taking the original weight matrix as a current weight matrix WxTaking the original grouping information as current grouping information;
the cycle execution module comprises a calculation unit and an updating unit, and is used for controlling the calculation unit and the updating unit in a cycle until a cycle ending condition is met:
the calculation unit is used for calculating the weight matrix W according to the current weight matrixxAnd current grouping information, calculating current total network information of the undirected graph
Figure BDA0003264516930000021
Wherein Q isijIs the degree of difference between vertex i and vertex j, which is inversely related to the probability of belonging to the same group between vertex i and vertex j;the probability of belonging to the same group between the vertex i and the vertex j is obtained according to the grouping information;
the updating unit is used for updating the current weight matrix W according to a preset updating algorithm based on T if the cycle ending condition is not metxAnd current grouping information;
a pre-estimation weight matrix determining module for determining the current weight matrix W after the circulation is finished1And determining the estimated weight matrix as the undirected graph.
According to a third aspect of one or more embodiments of the present specification, there is provided an electronic apparatus including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor executes the executable instructions to implement the method for determining the pre-estimated weight matrix of the undirected graph.
According to a fourth aspect of one or more embodiments of the present specification, a computer-readable storage medium is presented, on which computer instructions are stored, which when executed by a processor, implement the method for determining a pre-estimated weight matrix of an undirected graph as described above.
In one or more embodiments of the present specification, an original weight matrix W of an undirected graph is obtained0And original packet information, the original weight matrix W0And respectively initializing the original packet information into corresponding current weight matrix WxAnd grouping the information currently, and then circularly executing the following steps until a cycle end condition is met: according to the current weight matrix WxAnd current grouping information, calculating current total network information of the undirected graph
Figure BDA0003264516930000022
Wherein Q isijIs the degree of difference between vertex i and vertex j, which is inversely related to the probability of belonging to the same group between vertex i and vertex j; judging whether a circulation ending condition is met or not, if so, ending the circulation, and if not, updating according to a preset value based on TAlgorithm, updating current weight matrix WxAnd current packet information. After the circulation is finished, the current weight matrix W at the end of the circulation is usedxAs the estimated weight matrix of the undirected graph.
By one or more embodiments of the present specification, the estimated weight matrix proximate to the true and correct weight matrix is estimated step by step based on the relationship between the weight matrix and the grouping information, so that the correlation calculation based on the weight matrix is more accurate.
Drawings
Fig. 1 is a schematic flowchart illustrating a method for determining a pre-estimated weight matrix of an undirected graph according to an exemplary embodiment.
Fig. 2 is a block diagram illustrating an apparatus for determining an estimated weight matrix of an undirected graph according to an exemplary embodiment.
Fig. 3 is a schematic structural diagram of a computer device according to an exemplary embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with one or more embodiments of the present specification. Rather, they are merely examples of apparatus and methods consistent with certain aspects of one or more embodiments of the specification, as detailed in the claims which follow.
It should be noted that: in other embodiments, the steps of the corresponding methods are not necessarily performed in the order shown and described herein. In some other embodiments, the method may include more or fewer steps than those described herein. Moreover, a single step described in this specification may be broken down into multiple steps for description in other embodiments; multiple steps described in this specification may be combined into a single step in other embodiments.
In many application scenarios, the weight matrix of the undirected graph obtained based on the related data is inaccurate, and many further calculations are performed based on the weight matrix, for example, clustering of graphs, segmentation of graphs, and the like, all need to be performed based on the weight matrix of the constructed undirected graph, and in many cases, default data is correct and complete, so the default constructed undirected graph is also accurate, and further correlation calculations performed based on the constructed undirected graph (that is, taking the weight matrix of the constructed undirected graph as a hyper-parameter) are also accurate. However, in the practical case, the obtained related data is not accurate and complete, and in the case of a social scenario, the missing or insufficient interaction data between two users represented by two vertices results in the weight representing the relationship closeness between the two vertices being much smaller than the actual weight.
The inventor finds that, in practice, in many application scenarios, each vertex in the undirected graph is grouped, and the smaller the difference between the two vertices (the more similar the related information of the two vertices), the more likely the two vertices belong to the same group, while the grouped information is strongly linked to the weight, the weight between the two vertices belonging to the same group is relatively larger, and the weight between the two vertices not belonging to the same group is relatively smaller. Taking a social relationship as an example, each vertex is a user, a friend ring is a group (e.g., a family is a group), and a weight between every two vertices represents a relationship closeness between the users represented by the two vertices, where a relationship closeness between two users belonging to the same friend ring is relatively higher, and a relationship closeness between two users belonging to different friend rings is relatively lower, and may even be 0.
Based on the above, the present specification proposes a method for determining an estimated weight matrix of an undirected graph in an application scenario where the weight matrix and the grouping information have the above relationship, and obtains an original weight matrix W of the undirected graph0And original packet information, the original weight matrix W0And respectively initializing the original packet information into corresponding current weight matrix WxAnd grouping the information currently, and then circularly executing the following steps until a cycle end condition is met: according to the current weight matrix WxAnd current grouping information, calculating current total network information of the undirected graph
Figure BDA0003264516930000041
Wherein Q isijIs the degree of difference between vertex i and vertex j, which is inversely related to the probability of belonging to the same group between vertex i and vertex j; judging whether a cycle ending condition is met, if so, ending the cycle, and if not, updating the current weight matrix W according to a preset updating algorithm based on TxAnd current packet information. After the circulation is finished, the current weight matrix W at the end of the circulation is usedxAs the estimated weight matrix of the undirected graph.
By one or more embodiments of the present specification, the estimated weight matrix proximate to the true and correct weight matrix is estimated step by step based on the relationship between the weight matrix and the grouping information, so that the correlation calculation based on the weight matrix is more accurate.
The specification provides a method for determining an estimated weight matrix of an undirected graph, and provides a device, equipment and a computer storage medium corresponding to the method, and the method for determining the estimated weight matrix of the undirected graph is described in detail next.
As described above, in many application scenarios, each vertex in the undirected graph is grouped, and generally, grouping information is strongly related to weight, and the weight between two vertices belonging to the same group is relatively larger, and the weight between two vertices not belonging to the same group is relatively smaller. The method for determining the estimated weight matrix of the undirected graph in the present specification is performed on the basis of the above, in other words, if the group information and the weight matrix have no relationship as described above, the method for determining the estimated weight matrix of the undirected graph in the present specification is not applied.
First, the principle of the method for determining the pre-estimated weight matrix of the undirected graph will be described, in which the weight between two vertices belonging to the same group is relatively larger, and the difference between two vertices belonging to the same group is relatively smaller, and relatively, the vertices not belonging to the same group are notThe weight between two vertices of a group will be relatively smaller and the degree of difference between two vertices belonging to different groups will be greater, based on the recognition that for any two vertices (vertex i and vertex j), W is usedijRepresents the weight between vertex i and vertex j, using QijRepresents the degree of difference between vertex i and vertex j, since WijAnd QijAlways one is relatively small (either relatively small weight or relatively small degree of difference between vertex i and vertex j), then Wij*QijThe values obtained are always relatively small, such that
Figure BDA0003264516930000042
The smaller T is the total network information (n is the total vertex number) of the undirected graph, which indicates that the relationship between the weight matrix and the grouping information at this time should be more consistent with the above-mentioned knowledge, and the closer the weight matrix at this time is to the true and correct weight matrix.
As shown in fig. 1, a schematic flow chart of a method for determining a pre-estimated weight matrix of an undirected graph shown in this specification includes the following steps:
step 101, obtaining an original weight matrix W of an undirected graph0And original packet information.
Note that, in this specification, any element W in any weight matrix W is referred toijRepresents the weight between vertex i and vertex j, WijGreater than 0, i ≠ j (i.e., vertex i and vertex j are two different vertices), i, j ∈ {1, 2, 3, … …, n }, with n being the total number of vertices.
In this specification, the original weight matrix is constructed from original data, which may have missing or dirty data (erroneous data), and thus the constructed original weight matrix may not be accurate. Or taking a social scene as an example, each vertex is a user, and for any two users, determining the relationship closeness between the two users according to the interaction of the two users in the social scene, so as to obtain a relationship weight matrix.
It should be noted that, the present specification is directed to an undirected graph, and therefore, the weight matrices are all symmetric matrices (i.e., the values of i row and j column are the same as the values of j row and i column, i is not equal to j), and in the matrices, there is no weight between each vertex and itself (there is no value of i row and i column), and the weights are illegal values (NAN values).
As shown in table 1, for the weight matrix shown in this specification:
TABLE 1 weight matrix schematic
\ Vertex A Vertex B Vertex C Vertex D
Vertex A / 0.5 0.7 0.1
Vertex B 0.5 / 0.6 0.2
Vertex C 0.7 0.6 / 0.9
Vertex D 0.1 0.2 0.9 /
Where A, B, C, D represents a vertex, each value represents a weight between any two vertices.
In practical applications, the weight between any two vertices may be undeterminable due to the absence of data, so the weight between the two vertices may be set to 0, that is, for any two vertices in the original weight matrix, if there is no weight between the two vertices, the corresponding position in the weight matrix is complemented by 0.
The original grouping information refers to classifying each vertex, and generally, high similarity (i.e., small difference) is grouped into a group, for example, an interest group, a friend group, and the like in a social scene are all information with high similarity in some aspect.
The grouping information of each vertex may be labeled in advance, or may be clustered based on similarity, the number of groups is constant, and the probability that each vertex belongs to each group is not necessarily constant.
Let the grouping matrix be P, P in the grouping matrixlkRepresenting the probability that vertex l belongs to the kth packet, l ∈ {1, 2, 3, … …, n }, k ∈ {1, 2, 3, … …, m }, m being the total number of packets,
Figure BDA0003264516930000051
as shown in table 2, a grouping matrix schematic table shown in this specification:
TABLE 2 packet matrix schematic Table
Group 1 Group 2 Group 3
Vertex A 0.2 0.5 0.3
Vertex B 0 1 0
Vertex C 0.5 0.3 0.2
Vertex D 0.2 0.1 0.7
Vertex E 1 0 0
Step 103,Taking the original weight matrix as a current weight matrix WxAnd taking the original packet information as the current packet information.
105, according to the current weight matrix WxAnd current grouping information, calculating current total network information of the undirected graph
Figure BDA0003264516930000061
Wherein Q isijIs the degree of difference between vertex i and vertex j, which is inversely related to the probability of belonging to the same group between vertex i and vertex j; and the probability of belonging to the same group between the vertex i and the vertex j is obtained according to the grouping information.
In the case where the grouping information is the grouping matrix as described above, this time
Figure BDA0003264516930000062
The value is smallest (0) when the vertex i and the vertex j are determined to belong to the same group, the value is largest (2) when the vertex i and the vertex j are not determined to belong to the same group, and the value (Q) is larger when the probability of belonging to the same group is not determined to belong to the same groupij) The smaller, and conversely the smaller the probability of belonging to the same packet, the smaller the value (Q)ij) The larger.
In addition, other factors of influence are generally taken into account, then
Figure BDA0003264516930000063
Where ε is the other influencing factor, or other uncontrollable factor, the value is generally an empirical value with uncertainty.
Step 107, judging whether the loop ending condition is met, and if the loop ending condition is not met, jumping to step 109; if the loop end condition is satisfied, go to step 111.
The loop end condition may be that the total network information T converges, or the number of loop executions reaches a preset number, or that the total network information T converges or the number of loop executions reaches the preset number.
And under the condition that the cycle end condition is that the total network information T is converged, considering that the difference between the current total network information and the minimum value of the total network information at the moment can be ignored, and then considering that the current weight matrix corresponding to the moment is closest to the true weight matrix.
And under the condition that the cycle ending condition is that the cycle execution times reach the preset times, considering that the difference between the total network information T after the cycle execution times reach the preset times and the minimum value of the total network information can be ignored, and then considering that the corresponding current weight matrix is closest to the real weight matrix.
When the cycle end condition is that the total network information T converges or the cycle execution frequency reaches the preset frequency, the difference between the total network information T and the minimum value of the total network information at this time is negligible after the cycle execution of the preset frequency or after the total network information T converges (satisfies any condition), and then the corresponding current weight matrix is considered to be closest to the true weight matrix.
Step 109, updating the current weight matrix W according to a preset updating algorithm based on TxAnd current packet information, and jumps to step 105.
The current weight matrix W can be selected according to the random direction of TxAnd updating the current grouping information, and generally updating gradually in the direction of fastest gradient decrease in order to obtain a result more quickly.
When the current weight matrix and the current grouping information are updated according to the preset gradient, the update can be performed in the following manner:
for the current weight matrix WxIn
Figure BDA0003264516930000064
Using formulas
Figure BDA0003264516930000065
Updating
Figure BDA0003264516930000066
For the updated current weight matrix WxCarrying out normalization processing; and gamma is the learning rate of the current weight matrix.
During normalization, the following processing can be performed:
Figure BDA0003264516930000067
that is, for any weight in the current weight matrix, the weight is calculated
Figure BDA0003264516930000068
With the smallest weight in the weight matrix
Figure BDA0003264516930000069
The maximum weight difference value in the weight matrix is calculated
Figure BDA00032645169300000610
Compared with the normalized
Figure BDA00032645169300000611
In this way, the value of the weight of the normalized current weight matrix is considered to be [0,1 ]]In the meantime.
For the current grouping matrix PxIn
Figure BDA0003264516930000071
Using formulas
Figure BDA0003264516930000072
Updating
Figure BDA0003264516930000073
For the updated current grouping matrix PxCarrying out normalization processing; δ is the learning rate of the current grouping matrix.
During normalization, the following processing can be performed:
Figure BDA0003264516930000074
that is, for any one of the grouping probabilities in the current grouping matrix, the grouping probability is calculated
Figure BDA0003264516930000075
With minimum packet probability in the packet matrix
Figure BDA0003264516930000076
The maximum probability difference in the grouping matrix is calculated
Figure BDA0003264516930000077
Compared with the normalized
Figure BDA0003264516930000078
Thus, the value of any current grouping probability of the obtained normalized current grouping matrix is [0,1 ]]In the meantime.
In practical applications, the learning rate γ of the weight matrix and the learning rate δ of the grouping matrix may be the same or different, and are set empirically.
The present specification only schematically shows the update formula of the current weight matrix and the current grouping information, and those skilled in the art can derive more update formulas based on the idea of the present specification, which is not listed in the present specification.
Step 111, the current weight matrix W at the end of the cyclexAnd determining the estimated weight matrix as the undirected graph.
After the total network information T converges or the number of times of the circular execution reaches the preset value, the total network information T at the moment is considered to replace the minimum total network information T, and then the current weight matrix W at the moment is consideredxIs a true accurate weight matrix, i.e., a pre-estimated weight matrix as an undirected graph.
The estimated weight matrix is considered to be close to a true and correct weight matrix, so that the estimated weight matrix has strong guiding significance, and the constructed original weight matrix is evaluated by comparing the difference between the estimated weight matrix and the original weight matrix.
Let original rightThe weight matrix is W0The estimated weight matrix obtained at the end of the cycle is W1Then for vertex i and vertex j of non-zero weight, i.e. in the original weight matrix
Figure BDA0003264516930000079
If at vertex i and vertex j
Figure BDA00032645169300000710
Less than a predetermined difference theta1Illustrating the original weight matrix
Figure BDA00032645169300000711
With estimated weight matrices
Figure BDA00032645169300000712
Almost the same, determining the original weight matrix
Figure BDA00032645169300000713
The confidence is high, in other words, the confidence of the correlation data between vertex i and vertex j is high.
Further, if
Figure BDA00032645169300000714
It is shown that the estimated weight is greater than the original weight, and the original weight is lower, and the estimation is caused by data missing (for example, a part of social interaction is ignored in the social relationship), so that the original weight is lower. If it is
Figure BDA00032645169300000715
It is shown that the estimated weight is smaller than the original weight, the original weight is higher, and the estimation is caused by dirty data (in social relations, the interaction data is wrong, weak interaction is regarded as strong interaction, and the relationship tightness is high), so that the original weight is higher.
For zero-weight vertices i and j, i.e. in the original weight matrix
Figure BDA00032645169300000716
If at vertex i and vertex j
Figure BDA00032645169300000717
Greater than a predetermined value theta2It is stated that there should be a weight between vertex i and vertex j, i.e. it is determined that there is a link between vertex i and vertex j.
The above is a description of a method of determining an estimated weight matrix of an undirected graph, and a device corresponding thereto is described in detail below.
The present specification provides an apparatus for determining an estimated weight matrix of an undirected graph, as shown in fig. 2, the apparatus comprising:
an obtaining module 201, configured to obtain an original weight matrix W of an undirected graph0And original packet information; wherein, for any element W in any weight matrix WijRepresents the weight between vertex i and vertex j, WijMore than 0, i is not equal to j, i, j belongs to {1, 2, 3, … …, n }, and n is the total number of the vertex points;
an initialization module 203 for setting the original weight matrix as a current weight matrix WxTaking the original grouping information as current grouping information;
the loop execution module 205 includes a calculation unit 2051 and an update unit 2052, and is configured to cyclically control the calculation unit and the update unit until a loop end condition is met:
the calculation unit is used for calculating the weight matrix W according to the current weight matrixxAnd current grouping information, calculating current total network information of the undirected graph
Figure BDA0003264516930000081
Wherein Q isijIs the degree of difference between vertex i and vertex j, which is inversely related to the probability of belonging to the same group between vertex i and vertex j; the probability of belonging to the same group between the vertex i and the vertex j is obtained according to the grouping information;
the updating unit is used for updating the current weight matrix W according to a preset updating algorithm based on T if the cycle ending condition is not metxAnd current grouping information;
estimation ofA weight matrix determining module 207, configured to determine a current weight matrix W at the end of the cycle after the cycle is ended1And determining the estimated weight matrix as the undirected graph.
Wherein, the cycle end condition can be one of the following conditions:
the total network information T converges;
the cycle execution times reach the preset times;
the total network information T converges or the number of loop executions reaches a preset number.
Wherein, the grouping information is a grouping matrix P, P in the grouping matrixlkRepresenting the probability that vertex l belongs to the kth packet, l ∈ {1, 2, 3, … …, n }, k ∈ {1, 2, 3, … …, m }, m being the total number of packets,
Figure BDA0003264516930000082
at this time, the process of the present invention,
Figure BDA0003264516930000083
in addition, in the case that the grouping information is the grouping matrix P, the updating unit may be further specifically configured to:
for the current weight matrix WxIn
Figure BDA0003264516930000084
Using formulas
Figure BDA0003264516930000085
Updating
Figure BDA0003264516930000086
For the updated current weight matrix WxCarrying out normalization processing; gamma is the learning rate of the current weight matrix;
for the current grouping matrix PxIn
Figure BDA0003264516930000087
Using formulas
Figure BDA0003264516930000088
Updating
Figure BDA0003264516930000089
For the updated current grouping matrix PxCarrying out normalization processing; δ is the learning rate of the current grouping matrix.
Furthermore, the apparatus may further include:
a comparison module for comparing the estimated weight matrix W1And the original weight matrix W0
A determination module for aiming at
Figure BDA00032645169300000810
Vertex i and vertex j greater than 0 if
Figure BDA00032645169300000811
Less than a predetermined difference theta1Then obtained is determined
Figure BDA00032645169300000812
The confidence is high.
Furthermore, the apparatus may further include:
a comparison module for comparing the estimated weight matrix W1And the original weight matrix W0
A determination module for aiming at
Figure BDA00032645169300000813
Vertex i and vertex j equal to 0 if
Figure BDA00032645169300000814
Greater than a predetermined value theta2Then a link is determined to exist between vertex i and vertex j.
The implementation process of the functions and actions of each module in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution in the specification. One of ordinary skill in the art can understand and implement it without inventive effort.
The present specification also provides an electronic device comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor executes the executable instructions to implement the method for determining the pre-estimated weight matrix of the undirected graph.
In a typical configuration, a computer includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Fig. 3 is a schematic diagram illustrating a more specific hardware structure of a computing device according to an embodiment of the present disclosure, where the computing device may include: a processor 310, a memory 320, an input/output interface 330, a communication interface 340, and a bus 350. Wherein the processor 310, memory 320, input/output interface 330, and communication interface 340 are communicatively coupled to each other within the device via bus 350.
The processor 310 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present specification.
The Memory 320 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 320 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 320 and called to be executed by the processor 310.
The input/output interface 330 is used for connecting an input/output module to realize information input and output. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 340 is used for connecting a communication module (not shown in the figure) to implement communication interaction between the present device and other devices. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 350 includes a path that transfers information between the various components of the device, such as processor 310, memory 320, input/output interface 330, and communication interface 340.
It should be noted that although the above-mentioned device only shows the processor 310, the memory 320, the input/output interface 330, the communication interface 340 and the bus 350, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The present specification also provides a computer readable storage medium having stored thereon computer instructions which, when executed by a processor, carry out the steps of the method for determining a pre-estimated weight matrix of an undirected graph as described in any one of the above.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic disk storage, quantum memory, graphene-based storage media or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in one or more embodiments of the present description to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of one or more embodiments herein. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
The above description is only for the purpose of illustrating the preferred embodiments of the one or more embodiments of the present disclosure, and is not intended to limit the scope of the one or more embodiments of the present disclosure, and any modifications, equivalent substitutions, improvements, etc. made within the spirit and principle of the one or more embodiments of the present disclosure should be included in the scope of the one or more embodiments of the present disclosure.

Claims (14)

1. A method for determining a pre-estimated weight matrix of an undirected graph comprises the following steps:
obtaining an original weight matrix W of an undirected graph0And original packet information; wherein, for any element W in any weight matrix WijRepresents the weight between vertex i and vertex j, WijMore than 0, i is not equal to j, i, j belongs to {1, 2, 3, … …, n }, and n is the total number of the vertex points;
taking the original weight matrix as a current weight matrix WxTaking the original grouping information as current grouping information;
and circularly executing the following steps until a cycle end condition is met:
according to the current weight matrix WxAnd current grouping information, calculating current total network information of the undirected graph
Figure FDA0003264516920000011
Wherein Q isijIs the degree of difference between vertex i and vertex j, which is inversely related to the probability of belonging to the same group between vertex i and vertex j; the probability of belonging to the same group between the vertex i and the vertex j is obtained according to the grouping information;
if the cycle end condition is not met, updating the current weight matrix W according to a preset updating algorithm based on TxAnd current grouping information;
after the circulation is finished, the current weight matrix W at the end of the circulation is usedxAnd determining the estimated weight matrix as the undirected graph.
2. The method of claim 1, the end-of-cycle condition being one of:
the total network information T converges;
the cycle execution times reach the preset times;
the total network information T converges or the number of loop executions reaches a preset number.
3. The method of claim 1, wherein the grouping information is a grouping matrix P, P in the grouping matrixlkRepresenting the probability that vertex l belongs to the kth packet, l ∈ {1, 2, 3, … …, n }, k ∈ {1, 2, 3, … …, m }, m being the total number of packets,
Figure FDA0003264516920000012
the above-mentioned
Figure FDA0003264516920000013
4. The method of claim 3, wherein the current weight matrix W is updated according to a predetermined gradient of TxAnd current packet information including:
for the current weight matrix WxIn
Figure FDA0003264516920000014
Using formulas
Figure FDA0003264516920000015
Updating
Figure FDA0003264516920000016
For the updated current weight matrix WxCarrying out normalization processing; gamma is the learning rate of the current weight matrix;
for the current grouping matrix PxIn
Figure FDA0003264516920000017
Using formulas
Figure FDA0003264516920000018
Updating
Figure FDA0003264516920000019
For the updated current grouping matrix PxCarrying out normalization processing; δ is the learning rate of the current grouping matrix.
5. The method of claim 1, further comprising:
comparing the estimated weight matrix W1And the original weight matrix W0
To is directed at
Figure FDA00032645169200000110
Vertices i and j greater than 0,if it is
Figure FDA00032645169200000111
Less than a predetermined difference theta1Then obtained is determined
Figure FDA00032645169200000112
The confidence is high.
6. The method of claim 1, further comprising:
comparing the estimated weight matrix W1And the original weight matrix W0
To is directed at
Figure FDA00032645169200000113
Vertex i and vertex j equal to 0 if
Figure FDA00032645169200000114
Greater than a predetermined value theta2Then a link is determined to exist between vertex i and vertex j.
7. An apparatus for determining a pre-estimated weight matrix of an undirected graph, the apparatus comprising:
an obtaining module, configured to obtain an original weight matrix W of an undirected graph0And original packet information; wherein, for any element W in any weight matrix WijRepresents the weight between vertex i and vertex j, WijMore than 0, i is not equal to j, i, j belongs to {1, 2, 3, … …, n }, and n is the total number of the vertex points;
an initialization module for taking the original weight matrix as a current weight matrix WxTaking the original grouping information as current grouping information;
the cycle execution module comprises a calculation unit and an updating unit, and is used for controlling the calculation unit and the updating unit in a cycle until a cycle ending condition is met:
the calculation unit is used for calculating the weight matrix W according to the current weight matrixxAnd current grouping information, calculating the disorientationCurrent total network information of the graph
Figure FDA0003264516920000021
Wherein Q isijIs the degree of difference between vertex i and vertex j, which is inversely related to the probability of belonging to the same group between vertex i and vertex j; the probability of belonging to the same group between the vertex i and the vertex j is obtained according to the grouping information;
the updating unit is used for updating the current weight matrix W according to a preset updating algorithm based on T if the cycle ending condition is not metxAnd current grouping information;
a pre-estimation weight matrix determining module for determining the current weight matrix W after the circulation is finished1And determining the estimated weight matrix as the undirected graph.
8. The apparatus of claim 7, the end-of-cycle condition being one of:
the total network information T converges;
the cycle execution times reach the preset times;
the total network information T converges or the number of loop executions reaches a preset number.
9. The apparatus of claim 7, wherein the grouping information is a grouping matrix P, P in the grouping matrixlkRepresenting the probability that vertex l belongs to the kth packet, l ∈ {1, 2, 3, … …, n }, k ∈ {1, 2, 3, … …, m }, m being the total number of packets,
Figure FDA0003264516920000022
the above-mentioned
Figure FDA0003264516920000023
10. The apparatus of claim 9, wherein the update unit is specifically configured to:
for the current weight matrix WxIn
Figure FDA0003264516920000024
Using formulas
Figure FDA0003264516920000025
Updating
Figure FDA0003264516920000026
For the updated current weight matrix WxCarrying out normalization processing; gamma is the learning rate of the current weight matrix;
for the current grouping matrix PxIn
Figure FDA0003264516920000027
Using formulas
Figure FDA0003264516920000028
Updating
Figure FDA0003264516920000029
For the updated current grouping matrix PxCarrying out normalization processing; δ is the learning rate of the current grouping matrix.
11. The apparatus of claim 7, further comprising:
a comparison module for comparing the estimated weight matrix W1And the original weight matrix W0
A determination module for aiming at
Figure FDA00032645169200000210
Vertex i and vertex j greater than 0 if
Figure FDA00032645169200000211
Less than a predetermined difference theta1Then obtained is determined
Figure FDA00032645169200000212
The confidence is high.
12. The apparatus of claim 7, further comprising:
a comparison module for comparing the estimated weight matrix W1And the original weight matrix W0
A determination module for aiming at
Figure FDA00032645169200000213
Vertex i and vertex j equal to 0 if
Figure FDA00032645169200000214
Greater than a predetermined value theta2Then a link is determined to exist between vertex i and vertex j.
13. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor implements the method of any one of claims 1-6 by executing the executable instructions.
14. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 6.
CN202111082304.3A 2021-09-15 2021-09-15 Method and device for determining pre-estimated weight matrix of undirected graph Pending CN113849698A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111082304.3A CN113849698A (en) 2021-09-15 2021-09-15 Method and device for determining pre-estimated weight matrix of undirected graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111082304.3A CN113849698A (en) 2021-09-15 2021-09-15 Method and device for determining pre-estimated weight matrix of undirected graph

Publications (1)

Publication Number Publication Date
CN113849698A true CN113849698A (en) 2021-12-28

Family

ID=78974044

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111082304.3A Pending CN113849698A (en) 2021-09-15 2021-09-15 Method and device for determining pre-estimated weight matrix of undirected graph

Country Status (1)

Country Link
CN (1) CN113849698A (en)

Similar Documents

Publication Publication Date Title
KR101564535B1 (en) Systems and methods for software upgrade recommendation
CN113313575A (en) Method and device for determining risk identification model
US9761221B2 (en) Order statistic techniques for neural networks
US20150371149A1 (en) Calculation device, calculation method, and recording medium
CN112182578A (en) Model training method, URL detection method and device
CN111078639B (en) Data standardization method and device and electronic equipment
CN110909868A (en) Node representation method and device based on graph neural network model
CN111639687A (en) Model training and abnormal account identification method and device
CN112200132A (en) Data processing method, device and equipment based on privacy protection
WO2018001123A1 (en) Sample size estimator
WO2018196676A1 (en) Non-convex optimization by gradient-accelerated simulated annealing
KR20220032730A (en) On identifying the author group of malwares via graph embedding and human-in-loop approaches
CN111950579A (en) Training method and training device for classification model
CN113849698A (en) Method and device for determining pre-estimated weight matrix of undirected graph
CN112491816A (en) Service data processing method and device
CN115567371B (en) Abnormity detection method, device, equipment and readable storage medium
CN108681490B (en) Vector processing method, device and equipment for RPC information
US20220414533A1 (en) Automated hyperparameter tuning in machine learning algorithms
CN108229689B (en) Deviation classification in semiconductor processing equipment using radial basis function networks and hypercubes
CN114818458A (en) System parameter optimization method, device, computing equipment and medium
JP2019016142A (en) Input content confirmation screen display device, input content confirmation screen display method and input content confirmation screen display program
US20160275169A1 (en) System and method of generating initial cluster centroids
CN115661847B (en) Table structure recognition and model training method, device, equipment and storage medium
CN115423485B (en) Data processing method, device and equipment
US20180032586A1 (en) Scalable reservoir sampling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination