CN117671277A - Graph feature extraction method and device, storage medium and electronic equipment - Google Patents
Graph feature extraction method and device, storage medium and electronic equipment Download PDFInfo
- Publication number
- CN117671277A CN117671277A CN202311226890.3A CN202311226890A CN117671277A CN 117671277 A CN117671277 A CN 117671277A CN 202311226890 A CN202311226890 A CN 202311226890A CN 117671277 A CN117671277 A CN 117671277A
- Authority
- CN
- China
- Prior art keywords
- graph
- matrix
- feature extraction
- hidden
- determining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 97
- 238000003860 storage Methods 0.000 title claims abstract description 21
- 238000012549 training Methods 0.000 claims abstract description 28
- 239000011159 matrix material Substances 0.000 claims description 113
- 238000000034 method Methods 0.000 claims description 50
- 238000009792 diffusion process Methods 0.000 claims description 35
- 238000002372 labelling Methods 0.000 claims description 29
- 238000000354 decomposition reaction Methods 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 15
- 238000005295 random walk Methods 0.000 claims description 13
- 238000005070 sampling Methods 0.000 claims description 9
- 238000010586 diagram Methods 0.000 description 10
- 230000006872 improvement Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 8
- 238000012545 processing Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 229920001296 polysiloxane Polymers 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000012954 risk control Methods 0.000 description 1
- 239000010979 ruby Substances 0.000 description 1
- 229910001750 ruby Inorganic materials 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The specification discloses a graph feature extraction method, a device, a storage medium and electronic equipment. In the graph feature extraction method provided in the present specification, a target graph is acquired; inputting the target graph into a pre-trained feature extraction model, wherein the feature extraction model at least comprises a comparison subnet and an output subnet; determining, by the comparison subnet, for each hidden graph stored in the feature extraction model, a similarity between the hidden graph and the target graph, where the hidden graph is obtained by training the feature extraction model; and outputting the graph characteristics of the target graph according to the similarity between the target graph and each hidden graph through the output subnet.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a graph feature extraction method, a device, a storage medium, and an electronic apparatus.
Background
Nowadays, a graph is a very common data structure, and graph data generated based on the graph structure has wide application in various fields. For example, the user personal information is used to construct map data to perform wind control prediction. In general, it is necessary to extract features of the map data, that is, map features, when applying the map data. In this process, the quality of the extracted graph features often directly affects the implementation effect of the subsequent tasks. Therefore, extracting the graph features with better quality is a vital link.
In the conventional method, when extracting the graph features of the graph data, the graph features are usually extracted through the model according to the point and side information contained in the graph data after the model is constructed. However, the amount of information contained in the graph features extracted by the method is small, and the effect achieved when the graph features are adopted to execute subsequent tasks is often unsatisfactory.
Therefore, how to extract graph features containing more information from graph data is a urgent problem to be solved.
Disclosure of Invention
The present disclosure provides a graph feature extraction method, a device, a storage medium, and an electronic apparatus, so as to at least partially solve the foregoing problems of the prior art.
The technical scheme adopted in the specification is as follows:
the specification provides a graph feature extraction method, which comprises the following steps:
obtaining a target graph;
inputting the target graph into a pre-trained feature extraction model, wherein the feature extraction model at least comprises a comparison subnet and an output subnet;
determining, by the comparison subnet, for each hidden graph stored in the feature extraction model, a similarity between the hidden graph and the target graph, where the hidden graph is obtained by training the feature extraction model;
And outputting the graph characteristics of the target graph according to the similarity between the target graph and each hidden graph through the output subnet.
Optionally, determining the similarity between the hidden graph and the target graph specifically includes:
determining a common walk quantity between the hidden graph and the target graph through a random walk kernel function;
and determining the similarity between the hidden graph and the target graph according to the public wander quantity.
Optionally, obtaining the target graph specifically includes:
determining a target graph containing user portrait of the user according to the service information of the user;
the method further comprises the steps of:
inputting the graph characteristics into a pre-trained wind control model to obtain a prediction result output by the wind control model;
and determining the risk type of the user according to the prediction result.
Optionally, the feature extraction model is trained in advance, specifically including:
acquiring a sample graph and determining a graph limit of the sample graph;
sampling the graph limit to obtain a sub graph of the graph limit, and taking the sub graph as a labeling graph of the sample graph;
determining the labeling characteristics of the labeling graph;
inputting the sample graph into a feature extraction model to be trained;
Determining the similarity to be optimized between each preset hidden graph to be optimized and the sample graph through the comparison sub-network;
outputting the characteristics of the to-be-optimized graph of the sample graph according to the to-be-optimized similarity between the sample graph and each hidden graph to be optimized through the output subnet;
and training the feature extraction model by using the feature to be optimized and the labeling feature with the smallest difference.
Optionally, determining a graph limit of the sample graph specifically includes:
determining an adjacency matrix of the sample graph as an original matrix;
performing matrix diffusion on the original matrix to obtain a diffusion matrix of the sample graph;
and carrying out singular value decomposition on the diffusion matrix, and determining the graph limit of the sample graph according to the decomposition result.
Optionally, singular value decomposition is performed on the diffusion matrix, and a graph limit of the sample graph is determined according to the decomposition result, which specifically includes:
singular value decomposition is carried out on the diffusion matrix to obtain a left singular matrix, a right singular matrix and singular values of the diffusion matrix;
adjusting the singular value by adopting preset appointed weight to obtain a weighted singular value;
Performing matrix multiplication on the left singular matrix, the singular value and the right singular matrix to obtain a limit matrix;
and determining a graph corresponding to the limit matrix as a graph limit of the sample graph.
The present specification provides a drawing feature extraction device including:
the acquisition module is used for acquiring the target graph;
the input module is used for inputting the target graph into a pre-trained feature extraction model, and the feature extraction model at least comprises a comparison subnet and an output subnet;
the comparison module is used for determining the similarity between each hidden graph and the target graph according to each hidden graph stored in the feature extraction model through the comparison sub-network, wherein the hidden graphs are obtained by training the feature extraction model;
and the output module is used for outputting the graph characteristics of the target graph according to the similarity between the target graph and each hidden graph through the output subnet.
Optionally, the comparing module is specifically configured to determine, through a random walk kernel function, a common walk number between the hidden graph and the target graph; and determining the similarity between the hidden graph and the target graph according to the public wander quantity.
Optionally, the acquiring module is specifically configured to determine, according to service information of a user, a target graph including a user portrait of the user;
the device further comprises an air control module, wherein the air control module is specifically used for inputting the graph characteristics into a pre-trained air control model to obtain a prediction result output by the air control model; and determining the risk type of the user according to the prediction result.
Optionally, the device further comprises a training module, specifically configured to obtain a sample graph, and determine a graph limit of the sample graph; sampling the graph limit to obtain a sub graph of the graph limit, and taking the sub graph as a labeling graph of the sample graph; determining the labeling characteristics of the labeling graph; inputting the sample graph into a feature extraction model to be trained; determining the similarity to be optimized between each preset hidden graph to be optimized and the sample graph through the comparison sub-network; outputting the characteristics of the to-be-optimized graph of the sample graph according to the to-be-optimized similarity between the sample graph and each hidden graph to be optimized through the output subnet; and training the feature extraction model by using the feature to be optimized and the labeling feature with the smallest difference.
Optionally, the training module is specifically configured to determine an adjacency matrix of the sample graph as an original matrix; performing matrix diffusion on the original matrix to obtain a diffusion matrix of the sample graph; and carrying out singular value decomposition on the diffusion matrix, and determining the graph limit of the sample graph according to the decomposition result.
Optionally, the training module is specifically configured to perform singular value decomposition on the diffusion matrix to obtain a left singular matrix, a right singular matrix and singular values of the diffusion matrix; adjusting the singular value by adopting preset appointed weight to obtain a weighted singular value; performing matrix multiplication on the left singular matrix, the singular value and the right singular matrix to obtain a limit matrix; and determining a graph corresponding to the limit matrix as a graph limit of the sample graph.
The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the graph feature extraction method described above.
The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the graph feature extraction method described above when executing the program.
The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:
in the graph feature extraction method provided in the present specification, a target graph is acquired; inputting the target graph into a pre-trained feature extraction model, wherein the feature extraction model at least comprises a comparison subnet and an output subnet; determining, by the comparison subnet, for each hidden graph stored in the feature extraction model, a similarity between the hidden graph and the target graph, where the hidden graph is obtained by training the feature extraction model; and outputting the graph characteristics of the target graph according to the similarity between the target graph and each hidden graph through the output subnet.
When the graph characteristic of the target graph is determined by adopting the graph characteristic extraction method provided by the specification, the graph characteristic can be realized by adopting a characteristic extraction model comprising a comparison subnet and an output subnet; and determining the similarity between each hidden graph and the target graph, which represent different topological structure types, through a comparison sub-network, and finally outputting graph characteristics of the target graph according to the similarity through an output sub-network. By adopting the method, the structure information in the target graph can be automatically captured, the relationship between the target graph and various topological structures is characterized through the similarity between different hidden graphs and the target graph, and finally the graph characteristics containing the structure information of the target graph are determined according to the similarity.
Drawings
The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. Attached at
In the figure:
FIG. 1 is a schematic flow chart of a method for extracting features of a graph provided in the present specification;
FIG. 2 is a schematic structural diagram of a feature extraction model provided in the present specification;
FIG. 3 is a schematic view of a drawing feature extraction device provided in the present specification;
fig. 4 is a schematic view of an electronic device corresponding to fig. 1 provided in the present specification.
Detailed Description
In an actual application scene, the information contained in the data of the graph structure can exist not only on the characteristics of points and edges but also on the structural information, and the decision making and the qualitative can be helped by extracting the structural information, so that the performance of the model is improved. For example, in online shopping, a user and a commodity form an obvious bipartite graph, and risks involved in such a scene can be reflected more intuitively by structural information than information contained in points and edges in graph data.
However, at present, the conventional method still remains at the level of extracting features of points and edges in the graph data for feature extraction of the graph data. Based on the above, the present specification provides a better graph feature extraction method, which can automatically capture the structural information in the graph data, so as to extract more comprehensive and accurate graph features.
For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a graph feature extraction method provided in the present specification, including the following steps:
s100: and obtaining a target graph.
In the present specification, an execution body for implementing the graph feature extraction method may refer to a designated device such as a server provided on a service platform, and for convenience of description, only the server is taken as an execution body in the present specification to describe a graph feature extraction method provided in the present specification.
In the graph feature extraction method provided in the present specification, the final object to be achieved is to extract the graph features of the target graph. Based on this, the target map may be acquired first in this step. The target graph can be data of any graph structure, and the target graph comprises a plurality of vertexes and edges.
S102: inputting the target graph into a pre-trained feature extraction model, wherein the feature extraction model at least comprises a comparison subnet and an output subnet.
In the graph feature extraction method provided in the present specification, a feature extraction model trained in advance is used to obtain graph features of a target graph. After the target graph is acquired in step S100, the target book may be input into the feature extraction model in this step. Fig. 2 is a schematic structural diagram of a feature extraction model used in the method. As shown in fig. 2, at least a contrast subnet and an output subnet may be included in the feature extraction model.
The comparison sub-network is used for determining the topological structure of the target graph in a comparison mode and representing the topological structure by adopting the similarity; and the output subnetwork is used for obtaining the graph characteristics of the target graph according to the determined similarity. In addition, besides the comparison sub-network and the output sub-network, other sub-networks or network layers can also exist in the feature extraction model, and the specification is not particularly limited, and only needs to ensure that the original functions are not affected.
S104: and determining the similarity between each hidden graph and the target graph according to each hidden graph stored in the feature extraction model through the comparison sub-network, wherein the hidden graph is obtained by training the feature extraction model.
After inputting the target graph into the feature extraction model, the target graph first enters a contrast subnet in the feature extraction model. In the feature extraction model, a plurality of hidden graphs are stored, and the comparison network determines the similarity between each hidden graph and the target graph. The hidden graph may be directly stored in the feature extraction model or may be stored in the model through a memory network, which is not particularly limited in this specification.
In the feature extraction model adopted by the method, the hidden graph can be regarded as a standard graph. Typically, each hidden graph will have and only one type of topology, and each hidden graph will have a different topology. In other words, each hidden graph represents a standard topology type. The types of topology may include, but are not limited to, bus, star, ring, tree, mesh, hybrid, etc. On this basis, the calculation of the similarity between a hidden image and a target image can be regarded as determining the similarity between the target image and a topology. The higher the similarity between the target graph and one hidden graph, the higher the similarity between the target graph and the topology structure of the hidden graph, that is, the closer the topology structure of the target graph and the topology structure of the hidden graph are. Therefore, the structural information of the target graph can be represented by comparing the similarity determined by the sub-networks.
It should be noted that in the practical process, there may be a plurality of different topologies in a target graph. That is, one target graph may have a high similarity with a plurality of different hidden graphs.
There may be a variety of different ways of determining the similarity between a hidden drawing and a target drawing, and a specific embodiment is provided herein for reference. In particular, the number of common walks between the hidden graph and the target graph may be determined by a random walk kernel function; and determining the similarity between the hidden graph and the target graph according to the public wander quantity.
The random walk kernel functions may perform a random walk on the hidden graph and the target graph, respectively, and remember the path during the random walk. When a random walk kernel is capable of completing a random walk of the same path on the target graph and the hidden graph, respectively, the random walk may be determined as a common walk between the target graph and the hidden graph. It is conceivable that the higher the number of common walks between the target graph and the hidden graph, the more common portions of the target graph and the hidden graph on the topology, the more similar the topology of the two, and thus the higher the similarity of the two, with the total number of random walks being made.
In the feature extraction model employed in the present method, hidden drawings stored in the model are part of the model and may be trained along with the feature extraction model. Meanwhile, the number of hidden drawings included in the feature extraction model may be set according to specific requirements, which is not particularly limited in the present specification.
S106: and outputting the graph characteristics of the target graph according to the similarity between the target graph and each hidden graph through the output subnet.
When the similarity between each hidden graph and the target graph is determined in step S104, the graph features of the target graph can be output according to the determined similarity through the output subnet in this step. It should be noted that, when outputting the graph characteristics of the subnet output target graph, the output is not performed only according to the similarity. In addition to the similarity determined in the method, the original information contained in the vertices and edges in the target graph also needs to be considered. Based on this, a more complete and comprehensive graph feature can be finally output.
When the graph characteristic of the target graph is determined by adopting the graph characteristic extraction method provided by the specification, the graph characteristic can be realized by adopting a characteristic extraction model comprising a comparison subnet and an output subnet; and determining the similarity between each hidden graph and the target graph, which represent different topological structure types, through a comparison sub-network, and finally outputting graph characteristics of the target graph according to the similarity through an output sub-network. By adopting the method, the structure information in the target graph can be automatically captured, the relationship between the target graph and various topological structures is characterized through the similarity between different hidden graphs and the target graph, and finally the graph characteristics containing the structure information of the target graph are determined according to the similarity.
After the graph feature extraction method provided by the specification is adopted to extract the graph feature of the target graph, the graph feature can be subsequently utilized in a plurality of different application scenes. The present description will now take the example of a risk management scenario for a user in a business, to describe the subsequent application of the graph features extracted by the method.
Specifically, in an application scenario of risk control for a user, a target graph containing a user portrait of the user can be determined according to service information of the user; then, the feature extraction model in the method can be used for determining the graph features of the target graph; inputting the graph characteristics into a pre-trained wind control model to obtain a prediction result output by the wind control model; and determining the risk type of the user according to the prediction result.
The wind control model has the function of outputting the risk types of the users to which the graph features belong according to the input graph features. Risk types may include a variety of types including, but not limited to, funding risk, account risk, transaction anomalies, and the like. The wind control model can accurately judge the risk type of the user through the graph characteristics containing the structural information of the target graph.
Different risk types generally correspond to different topologies, such as a star graph, a two-part graph, a dense closed-loop graph, and the like, and the risk types existing in the user can be accurately and accurately determined according to the structural information of the target graph.
Additionally, the feature extraction model employed in the present method may be trained in advance. Specifically, a sample graph may be acquired and a graph limit for the sample graph may be determined; sampling the graph limit to obtain a sub graph of the graph limit, and taking the sub graph as a labeling graph of the sample graph; determining the labeling characteristics of the labeling graph; inputting the sample graph into a feature extraction model to be trained; determining the similarity to be optimized between each preset hidden graph to be optimized and the sample graph through the comparison sub-network; outputting the characteristics of the to-be-optimized graph of the sample graph according to the to-be-optimized similarity between the sample graph and each hidden graph to be optimized through the output subnet; and training the feature extraction model by using the feature to be optimized and the labeling feature with the smallest difference.
The input of the feature extraction model is a graph, and the output is the feature of the graph, so that the sample graph and the marked feature can be correspondingly used for training. The sample graph can be any structure graph, and data contained in the sample graph can be determined according to training targets and subsequent applications. For example, when the subsequent application controls the scene for the risk to the user, the data contained in the sample graph may be user information of the sample user.
When the feature extraction model is trained in the method, the labeling features can be determined by determining the graph limit of the sample graph. For any graph, when the points and edges of the graph are increased continuously according to the original rule, the new graph obtained when the graph is trended to infinity is the graph limit of the graph. In general, the graph limits may be represented in the form of an adjacency matrix. In practice, the original law of a graph is the topology of the graph. Since the graph limit of a graph is the representation of the points and edges of the graph as they extend to the limit according to the original law, the graph limit of a graph can represent the topology of the graph to the greatest extent. However, since points and edges in the graph limit tend to be infinite, the graph limit cannot be directly used as a graph, and the graph limit needs to be sampled to obtain a part of the graph limit, namely a sub graph of the graph limit, and the graph limit is used as an annotation graph for embodying the topological structure rule of the sample graph. There are various sampling methods, such as random sampling, and the present specification does not limit the sampling method specifically.
Then, the features extracted from the labeling graph of the sample graph can be used as the labeling features during training, and the labeling features can represent the topological structure information of the sample graph to the greatest extent. During training, the feature extraction model can be trained by taking the minimum difference between the features of the to-be-optimized graph and the labeling features of the sample graph output by the feature extraction model as an optimization target, and parameters and hidden graphs in the feature extraction model are adjusted. The initial hidden graph preset in the feature extraction model may be a random graph, which is not specifically limited in this specification.
Preferably, when determining the graph limit of the sample graph, the determination can be performed by adopting a singular value decomposition mode. Specifically, an adjacency matrix of the sample graph can be determined as an original matrix; performing matrix diffusion on the original matrix to obtain a diffusion matrix of the sample graph; and carrying out singular value decomposition on the diffusion matrix, and determining the graph limit of the sample graph according to the decomposition result. In general, singular value decomposition is performed on a matrix, and the decomposition result obtained may include a left singular matrix, a right singular matrix, and singular values. Based on the above, when determining the graph limit of the sample graph according to the analysis result, the singular value can be adjusted by adopting a preset designated weight to obtain a weighted singular value; performing matrix multiplication on the left singular matrix, the singular value and the right singular matrix to obtain a limit matrix; and determining a graph corresponding to the limit matrix as a graph limit of the sample graph.
Wherein matrix diffusion operates as a process of finding adjacency of matrices. For example, for any one matrix, performing a matrix diffusion operation on the matrix finds all points in the matrix that have a second order adjacency, and connects the points with edges. In colloquial terms, for any point in the matrix, the point is connected by an edge to all points that can be reached after passing through both edges. It is conceivable that performing a matrix diffusion operation on the matrix will link together all points having a second order adjacency; performing a second matrix diffusion operation will tie all points with third order adjacencies together, and so on. In the method, the number of matrix diffusion times of the original matrix can be determined according to requirements, and the method is not particularly limited in the specification.
The assigned weight is a preset weight for adjusting the singular value, and can be determined according to specific requirements, and can be set to be 10% for example. The matrix multiplication can be performed on the left singular matrix, the singular value and the right singular matrix according to the following formula to obtain a limit matrix:
M=U×(ω·S)×V
wherein M represents a limit matrix, U represents a left singular matrix, V represents a right singular matrix, S represents a singular value, and ω represents a preset weight. The limit matrix may be regarded as an adjacency matrix of the graph limits, and the graph corresponding to the limit matrix may be determined as the graph limit of the sample graph.
The above is one or more embodiments of the method for extracting the graph features in the present specification, and based on the same concept, the present specification further provides a corresponding device for extracting the graph features, as shown in fig. 3.
Fig. 3 is a schematic diagram of a graph feature extraction device provided in the present specification, including:
an acquisition module 200, configured to acquire a target graph;
an input module 202, configured to input the target graph into a pre-trained feature extraction model, where the feature extraction model includes at least a comparison subnet and an output subnet;
a comparison module 204, configured to determine, for each hidden graph stored in the feature extraction model, a similarity between the hidden graph and the target graph through the comparison subnet, where the hidden graph is obtained by training the feature extraction model;
And the output module 206 is configured to output, through the output subnet, the graph features of the target graph according to the similarity between the target graph and each hidden graph.
Optionally, the comparing module 204 is specifically configured to determine, by using a random walk kernel function, a common walk amount between the hidden graph and the target graph; and determining the similarity between the hidden graph and the target graph according to the public wander quantity.
Optionally, the acquiring module 200 is specifically configured to determine, according to service information of a user, a target graph including a user portrait of the user;
the device further comprises an air control module 208, which is specifically configured to input the graph feature into a pre-trained air control model to obtain a prediction result output by the air control model; and determining the risk type of the user according to the prediction result.
Optionally, the apparatus further includes a training module 210, specifically configured to acquire a sample graph, and determine a graph limit of the sample graph; sampling the graph limit to obtain a sub graph of the graph limit, and taking the sub graph as a labeling graph of the sample graph; determining the labeling characteristics of the labeling graph; inputting the sample graph into a feature extraction model to be trained; determining the similarity to be optimized between each preset hidden graph to be optimized and the sample graph through the comparison sub-network; outputting the characteristics of the to-be-optimized graph of the sample graph according to the to-be-optimized similarity between the sample graph and each hidden graph to be optimized through the output subnet; and training the feature extraction model by using the feature to be optimized and the labeling feature with the smallest difference.
Optionally, the training module 210 is specifically configured to determine an adjacency matrix of the sample graph as an original matrix; performing matrix diffusion on the original matrix to obtain a diffusion matrix of the sample graph; and carrying out singular value decomposition on the diffusion matrix, and determining the graph limit of the sample graph according to the decomposition result.
Optionally, the training module 210 is specifically configured to perform singular value decomposition on the diffusion matrix to obtain a left singular matrix, a right singular matrix, and singular values of the diffusion matrix; adjusting the singular value by adopting preset appointed weight to obtain a weighted singular value; performing matrix multiplication on the left singular matrix, the singular value and the right singular matrix to obtain a limit matrix; and determining a graph corresponding to the limit matrix as a graph limit of the sample graph.
The present specification also provides a computer readable storage medium storing a computer program operable to perform a graph feature extraction method as provided in fig. 1 above.
The present specification also provides a schematic structural diagram of an electronic device corresponding to fig. 1 shown in fig. 4. At the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile storage, as described in fig. 4, although other hardware required by other services may be included. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs to implement the graph feature extraction method described above with respect to fig. 1. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.
In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.
Claims (14)
1. A graph feature extraction method, comprising:
obtaining a target graph;
inputting the target graph into a pre-trained feature extraction model, wherein the feature extraction model at least comprises a comparison subnet and an output subnet;
determining, by the comparison subnet, for each hidden graph stored in the feature extraction model, a similarity between the hidden graph and the target graph, where the hidden graph is obtained by training the feature extraction model;
And outputting the graph characteristics of the target graph according to the similarity between the target graph and each hidden graph through the output subnet.
2. The method of claim 1, determining a similarity between the hidden graph and the target graph, specifically comprising:
determining a common walk quantity between the hidden graph and the target graph through a random walk kernel function;
and determining the similarity between the hidden graph and the target graph according to the public wander quantity.
3. The method of claim 1, obtaining a target graph, specifically comprising:
determining a target graph containing user portrait of the user according to the service information of the user;
the method further comprises the steps of:
inputting the graph characteristics into a pre-trained wind control model to obtain a prediction result output by the wind control model;
and determining the risk type of the user according to the prediction result.
4. The method of claim 1, pre-training a feature extraction model, comprising in particular:
acquiring a sample graph and determining a graph limit of the sample graph;
sampling the graph limit to obtain a sub graph of the graph limit, and taking the sub graph as a labeling graph of the sample graph;
determining the labeling characteristics of the labeling graph;
Inputting the sample graph into a feature extraction model to be trained;
determining the similarity to be optimized between each preset hidden graph to be optimized and the sample graph through the comparison sub-network;
outputting the characteristics of the to-be-optimized graph of the sample graph according to the to-be-optimized similarity between the sample graph and each hidden graph to be optimized through the output subnet;
and training the feature extraction model by using the feature to be optimized and the labeling feature with the smallest difference.
5. The method of claim 4, determining a graph limit for the sample graph, comprising:
determining an adjacency matrix of the sample graph as an original matrix;
performing matrix diffusion on the original matrix to obtain a diffusion matrix of the sample graph;
and carrying out singular value decomposition on the diffusion matrix, and determining the graph limit of the sample graph according to the decomposition result.
6. The method of claim 5, wherein the singular value decomposition is performed on the diffusion matrix, and the graph limit of the sample graph is determined according to the decomposition result, and specifically comprises:
singular value decomposition is carried out on the diffusion matrix to obtain a left singular matrix, a right singular matrix and singular values of the diffusion matrix;
Adjusting the singular value by adopting preset appointed weight to obtain a weighted singular value;
performing matrix multiplication on the left singular matrix, the singular value and the right singular matrix to obtain a limit matrix;
and determining a graph corresponding to the limit matrix as a graph limit of the sample graph.
7. A graph feature extraction apparatus comprising:
the acquisition module is used for acquiring the target graph;
the input module is used for inputting the target graph into a pre-trained feature extraction model, and the feature extraction model at least comprises a comparison subnet and an output subnet;
the comparison module is used for determining the similarity between each hidden graph and the target graph according to each hidden graph stored in the feature extraction model through the comparison sub-network, wherein the hidden graphs are obtained by training the feature extraction model;
and the output module is used for outputting the graph characteristics of the target graph according to the similarity between the target graph and each hidden graph through the output subnet.
8. The apparatus of claim 7, the comparison module being configured to determine a common number of walks between the hidden graph and the target graph by a random walk kernel; and determining the similarity between the hidden graph and the target graph according to the public wander quantity.
9. The device of claim 8, wherein the acquisition module is specifically configured to determine a target graph containing a user portrait of the user according to service information of the user;
the device further comprises an air control module, wherein the air control module is specifically used for inputting the graph characteristics into a pre-trained air control model to obtain a prediction result output by the air control model; and determining the risk type of the user according to the prediction result.
10. The apparatus of claim 7, further comprising a training module, in particular for obtaining a sample graph and determining a graph limit for the sample graph; sampling the graph limit to obtain a sub graph of the graph limit, and taking the sub graph as a labeling graph of the sample graph; determining the labeling characteristics of the labeling graph; inputting the sample graph into a feature extraction model to be trained; determining the similarity to be optimized between each preset hidden graph to be optimized and the sample graph through the comparison sub-network; outputting the characteristics of the to-be-optimized graph of the sample graph according to the to-be-optimized similarity between the sample graph and each hidden graph to be optimized through the output subnet; and training the feature extraction model by using the feature to be optimized and the labeling feature with the smallest difference.
11. The apparatus of claim 10, the training module being specifically configured to determine an adjacency matrix of the sample graph as an original matrix; performing matrix diffusion on the original matrix to obtain a diffusion matrix of the sample graph; and carrying out singular value decomposition on the diffusion matrix, and determining the graph limit of the sample graph according to the decomposition result.
12. The apparatus of claim 11, wherein the training module is specifically configured to perform singular value decomposition on the diffusion matrix to obtain a left singular matrix, a right singular matrix, and singular values of the diffusion matrix; adjusting the singular value by adopting preset appointed weight to obtain a weighted singular value; performing matrix multiplication on the left singular matrix, the singular value and the right singular matrix to obtain a limit matrix; and determining a graph corresponding to the limit matrix as a graph limit of the sample graph.
13. A computer readable storage medium storing a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-6.
14. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any of the preceding claims 1-6 when the program is executed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311226890.3A CN117671277A (en) | 2023-09-21 | 2023-09-21 | Graph feature extraction method and device, storage medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311226890.3A CN117671277A (en) | 2023-09-21 | 2023-09-21 | Graph feature extraction method and device, storage medium and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117671277A true CN117671277A (en) | 2024-03-08 |
Family
ID=90062947
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311226890.3A Pending CN117671277A (en) | 2023-09-21 | 2023-09-21 | Graph feature extraction method and device, storage medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117671277A (en) |
-
2023
- 2023-09-21 CN CN202311226890.3A patent/CN117671277A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115828162B (en) | Classification model training method and device, storage medium and electronic equipment | |
CN112735407B (en) | Dialogue processing method and device | |
CN117194992B (en) | Model training and task execution method and device, storage medium and equipment | |
CN115841335B (en) | Data processing method, device and equipment | |
CN115600157B (en) | Data processing method and device, storage medium and electronic equipment | |
CN116049761A (en) | Data processing method, device and equipment | |
CN112365513A (en) | Model training method and device | |
CN117409466B (en) | Three-dimensional dynamic expression generation method and device based on multi-label control | |
CN116434787B (en) | Voice emotion recognition method and device, storage medium and electronic equipment | |
CN115545572B (en) | Method, device, equipment and storage medium for business wind control | |
CN116186330B (en) | Video deduplication method and device based on multi-mode learning | |
CN116453615A (en) | Prediction method and device, readable storage medium and electronic equipment | |
CN117671277A (en) | Graph feature extraction method and device, storage medium and electronic equipment | |
CN110704742B (en) | Feature extraction method and device | |
CN114154579A (en) | Image classification method and device, storage medium and electronic equipment | |
CN115952271B (en) | Method and device for generating dialogue information, storage medium and electronic equipment | |
CN115862675B (en) | Emotion recognition method, device, equipment and storage medium | |
CN115795342B (en) | Method and device for classifying business scenes, storage medium and electronic equipment | |
CN115545938B (en) | Method, device, storage medium and equipment for executing risk identification service | |
CN116563387A (en) | Training method and device of calibration model, storage medium and electronic equipment | |
CN117909926A (en) | Risk identification method and device, storage medium and electronic equipment | |
CN116543759A (en) | Speech recognition processing method and device | |
CN117591217A (en) | Information display method, device, equipment and storage medium | |
CN117633644A (en) | Multitasking learning method and device, storage medium and electronic equipment | |
CN118229294A (en) | Wind control method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |