CN114781553B - Unsupervised patent clustering method based on parallel multi-graph convolution neural network - Google Patents

Unsupervised patent clustering method based on parallel multi-graph convolution neural network Download PDF

Info

Publication number
CN114781553B
CN114781553B CN202210695144.8A CN202210695144A CN114781553B CN 114781553 B CN114781553 B CN 114781553B CN 202210695144 A CN202210695144 A CN 202210695144A CN 114781553 B CN114781553 B CN 114781553B
Authority
CN
China
Prior art keywords
attention
graph
vector
patent data
vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210695144.8A
Other languages
Chinese (zh)
Other versions
CN114781553A (en
Inventor
韩蒙
梁兵
况欢
陈灏毅
陈唯
林昶廷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Binjiang Research Institute Of Zhejiang University
Original Assignee
Binjiang Research Institute Of Zhejiang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Binjiang Research Institute Of Zhejiang University filed Critical Binjiang Research Institute Of Zhejiang University
Priority to CN202210695144.8A priority Critical patent/CN114781553B/en
Publication of CN114781553A publication Critical patent/CN114781553A/en
Application granted granted Critical
Publication of CN114781553B publication Critical patent/CN114781553B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an unsupervised patent clustering method based on a parallel multi-graph convolution neural network, which is characterized in that on the basis of constructing 4 types of patent graphs and coding vectors of a self-encoder for patent data, 4 types of patent graphs and coding vectors are fully extracted through graph convolution operation, effective feature vectors of the patent data are comprehensively extracted, weight is distributed to each type of feature vectors through a parallel single graph self-attention module, the importance degree of important features of a single graph is improved to obtain a single graph attention vector, the single graph attention vectors of all types are fused through a multi-graph attention module for learning, a larger weight is distributed to the important single graph, the obtained global attention vector integrates multi-aspect feature information, and the clustering precision is improved.

Description

Unsupervised patent clustering method based on parallel multi-graph convolution neural network
Technical Field
The invention belongs to the technical field of patent classification, and particularly relates to an unsupervised patent clustering method based on a parallel multi-graph convolution neural network.
Background
Through the analysis of the patent data, specific market development wind vane and organization innovation strength can be obtained. People often search for patents on various intellectual property platforms using information such as Patent names, keywords, and CPC (co-Patent Classification) codes. Among them, the CPC code is an extension of IPC (International Patent Classification), and is commonly managed by EPO (European Patent Office) and the us Patent and trademark Office. It is divided into nine parts, a-H and Y, which are in turn divided into classes, subclasses, groups and subgroups, with approximately 250000 classification entries. Whichever institution participates in processing and approving the patent will determine the type of classification code used for the invention. Once the patent application is approved, the CPC code cannot be changed any more. Therefore, it is extremely important for the patent applicant to prejudge the patent CPC code in advance.
At present, the classification of patent CPC codes mostly adopts a manual method to check patent names, abstracts and texts so as to match the corresponding patent CPC codes, which is very tedious for patent examiners and easy to make mistakes.
Some scholars study the NLP (Natural Language Processing) technology, and classify patents through a word embedding system and a machine learning classification model, so that the speed and accuracy of classifying patents are improved, and the labor cost is reduced.
The prior scholars also study deep learning methods for classifying patents, which may include Convolutional Neural Networks (CNNs), graph Neural Networks (GNNs), and Graph Convolutional Neural Networks (GCNs). The graph convolution neural network introduces graph embedding to consider structural information of original patent samples, and effectively utilizes important relations between nodes by using convolution operation on the graph, so that the model achieves better cognition and patent classification capability, but the traditional graph convolution neural network only focuses on embedding of a single graph and depends heavily on the quality of the single graph, and the generalization performance of the model is insufficient. Moreover, label training samples in fine classification rarely lead to insufficient classification performance of supervised models and are not enough for realizing fine classification of CPC codes.
Patent document CN109446319a discloses a biomedical patent clustering analysis method based on K-means, which selects 4 important evaluation indexes of patent application amount, patent authorization amount, patent growth rate, and patent efficiency in patent analysis as clustering variables to perform clustering analysis, and can deeply mine the association between data and better classify patent data, but cannot classify patent CPC codes.
Disclosure of Invention
In view of the above, the invention provides an unsupervised patent clustering method based on a parallel multi-graph convolution neural network, which improves the precision of a model for finely classifying patents and improves the accuracy of patent classification under unsupervised learning.
In order to achieve the above object, an embodiment of the present invention provides an unsupervised patent clustering method based on a parallel multi-graph convolutional neural network, including the following steps:
vectorizing the patent data to be clustered to obtain vectorized patent data;
constructing multiple types of patent diagrams according to vectorized patent data, wherein the multiple types of patent diagrams comprise KNN patent diagrams, patent diagrams of the same applicant, patent diagrams of the same inventor and patent diagrams of the same keyword, which are constructed based on the similarity of patents;
the patent data to be clustered are calculated by utilizing a model constructed based on unsupervised learning, and the method comprises the following steps: carrying out vector coding on each vectorization patent data by utilizing a coder contained in a self-coder to obtain a coding vector; extracting feature vectors of each type of patent drawings combined with the coding vectors in parallel by using each image convolutional neural network contained in the parallel image convolutional neural network module; calculating a single-image attention vector according to each type of feature vector in parallel by utilizing each single-image self-attention layer contained in the parallel single-image self-attention module; calculating a global attention vector of each patent datum according to all the class single-figure attention vectors by using a multi-figure attention module;
and clustering the global attention vectors of all the patent data to obtain a clustering result.
In one embodiment, each patent data includes an invention name, a summary, an applicant, and an inventor, and these data are vectorized to obtain vectorized patent data.
In one embodiment, when constructing multiple types of patent graphs, each patent is used as a node, vectorized patent data is used as a node attribute, and connecting edges between nodes are different according to the types of the patent graphs, and the construction modes are also different, including:
for the KNN patent diagram, similarity calculation between any two patent data is carried out on all the patent data, and the patent data corresponding to k large similarities before the patent data are screened according to the similarity value to serve as neighborhood patent data to be used for constructing a connecting edge between nodes, namely the connecting edge is constructed between any two corresponding nodes of all the neighborhood patent data;
aiming at the patent drawings of the common applicant, constructing connecting edges between the nodes corresponding to the common applicant;
aiming at the patent drawings of the common inventor, constructing connecting edges among nodes corresponding to the common inventor;
and aiming at the common keyword patent graph, constructing a connecting edge between nodes corresponding to the common keywords.
In one embodiment, the encoder comprises L encoding layers, and the input vectorization patent data is subjected to vector encoding of the plurality of encoding layers to obtain an encoding vector output by each layer;
each graph convolution neural network corresponding to each type of patent graph comprises L graph convolution layers, the number of the graph convolution layers is equal to that of the coding layers, each graph convolution layer firstly carries out weight distribution on a coding vector output by the corresponding coding layer and a characteristic vector output by the last layer of graph convolution layer, then takes the characteristic vector distributed with the weight as the input of the current graph convolution operation, carries out the graph convolution operation by combining the adjacent matrix of each type of patent graph to output the characteristic vector, and is expressed by a formula as follows:
Figure 938721DEST_PATH_IMAGE001
wherein the content of the first and second substances,lexpressed as an index of the number of network layer layers,van index indicating the kind of the patent drawing,
Figure 665369DEST_PATH_IMAGE002
representing weights for balancing importance of code vectors and feature vectorsThe degree of the sexual function is high,
Figure 564055DEST_PATH_IMAGE003
is shown asl-coding vectors output by a 1-layer coding layer,
Figure 540101DEST_PATH_IMAGE004
and
Figure 284066DEST_PATH_IMAGE005
respectively representvCorresponding to patent-like drawingl-1 layer and the secondlThe feature vectors output by the layer map convolution operation,
Figure 396379DEST_PATH_IMAGE006
a feature vector representing the assigned weight is assigned,
Figure 465966DEST_PATH_IMAGE007
is shown asvCorresponding to patent-like drawinglThe weight of the layer map convolution operation,
Figure 663729DEST_PATH_IMAGE008
is shown asvAdjacency matrix of similar patent graph
Figure 211385DEST_PATH_IMAGE009
And the sum of the identity matrix and the identity matrix,Drepresent
Figure 443783DEST_PATH_IMAGE010
The diagonal matrix of (2), reLU () represents the ReLU activation function;
with respect to the first graph convolution layer,
Figure 684272DEST_PATH_IMAGE011
and the node matrix X represents each type of patent graph.
In one embodiment, each single graph calculates a single graph attention vector from the attention layer in parallel according to each type of feature vector, and the method comprises the following steps: firstly, the attention weight of the feature is calculated according to each type of feature vector, and then the activation calculation is carried out on each type of feature vector according to the attention weight so as to obtain the single-image attention vector corresponding to each type of feature vector.
In one embodiment, the computing the global attention vector for each patent data from all class single map attention vectors using a multi-map attention module comprises: firstly, carrying out nonlinear transformation on each type of single-image attention vector to obtain each type of multi-layer attention value; then, carrying out normalization processing on each type of multilayer attention value relative to all types of multilayer attention values to obtain each type of global attention weight; and finally, carrying out weighted summation on the attention vectors of the single images of each type according to the global attention weight of each type to obtain the global attention vector of each patent data.
In one embodiment, the model requires parameter optimization before being applied, including:
decoding the coding vector output by the encoder by using a decoder contained in the self-encoder to obtain reconstructed patent data corresponding to each vectorized patent data;
constructing total loss, namely constructing reconstruction loss based on vectorization patent data input by a self-encoder and output reconstruction patent data, constructing multi-graph correlation loss based on attention vectors of all classes of single graphs, and taking weighted summation of the reconstruction loss and the multi-graph correlation loss as the total loss;
and optimizing the model parameters by using the total loss and adopting an unsupervised learning mode to obtain a model with optimized parameters.
In one embodiment, the constructing of the reconstruction loss based on the vectorized patent data input from the encoder and the output reconstructed patent data includes: and constructing reconstruction loss according to the squares of Euclidean norms between vectorized patent data and reconstructed patent data corresponding to all the patent data.
In one embodiment, constructing a multi-map correlation penalty based on all class single-map attention vectors includes: firstly, calculating the autocorrelation similarity of attention vectors of each type of single images; and then constructing the multi-graph correlation loss according to the square of the Euclidean norm between the autocorrelation similarities of the self-correlation of any two types of single graph attention vectors.
In one embodiment, the unsupervised patent clustering method further includes:
and performing CPC code classification on each patent data according to the clustering result, wherein the CPC code classification comprises the following steps: patent data belonging to the same cluster are considered to have the same CPC code, and when the CPC of one patent data in the cluster is judged manually, the CPC codes of all other patent data in the cluster can be obtained.
Compared with the prior art, the method has the beneficial effects that at least:
on the basis of constructing 4 types of patent drawings and coding vectors of patent data from a coder, 4 types of patent drawings and coding vectors are fully extracted through a drawing convolution operation, effective feature vectors of the patent data are comprehensively extracted, weights are distributed to each type of feature vectors through a parallel single-drawing self-attention module, the importance degree of important features of a single drawing is improved to obtain a single-drawing attention vector, the single-drawing attention vectors of all types are fused through a multi-drawing attention module for learning, and larger weights are distributed to the important single drawing, so that the obtained global attention vector integrates multi-aspect feature information, and the clustering precision is improved.
The model is constructed based on unsupervised learning, the generalization performance of the model to the deep clustering of the patent data is improved under the condition that the fine classification labels are true, the comprehensiveness of the model in feature extraction is improved, and the effectiveness of the patent data clustering is further improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart of an unsupervised patent clustering method based on a parallel multi-graph convolutional neural network provided by an embodiment;
FIG. 2 is a schematic diagram of a model provided by the embodiment;
FIG. 3 is a schematic view of the structure of each convolution layer provided by the embodiment;
FIG. 4 is a schematic structural diagram of each single-drawing self-attention layer provided by the embodiment;
fig. 5 is a schematic structural diagram of a multi-view attention module according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
The problem of when patent fine classification label training sample is too few, lead to having the classification performance of supervised classification model not enough is solved, still for according to unilateral patent drawing, lead to the generalization performance of classification model not enough and then the inaccurate problem of patent classification that arouses. The embodiment provides an unsupervised patent clustering method based on a parallel multi-graph convolution neural network, which improves the precision of a model for finely classifying patents and the accuracy of patent classification under unsupervised learning.
Fig. 1 is a flowchart of an unsupervised patent clustering method based on a parallel multi-graph convolutional neural network according to an embodiment. As shown in fig. 1, the unsupervised patent clustering method based on the parallel multi-graph convolutional neural network provided in the embodiment includes the following steps:
step 1, vectorizing the patent data to be clustered to obtain vectorized patent data.
In the embodiment, each piece of patent data to be clustered corresponds to one patent document, and includes the name, abstract, applicant and inventor of the patent, vectorization is performed on the data to obtain vectorized patent data, and the specific vectorized patent data is expressed in a form of a 1-dimensional vector group.
And 2, constructing multiple types of patent diagrams according to the vectorized patent data.
In an embodiment, the multi-class patent drawings comprise KNN (K-nearest-neighbor) patent drawings, patent drawings of the same applicant, patent drawings of the same inventor and patent drawings of the same keyword, which are constructed based on the similarity of patents. When constructing multiple types of patent graphs, each patent is used as a node, vectorized patent data is used as a node attribute, connecting edges between the nodes are different according to different types of the patent graphs, and the construction mode is also different, and the method comprises the following steps:
and (3) aiming at the KNN patent graph, calculating the similarity between any two patent data of all the patent data, screening the patent data corresponding to k large similarities as neighborhood patent data according to the similarity value, and constructing a connecting edge between nodes, namely constructing a connecting edge between any two corresponding nodes of all the neighborhood patent data to form the KNN patent graph.
In one embodiment, cosine similarity between any two patent data can be calculated, and patent data corresponding to k cosine similarities before the patent data are screened as neighborhood patent data according to the cosine similarity, so as to construct a connecting edge between nodes.
In the embodiment, for the patent drawings of the common applicant, connecting edges are constructed among nodes corresponding to the common applicant to form the patent drawings of the common applicant; aiming at the co-inventor patent drawings, constructing connecting edges among nodes corresponding to the co-inventors to form the co-inventor patent drawings; and aiming at the common keyword patent diagrams, constructing connecting edges among nodes corresponding to the common keywords so as to form the common keyword patent diagrams. Wherein, the key words are extracted from the invention names and the abstract contents.
And 3, calculating the patent data to be clustered by using the model constructed based on unsupervised learning to obtain the global attention vector of each patent data.
Fig. 2 is a schematic structural diagram of a model provided in the embodiment. As shown in fig. 2, the constructed model includes a self-encoder including an encoder and a decoder, a parallel graph convolution neural network module, a parallel single-graph self-attention module, and a multi-graph attention module, wherein the encoder is configured to perform vector encoding on vectorized patent data to obtain an encoded vector; the decoder is used for decoding the coding vector to obtain reconstructed patent data; the parallel graph convolution neural network module is used for extracting the feature vectors of each type of patent graphs combined with the coding vectors in parallel; the parallel single-graph self-attention module is used for calculating single-graph attention vectors according to each type of feature vectors in parallel; the multi-map attention module is used for calculating a global attention vector of each patent datum according to all the class single-map attention vectors.
In an embodiment, the encoder includes L encoding layers, and the input vectorized patent data is subjected to vector encoding by the plurality of encoding layers to obtain an encoding vector output by each layer, and is expressed by a formula:
Figure 103752DEST_PATH_IMAGE012
wherein the content of the first and second substances,lexpressed as an index of the coding layer, reLU () represents the ReLU activation function,
Figure 720678DEST_PATH_IMAGE013
and
Figure 559581DEST_PATH_IMAGE014
representing the weights and offsets of the coding layers,
Figure 236550DEST_PATH_IMAGE015
and
Figure 143327DEST_PATH_IMAGE016
respectively representl-1 layer and the secondlLayer-coding the coded vectors output by the layer, in particular whenlWhen =1, i.e. for the first layer coding layer,
Figure 298364DEST_PATH_IMAGE017
the input vectorized patent data is represented, the coding layer can adopt a full connection layer, and the obtained coding vector can be used for enhancing the data representation of the patent drawing.
In an embodiment, the number of layers of the decoder is the same as that of the encoder, and the decoder includes L decoding layers, an input encoded vector is subjected to vector decoding of the plurality of decoding layers to obtain a decoded vector output by a last decoding layer as reconstruction patent data, and the reconstruction patent data is used for constructing a reconstruction loss and is expressed by a formula:
Figure 239776DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure 822067DEST_PATH_IMAGE019
and
Figure 481718DEST_PATH_IMAGE020
representing the weights and offsets of the decoded layers,
Figure 174868DEST_PATH_IMAGE021
and
Figure 970785DEST_PATH_IMAGE022
respectively represent the firstl-1 layer and the second layerlLayer decoding the decoded vector output by the layer, in particular whenlWhen =1, i.e. the layer is decoded for the first layer,
Figure 723978DEST_PATH_IMAGE023
representing the input code vector.
In the embodiment, the parallel graph convolution neural network module includes graph convolution neural networks with the same number as the types of the patent drawings, that is, 4 graph convolution neural networks exist for 4 types of patent drawings, and the 4 graph convolution neural networks respectively perform feature extraction on feature vectors of 4 types of patent drawings combined with coding vectors in parallel to obtain feature vectors of 4 types of patent drawings.
In the embodiment, each graph convolutional neural network corresponding to each type of patent graph includes L graph convolutional layers, that is, the number of graph convolutional layers is equal to the number of coding layers, as shown in fig. 3, each graph convolutional layer includes a weight assignment operation and a graph convolutional operation, that is, after each graph convolutional layer first performs weight assignment on a coding vector output by a corresponding coding layer (correspondingly coded into a coding layer having the same index as that of the convolutional layer) and a feature vector output by a previous graph convolutional layer, then the feature vector to which the weight is assigned is taken as an input of the current graph convolutional operation, the graph convolutional operation is performed in combination with an adjacent matrix of each type of patent graph, so as to output the feature vector, which is expressed by a formula:
Figure 605346DEST_PATH_IMAGE001
wherein the content of the first and second substances,lan index indicating the number of network layers (coding layers or picture convolution layers),vindexes representing the types of the patent drawings, namely a KNN patent drawing, a co-applicant patent drawing, a co-inventor patent drawing and a co-keyword patent drawing,
Figure 367766DEST_PATH_IMAGE002
representing weights for balancing the degree of importance of the code vector and the feature vector,
Figure 18190DEST_PATH_IMAGE004
and
Figure 942284DEST_PATH_IMAGE005
respectively representvCorresponding to patent-like drawingl-1 layer and the secondlThe feature vectors output by the layer map convolution operation,
Figure 310948DEST_PATH_IMAGE006
a feature vector representing the assigned weight is assigned,
Figure 345900DEST_PATH_IMAGE007
denotes the firstvCorresponding to patent-like drawinglThe weight of the layer map convolution operation,
Figure 381989DEST_PATH_IMAGE008
is shown asvAdjacency matrix of similar patent drawings
Figure 211405DEST_PATH_IMAGE009
And the sum of identity matrices, i.e.
Figure 67365DEST_PATH_IMAGE024
DTo represent
Figure 171588DEST_PATH_IMAGE010
A diagonal matrix of (1), reLU () representing a ReLU activation function, in particular, whenlWhere =1, i.e. for the first map convolutional layer,
Figure 531025DEST_PATH_IMAGE011
and the node matrix X represents each type of patent graph.
In the embodiment, the parallel graph convolution neural network module can improve the feature aggregation capability of the model by combining the coding vector of the self-coder and the graph information of each type of patent graph, and comprehensively obtain the special features of the patent data.
In the embodiment, the parallel single-graph self-attention module includes single-graph self-attention layers with the same number as the types of the patent graphs, that is, 4 single-graph self-attention layers exist for 4 types of patent graphs, and the 4 single-graph self-attention layers respectively calculate 4 types of single-graph attention vectors according to the 4 types of feature vectors in parallel.
In an embodiment, as shown in fig. 4, each single-graph self-attention layer corresponding to each type of patent graph includes an attention weight calculation operation and an activation calculation operation, that is, first, an attention weight of a feature is calculated according to each type of feature vector, and then, activation calculation is performed on each type of feature vector according to the attention weight, so as to obtain a single-graph attention vector corresponding to each type of feature vector, which is expressed by a formula:
Figure 796921DEST_PATH_IMAGE025
Figure 140178DEST_PATH_IMAGE026
wherein the content of the first and second substances,iandmeach of which represents an index of the patent data,
Figure 782512DEST_PATH_IMAGE027
Figure 527614DEST_PATH_IMAGE028
and
Figure 964411DEST_PATH_IMAGE029
respectively representvClass I patent drawings containiFeature vectors, attention weights and single-map attention vectors corresponding to the individual patent data,
Figure 794964DEST_PATH_IMAGE030
and
Figure 975410DEST_PATH_IMAGE031
respectively, the weight and bias of the attention weight calculation, tan () represents tan trigonometric function, and Sigmoid () represents Sigmoid activation function.
In an embodiment, each attention layer of the parallel single-graph self-attention module can assign a higher weight to important features of a single patent graph, so that the obtained single-graph attention vector focuses more on characteristic information embodied by the category of the single-graph attention vector.
In an embodiment, the multi-map attention module is configured to compute a global attention vector based on all class single-map attention vectors. As shown in fig. 5, the multi-map attention module includes a non-linear transformation calculation operation, a global attention weight calculation operation, and a global attention vector calculation operation, that is, first, a non-linear transformation is performed on each type of single-map attention vector to obtain each type of multi-layer attention value; then, carrying out normalization processing on each type of multilayer attention value relative to all types of multilayer attention values to obtain a global attention weight of each type; and finally, carrying out weighted summation on the attention vectors of the single images of each type according to the global attention weight of each type to obtain the global attention vector of each patent data, wherein the global attention vector is expressed by a formula as follows:
Figure 309439DEST_PATH_IMAGE032
wherein, the first and the second end of the pipe are connected with each other,
Figure 182717DEST_PATH_IMAGE033
representing shared attention vectors, superscriptTThe transpose is represented by,
Figure 234987DEST_PATH_IMAGE034
and
Figure 219123DEST_PATH_IMAGE035
respectively representing the weight and bias of the nonlinear transformation computation operation,
Figure 673238DEST_PATH_IMAGE036
Figure 451839DEST_PATH_IMAGE037
and
Figure 991404DEST_PATH_IMAGE038
respectively representvClass I patent drawing containsiThe patent data corresponds to a plurality of layers of attention values, a global attention weight and a global attention vector.
In the embodiment, the multi-graph attention module allocates higher weight to the important single-graph attention vector, so that the feature extraction capability of the model is improved, and the deep clustering capability is further improved.
In an embodiment, the constructed model needs to be optimized for parameters before being applied, including: constructing total loss, namely constructing reconstruction loss based on vectorization patent data input by a self-encoder and output reconstruction patent data, constructing multi-graph correlation loss based on attention vectors of all kinds of single graphs, and taking weighted summation of the reconstruction loss and the multi-graph correlation loss as the total loss; optimizing model parameters by using total loss and adopting an unsupervised learning mode to obtain a parameter-optimized model, wherein the total lossLoss final Expressed as:
Figure 779232DEST_PATH_IMAGE039
wherein the content of the first and second substances,α,βthe hyper-parameter is determined by unsupervised learning.
In an embodiment, reconstruction lossLoss Reconstruction The construction of the vectorized patent data and the output reconstructed patent data based on the input from the encoder specifically comprises the following steps: vectorizing patent data and reconstructing patent data corresponding to all patent dataAnd constructing reconstruction loss by utilizing the square of the Euclidean norm among the data, and expressing the reconstruction loss by using a formula as follows:
Figure 87853DEST_PATH_IMAGE040
wherein, the first and the second end of the pipe are connected with each other,
Figure 37355DEST_PATH_IMAGE041
Figure 64217DEST_PATH_IMAGE042
respectively representiVectorized patent data and reconstructed patent data corresponding to the individual patent data,
Figure 390156DEST_PATH_IMAGE043
Figure 547424DEST_PATH_IMAGE044
respectively representing vectorized patent data and reconstructed patent data corresponding to all patent data, N representing the total amount of patent data,
Figure 933406DEST_PATH_IMAGE045
representing the square of the euclidean norm,
Figure 447564DEST_PATH_IMAGE046
representing the euclidean norm result.
In the examples, the loss associated with multiple graphsLoss Multiple diagrams Constructing according to attention vectors of all kinds of single graphs, specifically comprising: firstly, calculating the autocorrelation similarity of attention vectors of each type of single images; then, a multi-graph correlation loss is constructed according to the square of the Euclidean norm between the autocorrelation similarities of any two types of single-graph attention vectors, and is expressed by a formula as follows:
Figure 577194DEST_PATH_IMAGE047
wherein the content of the first and second substances,
Figure 594829DEST_PATH_IMAGE048
Figure 151712DEST_PATH_IMAGE049
respectively represent the firstvThe class single graph attention vector is relative to the normalization result and the autocorrelation similarity of the class single graph attention vector,tan autocorrelation similarity index representing a single graph attention vector,
Figure 153166DEST_PATH_IMAGE050
the autocorrelation similarity of the attention vectors of the single graphs of the t-th class and the V-th class is respectively shown, and V represents the type of the patent graph.
The total loss provided by the embodiment is fused with the reconstruction loss and the multi-graph loss, and the generalization performance of the model on the deep clustering of the patent data is improved, so that the effectiveness of the classification of the CPC codes of the patent is improved.
The model obtained by adopting the total loss and optimizing through unsupervised learning has strong generalization capability, and can obtain a comprehensive global attention vector which can realize effective and reliable classification of the patent CPC codes.
In the embodiment, the calculation of the patent data to be clustered after the parameter optimization comprises the following processes: carrying out vector coding on each vectorization patent data by using an encoder contained in a self-encoder to obtain a coding vector; extracting feature vectors of each type of patent drawings combined with the coding vectors in parallel by using each image convolution neural network contained in the parallel image convolution neural network module; calculating a single-image attention vector according to each type of feature vector in parallel by utilizing each single-image self-attention layer contained in the parallel single-image self-attention module; a global attention vector for each patent datum is calculated from all class simplex attention vectors using a multi-graph attention module.
And 4, clustering the global attention vectors of all the patent data to obtain a clustering result.
In the embodiment, based on the global attention vector corresponding to each patent data, clustering operation is performed to obtain a clustering result, each clustering cluster comprises a plurality of global attention vectors corresponding to the patent data, and each global attention vector has a vector capable of comprehensively expressing patent data characteristics, so that clustering clusters obtained by clustering based on the global attention vectors have very same patent data characteristics, and can be considered to belong to the same class and have the same CPC code. The clustering algorithm can adopt an algorithm such as k-means clustering and the like.
And 5, performing CPC code classification on each patent data according to the clustering result.
In the embodiment, the patent data belonging to the same cluster is considered to have the same CPC code, and when the CPC of one patent data in the cluster is judged manually, the CPC codes of all other patent data in the cluster can be obtained.
In summary, the unsupervised patent clustering method based on the parallel multi-graph convolutional neural network provided by the embodiment realizes deep clustering of patents by considering multi-graph information and coding information of patent data, improves effectiveness and generalization of CPC code classification of the patents, and has a high application value to CPC code classification of the patents.
The technical solutions and advantages of the present invention have been described in detail in the foregoing detailed description, and it should be understood that the above description is only the most preferred embodiment of the present invention, and is not intended to limit the present invention, and any modifications, additions, and equivalents made within the scope of the principles of the present invention should be included in the protection scope of the present invention.

Claims (4)

1. A CPC code classification method based on a parallel multi-graph convolution neural network is characterized by comprising the following steps:
step 1, vectorizing the patent data to be clustered to obtain vectorized patent data in a 1-dimensional vector group form, wherein each patent data comprises an invention name, an abstract, an applicant and an inventor;
step 2, constructing multiple types of patent drawings according to vectorized patent data, wherein the multiple types of patent drawings comprise KNN patent drawings constructed based on patent similarity, patent drawings of the same applicant, patent drawings of the same inventor and patent drawings of the same keyword; according to the KNN patent graph, cosine similarity between any two patent data is carried out on all the patent data, and the patent data corresponding to k cosine similarities before are screened according to the cosine similarity to serve as neighborhood patent data and are used for constructing a connecting edge between nodes, namely the connecting edge is constructed between any two nodes corresponding to all the neighborhood patent data;
step 3, calculating the patent data to be clustered by using the model constructed based on unsupervised learning, and the method comprises the following steps:
(a) Carrying out vector coding on each vectorization patent data by utilizing a coder contained in a self-coder to obtain a coding vector, wherein the coder comprises L coding layers, and the input vectorization patent data is subjected to vector coding of a plurality of coding layers to obtain the coding vector output by each layer;
(b) Extracting feature vectors of each type of patent drawings combined with coding vectors in parallel by using each graph convolutional neural network contained in a parallel graph convolutional neural network module, wherein each graph convolutional neural network corresponding to each type of patent drawings contains L graph convolutional layers, the number of the graph convolutional layers is equal to that of the coding layers, each graph convolutional layer firstly carries out weight distribution on the coding vectors output by the corresponding coding layers and the feature vectors output by the last layer of the graph convolutional layers, then the feature vectors distributed with the weights are used as the input of the current graph convolutional operation, and the graph convolutional operation is carried out by combining an adjacent matrix of each type of patent drawings to output the feature vectors, and the formula is expressed as follows:
Figure FDA0004045413870000021
Figure FDA0004045413870000022
wherein l is an index of the number of network layers, v is an index of the type of the patent drawing, epsilon is a weight for balancing the importance degree of the coding vector and the characteristic vector,
Figure FDA0004045413870000023
represents the coded vector that is output by the l-1 coding layer, and->
Figure FDA0004045413870000024
And &>
Figure FDA0004045413870000025
Respectively represents the characteristic vectors output by the convolution operation of the l-1 th layer and the l-th layer corresponding to the v-th type patent diagram, and>
Figure FDA0004045413870000026
feature vector representing an assigned weight>
Figure FDA0004045413870000027
Weight, A, representing the l-th level graph convolution operation corresponding to the v-th type patent graph v Adjacency matrix A representing class v patent drawings v And the sum of identity matrices, D represents A v The diagonal matrix of (1), reLU () representing the ReLU activation function; for the first map convolutional layer>
Figure FDA0004045413870000028
A node matrix X representing each type of patent graph;
(c) Calculating a single-drawing attention vector according to each type of feature vector in parallel by utilizing each single-drawing self-attention layer contained in the parallel single-drawing self-attention module;
(d) Calculating a global attention vector of each patent datum according to all the class single-drawing attention vectors by using a multi-drawing attention module;
wherein, the model needs parameter optimization before being applied, including: decoding the coded vector output by the encoder by using a decoder contained in the self-encoder to obtain reconstructed patent data corresponding to each vectorized patent data; constructing total loss, including constructing reconstruction loss based on vectorized patent data input from the encoder and reconstructed patent data output, constructing multi-graph correlation loss based on attention vectors of all kinds of single graphs, and performing weighted solving on the reconstruction loss and the multi-graph correlation lossAnd as total losses; optimizing model parameters by using total Loss and adopting an unsupervised learning mode to obtain a parameter-optimized model, wherein the total Loss is less final Expressed as:
Loss final =αLoss reconstruction +βLoss Multiple diagrams
Wherein, alpha and beta are hyper-parameters and are determined by unsupervised learning;
loss of reconstruction Loss Reconstruction In order to construct according to the square of the Euclidean norm between vectorized patent data and reconstructed patent data corresponding to all patent data, the formula is expressed as follows:
Figure FDA0004045413870000031
wherein the content of the first and second substances,
Figure FDA0004045413870000032
respectively representing the vectorized patent data and the reconstructed patent data corresponding to the ith patent data,
Figure FDA0004045413870000033
respectively representing vectorized patent data and reconstructed patent data corresponding to all patent data, wherein N represents the total amount of the patent data, and/or the judgment result is greater than>
Figure FDA0004045413870000034
Represents the square of the Euclidean norm, <' > is>
Figure FDA0004045413870000035
Representing the euclidean norm result;
loss associated with multiple graphs Multiple diagrams Constructing according to attention vectors of all kinds of single graphs, specifically comprising: firstly, calculating the autocorrelation similarity of attention vectors of each type of single images; then, a multi-graph correlation loss is constructed according to the square of the Euclidean norm between the autocorrelation similarities of any two types of single-graph attention vectors, and is expressed by a formula as follows:
Figure FDA0004045413870000036
Figure FDA0004045413870000037
wherein, M nor,v 、S v Respectively representing the normalization result and the autocorrelation similarity of the class v single-chart attention vector relative to the single-chart attention vector, t representing the autocorrelation similarity index of the single-chart attention vector, S t 、S v Respectively representing the autocorrelation similarity of the attention vectors of the single graphs of the t-th class and the V-th class, wherein V represents the type of the patent graph;
step 4, clustering the global attention vectors of all patent data to obtain a clustering result;
and 5, performing CPC code classification on each patent data according to the clustering result, wherein the CPC code classification comprises the following steps: patent data belonging to the same cluster are considered to have the same CPC code, and when the CPC of one patent data in the cluster is judged manually, the CPC codes of all other patent data of the cluster can be obtained.
2. The CPC code classification method based on the parallel multi-graph convolutional neural network according to claim 1, wherein when constructing multiple classes of patent graphs, each patent is used as a node, vectorized patent data is used as a node attribute, and connecting edges between nodes are constructed in different ways according to different classes of patent graphs, including:
aiming at the patent drawings of the common applicant, constructing connecting edges among nodes corresponding to the common applicant;
aiming at the patent drawings of the common inventors, constructing connecting edges among nodes corresponding to the common inventors;
and aiming at the common keyword patent graph, constructing a connecting edge between nodes corresponding to the common keywords.
3. The CPC code classification method based on the parallel multi-graph convolutional neural network of claim 1, wherein each single graph calculates a single graph attention vector from each class of feature vectors in parallel from an attention layer, comprising: firstly, attention weight of the feature is calculated according to each type of feature vector, and then activation calculation is carried out on each type of feature vector according to the attention weight so as to obtain a single-image attention vector corresponding to each type of feature vector.
4. The CPC code classification method based on the parallel multi-graph convolutional neural network of claim 1, wherein the calculating of the global attention vector of each patent data according to all class single-graph attention vectors by using the multi-graph attention module comprises: firstly, carrying out nonlinear transformation on each type of single-image attention vector to obtain each type of multi-layer attention value; then, carrying out normalization processing on each type of multilayer attention value relative to all types of multilayer attention values to obtain a global attention weight of each type; and finally, carrying out weighted summation on the attention vectors of the single images of each type according to the global attention weight of each type to obtain the global attention vector of each patent data.
CN202210695144.8A 2022-06-20 2022-06-20 Unsupervised patent clustering method based on parallel multi-graph convolution neural network Active CN114781553B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210695144.8A CN114781553B (en) 2022-06-20 2022-06-20 Unsupervised patent clustering method based on parallel multi-graph convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210695144.8A CN114781553B (en) 2022-06-20 2022-06-20 Unsupervised patent clustering method based on parallel multi-graph convolution neural network

Publications (2)

Publication Number Publication Date
CN114781553A CN114781553A (en) 2022-07-22
CN114781553B true CN114781553B (en) 2023-04-07

Family

ID=82421156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210695144.8A Active CN114781553B (en) 2022-06-20 2022-06-20 Unsupervised patent clustering method based on parallel multi-graph convolution neural network

Country Status (1)

Country Link
CN (1) CN114781553B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486934A (en) * 2021-06-22 2021-10-08 河北工业大学 Attribute graph deep clustering method of hierarchical graph convolution network based on attention mechanism

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6243827B2 (en) * 2014-11-28 2017-12-06 本田技研工業株式会社 Image analysis device, image feature information database creation method, and design class rejection device and method
CN111373392B (en) * 2017-11-22 2021-05-07 花王株式会社 Document sorting device
CN110162703A (en) * 2019-05-13 2019-08-23 腾讯科技(深圳)有限公司 Content recommendation method, training method, device, equipment and storage medium
CN113326372A (en) * 2021-05-13 2021-08-31 贵阳业勤中小企业促进中心有限公司 Intellectual property data analysis method based on technical position
CN113378913B (en) * 2021-06-08 2023-10-31 电子科技大学 Semi-supervised node classification method based on self-supervised learning
CN113362160B (en) * 2021-06-08 2023-08-22 南京信息工程大学 Federal learning method and device for credit card anti-fraud
CN113468291B (en) * 2021-06-17 2024-04-02 中国科学技术大学 Patent automatic classification method based on patent network representation learning
CN113312500B (en) * 2021-06-24 2022-05-03 河海大学 Method for constructing event map for safe operation of dam
CN113254656B (en) * 2021-07-06 2021-10-22 北京邮电大学 Patent text classification method, electronic equipment and computer storage medium
CN113918711B (en) * 2021-07-29 2024-04-16 北京工业大学 Academic paper-oriented classification method based on multi-view multi-layer attention
CN113722484A (en) * 2021-08-31 2021-11-30 平安国际智慧城市科技股份有限公司 Rumor detection method, device, equipment and storage medium based on deep learning
CN113869404B (en) * 2021-09-27 2024-05-28 北京工业大学 Self-adaptive graph roll accumulation method for paper network data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486934A (en) * 2021-06-22 2021-10-08 河北工业大学 Attribute graph deep clustering method of hierarchical graph convolution network based on attention mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Graph Kernel Approach for Detecting Core Patents and Patent Groups;Dohyun Kim 等;《IEEE》;第44-51页 *
基于图卷积网络的高质量专利自动识别方案研究;吴洁 等;《情报杂志》;第88-95,124页 *

Also Published As

Publication number Publication date
CN114781553A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN112991354B (en) High-resolution remote sensing image semantic segmentation method based on deep learning
CN110689086B (en) Semi-supervised high-resolution remote sensing image scene classification method based on generating countermeasure network
Basu et al. Deepsat: a learning framework for satellite imagery
CN107526785B (en) Text classification method and device
CN109063666A (en) The lightweight face identification method and system of convolution are separated based on depth
CN109918528A (en) A kind of compact Hash code learning method based on semanteme protection
CN109960737B (en) Remote sensing image content retrieval method for semi-supervised depth confrontation self-coding Hash learning
CN108920720A (en) The large-scale image search method accelerated based on depth Hash and GPU
CN113326377B (en) Name disambiguation method and system based on enterprise association relationship
CN111667022A (en) User data processing method and device, computer equipment and storage medium
CN112765352A (en) Graph convolution neural network text classification method based on self-attention mechanism
CN108304573A (en) Target retrieval method based on convolutional neural networks and supervision core Hash
CN114896434B (en) Hash code generation method and device based on center similarity learning
CN109255381A (en) A kind of image classification method based on the sparse adaptive depth network of second order VLAD
CN113537384B (en) Hash remote sensing image retrieval method, device and medium based on channel attention
CN114373099A (en) Three-dimensional point cloud classification method based on sparse graph convolution
CN114120041A (en) Small sample classification method based on double-pair anti-variation self-encoder
CN111861756A (en) Group partner detection method based on financial transaction network and implementation device thereof
CN109902808A (en) A method of convolutional neural networks are optimized based on floating-point numerical digit Mutation Genetic Algorithms Based
Qin et al. Making deep neural networks robust to label noise: Cross-training with a novel loss function
CN114880538A (en) Attribute graph community detection method based on self-supervision
CN116977763A (en) Model training method, device, computer readable storage medium and computer equipment
Lu et al. Fine crop classification in high resolution remote sensing based on deep learning
CN114781553B (en) Unsupervised patent clustering method based on parallel multi-graph convolution neural network
CN114741473B (en) Event extraction method based on multi-task learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant