CN114781553B

CN114781553B - Unsupervised patent clustering method based on parallel multi-graph convolution neural network

Info

Publication number: CN114781553B
Application number: CN202210695144.8A
Authority: CN
Inventors: 韩蒙; 梁兵; 况欢; 陈灏毅; 陈唯; 林昶廷
Original assignee: Binjiang Research Institute Of Zhejiang University
Current assignee: Binjiang Research Institute Of Zhejiang University
Priority date: 2022-06-20
Filing date: 2022-06-20
Publication date: 2023-04-07
Anticipated expiration: 2042-06-20
Also published as: CN114781553A

Abstract

The invention discloses an unsupervised patent clustering method based on a parallel multi-graph convolution neural network, which is characterized in that on the basis of constructing 4 types of patent graphs and coding vectors of a self-encoder for patent data, 4 types of patent graphs and coding vectors are fully extracted through graph convolution operation, effective feature vectors of the patent data are comprehensively extracted, weight is distributed to each type of feature vectors through a parallel single graph self-attention module, the importance degree of important features of a single graph is improved to obtain a single graph attention vector, the single graph attention vectors of all types are fused through a multi-graph attention module for learning, a larger weight is distributed to the important single graph, the obtained global attention vector integrates multi-aspect feature information, and the clustering precision is improved.

Description

Unsupervised patent clustering method based on parallel multi-graph convolution neural network

Technical Field

The invention belongs to the technical field of patent classification, and particularly relates to an unsupervised patent clustering method based on a parallel multi-graph convolution neural network.

Background

Through the analysis of the patent data, specific market development wind vane and organization innovation strength can be obtained. People often search for patents on various intellectual property platforms using information such as Patent names, keywords, and CPC (co-Patent Classification) codes. Among them, the CPC code is an extension of IPC (International Patent Classification), and is commonly managed by EPO (European Patent Office) and the us Patent and trademark Office. It is divided into nine parts, a-H and Y, which are in turn divided into classes, subclasses, groups and subgroups, with approximately 250000 classification entries. Whichever institution participates in processing and approving the patent will determine the type of classification code used for the invention. Once the patent application is approved, the CPC code cannot be changed any more. Therefore, it is extremely important for the patent applicant to prejudge the patent CPC code in advance.

At present, the classification of patent CPC codes mostly adopts a manual method to check patent names, abstracts and texts so as to match the corresponding patent CPC codes, which is very tedious for patent examiners and easy to make mistakes.

Some scholars study the NLP (Natural Language Processing) technology, and classify patents through a word embedding system and a machine learning classification model, so that the speed and accuracy of classifying patents are improved, and the labor cost is reduced.

The prior scholars also study deep learning methods for classifying patents, which may include Convolutional Neural Networks (CNNs), graph Neural Networks (GNNs), and Graph Convolutional Neural Networks (GCNs). The graph convolution neural network introduces graph embedding to consider structural information of original patent samples, and effectively utilizes important relations between nodes by using convolution operation on the graph, so that the model achieves better cognition and patent classification capability, but the traditional graph convolution neural network only focuses on embedding of a single graph and depends heavily on the quality of the single graph, and the generalization performance of the model is insufficient. Moreover, label training samples in fine classification rarely lead to insufficient classification performance of supervised models and are not enough for realizing fine classification of CPC codes.

Patent document CN109446319a discloses a biomedical patent clustering analysis method based on K-means, which selects 4 important evaluation indexes of patent application amount, patent authorization amount, patent growth rate, and patent efficiency in patent analysis as clustering variables to perform clustering analysis, and can deeply mine the association between data and better classify patent data, but cannot classify patent CPC codes.

Disclosure of Invention

In view of the above, the invention provides an unsupervised patent clustering method based on a parallel multi-graph convolution neural network, which improves the precision of a model for finely classifying patents and improves the accuracy of patent classification under unsupervised learning.

In order to achieve the above object, an embodiment of the present invention provides an unsupervised patent clustering method based on a parallel multi-graph convolutional neural network, including the following steps:

vectorizing the patent data to be clustered to obtain vectorized patent data;

constructing multiple types of patent diagrams according to vectorized patent data, wherein the multiple types of patent diagrams comprise KNN patent diagrams, patent diagrams of the same applicant, patent diagrams of the same inventor and patent diagrams of the same keyword, which are constructed based on the similarity of patents;

the patent data to be clustered are calculated by utilizing a model constructed based on unsupervised learning, and the method comprises the following steps: carrying out vector coding on each vectorization patent data by utilizing a coder contained in a self-coder to obtain a coding vector; extracting feature vectors of each type of patent drawings combined with the coding vectors in parallel by using each image convolutional neural network contained in the parallel image convolutional neural network module; calculating a single-image attention vector according to each type of feature vector in parallel by utilizing each single-image self-attention layer contained in the parallel single-image self-attention module; calculating a global attention vector of each patent datum according to all the class single-figure attention vectors by using a multi-figure attention module;

and clustering the global attention vectors of all the patent data to obtain a clustering result.

In one embodiment, each patent data includes an invention name, a summary, an applicant, and an inventor, and these data are vectorized to obtain vectorized patent data.

In one embodiment, when constructing multiple types of patent graphs, each patent is used as a node, vectorized patent data is used as a node attribute, and connecting edges between nodes are different according to the types of the patent graphs, and the construction modes are also different, including:

for the KNN patent diagram, similarity calculation between any two patent data is carried out on all the patent data, and the patent data corresponding to k large similarities before the patent data are screened according to the similarity value to serve as neighborhood patent data to be used for constructing a connecting edge between nodes, namely the connecting edge is constructed between any two corresponding nodes of all the neighborhood patent data;

aiming at the patent drawings of the common applicant, constructing connecting edges between the nodes corresponding to the common applicant;

aiming at the patent drawings of the common inventor, constructing connecting edges among nodes corresponding to the common inventor;

and aiming at the common keyword patent graph, constructing a connecting edge between nodes corresponding to the common keywords.

In one embodiment, the encoder comprises L encoding layers, and the input vectorization patent data is subjected to vector encoding of the plurality of encoding layers to obtain an encoding vector output by each layer;

each graph convolution neural network corresponding to each type of patent graph comprises L graph convolution layers, the number of the graph convolution layers is equal to that of the coding layers, each graph convolution layer firstly carries out weight distribution on a coding vector output by the corresponding coding layer and a characteristic vector output by the last layer of graph convolution layer, then takes the characteristic vector distributed with the weight as the input of the current graph convolution operation, carries out the graph convolution operation by combining the adjacent matrix of each type of patent graph to output the characteristic vector, and is expressed by a formula as follows:

wherein the content of the first and second substances,lexpressed as an index of the number of network layer layers,van index indicating the kind of the patent drawing,

representing weights for balancing importance of code vectors and feature vectorsThe degree of the sexual function is high,

is shown asl-coding vectors output by a 1-layer coding layer,

and

respectively representvCorresponding to patent-like drawingl-1 layer and the secondlThe feature vectors output by the layer map convolution operation,

a feature vector representing the assigned weight is assigned,

is shown asvCorresponding to patent-like drawinglThe weight of the layer map convolution operation,

is shown asvAdjacency matrix of similar patent graph

And the sum of the identity matrix and the identity matrix,Drepresent

The diagonal matrix of (2), reLU () represents the ReLU activation function;

with respect to the first graph convolution layer,

and the node matrix X represents each type of patent graph.

In one embodiment, each single graph calculates a single graph attention vector from the attention layer in parallel according to each type of feature vector, and the method comprises the following steps: firstly, the attention weight of the feature is calculated according to each type of feature vector, and then the activation calculation is carried out on each type of feature vector according to the attention weight so as to obtain the single-image attention vector corresponding to each type of feature vector.

In one embodiment, the computing the global attention vector for each patent data from all class single map attention vectors using a multi-map attention module comprises: firstly, carrying out nonlinear transformation on each type of single-image attention vector to obtain each type of multi-layer attention value; then, carrying out normalization processing on each type of multilayer attention value relative to all types of multilayer attention values to obtain each type of global attention weight; and finally, carrying out weighted summation on the attention vectors of the single images of each type according to the global attention weight of each type to obtain the global attention vector of each patent data.

In one embodiment, the model requires parameter optimization before being applied, including:

decoding the coding vector output by the encoder by using a decoder contained in the self-encoder to obtain reconstructed patent data corresponding to each vectorized patent data;

constructing total loss, namely constructing reconstruction loss based on vectorization patent data input by a self-encoder and output reconstruction patent data, constructing multi-graph correlation loss based on attention vectors of all classes of single graphs, and taking weighted summation of the reconstruction loss and the multi-graph correlation loss as the total loss;

and optimizing the model parameters by using the total loss and adopting an unsupervised learning mode to obtain a model with optimized parameters.

In one embodiment, the constructing of the reconstruction loss based on the vectorized patent data input from the encoder and the output reconstructed patent data includes: and constructing reconstruction loss according to the squares of Euclidean norms between vectorized patent data and reconstructed patent data corresponding to all the patent data.

In one embodiment, constructing a multi-map correlation penalty based on all class single-map attention vectors includes: firstly, calculating the autocorrelation similarity of attention vectors of each type of single images; and then constructing the multi-graph correlation loss according to the square of the Euclidean norm between the autocorrelation similarities of the self-correlation of any two types of single graph attention vectors.

In one embodiment, the unsupervised patent clustering method further includes:

and performing CPC code classification on each patent data according to the clustering result, wherein the CPC code classification comprises the following steps: patent data belonging to the same cluster are considered to have the same CPC code, and when the CPC of one patent data in the cluster is judged manually, the CPC codes of all other patent data in the cluster can be obtained.

Compared with the prior art, the method has the beneficial effects that at least:

on the basis of constructing 4 types of patent drawings and coding vectors of patent data from a coder, 4 types of patent drawings and coding vectors are fully extracted through a drawing convolution operation, effective feature vectors of the patent data are comprehensively extracted, weights are distributed to each type of feature vectors through a parallel single-drawing self-attention module, the importance degree of important features of a single drawing is improved to obtain a single-drawing attention vector, the single-drawing attention vectors of all types are fused through a multi-drawing attention module for learning, and larger weights are distributed to the important single drawing, so that the obtained global attention vector integrates multi-aspect feature information, and the clustering precision is improved.

The model is constructed based on unsupervised learning, the generalization performance of the model to the deep clustering of the patent data is improved under the condition that the fine classification labels are true, the comprehensiveness of the model in feature extraction is improved, and the effectiveness of the patent data clustering is further improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flowchart of an unsupervised patent clustering method based on a parallel multi-graph convolutional neural network provided by an embodiment;

FIG. 2 is a schematic diagram of a model provided by the embodiment;

FIG. 3 is a schematic view of the structure of each convolution layer provided by the embodiment;

FIG. 4 is a schematic structural diagram of each single-drawing self-attention layer provided by the embodiment;

fig. 5 is a schematic structural diagram of a multi-view attention module according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.

The problem of when patent fine classification label training sample is too few, lead to having the classification performance of supervised classification model not enough is solved, still for according to unilateral patent drawing, lead to the generalization performance of classification model not enough and then the inaccurate problem of patent classification that arouses. The embodiment provides an unsupervised patent clustering method based on a parallel multi-graph convolution neural network, which improves the precision of a model for finely classifying patents and the accuracy of patent classification under unsupervised learning.

Fig. 1 is a flowchart of an unsupervised patent clustering method based on a parallel multi-graph convolutional neural network according to an embodiment. As shown in fig. 1, the unsupervised patent clustering method based on the parallel multi-graph convolutional neural network provided in the embodiment includes the following steps:

step 1, vectorizing the patent data to be clustered to obtain vectorized patent data.

In the embodiment, each piece of patent data to be clustered corresponds to one patent document, and includes the name, abstract, applicant and inventor of the patent, vectorization is performed on the data to obtain vectorized patent data, and the specific vectorized patent data is expressed in a form of a 1-dimensional vector group.

And 2, constructing multiple types of patent diagrams according to the vectorized patent data.

In an embodiment, the multi-class patent drawings comprise KNN (K-nearest-neighbor) patent drawings, patent drawings of the same applicant, patent drawings of the same inventor and patent drawings of the same keyword, which are constructed based on the similarity of patents. When constructing multiple types of patent graphs, each patent is used as a node, vectorized patent data is used as a node attribute, connecting edges between the nodes are different according to different types of the patent graphs, and the construction mode is also different, and the method comprises the following steps:

and (3) aiming at the KNN patent graph, calculating the similarity between any two patent data of all the patent data, screening the patent data corresponding to k large similarities as neighborhood patent data according to the similarity value, and constructing a connecting edge between nodes, namely constructing a connecting edge between any two corresponding nodes of all the neighborhood patent data to form the KNN patent graph.

In one embodiment, cosine similarity between any two patent data can be calculated, and patent data corresponding to k cosine similarities before the patent data are screened as neighborhood patent data according to the cosine similarity, so as to construct a connecting edge between nodes.

In the embodiment, for the patent drawings of the common applicant, connecting edges are constructed among nodes corresponding to the common applicant to form the patent drawings of the common applicant; aiming at the co-inventor patent drawings, constructing connecting edges among nodes corresponding to the co-inventors to form the co-inventor patent drawings; and aiming at the common keyword patent diagrams, constructing connecting edges among nodes corresponding to the common keywords so as to form the common keyword patent diagrams. Wherein, the key words are extracted from the invention names and the abstract contents.

And 3, calculating the patent data to be clustered by using the model constructed based on unsupervised learning to obtain the global attention vector of each patent data.

Fig. 2 is a schematic structural diagram of a model provided in the embodiment. As shown in fig. 2, the constructed model includes a self-encoder including an encoder and a decoder, a parallel graph convolution neural network module, a parallel single-graph self-attention module, and a multi-graph attention module, wherein the encoder is configured to perform vector encoding on vectorized patent data to obtain an encoded vector; the decoder is used for decoding the coding vector to obtain reconstructed patent data; the parallel graph convolution neural network module is used for extracting the feature vectors of each type of patent graphs combined with the coding vectors in parallel; the parallel single-graph self-attention module is used for calculating single-graph attention vectors according to each type of feature vectors in parallel; the multi-map attention module is used for calculating a global attention vector of each patent datum according to all the class single-map attention vectors.

In an embodiment, the encoder includes L encoding layers, and the input vectorized patent data is subjected to vector encoding by the plurality of encoding layers to obtain an encoding vector output by each layer, and is expressed by a formula:

wherein the content of the first and second substances,lexpressed as an index of the coding layer, reLU () represents the ReLU activation function,

and

representing the weights and offsets of the coding layers,

and

respectively representl-1 layer and the secondlLayer-coding the coded vectors output by the layer, in particular whenlWhen =1, i.e. for the first layer coding layer,

the input vectorized patent data is represented, the coding layer can adopt a full connection layer, and the obtained coding vector can be used for enhancing the data representation of the patent drawing.

In an embodiment, the number of layers of the decoder is the same as that of the encoder, and the decoder includes L decoding layers, an input encoded vector is subjected to vector decoding of the plurality of decoding layers to obtain a decoded vector output by a last decoding layer as reconstruction patent data, and the reconstruction patent data is used for constructing a reconstruction loss and is expressed by a formula:

wherein the content of the first and second substances,

and

representing the weights and offsets of the decoded layers,

and

respectively represent the firstl-1 layer and the second layerlLayer decoding the decoded vector output by the layer, in particular whenlWhen =1, i.e. the layer is decoded for the first layer,

representing the input code vector.

In the embodiment, the parallel graph convolution neural network module includes graph convolution neural networks with the same number as the types of the patent drawings, that is, 4 graph convolution neural networks exist for 4 types of patent drawings, and the 4 graph convolution neural networks respectively perform feature extraction on feature vectors of 4 types of patent drawings combined with coding vectors in parallel to obtain feature vectors of 4 types of patent drawings.

In the embodiment, each graph convolutional neural network corresponding to each type of patent graph includes L graph convolutional layers, that is, the number of graph convolutional layers is equal to the number of coding layers, as shown in fig. 3, each graph convolutional layer includes a weight assignment operation and a graph convolutional operation, that is, after each graph convolutional layer first performs weight assignment on a coding vector output by a corresponding coding layer (correspondingly coded into a coding layer having the same index as that of the convolutional layer) and a feature vector output by a previous graph convolutional layer, then the feature vector to which the weight is assigned is taken as an input of the current graph convolutional operation, the graph convolutional operation is performed in combination with an adjacent matrix of each type of patent graph, so as to output the feature vector, which is expressed by a formula:

wherein the content of the first and second substances,lan index indicating the number of network layers (coding layers or picture convolution layers),vindexes representing the types of the patent drawings, namely a KNN patent drawing, a co-applicant patent drawing, a co-inventor patent drawing and a co-keyword patent drawing,

representing weights for balancing the degree of importance of the code vector and the feature vector,

and

a feature vector representing the assigned weight is assigned,

denotes the firstvCorresponding to patent-like drawinglThe weight of the layer map convolution operation,

is shown asvAdjacency matrix of similar patent drawings

And the sum of identity matrices, i.e.

，DTo represent

A diagonal matrix of (1), reLU () representing a ReLU activation function, in particular, whenlWhere =1, i.e. for the first map convolutional layer,

and the node matrix X represents each type of patent graph.

In the embodiment, the parallel graph convolution neural network module can improve the feature aggregation capability of the model by combining the coding vector of the self-coder and the graph information of each type of patent graph, and comprehensively obtain the special features of the patent data.

In the embodiment, the parallel single-graph self-attention module includes single-graph self-attention layers with the same number as the types of the patent graphs, that is, 4 single-graph self-attention layers exist for 4 types of patent graphs, and the 4 single-graph self-attention layers respectively calculate 4 types of single-graph attention vectors according to the 4 types of feature vectors in parallel.

In an embodiment, as shown in fig. 4, each single-graph self-attention layer corresponding to each type of patent graph includes an attention weight calculation operation and an activation calculation operation, that is, first, an attention weight of a feature is calculated according to each type of feature vector, and then, activation calculation is performed on each type of feature vector according to the attention weight, so as to obtain a single-graph attention vector corresponding to each type of feature vector, which is expressed by a formula:

wherein the content of the first and second substances,iandmeach of which represents an index of the patent data,

、

and

respectively representvClass I patent drawings containiFeature vectors, attention weights and single-map attention vectors corresponding to the individual patent data,

and

respectively, the weight and bias of the attention weight calculation, tan () represents tan trigonometric function, and Sigmoid () represents Sigmoid activation function.

In an embodiment, each attention layer of the parallel single-graph self-attention module can assign a higher weight to important features of a single patent graph, so that the obtained single-graph attention vector focuses more on characteristic information embodied by the category of the single-graph attention vector.

In an embodiment, the multi-map attention module is configured to compute a global attention vector based on all class single-map attention vectors. As shown in fig. 5, the multi-map attention module includes a non-linear transformation calculation operation, a global attention weight calculation operation, and a global attention vector calculation operation, that is, first, a non-linear transformation is performed on each type of single-map attention vector to obtain each type of multi-layer attention value; then, carrying out normalization processing on each type of multilayer attention value relative to all types of multilayer attention values to obtain a global attention weight of each type; and finally, carrying out weighted summation on the attention vectors of the single images of each type according to the global attention weight of each type to obtain the global attention vector of each patent data, wherein the global attention vector is expressed by a formula as follows:

wherein, the first and the second end of the pipe are connected with each other,

representing shared attention vectors, superscriptTThe transpose is represented by,

and

respectively representing the weight and bias of the nonlinear transformation computation operation,

、

and

respectively representvClass I patent drawing containsiThe patent data corresponds to a plurality of layers of attention values, a global attention weight and a global attention vector.

In the embodiment, the multi-graph attention module allocates higher weight to the important single-graph attention vector, so that the feature extraction capability of the model is improved, and the deep clustering capability is further improved.

In an embodiment, the constructed model needs to be optimized for parameters before being applied, including: constructing total loss, namely constructing reconstruction loss based on vectorization patent data input by a self-encoder and output reconstruction patent data, constructing multi-graph correlation loss based on attention vectors of all kinds of single graphs, and taking weighted summation of the reconstruction loss and the multi-graph correlation loss as the total loss; optimizing model parameters by using total loss and adopting an unsupervised learning mode to obtain a parameter-optimized model, wherein the total lossLoss _final Expressed as:

wherein the content of the first and second substances,α，βthe hyper-parameter is determined by unsupervised learning.

In an embodiment, reconstruction lossLoss _{Reconstruction} The construction of the vectorized patent data and the output reconstructed patent data based on the input from the encoder specifically comprises the following steps: vectorizing patent data and reconstructing patent data corresponding to all patent dataAnd constructing reconstruction loss by utilizing the square of the Euclidean norm among the data, and expressing the reconstruction loss by using a formula as follows:

、

respectively representiVectorized patent data and reconstructed patent data corresponding to the individual patent data,

、

respectively representing vectorized patent data and reconstructed patent data corresponding to all patent data, N representing the total amount of patent data,

representing the square of the euclidean norm,

representing the euclidean norm result.

In the examples, the loss associated with multiple graphsLoss _{Multiple diagrams} Constructing according to attention vectors of all kinds of single graphs, specifically comprising: firstly, calculating the autocorrelation similarity of attention vectors of each type of single images; then, a multi-graph correlation loss is constructed according to the square of the Euclidean norm between the autocorrelation similarities of any two types of single-graph attention vectors, and is expressed by a formula as follows:

wherein the content of the first and second substances,

、

respectively represent the firstvThe class single graph attention vector is relative to the normalization result and the autocorrelation similarity of the class single graph attention vector,tan autocorrelation similarity index representing a single graph attention vector,

the autocorrelation similarity of the attention vectors of the single graphs of the t-th class and the V-th class is respectively shown, and V represents the type of the patent graph.

The total loss provided by the embodiment is fused with the reconstruction loss and the multi-graph loss, and the generalization performance of the model on the deep clustering of the patent data is improved, so that the effectiveness of the classification of the CPC codes of the patent is improved.

The model obtained by adopting the total loss and optimizing through unsupervised learning has strong generalization capability, and can obtain a comprehensive global attention vector which can realize effective and reliable classification of the patent CPC codes.

In the embodiment, the calculation of the patent data to be clustered after the parameter optimization comprises the following processes: carrying out vector coding on each vectorization patent data by using an encoder contained in a self-encoder to obtain a coding vector; extracting feature vectors of each type of patent drawings combined with the coding vectors in parallel by using each image convolution neural network contained in the parallel image convolution neural network module; calculating a single-image attention vector according to each type of feature vector in parallel by utilizing each single-image self-attention layer contained in the parallel single-image self-attention module; a global attention vector for each patent datum is calculated from all class simplex attention vectors using a multi-graph attention module.

And 4, clustering the global attention vectors of all the patent data to obtain a clustering result.

In the embodiment, based on the global attention vector corresponding to each patent data, clustering operation is performed to obtain a clustering result, each clustering cluster comprises a plurality of global attention vectors corresponding to the patent data, and each global attention vector has a vector capable of comprehensively expressing patent data characteristics, so that clustering clusters obtained by clustering based on the global attention vectors have very same patent data characteristics, and can be considered to belong to the same class and have the same CPC code. The clustering algorithm can adopt an algorithm such as k-means clustering and the like.

And 5, performing CPC code classification on each patent data according to the clustering result.

In the embodiment, the patent data belonging to the same cluster is considered to have the same CPC code, and when the CPC of one patent data in the cluster is judged manually, the CPC codes of all other patent data in the cluster can be obtained.

In summary, the unsupervised patent clustering method based on the parallel multi-graph convolutional neural network provided by the embodiment realizes deep clustering of patents by considering multi-graph information and coding information of patent data, improves effectiveness and generalization of CPC code classification of the patents, and has a high application value to CPC code classification of the patents.

The technical solutions and advantages of the present invention have been described in detail in the foregoing detailed description, and it should be understood that the above description is only the most preferred embodiment of the present invention, and is not intended to limit the present invention, and any modifications, additions, and equivalents made within the scope of the principles of the present invention should be included in the protection scope of the present invention.

Claims

1. A CPC code classification method based on a parallel multi-graph convolution neural network is characterized by comprising the following steps:

step 1, vectorizing the patent data to be clustered to obtain vectorized patent data in a 1-dimensional vector group form, wherein each patent data comprises an invention name, an abstract, an applicant and an inventor;

step 2, constructing multiple types of patent drawings according to vectorized patent data, wherein the multiple types of patent drawings comprise KNN patent drawings constructed based on patent similarity, patent drawings of the same applicant, patent drawings of the same inventor and patent drawings of the same keyword; according to the KNN patent graph, cosine similarity between any two patent data is carried out on all the patent data, and the patent data corresponding to k cosine similarities before are screened according to the cosine similarity to serve as neighborhood patent data and are used for constructing a connecting edge between nodes, namely the connecting edge is constructed between any two nodes corresponding to all the neighborhood patent data;

step 3, calculating the patent data to be clustered by using the model constructed based on unsupervised learning, and the method comprises the following steps:

(a) Carrying out vector coding on each vectorization patent data by utilizing a coder contained in a self-coder to obtain a coding vector, wherein the coder comprises L coding layers, and the input vectorization patent data is subjected to vector coding of a plurality of coding layers to obtain the coding vector output by each layer;

(b) Extracting feature vectors of each type of patent drawings combined with coding vectors in parallel by using each graph convolutional neural network contained in a parallel graph convolutional neural network module, wherein each graph convolutional neural network corresponding to each type of patent drawings contains L graph convolutional layers, the number of the graph convolutional layers is equal to that of the coding layers, each graph convolutional layer firstly carries out weight distribution on the coding vectors output by the corresponding coding layers and the feature vectors output by the last layer of the graph convolutional layers, then the feature vectors distributed with the weights are used as the input of the current graph convolutional operation, and the graph convolutional operation is carried out by combining an adjacent matrix of each type of patent drawings to output the feature vectors, and the formula is expressed as follows:

wherein l is an index of the number of network layers, v is an index of the type of the patent drawing, epsilon is a weight for balancing the importance degree of the coding vector and the characteristic vector,

represents the coded vector that is output by the l-1 coding layer, and->

And &>

Respectively represents the characteristic vectors output by the convolution operation of the l-1 th layer and the l-th layer corresponding to the v-th type patent diagram, and>

feature vector representing an assigned weight>

Weight, A, representing the l-th level graph convolution operation corresponding to the v-th type patent graph _v Adjacency matrix A representing class v patent drawings _v And the sum of identity matrices, D represents A _v The diagonal matrix of (1), reLU () representing the ReLU activation function; for the first map convolutional layer>

A node matrix X representing each type of patent graph;

(c) Calculating a single-drawing attention vector according to each type of feature vector in parallel by utilizing each single-drawing self-attention layer contained in the parallel single-drawing self-attention module;

(d) Calculating a global attention vector of each patent datum according to all the class single-drawing attention vectors by using a multi-drawing attention module;

wherein, the model needs parameter optimization before being applied, including: decoding the coded vector output by the encoder by using a decoder contained in the self-encoder to obtain reconstructed patent data corresponding to each vectorized patent data; constructing total loss, including constructing reconstruction loss based on vectorized patent data input from the encoder and reconstructed patent data output, constructing multi-graph correlation loss based on attention vectors of all kinds of single graphs, and performing weighted solving on the reconstruction loss and the multi-graph correlation lossAnd as total losses; optimizing model parameters by using total Loss and adopting an unsupervised learning mode to obtain a parameter-optimized model, wherein the total Loss is less _final Expressed as:

Loss _final ＝αLoss _{reconstruction} +βLoss _{Multiple diagrams}

Wherein, alpha and beta are hyper-parameters and are determined by unsupervised learning;

loss of reconstruction Loss _{Reconstruction} In order to construct according to the square of the Euclidean norm between vectorized patent data and reconstructed patent data corresponding to all patent data, the formula is expressed as follows:

wherein the content of the first and second substances,

respectively representing the vectorized patent data and the reconstructed patent data corresponding to the ith patent data,

respectively representing vectorized patent data and reconstructed patent data corresponding to all patent data, wherein N represents the total amount of the patent data, and/or the judgment result is greater than>

Represents the square of the Euclidean norm, <' > is>

Representing the euclidean norm result;

loss associated with multiple graphs _{Multiple diagrams} Constructing according to attention vectors of all kinds of single graphs, specifically comprising: firstly, calculating the autocorrelation similarity of attention vectors of each type of single images; then, a multi-graph correlation loss is constructed according to the square of the Euclidean norm between the autocorrelation similarities of any two types of single-graph attention vectors, and is expressed by a formula as follows:

wherein, M _nor,v 、S _v Respectively representing the normalization result and the autocorrelation similarity of the class v single-chart attention vector relative to the single-chart attention vector, t representing the autocorrelation similarity index of the single-chart attention vector, S _t 、S _v Respectively representing the autocorrelation similarity of the attention vectors of the single graphs of the t-th class and the V-th class, wherein V represents the type of the patent graph;

step 4, clustering the global attention vectors of all patent data to obtain a clustering result;

and 5, performing CPC code classification on each patent data according to the clustering result, wherein the CPC code classification comprises the following steps: patent data belonging to the same cluster are considered to have the same CPC code, and when the CPC of one patent data in the cluster is judged manually, the CPC codes of all other patent data of the cluster can be obtained.

2. The CPC code classification method based on the parallel multi-graph convolutional neural network according to claim 1, wherein when constructing multiple classes of patent graphs, each patent is used as a node, vectorized patent data is used as a node attribute, and connecting edges between nodes are constructed in different ways according to different classes of patent graphs, including:

aiming at the patent drawings of the common applicant, constructing connecting edges among nodes corresponding to the common applicant;

aiming at the patent drawings of the common inventors, constructing connecting edges among nodes corresponding to the common inventors;

3. The CPC code classification method based on the parallel multi-graph convolutional neural network of claim 1, wherein each single graph calculates a single graph attention vector from each class of feature vectors in parallel from an attention layer, comprising: firstly, attention weight of the feature is calculated according to each type of feature vector, and then activation calculation is carried out on each type of feature vector according to the attention weight so as to obtain a single-image attention vector corresponding to each type of feature vector.

4. The CPC code classification method based on the parallel multi-graph convolutional neural network of claim 1, wherein the calculating of the global attention vector of each patent data according to all class single-graph attention vectors by using the multi-graph attention module comprises: firstly, carrying out nonlinear transformation on each type of single-image attention vector to obtain each type of multi-layer attention value; then, carrying out normalization processing on each type of multilayer attention value relative to all types of multilayer attention values to obtain a global attention weight of each type; and finally, carrying out weighted summation on the attention vectors of the single images of each type according to the global attention weight of each type to obtain the global attention vector of each patent data.