CN117828090B

CN117828090B - Document classification method, device, equipment and storage medium

Info

Publication number: CN117828090B
Application number: CN202410225858.1A
Authority: CN
Inventors: 胡克坤; 董刚; 曹其春; 杨宏斌
Original assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Current assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Priority date: 2024-02-29
Filing date: 2024-02-29
Publication date: 2024-05-03
Anticipated expiration: 2044-02-29
Also published as: CN117828090A

Abstract

The embodiment of the application relates to the technical field of data processing, in particular to a method, a device, equipment and a storage medium for classifying documents, aiming at accurately classifying the documents. The method comprises the following steps: receiving a literature relationship graph, an adjacent matrix, a vertex feature matrix and a vertex label matrix; coarsening the literature relation graph by using a coarsening algorithm to obtain a coarsened graph chain corresponding to the literature relation graph; performing anti-roughening treatment on the coarsened drawing chain by using an anti-roughening algorithm to obtain an anti-coarsened drawing chain; establishing a multi-scale graph Haar wavelet convolutional neural network according to the coarsening graph chain and the anti-coarsening graph chain; training the multi-scale graph Haar wavelet convolutional neural network through the adjacency matrix, the vertex feature matrix and the vertex label matrix; and obtaining a literature classification result corresponding to the literature relation graph under the condition that the multi-scale graph Haar wavelet convolutional neural network is trained.

Description

Document classification method, device, equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of data processing, in particular to a method, a device, equipment and a storage medium for classifying documents.

Background

When storing documents, the documents need to be stored in a document database after being digitized, and document classification is a key premise and an important basis for realizing accurate document retrieval, document recommendation and document metering analysis in the document database. In the related art, graph modeling is generally used to represent the association between documents, and then based on the association between documents, a graph neural network is used to classify the documents.

In the related art, when the graph neural network is used for classifying the documents, the performance of the graph neural network is limited, the noise influence cannot be effectively eliminated, the documents cannot be well and accurately classified, and the document classification is not accurate enough.

Disclosure of Invention

The embodiment of the application provides a method, a device, equipment and a storage medium for classifying documents, which aim to accurately classify the documents.

An embodiment of the present application provides a document classification method, including:

Receiving a literature relationship graph, an adjacent matrix corresponding to the literature relationship graph, a vertex characteristic matrix corresponding to the literature relationship graph and a vertex label matrix corresponding to the literature relationship graph;

coarsening the literature relation graph by using a coarsening algorithm to obtain a coarsened graph chain corresponding to the literature relation graph;

Performing anti-roughening treatment on the coarsened drawing chain by using an anti-roughening algorithm to obtain an anti-coarsened drawing chain;

establishing a multi-scale graph Haar wavelet convolutional neural network according to the coarsening graph chain and the anti-coarsening graph chain;

Training the multi-scale graph Haar wavelet convolutional neural network through the adjacency matrix, the vertex feature matrix and the vertex label matrix;

and obtaining a literature classification result corresponding to the literature relation graph under the condition that the multi-scale graph Haar wavelet convolutional neural network is trained.

Optionally, before receiving the literature relationship graph, the adjacency matrix corresponding to the literature relationship graph, the vertex feature matrix corresponding to the literature relationship graph, and the vertex label matrix corresponding to the literature relationship graph, the method further includes:

obtaining literature data from a literature database;

constructing the literature relation graph by taking each literature in the literature data as a vertex and taking the reference relation among the literature as a side among the vertices;

Obtaining the vertex feature matrix according to the feature of each vertex in the literature relation graph;

obtaining the adjacency matrix according to the connection relation between the vertexes in the literature relation graph;

and obtaining the vertex label matrix according to the labels marked with the vertices in the literature relation graph.

Optionally, the roughening processing is performed on the literature relationship graph by using a roughening algorithm to obtain a roughened graph chain corresponding to the literature relationship graph, including:

Initializing the literature relation graph to obtain a first-stage coarsening graph;

Performing module density iterative optimization on the first-stage coarsening graph to obtain a second-stage coarsening graph;

And generating a plurality of coarsening graphs step by step, after the coarsening graphs of a preset level are obtained, forming the coarsening graph chain by all the coarsening graphs, and arranging each coarsening graph of the coarsening graph chain according to the generation sequence.

Optionally, the initializing the literature relationship graph to obtain a first-stage coarsening graph includes:

Taking each vertex in the literature relationship graph as a community;

setting a community state of each community to be unlabeled;

Adding the communities with the untagged community states into an untagged community set to obtain the first-stage coarsening graph.

Optionally, performing module density iterative optimization on the first-stage coarsening map to obtain a second-stage coarsening map, including:

Determining all neighbor communities corresponding to each untagged community in the first-level coarsening graph;

For each untagged community, forming an adjacent community pair by the community and each neighbor community respectively, so as to obtain a plurality of adjacent community pairs;

determining module density gains of communities in each adjacent community pair after the communities are combined;

determining the adjacent community pairs with the maximum module density gain corresponding to the adjacent community pairs;

Executing community merging operation on the communities in the adjacent community pair under the condition that the module density gain corresponding to the adjacent community pair is larger than a preset gain threshold;

Setting the community state of the neighbor communities to be marked under the condition that the community state of the neighbor communities in the adjacent community pair is unmarked;

deleting the neighbor communities from the unmarked community set;

Performing module density optimization in an iteration mode, and stopping iteration under the condition that community division density is not increased any more;

taking each community in the first-level coarsening graph after community combination as a super vertex;

and adding a connecting edge between each super vertex according to the corresponding connection relation between each super vertex to obtain the second-stage coarsening graph.

Optionally, before each community in the first-level coarsening graph after community merging is used as a super vertex, the method further includes:

Under the condition that the unmarked communities exist in the unmarked community set, executing community merging operation on the unmarked communities through a preset merging rule.

Optionally, the method further comprises:

Constructing a matching matrix between the coarsening graphs of two adjacent stages;

And under the condition that a new coarsening diagram is obtained, constructing an adjacent matrix corresponding to the coarsening diagram according to the connection relation between each vertex in the coarsening diagram.

Optionally, the performing a reverse roughening treatment on the coarsened graph chain by using a reverse roughening algorithm to obtain a reverse coarsened graph chain includes:

Initializing the coarsened graph chain;

and performing anti-roughening treatment on the coarsened graph chain after the initialization treatment by using the anti-roughening algorithm to obtain the anti-coarsened graph chain.

Optionally, the initializing the coarsening graph chain includes:

taking the coarsened drawing of the highest level in the coarsened drawing chain as a first-level anti-coarsened drawing in the anti-coarsened drawing chain;

taking the anti-coarsening diagram adjacent to the first-stage anti-coarsening diagram as a second-stage anti-coarsening diagram;

taking each vertex in the first-stage reverse coarsening graph as a father vertex;

taking each vertex in the second-stage anti-coarsening graph as a child vertex;

the state of each of the child vertices is set to an unlabeled state.

Optionally, the using the anti-coarsening algorithm to perform anti-coarsening processing on the coarsened graph chain after the initializing processing to obtain the anti-coarsened graph chain includes:

Starting from a first-stage reverse coarsening diagram, determining a corresponding child vertex of each father vertex in the first-stage reverse coarsening diagram in a second-stage reverse coarsening diagram;

determining migration cost of each child vertex from the corresponding parent vertex to any one of the parent vertices, wherein the migration destination of the child vertex does not comprise the parent vertex where the child vertex is originally located;

Determining the exchange cost of any unlabeled child vertex pair with different parent vertices in the second-stage reverse coarsening graph according to the migration cost;

ordering the exchange costs of all the child vertex pairs with different father vertices in the second-stage anti-coarsening graph to obtain an exchange cost ordering result;

Determining the sub-vertex pair with the largest positive benefit according to the exchange cost ordering result;

Exchanging the pair of child vertices for the parent vertex in the first level anti-coarsening graph;

setting the state of each of the child vertices in the pair of child vertices to a labeled state;

finishing the anti-roughening treatment of the second-stage anti-roughening graph under the condition that the states of all the sub-vertexes are the marked states;

And iteratively generating a reverse coarsening diagram, wherein after the reverse coarsening diagram of a preset level is obtained, the reverse coarsening diagram chain is formed by all the obtained reverse coarsening diagrams, and each reverse coarsening diagram in the reverse coarsening diagram chain is arranged according to the generation sequence.

Optionally, when the states of all the child vertices are the marked states, the method further includes:

And generating a matching matrix between the first-stage reverse coarsening diagram and the second-stage reverse coarsening diagram according to the exchange record of the child vertex pairs.

Optionally, the building a multi-scale graph Haar wavelet convolutional neural network according to the coarsened graph chain and the anti-coarsened graph chain includes:

processing the coarsened graph chain and the inverse coarsened graph chain according to a preset graph Haar wavelet transformation matrix construction rule to obtain a graph Haar wavelet transformation matrix corresponding to each coarsened graph and a graph Haar wavelet transformation matrix corresponding to each inverse coarsened graph;

And establishing the multi-scale graph Haar wavelet convolutional neural network according to the graph Haar wavelet transformation matrix corresponding to the coarsening graph and the graph Haar wavelet transformation matrix corresponding to the anti-coarsening graph.

Optionally, the building the multi-scale graph Haar wavelet convolutional neural network according to the graph Haar wavelet transform matrix corresponding to the coarsened graph and the graph Haar wavelet transform matrix corresponding to the anti-coarsened graph includes:

Constructing an input layer of the multi-scale graph Haar wavelet convolutional neural network;

constructing a plurality of graph Haar wavelet convolution layers corresponding to the coarsened graph chains according to the graph Haar wavelet transformation matrix corresponding to the coarsened graph;

Constructing a plurality of graph Haar wavelet convolution layers corresponding to the anti-coarsening graph chains according to the graph Haar wavelet transformation matrix corresponding to the anti-coarsening graph;

and constructing an output layer of the multi-scale graph Haar wavelet convolutional neural network.

Optionally, when constructing a plurality of graph Haar wavelet convolution layers corresponding to the coarsened graph chains according to the graph Haar wavelet transformation matrix corresponding to the coarsened graph, the method further includes:

Performing graph Haar wavelet convolution operation on each coarsened graph in the coarsened graph chain through a plurality of graph Haar wavelet convolution layers;

And carrying out average pooling operation on each coarsened graph in the coarsened graph chain through a plurality of graph Haar wavelet convolution layers.

Optionally, when constructing a plurality of graph Haar wavelet convolution layers corresponding to the chain of inverse coarsening graphs according to the graph Haar wavelet transformation matrix corresponding to the inverse coarsening graphs, the method further includes:

performing an anti-pooling operation on each of the anti-coarsening graphs in the chain of anti-coarsening graphs through a plurality of graph Haar wavelet convolution layers;

and performing graph Haar wavelet convolution operation on each anti-coarsening graph in the anti-coarsening graph chain through a plurality of graph Haar wavelet convolution layers.

Optionally, training the multi-scale graph Haar wavelet convolutional neural network through the adjacency matrix, the vertex feature matrix, and the vertex label matrix includes:

inputting the adjacency matrix, the vertex feature matrix and the vertex label matrix into the multi-scale graph Haar wavelet convolutional neural network;

according to a preset loss function, adjusting parameters of the multi-scale graph Haar wavelet convolutional neural network;

and stopping training when the prediction error of the multi-scale graph Haar wavelet convolutional neural network meets a preset error threshold value, and outputting a predicted vertex tag matrix.

Optionally, before inputting the adjacency matrix, the vertex feature matrix, and the vertex tag matrix into the multi-scale map Haar wavelet convolutional neural network, the method further comprises:

Setting the loss function based on cross entropy;

And initializing network parameters of each layer of the multi-scale graph Haar wavelet convolutional neural network.

A second aspect of an embodiment of the present application provides a document classification apparatus, the apparatus including:

the data receiving module is used for receiving the literature relation graph, the adjacent matrix corresponding to the literature relation graph, the vertex characteristic matrix corresponding to the literature relation graph and the vertex label matrix corresponding to the literature relation graph;

the coarsening processing module is used for coarsening the literature relation graph by using a coarsening algorithm to obtain a coarsened graph chain corresponding to the literature relation graph;

The anti-coarsening processing module is used for carrying out anti-coarsening processing on the coarsening graph chain by using an anti-coarsening algorithm to obtain the anti-coarsening graph chain;

The neural network building module is used for building a multi-scale graph Haar wavelet convolutional neural network according to the coarsening graph chain and the anti-coarsening graph chain;

The neural network training module is used for training the multi-scale graph Haar wavelet convolutional neural network through the adjacency matrix, the vertex feature matrix and the vertex label matrix;

the document classification result acquisition module is used for acquiring a document classification result corresponding to the document relation graph under the condition that the multi-scale graph Haar wavelet convolutional neural network is trained.

Optionally, the apparatus further comprises:

A document data acquisition module for acquiring document data from a document database;

a document relation diagram construction module, configured to construct the document relation diagram with each document in the document data as a vertex and with a reference relation between documents as a side between the vertices;

the vertex feature matrix construction module is used for obtaining the vertex feature matrix according to the feature of each vertex in the literature relation graph;

The adjacency matrix construction module is used for obtaining the adjacency matrix according to the connection relation between each vertex in the literature relation graph;

and the vertex tag matrix construction module is used for obtaining the vertex tag matrix according to the tags marked with the vertices in the literature relationship graph.

Optionally, the roughening module includes:

The first initialization processing sub-module is used for initializing the literature relation graph to obtain a first-stage coarsening graph;

The module density iterative optimization sub-module is used for performing module density iterative optimization on the first-stage coarsening graph to obtain a second-stage coarsening graph;

and the coarsening diagram chain acquisition sub-module is used for generating a plurality of coarsening diagrams step by step, after the coarsening diagrams of the preset series are obtained, the coarsening diagram chain is formed by all the obtained coarsening diagrams, and each coarsening diagram in the coarsening diagram chain is arranged according to the generation sequence.

Optionally, the first initialization processing sub-module includes:

A community determination submodule, configured to use each vertex in the literature relationship graph as a community;

A first community state setting sub-module for setting a community state of each community to be unlabeled;

the community adding sub-module is used for adding the communities with the untagged community states into the untagged community set to obtain the first-level coarsening graph.

Optionally, the module density iterative optimization submodule includes:

the neighbor community determination submodule is used for determining all neighbor communities corresponding to each untagged community in the first-level coarsening chart;

An adjacent community generation sub-module, configured to, for each untagged community, form an adjacent community pair with each neighbor community, so as to obtain a plurality of adjacent community pairs;

the module density gain determining submodule is used for determining the module density gain of the communities and the neighbor communities in each adjacent community pair after combination;

An adjacent community pair determining submodule, configured to determine the adjacent community pair with the greatest module density gain corresponding to the plurality of adjacent community pairs;

a community merging operation execution sub-module, configured to execute a community merging operation on the communities in the adjacent community pair when the module density gain corresponding to the adjacent community pair is greater than a preset gain threshold;

a second community state setting sub-module, configured to set, when the community state of the neighbor community in the adjacent community pair is an untagged state, the community state of the neighbor community to a tagged state;

A community deleting sub-module for deleting the neighbor communities from the untagged community set;

the iteration optimization sub-module is used for carrying out module density optimization in an iteration mode, and stopping iteration under the condition that the community division density is not increased any more;

the super-vertex determining sub-module is used for taking each community in the first-level coarsening diagram after community combination as a super-vertex;

and the connecting edge adding sub-module is used for adding connecting edges between the super vertices according to the corresponding connection relation between the super vertices to obtain the second-stage coarsening graph.

Optionally, the apparatus further comprises:

the second community merging operation execution sub-module is used for executing community merging operation on the untagged communities through preset merging rules under the condition that the untagged communities exist in the untagged community set.

Optionally, the apparatus further comprises:

a matching matrix construction sub-module for constructing a matching matrix between the coarsening graphs of adjacent two stages;

and the adjacency matrix construction submodule is used for constructing an adjacency matrix corresponding to the coarsening diagram according to the connection relation between each vertex in the coarsening diagram under the condition of obtaining a new coarsening diagram.

Optionally, the anti-coarsening processing module includes:

The second initialization processing sub-module is used for initializing the coarsened graph chain;

and the anti-coarsening diagram chain acquisition sub-module is used for carrying out anti-coarsening treatment on the coarsening diagram chain after the initialization treatment by using the anti-coarsening algorithm to obtain the anti-coarsening diagram chain.

Optionally, the second initialization processing submodule includes:

A first-stage anti-coarsening diagram determining submodule, configured to use the coarsening diagram of the highest stage in the coarsening diagram chain as a first-stage anti-coarsening diagram in the anti-coarsening diagram chain;

A second-stage anti-coarsening map determining sub-module, configured to use an anti-coarsening map adjacent to the first-stage anti-coarsening map as a second-stage anti-coarsening map;

the father vertex determining sub-module is used for taking each vertex in the first-stage reverse coarsening graph as a father vertex;

a sub-vertex determining sub-module, configured to take each vertex in the second-stage inverse coarsening graph as a sub-vertex;

And the first vertex state setting sub-module is used for setting the state of each sub-vertex to be an unlabeled state.

Optionally, the inverse coarsening graph chain obtaining submodule includes:

The corresponding vertex determining sub-module is used for determining corresponding child vertices of each father vertex in the first-stage reverse coarsening diagram in the second-stage reverse coarsening diagram from the first-stage reverse coarsening diagram;

A migration cost determining submodule, configured to determine a migration cost for each child vertex migrating from the corresponding parent vertex to any one of the parent vertices, where a migration destination of the child vertex does not include the parent vertex where the child vertex is originally located;

A swap cost determination submodule, configured to determine, according to the migration cost, a swap cost of any unmarked child vertex pair having a different parent vertex in the second-stage anti-coarsening graph;

A exchange cost ordering result determining sub-module, configured to order exchange costs of all the child vertex pairs with different parent vertices in the second-stage anti-coarsening graph, so as to obtain an exchange cost ordering result;

The maximum positive benefit determining sub-module is used for determining the sub-vertex pair with the maximum positive benefit according to the exchange cost ordering result;

a vertex exchange sub-module for exchanging the pair of child vertices for the parent vertex in the first level reverse coarsening graph;

A second vertex state setting sub-module configured to set a state of each of the sub-vertices in the pair of sub-vertices to a labeling state;

a sub-module for completing the anti-roughening treatment of the second-stage anti-roughening graph when all the states of the sub-vertices are the marked states;

And the anti-coarsening diagram chain processing completion submodule is used for iteratively generating anti-coarsening diagrams, after the anti-coarsening diagrams of the preset series are obtained, the anti-coarsening diagram chain is formed by all the obtained anti-coarsening diagrams, and each anti-coarsening diagram in the anti-coarsening diagram chain is arranged according to the generation sequence.

Optionally, the apparatus further comprises:

and the second matching matrix construction submodule is used for generating a matching matrix between the first-stage reverse coarsening diagram and the second-stage reverse coarsening diagram according to the exchange record of the child vertex pair.

Optionally, the neural network building module includes:

The map Haar wavelet transformation matrix construction submodule is used for processing the coarsened map chain and the inverse coarsened map chain according to a preset map Haar wavelet transformation matrix construction rule to obtain a map Haar wavelet transformation matrix corresponding to each coarsened map and a map Haar wavelet transformation matrix corresponding to each inverse coarsened map;

And the neural network construction submodule is used for building the multi-scale graph Haar wavelet convolution neural network according to the graph Haar wavelet transformation matrix corresponding to the coarsening graph and the graph Haar wavelet transformation matrix corresponding to the anti-coarsening graph.

Optionally, the neural network building sub-module includes:

An input layer construction submodule for constructing an input layer of the multi-scale image Haar wavelet convolution neural network;

The first graph Haar wavelet convolution layer construction submodule is used for constructing a plurality of graph Haar wavelet convolution layers corresponding to the coarsened graph chains according to the graph Haar wavelet transformation matrix corresponding to the coarsened graph;

The second graph Haar wavelet convolution layer construction submodule is used for constructing a plurality of graph Haar wavelet convolution layers corresponding to the anti-coarsening graph chains according to the graph Haar wavelet transformation matrix corresponding to the anti-coarsening graph;

and the output layer construction submodule is used for constructing an output layer of the multi-scale image Haar wavelet convolution neural network.

Optionally, the apparatus further comprises:

The first convolution operation submodule is used for executing graph Haar wavelet convolution operation on each coarsened graph in the coarsened graph chain through a plurality of graph Haar wavelet convolution layers;

And the first pooling operation submodule is used for executing average pooling operation on each coarsening graph in the coarsening graph chain through a plurality of graph Haar wavelet convolution layers.

Optionally, the apparatus further comprises:

A second pooling operation sub-module, configured to perform, through a plurality of the graph Haar wavelet convolution layers, an anti-pooling operation on each of the anti-coarsening graphs in the anti-coarsening graph chain;

and the second convolution operation submodule is used for executing graph Haar wavelet convolution operation on each anti-coarsening graph in the anti-coarsening graph chain through a plurality of graph Haar wavelet convolution layers.

Optionally, the network training module includes:

The data input submodule is used for inputting the adjacency matrix, the vertex characteristic matrix and the vertex label matrix into the multi-scale graph Haar wavelet convolutional neural network;

the parameter adjustment sub-module is used for adjusting parameters of the multi-scale image Haar wavelet convolutional neural network according to a preset loss function;

and the vertex tag matrix output sub-module is used for stopping training and outputting a predicted vertex tag matrix when the prediction error of the multi-scale image Haar wavelet convolutional neural network meets a preset error threshold value.

Optionally, the apparatus further comprises:

a loss function determination submodule for setting the loss function based on cross entropy;

and the parameter initialization sub-module is used for initializing network parameters of each layer of the multi-scale image Haar wavelet convolutional neural network.

A third aspect of the embodiments of the present application provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method according to the first aspect of the present application.

A fourth aspect of the embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the method according to the first aspect of the application when the processor executes the computer program.

By adopting the document classification method provided by the application, a document relation diagram, an adjacent matrix corresponding to the document relation diagram, a vertex characteristic matrix corresponding to the document relation diagram and a vertex label matrix corresponding to the document relation diagram are received; coarsening the literature relation graph by using a coarsening algorithm to obtain a coarsened graph chain corresponding to the literature relation graph; performing anti-roughening treatment on the coarsened drawing chain by using an anti-roughening algorithm to obtain an anti-coarsened drawing chain; establishing a multi-scale graph Haar wavelet convolutional neural network according to the coarsening graph chain and the anti-coarsening graph chain; training the multi-scale graph Haar wavelet convolutional neural network through the adjacency matrix, the vertex feature matrix and the vertex label matrix; and obtaining a literature classification result corresponding to the literature relation graph under the condition that the multi-scale graph Haar wavelet convolutional neural network is trained.

In the application, the document relation diagram is coarsened to obtain a coarsened diagram chain, and the coarsened diagram chain is subjected to anti-coarsening to obtain an anti-coarsened diagram chain. The graph Haar wavelet transformation matrix can be constructed efficiently based on the coarsened graph chain and the anti-coarsened graph chain, and matrix characteristic decomposition operation with high operation cost is avoided. The constructed multi-scale graph Haar wavelet convolutional neural network effectively expands the receptive field of the graph Haar wavelet convolution kernel by capturing vertex characteristics with different granularities and graph topology information with different layers of coarsening graph chains and anti-coarsening graph chains, reduces the influence of noise, avoids the problem of over-smoothness, enhances the embedded representation of the vertices, and greatly improves the accuracy of document classification. Moreover, the graph Haar wavelet transformation matrix is a sparse matrix, so that the network training and reasoning efficiency can be greatly improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments of the present application will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a document classification method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a coarsened chain and a coarsened chain according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a multi-scale graph Haar wavelet convolution network according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a document classification system according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a document classification apparatus according to an embodiment of the present application;

fig. 6 is a schematic diagram of an electronic device according to an embodiment of the application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without any inventive effort, are intended to be within the scope of the application.

Referring to fig. 1, fig. 1 is a flowchart of a document classification method according to an embodiment of the present application. As shown in fig. 1, the method comprises the steps of:

s11: and receiving a literature relationship graph, an adjacent matrix corresponding to the literature relationship graph, a vertex characteristic matrix corresponding to the literature relationship graph and a vertex label matrix corresponding to the literature relationship graph.

In this embodiment, the document relation graph is a graph structure configured by taking documents as points and relationships among the documents as edges, and can represent the association relationship among all the documents in the document database, the adjacent matrix corresponding to the document relation graph is a matrix representing the edges in the document relation graph, the vertex feature matrix corresponding to the document relation graph is a matrix representing the features on each vertex in the document relation graph, and the vertex label matrix corresponding to the document relation graph is a matrix representing the labels on the vertices in the document relation graph.

In this embodiment, a literature relationship diagram, an adjacent matrix corresponding to the literature relationship diagram, a vertex feature matrix corresponding to the literature relationship diagram, and a vertex label matrix corresponding to the literature relationship diagram are received first.

Illustratively, the documents obtained from the thesis database, each vertex of the document relationship graph is a paper, and the edges of the document relationship graph represent the citation relationships between the papers.

In another embodiment of the present application, before receiving the literature relationship graph, the adjacency matrix corresponding to the literature relationship graph, the vertex feature matrix corresponding to the literature relationship graph, and the vertex label matrix corresponding to the literature relationship graph, the method further includes:

s21: literature data is obtained from a literature database.

In this embodiment, the document database is used to store document data, and may also function to search for documents.

In this embodiment, prior to constructing the literature relationship graph, literature data is first acquired from the literature database.

The document database may be any database on the network, or may be a database in a local area network, for example.

S22: the document relation graph is constructed with each document in the document data as a vertex and with the reference relations between documents as sides between the vertices.

In this embodiment, after the document data is collected, each document in the document data is taken as one vertex of the document relation graph, and a reference relation between documents is taken as a side between the vertices, so as to construct a document relation graph, and the document relation graph can characterize the reference relation between all documents in the document database.

Illustratively, a mapTo represent. Wherein/>Representing a set of vertices, each vertex v representing a document; /(I)Representing a collection of reference relationships between documents, such references are typically directional, i.e., connecting edges/>Is a directed edge, representing literature/>Citation/>. Considering that the direction of the edge has less influence on the document classification result, the embodiment sets the directed edge as the undirected edge, namely/>Literature/>And literature/>There is a reference relationship between them.Record its neighbor vertex set as/>The number of neighbors is called the degree of the vertex and is recorded as/>。

S23: and obtaining the vertex characteristic matrix according to the characteristics of each vertex in the literature relation graph.

In this embodiment, each document in the document relation graph has a corresponding feature, i.e., a feature of a vertex, which can be extracted from the document by deep learning.

In this embodiment, a vertex feature matrix is obtained according to the feature of each vertex in the literature relationship graph.

Illustratively, each documentAll have/>Personal attribute/>Can be extracted from the headlines, abstract, keywords and text of the document by deep learning techniques such as Bert et al. Extracting feature vectors of documents by means of Bert et al depth neural network, and forming feature matrixes corresponding to the documents by all feature vectors of all documentsIts column vector/>Composition map/>One signal/>Representing a certain attribute value of all documents.

S24: and obtaining the adjacency matrix according to the connection relation between each vertex in the literature relation graph.

In this embodiment, the adjacency matrix is obtained from the connection relationship between each vertex in the literature relationship diagram.

By way of example only, the process may be performed,Record its neighbor vertex set as/>The number of neighbors is called the degree of the vertex and is recorded as/>；/>One/>Adjacency matrix of dimensions/>Representation, element/>Document/>, 1And literature/>There is a reference relationship,/>=0 Indicates absence.

S25: and obtaining the vertex label matrix according to the labels marked with the vertices in the literature relation graph.

In this embodiment, part of the vertex labels in the document relation graph are labeled, and the labels include accurate classification of documents.

In this embodiment, the vertex tag matrix is obtained from the tags of the labeled vertices in the literature relationship graph.

Illustratively, part of the documents have category labelsWherein/>A collection of document category labels is represented. The vertex sets marked with category labels and no category labels are/>, respectivelyAnd/>The following steps are: (1)=V；(2)/>。

The problem of graph-based document classification is to give a quotation network representing quotation relations among documentsDocument attribute matrix/>Few documents are known as category labels/>() Deducing each untyped literature/>Category label of/>. That is, the category marked literature in the graph is utilized to conduct supervised learning and the non-supervised learning of the literature without category marked in the graph is utilized to obtain a literature classifier/>。

S12: and coarsening the literature relation graph by using a coarsening algorithm to obtain a coarsened graph chain corresponding to the literature relation graph.

In this embodiment, the coarsening algorithm is a Bottom-up community discovery algorithm BCD-MO (Bottom-up Community Detection algorithm based on Modularity density Optimization) based on module density optimization. The core idea of the algorithm is to continuously cluster the community division of the generated graph from bottom to top until the density of the community division module of the whole graph is not increased. The roughened graph chain is a graph chain in which a plurality of roughened graphs obtained by roughening a literature relationship graph are arranged in order. And (3) roughening the lower-level roughened graph from the first-level roughened graph, so that a higher-level roughened graph can be obtained.

In this embodiment, a coarsening algorithm is used to coarsen the literature relationship graph to obtain a coarsened graph chain corresponding to the literature relationship graph, and the specific steps include:

S12-1: and initializing the literature relation graph to obtain a first-stage coarsening graph.

In this embodiment, the first-level rough map is an initial map obtained by initializing a literature relationship map.

In this embodiment, the initializing the literature relationship graph to obtain the first-stage coarsening graph includes the following specific steps:

s12-1-1: each vertex in the literature relationship graph is taken as a community.

In this embodiment, the community is a calculation unit when the document relationship diagram is subjected to the roughening process.

In this embodiment, each vertex in the literature relationship graph is taken as a community.

S12-1-2: setting a community state of each community to be unlabeled.

In the present embodiment, the community state of each community in the literature relationship diagram is set to be unlabeled for subsequent calculation.

S12-1-3: adding the communities with the untagged community states into an untagged community set to obtain the first-stage coarsening graph.

In this embodiment, a community whose community state is unlabeled is added to the unlabeled community set, and the coarsening process of the literature relationship graph is completed, so as to obtain a first-stage coarsening graph.

Illustratively, at the initial time, set upLevel coarsening map/>Its adjacent matrix is. Will/>Each vertex/>Singly as a community/>Namely there is/>. The community division of these communities is denoted/>Obviously there is/>; Setting the status of each community as unlabeled and adding to the unlabeled community set/>。

S12-2: and performing module density iterative optimization on the first-stage coarsening graph to obtain a second-stage coarsening graph.

In this embodiment, the second-level coarsening chart is a coarsening chart obtained by performing module density iterative optimization on the first-level coarsening chart.

In this embodiment, performing module density iterative optimization on the first-stage coarsening map to obtain a second-stage coarsening map, which specifically includes the following steps:

s12-2-1: and determining all neighbor communities corresponding to each untagged community in the first-level coarsening graph.

In this embodiment, when a connection relationship exists between one community and another community, the two communities are neighbor communities.

In this embodiment, all neighbor communities corresponding to each untagged community in the first-level coarsening chart are determined according to the connection relationship between communities in the first-level coarsening chart.

Illustratively, for the case ofCommunity division/>Each untagged community/>Record the set of its neighbor communities as。

S12-2-2: for each untagged community, forming an adjacent community pair by the community and each neighbor community respectively, so as to obtain a plurality of adjacent community pairs.

In this embodiment, the adjacent community pair is a community pair composed of an untagged community and a neighbor community of the community. Each untagged community corresponds to a plurality of neighbor communities and can form a plurality of adjacent community pairs.

S12-2-3: determining module density gains after the communities and the neighbor communities in each adjacent community pair are combined.

In this embodiment, the module density gain is the benefit degree of increasing the module density, and when a certain community is combined with its corresponding neighbor community, the corresponding module density gain is generated.

Illustratively, a device is providedIs/>A set of all communities, each communityAre all/>I.e. satisfying: /(I). Scale/>For/>Is divided into communities if and only if/>All have/>. To evaluate the community division, the present embodiment introduces a module density Q (/ >))：

（1）

Wherein,Representing vertex set,/>Representing communities/>Is a collection of internal edges of (a); /(I)Representing communities/>The degree of "inner tightness and outer tightness". The larger the value, the higher the cohesion degree of all communities, and the smaller the coupling degree between communities. To avoid the oversized generated communities and ensure that the scales of all communities are similar, penalty items/>And punishing communities with excessive vertex numbers. In the formula/>An exponential function based on a natural constant e is represented.

Suppose that it is toCommunities/>And/>Merging into/>Keeping the rest communities unchanged, and obtaining a new community division record as/>. The resulting module density gain Q can be calculated by the following formula:

（2）

Calculate the merge using equation (2) And any one of its neighbor communities/>The resulting module density gain Q.

S12-2-4: and determining the adjacent community pairs with the maximum module density gain corresponding to the adjacent community pairs.

In this embodiment, for each untagged community, a plurality of adjacent community pairs are formed, and after the untagged community and the neighbor communities in each adjacent community pair are combined, the module density gain for the whole coarsening graph is calculated.

In this embodiment, after obtaining the module density gain results of the plurality of pairs of adjacent communities for each unlabeled community, the module density gain results of each pair of adjacent communities are ordered in the order from large to small, a pair of adjacent communities with the largest gain is determined, the remaining adjacent communities are deleted, and only the pair of adjacent communities with the largest module density gain is reserved.

S12-2-5: and executing community merging operation on the communities in the adjacent community pair under the condition that the module density gain corresponding to the adjacent community pair is larger than a preset gain threshold.

In this embodiment, the preset gain threshold is a minimum value of the preset module density gain.

In this embodiment, when the module density gain of the remaining pair of adjacent communities is greater than the preset gain threshold, two communities in the pair of adjacent communities are merged.

Illustratively, the gain is greater than a threshold for the maximum module densityCommunity/>And executing a community merging operation.

S12-2-5: setting the community state of the neighbor communities to be marked under the condition that the community state of the neighbor communities in the adjacent community pair is unmarked.

In this embodiment, after the merging operation is performed, if the community state of the neighbor community in the adjacent community pair is an untagged state, the neighbor community is tagged, and the community state of the neighbor community is set to be a tagged state.

S12-2-6: and deleting the neighbor communities from the unmarked community set.

In this embodiment, when the neighbor community in the neighbor community pair is in a marked state, the neighbor community is deleted from the unmarked community set.

Illustratively, the community is placed in a marked state and deleted from U, i.e

S12-2-7: and (3) carrying out module density optimization by iteration, and stopping iteration under the condition that the community division density is not increased any more.

In this embodiment, the module density optimization is performed iteratively, that is, steps S12-2-1 to S12-2-6 are performed iteratively until the community dividing module density is no longer increased, that is, the communities in the first-level coarsening graph cannot be merged again, and the iteration is stopped.

S12-2-8: and taking each community in the first-level coarsening graph after community combination as a super vertex.

In this embodiment, each community in the first-level coarsening graph after community merging is used as a super vertex.

S12-2-9: and adding a connecting edge between each super vertex according to the corresponding connection relation between each super vertex to obtain the second-stage coarsening graph.

In this embodiment, according to the corresponding connection relationship between each super vertex, a connection edge is added between each super vertex, so as to obtain a second-stage coarsening graph.

Illustratively, a next-level coarsening map, i.e., a second-level coarsening map, is constructedIn time, according to/>, in the previous stepCombining the communities, taking each combined community as a super vertex, and forming/>, by a set formed by all the super verticesSuper vertex set/>. If and only if ]Any two communities/>And/>When there is a connecting edge between them, thenCorresponding super vertex/>And/>Adds a connecting edge betweenAnd add it to/>。

In this embodiment, before each community in the first-level coarsening graph after community merging is used as a super vertex, the method further includes:

S12-2-10: under the condition that the unmarked communities exist in the unmarked community set, executing community merging operation on the unmarked communities through a preset merging rule.

In this embodiment, after the community merging operation is iteratively performed, under the condition that an untagged community still exists in the untagged community set, the community merging operation is performed on the untagged community according to a preset merging rule.

For example, check whether U is empty, if not, calculate according to the following formulaWhich untagged communities/>Combining:

（3）

wherein E represents an edge inside the community, The vertex degree is represented as a super-vertex.

In this embodiment, the method further includes:

S12-2-11: and constructing a matching matrix between the coarsening graphs of two adjacent stages.

In this embodiment, the matching matrix characterizes a matching relationship between each pair of communities between two coarsened graphs.

In this embodiment, a matching matrix between coarsening graphs of two adjacent stages is constructed.

Illustratively, constructing adjacent two-level coarsening graphsAnd/>Matching matrix/>. According to/>Module density optimizing result, constructing matching matrix/>: Let it be/>Line/>Column element/>Set to 1 if and only if/>Vertex/>Is incorporated into/>In/>; Otherwise will/>Set to 0.

S12-2-12: and under the condition that a new coarsening diagram is obtained, constructing an adjacent matrix corresponding to the coarsening diagram according to the connection relation between each vertex in the coarsening diagram.

In this embodiment, when a new coarsened graph is obtained, an adjacent matrix corresponding to the coarsened graph is constructed according to the connection relationship between each vertex in the coarsened graph.

Illustratively, a next-level coarsening diagram is constructedAdjacency matrix/>. Calculated according to the following formula:

（4）

Wherein, For adjacent two-stage coarsening diagram/>And/>Matching matrix between.

S12-3: and generating a plurality of coarsening graphs step by step, after the coarsening graphs of a preset level are obtained, forming a coarsening graph chain by all the coarsening graphs, wherein each coarsening graph in the coarsening graph chain is arranged according to the generation sequence.

In this embodiment, module density iterative optimization is performed on each coarsened graph, after a coarsened graph of a preset level is obtained, all coarsened graphs obtained are used as a coarsened graph chain, in the coarsened graph chain, each coarsened graph is sequentially arranged according to a generating sequence, a generated first coarsened graph is a first coarsened graph in the coarsened graph chain, and a generated highest-level coarsened graph is a last coarsened graph in the coarsened graph chain.

Illustratively, a preset number of levelsCan be set according to the actual situation by itself, and is not limited herein. After constructing a coarsened graph, update the level/>If/>Repeating the module density iterative optimization step until all coarsening graphs/>, are obtainedI.e. coarsening the graph chain and denoted/>。

In the embodiment, the proposed BCD-MO algorithm discovers communities from bottom to top based on module density evaluation indexes, and the obtained communities are more balanced in scale, so that a foundation is laid for generating a high-quality graph Haar wavelet transform matrix in the subsequent step.

S13: and performing anti-roughening treatment on the coarsened graph chain by using an anti-roughening algorithm to obtain the anti-coarsened graph chain.

In this embodiment, the anti-coarsening Algorithm is a Multi-path KERNIGHAN-Lin based Algorithm (Multi-WAY KERNIGHAN-LIN REFINEMENT Algorithm, MKLR). The anti-roughening graph chain is a graph chain obtained by performing anti-roughening treatment on each roughening graph in the roughening graph chain.

In this embodiment, the step of performing the anti-roughening treatment on the roughened graph chain by using the anti-roughening algorithm to obtain the anti-roughened graph chain includes:

s13-1: and initializing the coarsened graph chain.

In this embodiment, the step of initializing the coarsened graph chain includes:

s13-1-1: and taking the coarsened drawing of the highest level in the coarsened drawing chain as a first-level anti-coarsened drawing in the anti-coarsened drawing chain.

In this embodiment, the top-level coarsened drawing in the coarsened drawing chain is used as the first-level anti-coarsened drawing in the anti-coarsened drawing chain.

S13-1-2: and taking the anti-coarsening graph adjacent to the first-stage anti-coarsening graph as a second-stage anti-coarsening graph.

In this embodiment, a coarsened pattern adjacent to the first-stage inverse coarsened pattern is used as the second-stage inverse coarsened pattern.

S13-1-3: and taking each vertex in the first-stage reverse coarsening graph as a father vertex.

In this embodiment, each vertex in the first-stage inverse coarsening graph is taken as a parent vertex.

S13-1-4: and taking each vertex in the second-stage anti-coarsening graph as a child vertex.

In this embodiment, each vertex in the first-stage anti-coarsening graph is formed by merging one or more vertices in the second-stage anti-coarsening graph, each vertex in the first-stage anti-coarsening graph is called a parent vertex, and one or more corresponding vertices of the vertex in the second-stage anti-coarsening graph are called child vertices. The child vertex is relative to the parent vertex, the child vertices compose a parent vertex, and the child vertex in the previous-stage anti-coarsening graph can be used as the parent vertex of the next-stage anti-coarsening graph.

In this embodiment, each vertex in the second-stage coarsening graph is used as a child vertex, that is, a child vertex corresponding to a parent vertex in the first coarsening graph, and it can be determined which parent vertices correspond to which child vertices according to the community merging situation when the coarsening graph is generated.

S13-1-5: the state of each of the child vertices is set to an unlabeled state.

In this embodiment, the state of each child vertex is set to an unmarked state for subsequent operations.

Illustratively, the initial time is set toLevel coarsening map/>Add it to the reverse-map coarsening chain/>；/>Is/>Coarsening by a BCD-MO algorithm; initialization/>Each child vertex is in an unlabeled state; the maximum gain g is initialized to minus infinity.

S13-2: and performing anti-roughening treatment on the coarsened graph chain after the initialization treatment by using the anti-roughening algorithm to obtain the anti-coarsened graph chain.

In this embodiment, a reverse roughening algorithm is used to perform a reverse roughening treatment on the roughened graph chain after the initialization treatment, so as to obtain a reverse roughening graph chain, and the specific steps include:

s13-2-1: starting from a first-level anti-coarsening graph, determining corresponding child vertices of each parent vertex in the first-level anti-coarsening graph in a second-level anti-coarsening graph.

In this embodiment, starting from the first-level anti-coarsening graph, a corresponding child vertex of each parent vertex in the first-level anti-coarsening graph in the second-level anti-coarsening graph is determined.

S13-2-2: determining migration cost of each child vertex from the corresponding parent vertex to any one of the parent vertices, wherein the migration destination of the child vertex does not comprise the parent vertex where the child vertex is originally located.

In this embodiment, the migration cost is a migration cost caused by migration of the child vertex from one parent vertex to another parent vertex.

In this embodiment, a migration cost for each child vertex to migrate from a corresponding parent vertex to any parent vertex is determined, and a migration destination of each child vertex does not include the parent vertex where the child vertex originally resides.

Illustratively, it is known thatCoarsening via BCD-MO algorithmThus/>It is composed of/>A plurality of vertexes are clustered, namelyWherein, is called/>Is the parent vertex, corresponding/>Is a child vertex. MKLR algorithm pair/>Reverse roughening, i.e. refining, to obtain/>. The core idea is to constantly exchange the data from/>Child vertex pairs of different parent vertices such that/>The number of cutting edges among different super vertices, namely communities, is reduced, and the module density is increased. In particular, for/>Any pair of vertices/>And/>Record them at/>The sub-vertex sets in (a) are/>, respectivelyAnd/>Each sub-vertex/>, is calculated according to the following formulaFrom the supervertex/>, to which the current belongsMove to/>Any other super vertex/>Migration cost/>：

（5）

Wherein,Representing vertices/>And super vertex/>A set of connecting edges, namely a set of cutting edges, between the sub-vertices of (a); /(I)Representing vertices/>With the same parent vertex/>The set of connected edges, i.e., the inner set of edges, between other child vertices. Due to will/>The migration to different super vertices is at different cost, and the maximum value is taken as/>Migration cost/>：

（6）

For a pair ofEach sub-vertex of all vertices in (1) computes its migration cost.

S13-2-3: and according to the migration cost, determining the exchange cost of any unlabeled child vertex pair with different parent vertices in the second-stage reverse coarsening graph.

In this embodiment, according to the obtained migration cost, the exchange cost of any unlabeled child vertex pair with different parent vertices in the second-stage reverse coarsening graph is determined.

Illustratively, for all fromChild vertex pair of different parent vertices/>The benefit/>, which is brought by exchanging the positions of the two sub-vertices, is calculated according to the following formula：

（7）

Wherein,From/>, respectivelyDifferent parent vertices of (a); /(I)Representing child vertex pairs/>The connecting edge between them may be empty, at this time/>Otherwise, equal to 1, P is the migration cost. When/>When the switching achieves positive gain, this means that the migration cost is reduced and the module density is increased, whereas the migration cost is increased and the module density is reduced.

S13-2-4: and ordering the exchange cost of all the child vertex pairs with different parent vertices in the second-stage anti-coarsening graph to obtain an exchange cost ordering result.

In this embodiment, the exchange costs of all the child vertex pairs with different parent vertices in the second-stage anti-coarsening graph are ordered, so as to obtain an exchange cost ordering result.

Illustratively, fromVertex set/>Unmarked child vertex pairs with different parent vertices/>。

Wherein the method comprises the steps of. According to equation (7), the benefit/>, of exchanging the pair of child vertices is calculated。

S13-2-5: and determining the child vertex pair with the largest positive benefit according to the exchange cost ordering result.

In this embodiment, the sub-vertex pair with the largest positive benefit is determined according to the exchange cost ordering result.

In the present embodiment, whenWhen the exchange gets positive benefits, the exchange cost becomes positive benefits, and the sub-vertex pair with the largest positive benefits is determined from the exchange cost sequencing result.

S13-2-6: the child vertex pairs are swapped for the parent vertex in the first level anti-coarsening graph.

In this embodiment, the child vertex pair with the greatest positive benefit is swapped for the parent vertex in the first level reverse coarsening graph.

S13-2-7: the state of each of the child vertices in the pair is set to a labeled state.

In this embodiment, the state of each child vertex in the pair of child vertices is set to a labeled state.

Illustratively, the vertex pair with the greatest positive benefit is selected to implement the swap, and the vertex pair is labeled. For a pair ofThe swap benefits of all child vertices with different parent vertices are ordered and the pair of vertices with the largest positive benefit (denoted as) Exchange them at/>Parent vertex in (i.e./>)，And will/>And/>The state of (2) is set to marked.

S13-2-8: and finishing the anti-roughening treatment of the second-stage anti-roughening graph under the condition that all the states of the sub-vertices are the marked states.

In this embodiment, when the states of all the child vertices are marked, the anti-roughening process for the second-stage anti-roughening map is completed.

Illustratively, steps S13-2-1 through S13-2-1-7 are repeated until all vertices are in a marked state, or the swap benefits for 10 consecutive vertices are negative. Selecting the forefront continuous moving operation to maximize the accumulated exchange benefit, namelyInverse coarsening of/>。

S13-2-9: and iteratively generating a reverse coarsening diagram, wherein after the reverse coarsening diagram of a preset level is obtained, the reverse coarsening diagram chain is formed by all the obtained reverse coarsening diagrams, and each reverse coarsening diagram in the reverse coarsening diagram chain is arranged according to the generation sequence.

In this embodiment, the steps are iteratively performed, and when one reverse coarsening diagram is completed, the next reverse coarsening diagram is generated until the reverse coarsening diagram of the preset level is generated, and the obtained multiple reverse coarsening diagrams are arranged according to the generation sequence, so as to obtain the reverse coarsening diagram chain.

Illustratively, a preset number of levelsCan be set according to the actual situation, and the method can generate the/>Updating when the level coarsening mapIf/>Repeating the steps until all the anti-coarsening graphs/>, are obtainedThey constitute the anti-coarsening graph chain/>。

In the embodiment, MKLR algorithm continuously exchanges vertex pairs among different communities on the premise of keeping community scale balance, reduces community cutting edges, improves community module density, optimizes graph coarsening results, and further ensures efficient construction of graph Haar wavelet transformation matrixes and multi-scale graph Haar convolutional neural networks.

Referring to FIG. 2, FIG. 2 is a schematic diagram of a coarsened chain and a reverse coarsened chain according to an embodiment of the present application, as shown in FIG. 2, the coarsened chain isInverse coarsening of the graph chain is/>。

In this embodiment, when the anti-roughening processing on the second-stage anti-roughening graph is completed under the condition that the states of all the child vertices are the marked states, the method further includes:

S13-2-10: and generating a matching matrix between the first-stage reverse coarsening diagram and the second-stage reverse coarsening diagram according to the exchange record of the child vertex pairs.

In this embodiment, a matching matrix between the first-stage anti-coarsening graph and the second-stage anti-coarsening graph is generated according to the exchange record of the child vertex pairs.

Illustratively, updateTo/>Matching matrix/>For/>To reflect/>It should be noted that the child vertex pairs exchange only the parent vertices of the pair of child vertices and do not change the adjacency of the pair of child vertices themselves, and therefore there is/>Wherein/>To coarsen the graph/>A matrix of adjacency relations between the vertices.

S14: and establishing a multi-scale graph Haar wavelet convolutional neural network according to the coarsening graph chain and the anti-coarsening graph chain.

In this embodiment, referring to fig. 3, fig. 3 is a schematic diagram of a multi-scale graph Haar wavelet convolution network according to an embodiment of the present application. The figure comprises an input layer, 2J convolution layers and an output layer, wherein H represents hidden features extracted from the convolution layers, A represents an adjacent matrix, X represents a vertex feature matrix and Z represents an output label matrix. The neural network is used to classify documents.

In this embodiment, according to the coarsened graph chain and the anti-coarsened graph chain, a multi-scale graph Haar wavelet convolutional neural network is established, and the specific steps include:

s14-1, processing the coarsened graph chain and the anti-coarsened graph chain according to a preset graph Haar wavelet transformation matrix construction rule to obtain a graph Haar wavelet transformation matrix corresponding to each coarsened graph and a graph Haar wavelet transformation matrix corresponding to each anti-coarsened graph.

In this embodiment, according to a preset graph Haar wavelet transform matrix construction rule, the coarsened graph chain and the inverse coarsened graph chain are processed to obtain a graph Haar wavelet transform matrix corresponding to each coarsened graph and a graph Haar wavelet transform matrix corresponding to each inverse coarsened graph chain.

Illustratively, a device is providedIs composed of figure/>And coarsening the graph obtained by the coarsening algorithm BCD-MO.It is a graph/>A community of multiple vertices, i.e. there is/>And record the size of the product as/>Pairs/>, in order of scale from large to smallAll super vertices in the tree are ordered, and/>, after the ordering is recordedFor/>Construct the graph/> according to the following rulesSecond order norm/>Orthogonal transform basis of/>：

（8）

Wherein the method comprises the steps ofFor the diagram/>Is defined as the vertex set of the model.

When (when)When (1): /(I)

（9）

Wherein,Is defined at/>Indication function on:

（10）

According to the drawing Is composed of figure/>As can be seen from the coarsening pattern obtained by the proposed coarsening algorithm BCD-MOIt is at/>Corresponds to one super vertex/>. Is defined at/>, as followsOrthogonal transform basis vector/>Conversion to be defined at/>Vector on/>：

（11）

For the followingSome super vertex/>Let it be by/>Vertex set of (a)Consists of vertices ordered from high to low in degree. When/>When (1):

（12）

Wherein, Is defined at/>Indication function on:

（13）

Easy to verify, structured vector set Mutually orthogonal, constitute a graph/>Second order norm/>An orthogonal transform basis on the same. Through formulas (8) - (13), a graph Haar wavelet transformation matrix of the coarsened graph and a conversion method for converting the matrix into the graph Haar wavelet transformation matrix corresponding to the fine-grained graph can be constructed.

S14-2: and establishing the multi-scale graph Haar wavelet convolutional neural network according to the graph Haar wavelet transformation matrix corresponding to the coarsening graph and the graph Haar wavelet transformation matrix corresponding to the anti-coarsening graph.

In this embodiment, after obtaining a wavelet change matrix corresponding to a coarsened graph and a graph Haar wavelet transform matrix corresponding to an anti-coarsened graph, a multi-scale wavelet convolutional neural network is built according to the obtained graph Haar wavelet transform matrix, and the specific steps include:

s14-2-1: and constructing an input layer of the multi-scale graph Haar wavelet convolutional neural network.

In this embodiment, when constructing the multi-scale map Haar wavelet convolutional neural network, an input layer of the multi-scale map Haar wavelet convolutional neural network is first constructed. The input layer is used for receiving the adjacency matrix, the vertex characteristic matrix and the vertex label matrix of the literature relation graph.

S14-2-2: and constructing a plurality of graph Haar wavelet convolution layers corresponding to the coarsened graph chains according to the graph Haar wavelet transformation matrix corresponding to the coarsened graph.

In this embodiment, a graph Haar wavelet transform matrix corresponding to a coarsened graph is constructed, and a plurality of graph Haar wavelet convolution layers corresponding to a coarsened graph chain are constructed.

S14-2-3: and constructing a plurality of graph Haar wavelet convolution layers corresponding to the anti-coarsening graph chains according to the graph Haar wavelet transformation matrix corresponding to the anti-coarsening graph.

In this embodiment, a plurality of wavelet convolution layers corresponding to the inverse coarsening graph chain are constructed according to the graph Haar wavelet transform matrix corresponding to the inverse coarsening graph.

Illustratively, a device is providedIs with the figure/>An associated Haar wavelet transform basis, the set of basis constituting a Haar wavelet transform matrix/>. For signal on graph G/>And/>The attendant Haar wavelet transform and the forward Haar wavelet transform are defined as follows:

（14）

（15）

Wherein, Is a wavelet transformation matrix,/>And/>Is the signal on graph G.

The graph Haar wavelet convolution operation is defined as follows based on formulas (14) - (15):

（16）

Wherein, Is the figure Harr wavelet convolution kernel,/>Is parameterized/>，Is a parameter to be learned; as indicated by the letter Hadamard product. Decomposing the graph Haar wavelet convolution operation in the formula (16) into two steps of linear transformation and convolution, and adding a nonlinear activation function to obtain the realization of the graph Haar wavelet convolution layer:

（17）

（18）

Wherein, And/>Respectively representing the vertex feature matrixes of the graph before and after linear transformation; /(I)Is a linear transformation matrix to be learned, and completes the vertex characteristic from/>To/>Conversion of dimensions; /(I)Representing a nonlinear activation function; h represents the vertex hidden layer representation of the vertex feature matrix X extracted via the graph Haar wavelet convolution operation.

And (4) constructing a corresponding graph harr wavelet convolution layer according to the graph Haar wavelet transformation matrix through formulas (14) - (15).

S14-2-4: and constructing an output layer of the multi-scale graph Haar wavelet convolutional neural network.

In this embodiment, the output layer of the multi-scale map Haar wavelet convolutional neural network is used to output the predicted vertex tag matrix.

Illustratively, the matrix output by the output layer is the predicted vertex tag matrix Z:

（19）

Wherein softmax (xi) = ；/>Is/>Column vectors of dimensions, representing all vertices belonging to a class/>Probability of (1), i.e. its/>, ofThe individual elements represent vertices/>Belonging to category/>Probability of/>To reverse coarsening the figure/>Is represented by the vertex hidden layer.

All kinds of prediction result vectors formA prediction result matrix Z of the dimension.

In this embodiment, when constructing a plurality of graph Haar wavelet convolution layers corresponding to the coarsened graph chain according to the graph Haar wavelet transform matrix corresponding to the coarsened graph, the method further includes:

s14-2-5: and executing graph Haar wavelet convolution operation on each coarsened graph in the coarsened graph chain through a plurality of graph Haar wavelet convolution layers.

In this embodiment, the wavelet convolution operation is performed on each coarsened graph in the coarsened graph chain by a plurality of graph Haar wavelet convolution layers, i.e., the first J convolution layers in fig. 3.

S14-2-6: and carrying out average pooling operation on each coarsened graph in the coarsened graph chain through a plurality of graph Haar wavelet convolution layers.

In this embodiment, the average pooling operation is performed on each coarsened graph in the coarsened graph chain by multiple graph Haar wavelet convolution layers, i.e., the first J convolution layers in fig. 3.

Illustratively, a graph Haar wavelet convolution operation and an average pooling operation are performed on each graph in the coarsened graph chain. For coarsening the graph chainAny one of the figures/>To/>And/>As input, vertex hidden layer representations/>, are extracted via a layer of graph Haar wavelet convolution layers：

（20）

（21）

Wherein the linear transformation matrixAnd convolution kernel/>Are all network parameters to be learned; /(I)Is a graph/>Constructed graph Haar wavelet transform matrix,/>Representing a nonlinear activation function. BCD-MO pair according to coarsening algorithmAs can be seen from the coarsening process of/>And/>The matching matrix between is/>Regularizing Shi Lie to obtain/>：

（22）

Will beAnd/>Multiplying to obtain average pooling result/>It serves as the next level coarsening figure/>Vertex feature matrix/>：

（23）

When (when)When pair/>Vertex feature matrix/>Performing graph Haar wavelet convolution operation defined by formulas (20) - (22) to obtain graph vertex hidden layer representation/>; The average pooling operation defined by equation (23) is not performed.

In this embodiment, when constructing the plurality of graph Haar wavelet convolution layers corresponding to the inverse coarsening graph chain according to the graph Haar wavelet transform matrix corresponding to the inverse coarsening graph, the method further includes:

S14-2-7: and executing anti-pooling operation on each anti-coarsening graph in the anti-coarsening graph chain through a plurality of graph Haar wavelet convolution layers.

In this embodiment, the anti-pooling operation is performed on each anti-coarsening graph in the chain of anti-coarsening graphs by multiple graph Haar wavelet convolution layers, i.e., the J-2J convolution layers in FIG. 3.

S14-2-8: and performing graph Haar wavelet convolution operation on each anti-coarsening graph in the anti-coarsening graph chain through a plurality of graph Haar wavelet convolution layers.

In this embodiment, the wavelet convolution operation is performed on each of the de-coarsened graphs in the chain of de-coarsened graphs by multiple graph Haar wavelet convolution layers, i.e., the J-2J convolution layers in FIG. 3.

Illustratively, the inverse graph coarsens the chainAny one of the figuresUtilization/>(When/>)At the time of/>) And/>Matching matrix/>Pair/>Medium graph vertex hidden layer representation/>Performing the reverse roughening operation to obtain a reverse roughening diagram/>Middle vertex feature matrix/>：

（24）

To be used forAnd/>As input, the anti-coarsening map/>, is extracted via a layer of map Haar wavelet convolution layerMiddle vertex hidden layer representation/>：

（25）

（26）

Wherein,Representing a nonlinear activation function,/>For network parameters to be learned,/>Is a graph/>The constructed graph Haar wavelet transform matrix.

The anti-coarsening operation and the average pooling operation are performed for each anti-coarsening graph by formulas (24) - (26).

In this embodiment, for a containerLevel coarsening map/>The multi-stage Haar wavelet convolutional neural network of the stage inverse coarsening graph comprises/>Layer diagram Haar convolution layers. Wherein front/>The layer graph Haar convolution layers are respectively responsible for coarsening graph chainsPerforming graph Haar wavelet convolution operations and average pooling operations on each coarsened graph in (a); rear/>The layer graph Haar convolution layers are respectively responsible for coarsening the inverse graph chain/>Is subjected to a reverse pooling operation and a graph Haar wavelet convolution operation. Finally obtain the anti-coarsening graph/>Vertex hidden layer representation/>In this representation, the original graph data/>, is embeddedAnd (5) inputting the adjacent information with different scales into an output layer to predict vertex classification results, and obtaining a predicted vertex label matrix Z.

S15: training the multi-scale graph Haar wavelet convolutional neural network through the adjacency matrix, the vertex feature matrix and the vertex label matrix.

In this embodiment, training the multi-scale graph Haar wavelet convolutional neural network through the adjacency matrix, the vertex feature matrix and the vertex label matrix includes the following specific steps:

s15-1: and inputting the adjacency matrix, the vertex characteristic matrix and the vertex label matrix into the multi-scale graph Haar wavelet convolutional neural network.

S15-2: and adjusting parameters of the multi-scale graph Haar wavelet convolutional neural network according to a preset loss function.

S15-3: and stopping training when the prediction error of the multi-scale graph Haar wavelet convolutional neural network meets a preset error threshold value, and outputting a predicted vertex tag matrix.

In this embodiment, the predicted vertex label matrix is a matrix of labels of the literature without labels in the literature relationship graph predicted by the multi-scale graph Haar wavelet convolutional neural network.

In this embodiment, the predicted error threshold is a minimum value of error values predicted by the preset multi-scale graph Haar wavelet convolutional neural network for the document category, and when the preset error threshold is satisfied, training is stopped, and a predicted vertex label matrix is output.

In the embodiment, after the multi-scale graph Haar wavelet convolutional neural network is constructed, training is carried out on the multi-scale graph Haar wavelet convolutional neural network through an adjacent matrix, a vertex characteristic matrix and a vertex label matrix,

For a given graph data set, taking an adjacent matrix, a vertex characteristic matrix and a vertex label matrix of the graph data set as inputs, sending the graph data set into a multi-scale graph Haar wavelet convolutional neural network for forward message transmission, calculating the prediction results of all vertexes belonging to each category, calculating a network loss function value, and updating network parameters of each layer according to a preset strategy until the network error reaches a preset error threshold or the iteration number reaches a specified maximum value, and finishing training.

Illustratively, in training, the network parameters of each layer of the graph Haar wavelet neural network are mapped according to a specific strategy such as random gradient descent (Stochastic GRADIENT DESCENT, SGD), momentum gradient descent (Momentum GRADIENT DESCENT, MGD), nesterov Momentum, adaGrad, RMSprop and Adam (Adaptive Moment Estimation), or Batch gradient descent (Batch GRADIENT DESCENT, BGD), etc、/>And the graph convolution kernel/>、/>Correction and updating are performed to optimize the loss function value.

In this embodiment, before inputting the adjacency matrix, the vertex feature matrix, and the vertex label matrix into the multi-scale map Haar wavelet convolutional neural network, the method further includes:

s15-4: the loss function is set based on cross entropy.

In this embodiment, the loss function is first set based on cross entropy prior to training the neural network.

Illustratively, incorporating vertex true labelsThe loss function loss based on cross entropy (CrossEntropy) is designed as follows:

（27）

Wherein, And/>The row vectors of the real label matrix Y and the predictive label matrix Z, respectively.

S15-5: and initializing network parameters of each layer of the multi-scale graph Haar wavelet convolutional neural network.

In this embodiment, before training the neural network, the network parameters of each layer are initialized using a specific policy.

For example, the initialization policy may be normal distribution random initialization, xavier initialization or He Initialization initialization, and initialize network parameters of each layer of the network, including feature transformation matrix of each layer、And the graph convolution kernel/>、/>。

S16: and obtaining a literature classification result corresponding to the literature relation graph under the condition that the multi-scale graph Haar wavelet convolutional neural network is trained.

In this embodiment, when the training of the multi-scale graph Haar wavelet convolutional neural network is completed, a predicted vertex tag matrix is output, and the predicted vertex tag matrix is used as a corresponding document classification result.

Illustratively, at the end of training, a predicted vertex label matrix Y is obtained for vertices without class labelsThe category to which the vertex label matrix Y should belong can be obtained according to the vertex label matrix Y. /(I)

Referring to fig. 4, fig. 4 is a schematic structural diagram of a document classification system according to an embodiment of the present application, in which document features and document application relationships are extracted from a warmth o database to generate a graph G, then a multi-level coarsening graph and an anti-coarsening graph are constructed, and then a multi-scale graph Haar wavelet convolutional neural network is used to perform document label prediction, so as to obtain a document classification result, namely a real label matrix Y.

Based on the same inventive concept, an embodiment of the present application provides a document classification apparatus. Referring to fig. 5, fig. 5 is a schematic diagram of a document classification apparatus 500 according to an embodiment of the application. As shown in fig. 5, the apparatus includes:

A data receiving module 501, configured to receive a literature relationship graph, an adjacent matrix corresponding to the literature relationship graph, a vertex feature matrix corresponding to the literature relationship graph, and a vertex label matrix corresponding to the literature relationship graph;

the coarsening module 502 is configured to coarsen the literature relationship graph by using a coarsening algorithm to obtain a coarsened graph chain corresponding to the literature relationship graph;

a reverse roughening processing module 503, configured to use a reverse roughening algorithm to perform reverse roughening processing on the roughened graph chain, so as to obtain a reverse roughened graph chain;

The neural network building module 504 is configured to build a multi-scale graph Haar wavelet convolutional neural network according to the coarsened graph chain and the inverse coarsened graph chain;

The neural network training module 505 is configured to train the multi-scale map Haar wavelet convolutional neural network through the adjacency matrix, the vertex feature matrix, and the vertex label matrix;

and the document classification result obtaining module 506 is configured to obtain a document classification result corresponding to the document relation graph when the training of the multi-scale graph Haar wavelet convolutional neural network is completed.

Optionally, the apparatus further comprises:

Optionally, the roughening module includes:

Optionally, the first initialization processing sub-module includes:

Optionally, the module density iterative optimization submodule includes:

Optionally, the apparatus further comprises:

Optionally, the anti-coarsening processing module includes:

Optionally, the second initialization processing submodule includes:

Optionally, the inverse coarsening graph chain obtaining submodule includes:

Optionally, the apparatus further comprises:

Optionally, the neural network building module includes:

Optionally, the neural network building sub-module includes:

Optionally, the apparatus further comprises:

A second pooling operation sub-module, configured to perform an average pooling operation on each of the anti-coarsening graphs in the anti-coarsening graph chain through a plurality of graph Haar wavelet convolution layers;

Optionally, the network training module includes:

Optionally, the apparatus further comprises:

Based on the same inventive concept, another embodiment of the present application provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the document classification method according to any of the above embodiments of the present application.

Based on the same inventive concept, another embodiment of the present application provides an electronic device, and referring to fig. 6, fig. 6 is a schematic diagram of an electronic device 600 according to an embodiment of the present application, including a memory 602, a processor 601, and a computer program stored in the memory and capable of running on the processor, where the processor executes the steps in the document classification method according to any one of the foregoing embodiments of the present application.

For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the application.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or terminal device that comprises the element.

The above detailed description of the document classification method, apparatus, device and storage medium provided by the present application applies specific examples to illustrate the principles and embodiments of the present application, and the above examples are only used to help understand the method and core idea of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A method of document classification, the method comprising:

2. The method of claim 1, wherein prior to receiving a literature relationship graph, an adjacency matrix corresponding to the literature relationship graph, a vertex feature matrix corresponding to the literature relationship graph, and a vertex label matrix corresponding to the literature relationship graph, the method further comprises:

obtaining literature data from a literature database;

3. The method of claim 1, wherein the roughening the document relation graph using a roughening algorithm to obtain a roughened graph chain corresponding to the document relation graph comprises:

and generating a plurality of coarsening graphs step by step, after the coarsening graphs of a preset level are obtained, forming a coarsening graph chain by all the coarsening graphs, wherein each coarsening graph in the coarsening graph chain is arranged according to the generation sequence.

4. The method of claim 3, wherein initializing the literature relationship graph to obtain a first level coarsening graph comprises:

Taking each vertex in the literature relationship graph as a community;

setting a community state of each community to be unlabeled;

5. A method according to claim 3, wherein performing module density iterative optimization on the first-stage coarsening map to obtain a second-stage coarsening map comprises:

deleting the neighbor communities from the unmarked community set;

6. The method of claim 5, wherein prior to merging each of the communities in the first level coarsening graph as a super vertex, the method further comprises:

7. A method according to claim 3, characterized in that the method further comprises:

and under the condition that a new coarsening diagram is obtained, constructing the adjacent matrix corresponding to the coarsening diagram according to the connection relation between each vertex in the coarsening diagram.

8. The method of claim 1, wherein the performing the anti-roughening treatment on the roughened graph chain using an anti-roughening algorithm to obtain an anti-roughened graph chain comprises:

Initializing the coarsened graph chain;

9. The method of claim 8, wherein initializing the coarsened graph chain comprises:

taking each vertex in the second-stage anti-coarsening graph as a child vertex;

the state of each of the child vertices is set to an unlabeled state.

10. The method of claim 8, wherein the performing, using the anti-coarsening algorithm, the anti-coarsening process on the coarsened graph chain after the initializing process to obtain the anti-coarsened graph chain includes:

11. The method according to claim 10, wherein when the anti-roughening processing of the second-stage anti-roughening map is completed in the case where the states of all the child vertices are the labeled states, the method further comprises:

12. The method of claim 1, wherein said building a multi-scale map Haar wavelet convolutional neural network from said coarsened map chain and said de-coarsened map chain comprises:

13. The method of claim 12, wherein the building the multi-scale graph Haar wavelet convolutional neural network from the graph Haar wavelet transform matrix corresponding to the coarsened graph and the graph Haar wavelet transform matrix corresponding to the inverse coarsened graph comprises:

14. The method of claim 12, wherein when constructing a plurality of graph Haar wavelet convolution layers corresponding to the coarsened graph chain from the graph Haar wavelet transform matrix corresponding to the coarsened graph, the method further comprises:

15. The method of claim 13, wherein when constructing the plurality of graph Haar wavelet convolutional layers corresponding to the chain of inverse coarsened graphs from the graph Haar wavelet transform matrix corresponding to the inverse coarsened graph, the method further comprises:

16. The method of claim 1, wherein training the multi-scale map Haar wavelet convolutional neural network by the adjacency matrix, the vertex feature matrix, and the vertex label matrix comprises:

17. The method of claim 16, wherein prior to inputting the adjacency matrix, the vertex feature matrix, and the vertex tag matrix into the multi-scale map Haar wavelet convolutional neural network, the method further comprises:

Setting the loss function based on cross entropy;

18. A document classification device, the device comprising:

19. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any of claims 1 to 17.

20. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 17 when executing the computer program.