CN116263783A - Text classification method, device, equipment and storage medium - Google Patents

Text classification method, device, equipment and storage medium Download PDF

Info

Publication number
CN116263783A
CN116263783A CN202111506316.4A CN202111506316A CN116263783A CN 116263783 A CN116263783 A CN 116263783A CN 202111506316 A CN202111506316 A CN 202111506316A CN 116263783 A CN116263783 A CN 116263783A
Authority
CN
China
Prior art keywords
text
node
nodes
word
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111506316.4A
Other languages
Chinese (zh)
Inventor
丁辰晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Suzhou Software Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202111506316.4A priority Critical patent/CN116263783A/en
Publication of CN116263783A publication Critical patent/CN116263783A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a text classification method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring text data; determining feature vectors of document nodes, concept nodes and word nodes corresponding to the text data based on the text data; constructing a text iso-graph based on feature vectors of document nodes, concept nodes and word nodes; determining the weight of edges between nodes in the text iso-graph; based on the text heterogram, obtaining a text feature vector corresponding to the text data; and classifying the text feature vectors by using a classification function to determine the text category. Thus, the prior knowledge in the text is obtained by obtaining the feature vector of the concept node; when the text iso-composition is constructed, concept nodes are fused, so that the problem of feature sparseness caused by short text lack of context can be relieved to a certain extent, the text feature vector extracted based on the text iso-composition can more accurately represent the features of the text, and the accuracy of text classification is further improved.

Description

Text classification method, device, equipment and storage medium
Technical Field
The present invention relates to the field of data processing, and in particular, to a text classification method, apparatus, device, and storage medium.
Background
In the big data age, a large amount of short text appears in the network. Because these short texts have short text length, lack of context information, and large content spoken language noise, how to accurately extract text features, and classifying the short texts by using a proper classification model is an important problem. When the text classification method in the prior art is adopted for text classification, semantic relations among short texts cannot be completely captured, and the problems of low classification accuracy and the like can be caused.
Disclosure of Invention
In order to solve the above technical problems, an embodiment of the present application is expected to provide a text classification method, apparatus, device and storage medium.
The technical scheme of the application is realized as follows:
in a first aspect, a text classification method is provided, the method comprising:
acquiring text data;
determining feature vectors of document nodes, feature vectors of concept nodes and feature vectors of word nodes corresponding to the text data based on the text data;
constructing a text iso-graph based on the feature vector of the document node, the feature vector of the concept node and the feature vector of the word node;
determining the weight of edges between nodes in the text iso-graph;
Based on the text heterogram, obtaining a text feature vector corresponding to the text data;
and classifying the text feature vectors by using a classification function to determine text categories.
In the above solution, the obtaining, based on the text iso-graph, a text feature vector corresponding to the text data includes: determining at least one type of attention weight for each node based on the text iso-graph; the type attention weight is a document type attention weight, a conceptual type attention weight or a word type attention weight; determining an inter-node attention weight between each node and a neighboring node based on the at least one type of attention weight, the feature vector of each node, and the feature vector of the at least one type of neighboring node; the text feature vector is determined based on the inter-node attention weights between all nodes and neighboring nodes, and the feature vectors of all nodes.
In the above solution, the determining at least one type of attention weight of each node based on the text iso-graph includes: calculating the sum of the feature vectors of the ith type adjacent nodes of the kth node in the text iso-graph to obtain the ith type feature vector of the kth node; wherein the i-th type is any one of a document type, a concept type or a word type; and determining the ith type attention weight of the kth node based on the eigenvector of the kth node, the eigenvector of the ith type adjacent node and the ith type eigenvector.
In the above aspect, the determining the text feature vector based on the attention weights between all nodes and adjacent nodes and feature vectors of all nodes includes: the attention weights among all nodes and adjacent nodes and the feature vectors of all nodes are input into a heterogeneous graph convolution network, and the text feature vectors corresponding to the text data are obtained; wherein the heterogeneous graph rolling network is constructed based on a multi-head attention mechanism.
In the above solution, the determining, based on the text data, a feature vector of a document node, a feature vector of a concept node, and a feature vector of a word node corresponding to the text data includes: calculating word frequency-inverse document frequency TF-IDF vector of the text data as a feature vector of the document node; acquiring a concept set of the text data based on a concept graph; mapping concepts in the concept set into feature vectors based on a word vector model to obtain feature vectors of the concept nodes; and acquiring the single-hot code vector of the word node in the text data from a preset vocabulary as the characteristic vector of the word node.
In the above scheme, the determining the weight of the edge between the nodes in the text iso-graph includes: acquiring at least one concept node corresponding to a document node and a correlation value between the document node and the at least one concept node based on a concept graph; determining weights for edges between the document node and the at least one concept node based on the relevance values; determining the weight of edges between the document nodes and at least one word node based on a word frequency-inverse document frequency TF-IDF algorithm; weights for edges between word nodes and word nodes are determined based on the point-to-point information between the words.
In the above scheme, the acquiring text data includes: acquiring original text data; preprocessing the original text data to obtain the text data; wherein the pretreatment comprises at least one of: noise information removal, word segmentation and word disabling.
In a second aspect, there is provided a text classification apparatus, the apparatus comprising:
the acquisition module is used for acquiring text data;
the processing module is used for determining the feature vector of the document node, the feature vector of the concept node and the feature vector of the word node corresponding to the text data based on the text data;
The processing module is further used for constructing a text iso-graph based on the feature vector of the document node, the feature vector of the concept node and the feature vector of the word node;
the processing module is further used for determining the weight of the edges between the nodes in the text iso-graph;
the processing module is further used for obtaining text feature vectors corresponding to the text data based on the text iso-graph;
the processing module is further used for classifying the text feature vectors by using a classification function to determine text categories.
In a third aspect, there is provided a text classification apparatus, the apparatus comprising: a processor and a memory configured to store a computer program capable of running on the processor, wherein the processor is configured to perform the steps of any of the preceding methods when the computer program is run.
In a fourth aspect, a computer storage medium is provided, on which a computer program is stored, wherein the computer program, when being executed by a processor, carries out the steps of the aforementioned method.
The application discloses a text classification method, a device, equipment and a storage medium, wherein the method obtains priori knowledge in a text by acquiring feature vectors of concept nodes; when the text iso-composition is constructed, concept nodes are fused, so that the problem of feature sparseness caused by short text lack of context can be relieved to a certain extent, the text feature vector extracted based on the text iso-composition can more accurately represent the features of the text, and the accuracy of text classification is further improved.
Drawings
FIG. 1 is a schematic flow chart of a text classification method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a text iso-composition set in an embodiment of the present application;
FIG. 3 is a schematic flow chart of determining text feature vectors according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a text iso-composition aggregation calculation method according to an embodiment of the present application;
fig. 5 is a schematic diagram of a composition structure of a text classification device according to an embodiment of the present application;
fig. 6 is a schematic diagram of a composition structure of a text classification apparatus according to an embodiment of the present application.
Detailed Description
For a more complete understanding of the features and technical content of the embodiments of the present application, reference should be made to the following detailed description of the embodiments of the present application, taken in conjunction with the accompanying drawings, which are for purposes of illustration only and not intended to limit the embodiments of the present application.
Fig. 1 is a schematic flow chart of a text classification method according to an embodiment of the present application. As shown in fig. 1, the text classification method specifically may include:
step 101: acquiring text data;
here, the text data may be understood as text data corresponding to a text to be classified (also referred to as "document" in the embodiment of the present application), and if multiple texts need to be classified, the multiple text data may form a text data set, where the acquiring text data includes: text data of a text to be classified is obtained from a text data set.
Illustratively, in some embodiments, the obtaining text data includes: acquiring original text data; preprocessing the original text data to obtain the text data; wherein the pretreatment comprises at least one of: noise information removal, word segmentation and word disabling.
Here, the original text data may be understood as original data corresponding to the text to be classified. The preprocessing is used for simplifying the original text data, and related terms in the original text data are reserved, so that the text classification efficiency can be improved.
Step 102: determining feature vectors of document nodes, feature vectors of concept nodes and feature vectors of word nodes corresponding to the text data based on the text data;
illustratively, in some embodiments, the determining, based on the text data, a feature vector of a document node, a feature vector of a concept node, and a feature vector of a word node corresponding to the text data includes: calculating word frequency-inverse document frequency TF-IDF vector of the text data as a feature vector of the document node; acquiring a concept set of the text data based on a concept graph; mapping concepts in the concept set into feature vectors based on a word vector model to obtain feature vectors of the concept nodes; and acquiring the single-hot code vector of the word node in the text data from a preset vocabulary as the characteristic vector of the word node.
For example, in practical application, a TF-IDF vector of a document to be classified in text data is calculated and used as a feature vector of a node of the document.
For example, the acquiring the concept set of the text data based on the concept graph may be: calling an API of the concept map, inputting text data, and obtaining a concept set corresponding to the text data. Illustratively, text data is entered: from "Eason Chan was born in Hong Kong" to concept atlas, a concept set { top chinese entertainer, singer, place, asian city }.
For example, in practical application, the preset vocabulary includes all words in the corpus and One-Hot (One-Hot) vectors corresponding to the words. The single-hot code vector of the word node in the text data can be directly obtained from the preset vocabulary and used as the characteristic vector of the word node.
Step 103: constructing a text iso-graph based on the feature vector of the document node, the feature vector of the concept node and the feature vector of the word node;
here, by acquiring the feature vector of the concept node corresponding to the text, a priori knowledge in the text is obtained; by constructing the text iso-composition based on the feature vectors of the document nodes, the word nodes and the concept nodes, the feature sparseness problem caused by short text lack context can be relieved to a certain extent, so that the text feature vector extracted based on the text iso-composition can more accurately represent the features of the text, and further the accuracy of text classification is improved.
Here, when text of a document is included in the text data, all document nodes, concept nodes, and word nodes in the document are included in the text iso-graph, and the document is classified.
Illustratively, in some embodiments, the method further comprises: a set of text iso-composition containing a plurality of texts is constructed, which set can be used to categorize the plurality of texts.
Fig. 2 is a schematic diagram of a text iso-composition set according to an embodiment of the present application. As shown in fig. 2, the text iso-composition set includes at least three text-corresponding text iso-compositions. The text iso-composition of each text includes: document nodes, concept nodes, and word nodes.
Step 104: determining the weight of edges between nodes in the text iso-graph;
illustratively, in some embodiments, the determining the weights of the edges between the nodes in the text iso-graph includes: acquiring at least one concept node corresponding to a document node and a correlation value between the document node and the at least one concept node based on a concept graph; determining weights for edges between the document node and the at least one concept node based on the relevance values; determining the weight of edges between the document nodes and at least one word node based on a word frequency-inverse document frequency TF-IDF algorithm; weights for edges between word nodes and word nodes are determined based on the point-to-point information between the words.
Illustratively, and exemplary, text data is entered: from "Eason Chan was born in HongKong" to concept graphs, a set of concepts and relevance values { < top chinese entertainer,0.6>, < singer,0.4>, < place,0.2>, < asian city,0.1> } can be obtained. And taking the relevance value of each concept and the text data as the weight of the edge between the document node and the concept node. Illustratively, the weight of the edges between the concept node "top chinese entertainer" and the document node "Eason Chan was born in HongKong" corresponding to the text data is 0.6.
Illustratively, in some embodiments, the determining weights for edges between the document node and at least one word node based on a word frequency-inverse document frequency TF-IDF algorithm comprises: calculating TF-IDF values between the words and the documents corresponding to the text data; and taking the TF-IDF value as the weight of the edge between the document node and the word node.
Specifically, the TF-IDF value between a word and a document corresponding to text data is calculated as follows: calculating the ratio of the number of times of the word appearing in the document to the total number of times of the document as word frequency TF; based on
Figure BDA0003404513460000061
Calculating an inverse document frequency IDF, wherein N is the total number of texts, and N is the number of documents containing the word in the text data set; the product of TF and IDF is calculated as TF-IDF value.
Illustratively, determining weights for the word nodes and edges between the word nodes based on the point-to-point information between the words includes: acquiring point mutual information between words in text data from a word point mutual information library; establishing edges between words with dot mutual information being positive values; the point mutual information between words is used as the weight of the edge between the word nodes.
Here, the word mutual information base is determined based on a corpus, and includes mutual information between words of all documents in the corpus. Illustratively, to utilize co-occurrence information of global words, co-occurrence statistics are collected by using a fixed size sliding window for all documents in the corpus; the relevance between words is measured using point to point information (PMI) as a weight between two word nodes. Wherein, the PMI calculation formula is as follows:
Figure BDA0003404513460000071
Figure BDA0003404513460000072
Figure BDA0003404513460000073
where PMI (i, j) is the (PMI) between words i, j, # W (i) is the number of sliding windows in the corpus containing word i, # W (i, j) is the number of sliding windows containing words i and j, and# W is the total number of sliding windows in the corpus. When the PMI is positive, the semantic association degree between words in the corpus is higher; when the PMI is negative, the semantic association degree between words in the corpus is lower or zero.
Step 105: based on the text heterogram, obtaining a text feature vector corresponding to the text data;
here, the text heterogeneous map includes: document nodes, concept nodes, and weights of edges between word nodes and nodes. The text feature vector corresponding to the text data is a feature vector for representing the text data, and is used for classifying the text based on the text feature vector.
Exemplary, in some embodiments, the obtaining, based on the text iso-graph, a text feature vector corresponding to the text data includes: calculating the attention weight among nodes in the text iso-graph; and inputting the attention weights among the nodes and the feature vectors of all the nodes in the text iso-graph into the heterogeneous graph convolution neural network to obtain the text feature vector corresponding to the text data. Here, the graph convolution neural network is used to extract spatial features of the topological graph, and may be obtained through training.
Step 106: and classifying the text feature vectors by using a classification function to determine text categories.
For example, in some embodiments, the classification function may be a softmax function.
Illustratively, in some embodiments, the text classification method further comprises: acquiring a training text data set; the text classification model is trained based on the training text dataset.
Illustratively, in some embodiments, the text classification method further comprises: a training text isomerous atlas is constructed containing all text in the training text dataset. The training text heterogeneous atlas is used for training a text classification model, and the text classification model can classify texts in a text data set which is directly input, so that text types corresponding to the texts are obtained.
Illustratively, constructing a training text isomerous atlas containing all text in the training text dataset includes: constructing a document d= { D containing short text 1 ,d 2 ,...,d m The word w= { W } 1 ,w 2 ,...,w n Concept c= { C 1 ,c 2 ,...,c k Training text as nodeConstructing a atlas; where m is the total number of documents in the corpus, n is the number of unique words in the corpus (vocabulary size), and k is the total number of concepts for all documents in the training text dataset. For document nodes, their feature vectors are represented using their word frequency-inverse document frequency (TF-IDF) vectors, the One-Hot vectors of the vocabulary are used as feature vectors for word nodes, and the pre-training word vectors are used to map conceptual words into feature vectors.
Illustratively, in some embodiments, the model uses cross entropy as a loss function while the model is trained, while using L2 regularization prevents the model from overfitting. The loss function is:
Figure BDA0003404513460000081
Wherein, C is the classification number, D is the training set size, Z is the prediction category, y is the actual category, and lambda theta 2 For the regular term, the model optimization adopts a gradient descent method.
Here, the execution subject of steps 101 to 106 may be a processor of the text classification apparatus.
The text data is text data corresponding to short text. Because the short text length is too short, and the context information is lacking, the common text classification method cannot obtain effective classification features from sparse text features. According to the technical scheme, the prior knowledge in the text is obtained by obtaining the feature vector of the concept node corresponding to the text; by constructing the text iso-composition based on the feature vectors of the document nodes, the word nodes and the concept nodes, the feature sparseness problem caused by short text lack context can be relieved to a certain extent, so that the text feature vector extracted based on the text iso-composition can more accurately represent the features of the text, and further the accuracy of text classification is improved.
On the basis of the above embodiment, a method for obtaining the text feature vector corresponding to the text data based on the text iso-graph in step 105 is further illustrated. Fig. 3 is a schematic flow chart of determining a text feature vector according to an embodiment of the present application. As shown in fig. 3, the determining procedure of the text feature vector includes:
Step 301: determining at least one type of attention weight for each node based on the text iso-graph; the type attention weight is document type attention weight, conceptual type attention weight or word type attention weight;
illustratively, in some embodiments, the determining at least one type of attention weight for each node based on the text iso-graph comprises: calculating the sum of the feature vectors of the ith type adjacent nodes of the kth node in the text iso-graph to obtain the ith type feature vector of the kth node; wherein the i-th type is any one of a document type, a concept type or a word type; and determining the ith type attention weight of the kth node based on the eigenvector of the kth node, the eigenvector of the ith type adjacent node and the ith type eigenvector.
Illustratively, the i-th type is: document type (type 1), concept type (type 2) or word type neighbor node (type 3). The ith type of feature vector of the kth node is: document type feature vectors, concept type feature vectors or word type feature vectors of the kth node; the document type feature vector of the kth node is the sum of feature vectors of document type adjacent nodes of the kth node; the conceptual feature vector of the kth node is the sum of the feature vectors of conceptual neighboring nodes of the kth node; the word-type feature vector of the kth node is the sum of feature vectors of word-type neighboring nodes of the kth node.
Illustratively, in some embodiments, the determination of the ith type of attention weight for the kth node based on the eigenvectors of the kth node, the eigenvectors of the ith type of neighboring nodes, and the ith type of eigenvectors, may be accomplished by,
Figure BDA0003404513460000091
wherein alpha is i An ith type attention weight representing a kth node,
Figure BDA0003404513460000092
representing an i-th type of attention vector; h is a k A feature vector representing a kth node; h is a i An ith type feature vector representing a kth node; the l represents the join operation, the leak ReLU (·) represents the leak ReLU activate function, and the softmax (·) represents processing using the softmax function.
Wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0003404513460000101
the i-th type of attention vector, which is the current node, is obtained by the following equation,
Figure BDA0003404513460000102
wherein V is T W, U are all learnable network parameters, h k Is the eigenvector of the kth node, h k' Is the feature vector of the i-th type neighboring node of the k-th node.
Illustratively, in some embodiments, the method further comprises: the type attention value is regularized using a softmax function.
Step 302: determining an inter-node attention weight between each node and a neighboring node based on the at least one type of attention weight, the feature vector of each node, and the feature vector of the at least one type of neighboring node;
Exemplary, in some embodiments, an inter-node attention weight β between a node k (also referred to herein as a kth node) and its neighboring node k kk' It can be calculated by the following formula, where node k is the i ' th type node and node k ' is the i ' th type node:
β kk' =softmax(LeakyReLU(V T ·α i' [h k ||h k' ]))
wherein V represents the attention vector of the node k, is a learnable network parameter and can be obtained by all adjacent nodes of the node k; alpha i' Attention rights of type i' for node kWeighing; h is a k A feature vector representing a kth node; h is a k' A feature vector representing a neighboring node k'; the l represents the join operation, the leak ReLU (·) represents the leak ReLU activate function, and the softmax (·) represents processing using the softmax function.
Illustratively, in some embodiments, the method further comprises: the inter-node attention value is regularized using a softmax function.
Step 303: the text feature vector is determined based on the inter-node attention weights between all nodes and neighboring nodes, and the feature vectors of all nodes.
Illustratively, in some embodiments, the determining the text feature vector based on the inter-node attention weights between all nodes and neighboring nodes, and the feature vector of all nodes includes: and inputting the attention weights among all the nodes and the adjacent nodes and the feature vectors of all the nodes into a heterogeneous graph convolution network to obtain the text feature vectors corresponding to the text data.
Illustratively, in some embodiments, the propagation rules of the heterogeneous graph rolling network include:
Figure BDA0003404513460000111
wherein H is (l+1) Representing the text feature vector of layer L+1, beta i Representing the attention weight between the i-th type node and the adjacent nodes;
Figure BDA0003404513460000112
layer L feature vector, W, representing type i node i (l) For a trainable transformation matrix, σ (·) is the ReLU activation function. I represents a set of I-th types.
Illustratively, in some embodiments, the heterogeneous graph convolution network is a multi-headed heterograph convolution network constructed based on a multi-headed attention mechanism.
By calculating different types of attention weights and combining the types of attention weights, the inter-node attention weights among the nodes are calculated, and then a multi-head iso-composition attention mechanism is added to calculate the importance of different types of adjacent nodes, so that the text feature extraction based on the text iso-composition has better performance and stronger robustness.
In order to further embody the purpose of the application, further illustrating the embodiment of the application, a text classification method based on the heterograph attention network is provided, and the text classification method can be applied to a text classification model. The method specifically comprises the following steps:
Step 401: acquiring an original text data set and preprocessing the original text data set to obtain the text data set;
wherein the text data set comprises at least one text data.
Specifically, preprocessing an original short text data set, correcting word spelling errors by using a script program, cleaning unnecessary labels, removing punctuation marks and special symbols, deleting high-frequency words which do not affect semantics in the original text data by using an nltk tool kit, and performing word removal operation to obtain the text data set.
Illustratively, in some embodiments, the method further comprises: a training text dataset is obtained. Wherein the training text data set contains text data of a plurality of training texts.
Step 402: using word frequency-inverse document frequency (TF-IDF) vector of text to represent its characteristic vector as characteristic vector of document node;
step 403: acquiring a concept set of the short text by using a Microsoft concept graph, and mapping the concept set into a feature vector by using a pre-training word vector to serve as a feature vector of a concept node;
illustratively, the short text data is input into the concept graph by calling an API of the concept graph to obtain a short text concept set. The method comprises the following specific steps:
Step 411: inputting an original text data set t, setting the number of expansion words Topk, setting an expansion algorithm as EX, setting data information MSCG of a conceptual diagram, and selecting a mode SelectMode;
step 412: words=split data (t); word segmentation is carried out on each text in the text data set, so that an initial feature set Words is obtained;
step 413: words= ReduceStopWords (Words); stop words in the reject feature words
Step 414: select = getselect wordset (MSCG, selectMode); obtaining a corresponding word set Select in the conceptual diagram according to the representation mode
Step 415: sel_words=words ∈select; selecting a feature word set Sel_Words to be expanded according to Select
Step 416: word_dic=access api (sel_words, EX, topk)/. For each feature Word in sel_words, a different interface of the conceptualized extension algorithm is invoked. The interface returns Topk conceptual words related to the feature and then forms the extended dictionary word_dic ++
Step 417: d=getingextension (words_dic); according to word_dic, the original feature words are expanded, so that a conceptual expanded semantic representation d × is obtained
Step 418: return d.
In addition to the concept set of short text, the relevance value between the concept set concept and the short text can be obtained from the concept graph. By calling the API of the concept graph and inputting short text, the concept set C= { < C can be obtained 1 ,w 1 >,<c 2 ,w 2 >,...<c 1 ,w 3 > }, where ci represents the concept set concept and wi represents the association between the concept ci and the short text. For example, input short text { Eason Chan was born in HongKong }, a set of concepts and a relevance value { can be obtained<top chinese entertainer,0.6>,<singer,0.4>,<place,0.2>,<asian city,0.1>}。
Step 404: using One-Hot vector of the vocabulary as a feature vector of the word node;
step 405: constructing a text iso-composition;
specifically, constructing a text iso-composition corresponding to each text in the text data set; and integrating the text iso-graphs of all texts in the text data set to obtain a text iso-graph set. The text isomerism atlas is represented as g= (V, E), V is the set of nodes in the graph, and E is the set of edges between nodes.
Step 406: determining the weight of edges between nodes in the text iso-graph;
specifically, establishing an edge between the document and the related concept by using a correlation value between the concept and the short text, which are commonly acquired from the concept graph, and taking the correlation value as the weight of the edge between the concept node and the document node;
building edges on the document nodes and the word nodes based on word co-occurrence in the document, and calculating edge weights by using a TF-IDF algorithm, wherein word frequency is the number of times of word occurrence in the document, and inverse document frequency is the logarithm of the reciprocal of the number of the document containing the word;
To take advantage of co-occurrence information of global words, co-occurrence statistics are collected by using a fixed size sliding window for all documents in the corpus. The system uses Point Mutual Information (PMI) to measure the relevance between words to calculate the weight between two word nodes, and the PMI calculation formula is as follows:
Figure BDA0003404513460000131
Figure BDA0003404513460000132
Figure BDA0003404513460000133
where, is the (PMI) between words i, j, # W (i) is the number of sliding windows in the corpus containing words i, # W (i, j) is the number of sliding windows containing words i and j, and# W is the total number of sliding windows in the corpus. When the PMI is positive, the semantic association degree between words in the corpus is higher; when the PMI is negative, the semantic association degree between words in the corpus is lower or zero. Edges are added between word pairs where PMI is positive.
Step 407: in the constructed text iso-graph, aggregation calculation is carried out on each node by using a multi-head iso-graph attention mechanism, and finally text feature vectors are aggregated.
The spatial features of the topology graph can be extracted by a graph convolutional neural network (GCN), which is a multi-layer neural network, which calculates an embedding vector (embedding vector) for each node by aggregating neighboring node features of the node. For the text iso-graph G= (V, E), V and E are respectively the set of nodes and edges in the graph, and the node characteristic matrix X epsilon R |V|×q Feature vector x including all nodes v ∈R q Adding a self-connected adjacency matrix A' =A+I, a degree matrix and a standard graph convolution neural network, wherein the propagation rule is as follows:
Figure BDA0003404513460000141
wherein H is (l) ∈R N×D Is a hidden feature of layer L, H (0) =X,W (l) In the form of a transformation matrix that can be trained,
Figure BDA0003404513460000142
to normalize the symmetric adjacency matrix, σ (·) is the ReLU activation function. The formula represents that the text feature vector of the L+1 layer is calculated by the text feature vector aggregation of the L layer.
However, the standard GCN cannot be directly applied to the text iso-graph in the present application, because there are three different types of nodes in the text iso-graph in the present application, which are respectively in the document type, the concept type and the word type. Different types of nodes have different feature spaces. The common solution is to splice the feature vectors of different types of nodes to obtain a new larger feature space, and fill the irrelevant dimensions of each node with 0 value, but the method ignores some feature information and affects the performance of the model.
In order to solve the problem, the method uses a method of heterogeneous graph convolution, considers the difference of different types of information, transforms the transformation matrix of each node of different types into a public implicit space, and the basic propagation rule of standard heterogeneous graph convolution is as follows:
Figure BDA0003404513460000143
Wherein the method comprises the steps of
Figure BDA0003404513460000144
Is A Is represented by V represents all nodes, |V t The i indicates a neighboring node of node type t. The above expression pair->
Figure BDA0003404513460000145
Different types of neighboring nodes of a node utilize a transformation matrix +.>
Figure BDA0003404513460000146
Transforming the characteristic vector, and aggregating the adjacent nodes of different types to obtain H (l+1) The characteristics of the nodes take into account the differences in different characteristic spaces and project them into the implicit public space +.>
Figure BDA0003404513460000147
Initial->
Figure BDA0003404513460000148
Specifically, for a given node in the heterogram, its neighbors of different types will have different effects on it. For neighboring nodes of the same type, more useful information can be aggregated, while for nodes of different types, there is some efficient information transfer, and in order to obtain efficient information between nodes and between different types, the method uses a multi-headed heterograph attention mechanism.
Given node v (corresponding to the kth node in this application), type attention learns the weights of different types of neighboring nodes. The feature vector of the t type (t type is any one of a document type, a concept type, and a word type, and corresponds to the i-th type in the present application) is expressed as:
Figure BDA0003404513460000151
wherein the method comprises the steps of
Figure BDA0003404513460000152
An adjacency matrix representing a node v and its neighboring node v ', v' representing the neighboring node of the t-type of the current node v, the above expression representing the calculation of the feature vector h of the neighboring node of the t-type v' The sum is used for obtaining a t type characteristic vector h t . Then using all types of type feature vectors with the current node feature vector h v The t-type attention weight α is calculated using the following equation t
Figure BDA0003404513460000153
Wherein alpha is t The t-type attention weight representing node v,
Figure BDA0003404513460000154
representing a t-type attention vector; the l represents the join operation, the leak ReLU (·) represents the leak ReLU activate function, and the softmax (·) represents processing using the softmax function.
Wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0003404513460000155
the t-type attention vector for the current node, can be derived from the following equation,
Figure BDA0003404513460000156
wherein V is T Both W, U are network parameters that can be learned.
The text classification method further comprises the following steps: all types of attention values were regularized using a softmax function to obtain type attention weights.
To capture important neighboring nodesInformation, while reducing the weight of noise nodes, the model uses node attention to calculate neighboring nodes. Given a node v of type t, and its neighbors v' e N of type t v And calculates the attention weight between the nodes using the following formula:
β vv' =softmax(LeakyReLU(V T ·α t' [h v ||h v' ]))
wherein V is the attention vector, alpha t' For the t' type attention weight, h v Is the characteristic vector of the current node, h v' The node attention value is then regularized, again using the softmax function, for the feature vectors of neighboring nodes.
Integrating the obtained attention weights among the nodes into the heterogram convolution to obtain a new heterogram attention network, wherein the propagation rule is as follows:
Figure BDA0003404513460000161
wherein H is (l+1) Representing the text feature vector of layer L+1, beta t Representing the attention weight calculated by the current node of the t type with the neighboring nodes,
Figure BDA0003404513460000162
feature vector of layer L representing current node of t type,>
Figure BDA0003404513460000163
representing a trainable transformation matrix, σ (·) is the ReLU activation function. I represents a set of t types.
Fig. 4 is a schematic diagram illustrating a text iso-composition aggregation calculation method according to an embodiment of the present application. As shown in figure 4 of the drawings,
Figure BDA0003404513460000164
representing document d i Feature vector of layer 1, c 1 、c 2 、c 3 Respectively represent and document d i Is a concept node adjacent to w 1 、w 2 、w 3 、w 4 Respectively represent and document d i Is convolved based on the attention between nodes and the feature vector of the node to obtain a document d i Feature vector of layer 1+1->
Figure BDA0003404513460000165
To make the model more stable, the present system incorporates a multi-head attention mechanism (multi-head attention). And independently calculating by using K groups of attention mechanisms, wherein each group of attention weights is calculated, and connecting calculation results to obtain the characteristics of each head, and the characteristics are shown in the following formula:
Figure BDA0003404513460000166
Wherein H is (l+1) Represents the text feature vector of the layer L+1, the I is the connection operation,
Figure BDA0003404513460000167
is the kth inter-node attention weight between the t-type node and other nodes, ++>
Figure BDA0003404513460000168
And the feature vector of the L layer of the current node of the t type is represented. W (W) t k Representing the k-th set of trainable transformation matrices.
For the last layer of the network, the connection mode is not used any more, but the average summation mode is used to obtain the final node characteristics:
Figure BDA0003404513460000169
calculating a neighboring concept node c i Word node w i With its attention weight, aggregate computation is performed in combination with the attention weight, while using a multi-headed attention mechanism, different arrow patterns represent independent attention computation by concatenationEach head is then or averaged to obtain the node characteristics of the new layer. The multi-head heterographing attention mechanism calculates the importance of different types of adjacent nodes, so that the model has better performance and stronger robustness.
Step 408: text feature vectors are classified using a softmax function.
Specifically, through calculation of the L layer MHGAT, text feature vectors corresponding to all texts in the heterogeneous text graph can be obtained, and then the text feature vectors are classified by using a softmax function to obtain text categories corresponding to all the texts.
Illustratively, in some embodiments, the text classification method further comprises: acquiring a training text data set; the text classification model is trained based on the training text dataset.
Illustratively, in some embodiments, the text classification method further comprises: a training text isomerous atlas is constructed containing all text in the training text dataset. The training text heterogeneous atlas is used for training a text classification model, and the text classification model can classify texts in a text data set which is directly input, so that text types corresponding to the texts are obtained.
Illustratively, constructing a training text isomerous atlas containing all text in the training text dataset includes: constructing a document d= { D containing short text 1 ,d 2 ,...,d m The word w= { W } 1 ,w 2 ,...,w n Concept c= { C 1 ,c 2 ,...,c k A training text heterogeneous atlas as a node; where m is the total number of documents in the corpus, n is the number of unique words in the corpus (vocabulary size), and k is the total number of concepts for all documents in the training text dataset. For document nodes, their feature vectors are represented using their word frequency-inverse document frequency (TF-IDF) vectors, the One-Hot vectors of the vocabulary are used as feature vectors for word nodes, and the pre-training word vectors are used to map conceptual words into feature vectors.
Illustratively, in some embodiments, the model uses cross entropy as a loss function while the model is trained, while using L2 regularization prevents the model from overfitting. The loss function is:
Figure BDA0003404513460000171
wherein, C is the classification number, D is the training set size, Z is the prediction category, y is the actual category, and lambda theta 2 For the regular term, the model optimization adopts a gradient descent method.
According to the technical scheme, when the text iso-composition is constructed, not only the feature vectors of the document nodes and the word nodes are used, but also the feature vectors of the concept nodes are obtained, priori knowledge in the text is obtained, the problem of feature sparseness caused by short text lack of context can be relieved to a certain extent, the feature of the text can be more accurately represented by the text feature vectors extracted based on the text iso-composition, and the accuracy of text classification is further improved; the attention weights among the nodes of different types are calculated to obtain the type attention weights, the double attention weights among the nodes are calculated by combining the type attention weights, and the importance of adjacent nodes of different types is calculated by adding a multi-head iso-composition attention mechanism, so that the aggregation calculation of the text iso-composition has better performance and stronger robustness, further more accurate text feature vectors are obtained, and the accuracy of text classification results is improved.
Fig. 5 is a schematic diagram of a composition structure of a text classification device in an embodiment of the present application, which shows a device 50 for implementing a text classification method, where the device 50 specifically includes:
an obtaining module 501, configured to obtain text data;
a processing module 502, configured to determine, based on the text data, a feature vector of a document node, a feature vector of a concept node, and a feature vector of a word node corresponding to the text data;
the processing module 502 is further configured to construct a text iso-graph based on the feature vector of the document node, the feature vector of the concept node, and the feature vector of the word node;
the processing module 502 is further configured to determine weights of edges between nodes in the text iso-graph;
the processing module 502 is further configured to obtain a text feature vector corresponding to the text data based on the text iso-graph;
the processing module 502 is further configured to classify the text feature vector by using a classification function, and determine a text category.
In some embodiments, the processing module 502 is configured to determine at least one type of attention weight for each node based on the text iso-graph; the type attention weight is document type attention weight, conceptual type attention weight or word type attention weight; determining an inter-node attention weight between each node and a neighboring node based on the at least one type of attention weight, the feature vector of each node, and the feature vector of the at least one type of neighboring node; the text feature vector is determined based on the inter-node attention weights between all nodes and neighboring nodes, and the feature vectors of all nodes.
In some embodiments, the processing module 502 is configured to calculate a sum of feature vectors of i-th type neighboring nodes of a k-th node in the text iso-graph to obtain an i-th type feature vector of the k-th node; wherein the i-th type is any one of a document type, a concept type or a word type; and determining the ith type attention weight of the kth node based on the eigenvector of the kth node, the eigenvector of the ith type adjacent node and the ith type eigenvector.
In some embodiments, the processing module 502 is configured to input the attention weights between all nodes and adjacent nodes, and the feature vectors of all nodes into the heterogeneous graph convolution network, so as to obtain text feature vectors corresponding to the text data; wherein the heterogeneous graph rolling network is constructed based on a multi-head attention mechanism.
In some embodiments, the processing module 502 is configured to calculate a word frequency-inverse document frequency TF-IDF vector of the text data as a feature vector of the document node; acquiring a concept set of the text data based on a concept graph; mapping concepts in the concept set into feature vectors based on a word vector model to obtain feature vectors of the concept nodes; and acquiring the single-hot code vector of the word node in the text data from a preset vocabulary as the characteristic vector of the word node.
In some embodiments, the processing module 502 is configured to obtain, based on a concept graph, at least one concept node corresponding to a document node, and a relevance value between the document node and the at least one concept node; determining weights for edges between the document node and the at least one concept node based on the relevance values; determining the weight of edges between the document nodes and at least one word node based on a word frequency-inverse document frequency TF-IDF algorithm; weights for edges between word nodes and word nodes are determined based on the point-to-point information between the words.
In some embodiments, the obtaining module 501 is configured to obtain original text data; preprocessing the original text data to obtain the text data; wherein the pretreatment comprises at least one of: noise information removal, word segmentation and word disabling.
Based on the hardware implementation of each unit in the text classification device, another text classification device is further provided in the embodiment of the present application, and fig. 6 is a schematic diagram of the composition structure of the text classification device in the embodiment of the present application. As shown in fig. 6, the apparatus 60 includes: a processor 601 and a memory 602 configured to store a computer program capable of running on the processor;
Wherein the processor 601 is arranged to execute the method steps of the previous embodiments when running a computer program.
Of course, in actual practice, the various components of the text classification apparatus are coupled together via a bus system 603 as shown in fig. 6. It is understood that the bus system 603 is used to enable connected communications between these components. The bus system 603 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as bus system 603 in fig. 6.
In practical applications, the processor may be at least one of an application specific integrated circuit (ASIC, application Specific Integrated Circuit), a digital signal processing device (DSPD, digital Signal Processing Device), a programmable logic device (PLD, programmable Logic Device), a Field-programmable gate array (Field-Programmable Gate Array, FPGA), a controller, a microcontroller, and a microprocessor. It will be appreciated that the electronic device for implementing the above-mentioned processor function may be other for different apparatuses, and embodiments of the present application are not specifically limited.
The Memory may be a volatile Memory (RAM) such as Random-Access Memory; or a nonvolatile Memory (non-volatile Memory), such as a Read-Only Memory (ROM), a flash Memory (flash Memory), a Hard Disk (HDD) or a Solid State Drive (SSD); or a combination of the above types of memories and provide instructions and data to the processor.
In an exemplary embodiment, the present application also provides a computer readable storage medium, e.g. a memory comprising a computer program executable by a processor of a text classification apparatus to perform the steps of the aforementioned method.
It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items. The expressions "having," "including," and "containing," or "including" and "comprising" are used herein to indicate the presence of corresponding features (e.g., elements such as values, functions, operations, or components), but do not exclude the presence of additional features.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another and do not necessarily describe a particular order or sequence. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the invention.
The technical solutions described in the embodiments of the present application may be arbitrarily combined without any conflict.
In the several embodiments provided in the present application, it should be understood that the disclosed methods, apparatuses, and devices may be implemented in other manners. The above-described embodiments are merely illustrative, and for example, the division of units is merely a logical function division, and other divisions may be implemented in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.
The units described as separate units may or may not be physically separate, and units displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application.

Claims (10)

1. A method of text classification, the method comprising:
acquiring text data;
determining feature vectors of document nodes, feature vectors of concept nodes and feature vectors of word nodes corresponding to the text data based on the text data;
constructing a text iso-graph based on the feature vector of the document node, the feature vector of the concept node and the feature vector of the word node;
determining the weight of edges between nodes in the text iso-graph;
Based on the text heterogram, obtaining a text feature vector corresponding to the text data;
and classifying the text feature vectors by using a classification function to determine text categories.
2. The method according to claim 1, wherein the obtaining, based on the text iso-graph, a text feature vector corresponding to the text data includes:
determining at least one type of attention weight for each node based on the text iso-graph; the type attention weight is document type attention weight, conceptual type attention weight or word type attention weight;
determining an inter-node attention weight between each node and a neighboring node based on the at least one type of attention weight, the feature vector of each node, and the feature vector of the at least one type of neighboring node;
the text feature vector is determined based on the inter-node attention weights between all nodes and neighboring nodes, and the feature vectors of all nodes.
3. The method of claim 2, wherein the determining at least one type of attention weight for each node based on the text iso-graph comprises:
Calculating the sum of the feature vectors of the ith type adjacent nodes of the kth node in the text iso-graph to obtain the ith type feature vector of the kth node; wherein the i-th type is any one of a document type, a concept type or a word type;
and determining the ith type attention weight of the kth node based on the eigenvector of the kth node, the eigenvector of the ith type adjacent node and the ith type eigenvector.
4. The method of claim 2, wherein the determining the text feature vector based on inter-node attention weights between all nodes and neighboring nodes, and feature vectors of all nodes, comprises:
the attention weights among all nodes and adjacent nodes and the feature vectors of all nodes are input into a heterogeneous graph convolution network, and the text feature vectors corresponding to the text data are obtained;
wherein the heterogeneous graph rolling network is constructed based on a multi-head attention mechanism.
5. The method of claim 1, wherein the determining, based on the text data, a feature vector of a document node, a feature vector of a concept node, and a feature vector of a word node corresponding to the text data comprises:
Calculating word frequency-inverse document frequency TF-IDF vector of the text data as a feature vector of the document node;
acquiring a concept set of the text data based on a concept graph;
mapping concepts in the concept set into feature vectors based on a word vector model to obtain feature vectors of the concept nodes;
and acquiring the single-hot code vector of the word node in the text data from a preset vocabulary as the characteristic vector of the word node.
6. The method of claim 1, wherein determining weights for edges between nodes in the text iso-graph comprises:
acquiring at least one concept node corresponding to a document node and a correlation value between the document node and the at least one concept node based on a concept graph;
determining weights for edges between the document node and the at least one concept node based on the relevance values;
determining the weight of edges between the document nodes and at least one word node based on a word frequency-inverse document frequency TF-IDF algorithm;
weights for edges between word nodes and word nodes are determined based on the point-to-point information between the words.
7. The method of claim 1, wherein the obtaining text data comprises:
Acquiring original text data;
preprocessing the original text data to obtain the text data;
wherein the pretreatment comprises at least one of: noise information removal, word segmentation and word disabling.
8. A text classification device, the device comprising:
the acquisition module is used for acquiring text data;
the processing module is used for determining the feature vector of the document node, the feature vector of the concept node and the feature vector of the word node corresponding to the text data based on the text data;
the processing module is further used for constructing a text iso-graph based on the feature vector of the document node, the feature vector of the concept node and the feature vector of the word node;
the processing module is further used for determining the weight of the edges between the nodes in the text iso-graph;
the processing module is further used for obtaining text feature vectors corresponding to the text data based on the text iso-graph;
the processing module is further used for classifying the text feature vectors by using a classification function to determine text categories.
9. A text classification device, the device comprising: a processor and a memory configured to store a computer program capable of running on the processor,
Wherein the processor is configured to perform the steps of the method of any of claims 1-7 when the computer program is run.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any of claims 1-7.
CN202111506316.4A 2021-12-10 2021-12-10 Text classification method, device, equipment and storage medium Pending CN116263783A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111506316.4A CN116263783A (en) 2021-12-10 2021-12-10 Text classification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111506316.4A CN116263783A (en) 2021-12-10 2021-12-10 Text classification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116263783A true CN116263783A (en) 2023-06-16

Family

ID=86721661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111506316.4A Pending CN116263783A (en) 2021-12-10 2021-12-10 Text classification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116263783A (en)

Similar Documents

Publication Publication Date Title
CN108388651B (en) Text classification method based on graph kernel and convolutional neural network
CN112990280B (en) Class increment classification method, system, device and medium for image big data
WO2021089013A1 (en) Spatial graph convolutional network training method, electronic device and storage medium
JP7178513B2 (en) Chinese word segmentation method, device, storage medium and computer equipment based on deep learning
CN110968692B (en) Text classification method and system
Zhang et al. Quantifying the knowledge in a DNN to explain knowledge distillation for classification
CN114925205B (en) GCN-GRU text classification method based on contrast learning
CN115546525A (en) Multi-view clustering method and device, electronic equipment and storage medium
Wang et al. A band selection approach based on Lévy sine cosine algorithm and alternative distribution for hyperspectral image
CN115080749A (en) Weak supervision text classification method, system and device based on self-supervision training
CN114238746A (en) Cross-modal retrieval method, device, equipment and storage medium
CN112668633B (en) Adaptive graph migration learning method based on fine granularity field
CN117349494A (en) Graph classification method, system, medium and equipment for space graph convolution neural network
Liu et al. A weight-incorporated similarity-based clustering ensemble method
CN117009613A (en) Picture data classification method, system, device and medium
CN116701647A (en) Knowledge graph completion method and device based on fusion of embedded vector and transfer learning
CN116451081A (en) Data drift detection method, device, terminal and storage medium
CN111126443A (en) Network representation learning method based on random walk
CN116263783A (en) Text classification method, device, equipment and storage medium
CN112307914B (en) Open domain image content identification method based on text information guidance
CN114611668A (en) Vector representation learning method and system based on heterogeneous information network random walk
JP6993250B2 (en) Content feature extractor, method, and program
CN114626378A (en) Named entity recognition method and device, electronic equipment and computer readable storage medium
CN114662687B (en) Graph comparison learning method and system based on interlayer mutual information
CN116402554B (en) Advertisement click rate prediction method, system, computer and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination