CN114861629A - Automatic judgment method for text style - Google Patents

Automatic judgment method for text style Download PDF

Info

Publication number
CN114861629A
CN114861629A CN202210475512.8A CN202210475512A CN114861629A CN 114861629 A CN114861629 A CN 114861629A CN 202210475512 A CN202210475512 A CN 202210475512A CN 114861629 A CN114861629 A CN 114861629A
Authority
CN
China
Prior art keywords
model
text
style
data
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210475512.8A
Other languages
Chinese (zh)
Other versions
CN114861629B (en
Inventor
陈峥
陈建树
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202210475512.8A priority Critical patent/CN114861629B/en
Publication of CN114861629A publication Critical patent/CN114861629A/en
Application granted granted Critical
Publication of CN114861629B publication Critical patent/CN114861629B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/322Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to an automatic judgment method of a text style, which belongs to the technical field of artificial intelligence, obtains an automatic text style judgment model through the flow of text style judgment of specific style label extraction, deep learning model training tuning and the like, deploys the judgment model in a text judgment system, increases the efficiency of text screening, does not need retraining the model when a new label is added, better accords with human understanding, and has good expandability.

Description

Automatic judgment method for text style
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an automatic text style judging method.
Background
Currently in some automated text generation systems, there is a need to obtain text that satisfies high level stylistic constraints, such as lyrics that require the text to implement a certain emotion. Generally, the computing power of today's equipment can automatically generate a large amount of target text in a short time, but there is little automated method for screening and evaluating text. The common situation is that manual screening is needed, on one hand, a large amount of texts can make a human screening staff dazzling, and the screening quantity which can be carried out in a certain time is extremely limited, so that the working efficiency is low; on the other hand, the consumption of physical strength, the loss of spirit and even the condition of mood greatly influence the subjective judgment of a manual screener, and further influence the judgment of results. Therefore, manual screening has two major disadvantages: 1. The labor and time costs are high; 2. the evaluation result is greatly influenced by subjective factors.
In the prior art, a small number of methods for automatically screening and evaluating text classification by using a machine are available, and the method mainly comprises the steps of setting K labels in advance, abstracting a text classification task into a classification model with K dimensions, inputting each sample into the model to obtain a vector with K dimensions, and enabling each dimension to represent the probability that the corresponding label is true. There are two disadvantages to this approach: 1. when the model needs to judge a plurality of tags, the probability of other tags is influenced by the maximum probability tag due to the mutual exclusion characteristic, and when all tags are false, the maximum probability tag is still given; 2. when new labels need to be added, data needs to be reconstructed and the model needs to be retrained, expansibility is limited, and efficiency is low.
Disclosure of Invention
In order to solve the technical problems, the invention provides an automatic text style evaluation method, and aims to reduce heavy workload and labor time cost of manual evaluation, improve screening efficiency, reduce misjudgment rate and improve expansibility of a classification model.
In order to achieve the above object, the present invention provides an automatic text style judging method, which comprises the following steps:
step 1), carrying out syntactic analysis on the existing text comment data by using a syntactic analysis tool HanLP to obtain an analyzed data result; step 2), single data style label extraction:
a) self-defining Node nodes, recursively constructing a multi-branch tree by using a bottom-up method, restoring a data result obtained by analysis into a tree structure of a phrase structure tree, and recording phrase properties and word contents of each Node by using a hash table A;
b) taking the node type as VP and the number of words contained in the VP as a screening rule to extract labels, filtering the data nodes in the hash table A according to the screening rule to obtain a result meeting the condition, and storing the result meeting the condition in another hash table B;
step 3), constructing a full-volume style label set: performing the operation of the step 2) on each piece of data in the database to obtain a final hash table B, performing reverse ordering according to the frequency of phrases, and taking the first K phrases as style labels of model training data, wherein the style labels need to be judged by using a longest common substring algorithm, two labels with high text similarity cannot be used, and the model is an ALBERT pre-training model;
step 4), model training data construction: constructing a binary data set by using a negative sampling mode, keeping the proportion balance of positive and negative samples, marking one positive sample of the data as 1, then randomly selecting a style label which does not appear in the comment, constructing a negative sample by using the same splicing mode, and marking the label as 0;
and 5) training and optimizing the deep learning model, finely adjusting the ALBERT pre-training model by adopting a deep learning model training frame, and performing performance verification on a verification set.
Preferably, the step 5) specifically comprises the following steps:
a. disordering the sequence in the constructed data set and sequentially inputting the sequence into an ALBERT pre-training model in small batches;
b. the ALBERT pre-training model carries out pre-processing operation on input, converts the input into one-hot vectors, carries out embedding operation, and then embeds position information and fragment information, wherein the fragment id of a label text is 0, and the fragment id of a composition text is 1;
c. inputting the preprocessed result into a neural network, respectively operating with three weight matrixes to obtain Q, K, V three matrixes, respectively operating Q, K, V through a self-attention module to obtain an attention score matrix between each character and other characters, wherein the operation mode is as follows:
Figure BDA0003625301820000031
wherein Z is i For the encoded vector, T is the matrix transpose, M is the mask matrix, d k Is a vector dimension of a single-head attention hiding layer, and i is a positive integer from 1 to n;
d. using multi-head attention to Z 1 ~Z n Splicing the input matrix X and the multi-head attention input matrix X together, and then transmitting the input matrix X into a linear layer to obtain a final output Z with the same dimension as the multi-head attention input matrix X;
e. after the final output Z of the input matrix X in the same dimension is obtained, residual error connection is carried out by utilizing the final output Z of the multi-head attention module and the X, then layer normalization operation is carried out, and the input of each layer of neurons is converted into the mean value variance which is converted into the standard normal distribution;
f. a feedforward module in the ALBERT pre-training model processes the result by using two layers of full connection layers, so that the output dimension is consistent with the input dimension, then residual connection and layer normalization operation are performed again, the output is used as the input of the next cycle, and N times of cycles are performed;
g. sending CLS vectors in the ALBERT pre-training model into a linear layer, activating, performing loss operation by adopting a two-class cross entropy loss function, and performing model parameter optimization by back propagation;
h. and repeating the steps a-g until the model training is completed.
The invention has the beneficial effects that:
1) the style label is extracted in a syntactic analysis mode, and the obtained label text is more consistent with human understanding;
2) the method uses the understandable text to be spliced with the input text to construct two classification tasks, and the model obtains the understanding of the label text, so that the addition of a new label does not need to retrain the model, the method has good expansibility, and the text classification efficiency is greatly improved;
3) the invention can greatly reduce labor and time cost, improve screening efficiency and reduce misjudgment rate.
Drawings
FIG. 1 is an exemplary diagram of a phrase structure tree for style keyword extraction according to the present invention;
FIG. 2 is a schematic diagram of the classification method of the ALBERT model of the present invention.
Detailed Description
Fig. 1 shows an example of the present invention performing style keyword extraction to obtain Phrase structure tree, which is illustrated by using only one short sentence "xiaoming to go to the middle and see the electronic product" due to limited space, wherein the Phrase structure tree always contains words of the sentence as its leaf nodes, and other non-leaf nodes represent the constituent components of the sentence, usually Verb phrases (Verb Phrase, VP) and Noun phrases (Noun Phrase, NP).
The present invention will be described in detail below with reference to a comment (as an example of existing text comment data) corresponding to a composition, the contents of which are: "this prose, standing for light, enduring human taste, rich meaning, the author makes a song of praise to youth from different angles. The article language is heavy and frustrated, is rich in romantic colors, and is said to be a successful practice. ".
The invention provides an automatic judgment method of a text style, which comprises the following steps:
step 1), carrying out syntactic analysis on the existing text comment data by using a syntactic analysis tool HanLP, and obtaining the result of the list structure data in python: the present invention is applicable to the present invention as described above, and is applicable to the present invention, as in the present invention, the present invention is applicable to the present invention, and is applicable to the present invention, the present invention is directed to the effects of the present invention and its advantages and effects of the present invention can be obtained from the following [ ', ' rich ' ] ] ] ], [ ', ' of ' ] ] ] ], [ ', ' ] ], [ ' IP ', [ ' NP ', [ ', [ ' author ' ] ], [ ' VP ', [ ', ' from ' ], [ ' NP ', ' CP ', [ ', ' ] ], ' in, ' in, ' in, ' in, ' etc., the present invention relates to a novel and useful capsule of the present invention, and more particularly to a capsule of the present invention, which comprises [ ', [' is '] ], [' is ', [' is '] ], [' is '] ] ] ], [', 'is' ] ] ] ], [ 'is', [ 'is' ] ] ] ] ] ], [ ', and'. '], [' IP ', [' NP ', [' NP ', [', '] ], [' VP ', [', [ 'is' ] ], [ 'NP', [ 'QP', [ 'prime' ] ], [ 'IP', [ 'VP', [ 'prime', [ 'success' ] ] ] ], [ 'prime' ] ], ] etc. ], and ], [ 'NP', [ 'prime', [ 'habit' ] ] ] ] ] ] ] ], [ 'NP', [ 'prime', [ ] ] ] ] ] ], [ ',' ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] ] ], [ ', and'. ' ] ] ], wherein underlining indicates that the word is a terminal leaf node;
step 2), single data style label extraction:
a) self-defining Node nodes, recursively constructing a multi-branch tree by using a bottom-up method, restoring a data result obtained by analysis into a tree structure of a phrase structure tree, and recording phrase properties and word contents of each Node by using a hash table A;
b) taking the node type as VP and the number of words contained in the VP as a screening rule to extract labels, filtering the data nodes in the hash table A according to the screening rule to obtain a result meeting the condition, and storing the result meeting the condition in another hash table B;
through the step 2), phrases meeting the conditions, such as ' shallow appearance and human taste resistance ', ' rich implications ', ' depression and pause and contusion ', and rich romantic colors ' can be obtained;
step 3), constructing a full-volume style label set: performing the operation of the step 2) on each piece of data in the database to obtain a final hash table B, performing reverse ordering according to the frequency of phrases, taking the first K phrases as style labels of model training data (the value of K is determined according to subsequent experiments), judging the style labels by using a longest common substring algorithm, and not using two labels with high text similarity (the length of the longest common substring can be defined by itself), wherein the model is an ALBERT pre-training model;
step 4), model training data construction: constructing a binary data set by using a negative sampling mode, and keeping proportion balance of positive samples and negative samples, wherein one positive sample of data is exemplified by the article of "[ CLS ]: is light and durable for people to look for the flavor. [ SEP ] age 18, a typical age of flower, wheezy winds of youth often blow over, with this young breath, spreading our lives … … ", labeled 1; then randomly selecting a style label which does not appear in the comment, such as 'good at citing typical cases', constructing a negative sample by using the same splicing mode, wherein the label of the negative sample is 0; because the number of labels is relatively large and positive samples are sparse, a data set is constructed by using a negative sampling mode;
step 5), training and optimizing a deep learning model, finely adjusting an ALBERT pre-training model by adopting a deep learning model training frame, and performing performance verification on a verification set, wherein the method specifically comprises the following steps:
a. disordering the sequence in the constructed data set and sequentially inputting the sequence into an ALBERT pre-training model in small batches;
b. the method comprises the steps that an ALBERT pre-training model carries out preprocessing operation on input, converts the input into one-hot vectors and carries out embedding (embedding) operation, and then embeds position information and fragment information, wherein the fragment id of a label text is 0, and the fragment id of a composition text is 1;
c. inputting the preprocessed result into a neural network, respectively operating with three weight matrixes to obtain Q, K, V three matrixes, respectively operating Q, K, V through a self-attention module to obtain an attention score matrix between each character and other characters, wherein the operation mode is as follows:
Figure BDA0003625301820000061
wherein Z is i For the encoded vector, T is the matrix transpose, M is the mask matrix, d k Is a vector dimension of a single-head attention hiding layer, and i is a positive integer from 1 to n;
d. z is expressed by Multi-Head Attention (Multi-Head Attention) 1 ~Z n Splicing (concat) together, and then introducing a Linear (Linear) layer to obtain a final output Z with the same dimension as a Multi-Head Attention (Multi-Head Attention) input matrix X;
e. after a final output Z of an input matrix X in the same dimension is obtained, performing residual error connection (Add) on the final output Z of a Multi-Head Attention (Multi-Head Attention) module and the final output X, and then performing Layer Normalization (Layer Normalization) operation to convert the input of each Layer of neurons into a mean value and a variance, and convert the mean value and the variance into standard normal distribution LayerNorm (X + Z);
f. a Feed-Forward (Feed Forward) module in the ALBERT pre-training model processes a result by using two layers of full connection layers, so that the output dimension is consistent with the input dimension, then residual connection and layer normalization operation are performed once again, the output is used as the input of the next cycle, and the cycle is performed for N times.
g. Sending CLS vectors in an ALBERT pre-training model into a linear layer, activating, performing loss (loss) operation by adopting a two-class cross entropy loss function, and performing model parameter optimization by back propagation, wherein the loss (loss) operation formula is as follows:
loss=y n ·log(x n )+(1-y n )·log(1-x n ),
wherein, y n Is a true tag with a value range of {0,1}, x n Is the probability that the sample output by the model is positive, and the value range is (0, 1);
h. and repeating the steps a-g until the model training is completed.
The present invention and its embodiments have been described above, and the description is not intended to be limiting, and the drawings are only one embodiment of the present invention, and the actual structure is not limited thereto. In summary, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (3)

1. An automatic judgment method for text style is characterized by comprising the following steps:
step 1), carrying out syntactic analysis on the existing text comment data by using a syntactic analysis tool HanLP to obtain a data result of a list structure in python;
step 2), single data style label extraction:
a) self-defining Node nodes, recursively constructing a multi-branch tree by using a bottom-up method, restoring a data result obtained by analysis into a tree structure of a phrase structure tree, and recording phrase properties and word contents of each Node by using a hash table A;
b) taking the node type as VP and the number of words as 3-5 as a screening rule to extract a label, filtering the data nodes of the hash table A according to the screening rule to obtain a result meeting the condition, and storing the result meeting the condition in another hash table B, wherein the VP is a verb phrase;
step 3), constructing a full-volume style label set: performing the operation of the step 2) on each piece of data in the database to obtain a final hash table B, performing reverse ordering according to the frequency of phrases, and taking the first K phrases as style labels of model training data, wherein the model is an ALBERT pre-training model;
step 4), model training data construction: constructing a binary data set by using a negative sampling mode, keeping the proportion balance of positive and negative samples, identifying one positive sample of the data by using a label 1, then randomly selecting a style label which does not appear in the comment, constructing a negative sample by using the same splicing mode, and identifying the negative sample by using a label 0;
step 5), deep learning model training and tuning: and (3) fine tuning the ALBERT pre-training model by adopting a deep learning model training frame, and performing performance verification on a verification set.
2. The automatic judgment method of text style according to claim 1, wherein in step 3), style labels are judged repeatedly by using the longest common substring algorithm, and two labels with high text similarity cannot be used.
3. The method for automatically judging the style of text according to claim 2, wherein the step 5) comprises the following steps:
a. disordering the sequence in the constructed data set and sequentially inputting the sequence into an ALBERT pre-training model in small batches;
b. the ALBERT pre-training model carries out pre-processing operation on input, converts the input into one-hot vectors, carries out embedding operation, and then embeds position information and fragment information, wherein the fragment id of a label text is 0, and the fragment id of a composition text is 1;
c. inputting the result after the preprocessing operation into a neural network, respectively operating with three weight matrixes to obtain Q, K, V three matrixes, respectively operating Q, K, V through a self-attention module to obtain an attention score matrix between each character and other characters, wherein the operation mode is as follows:
Figure FDA0003625301810000021
wherein Z is i For the encoded vector, T is the matrix transpose, M is the mask matrix, d k Is a vector dimension of a single-head attention hiding layer, and i is a positive integer from 1 to n;
d. using multi-head attention to Z 1 ~Z n Splicing the input matrix X and the multi-head attention input matrix X together, and then transmitting the input matrix X into a linear layer to obtain a final output Z with the same dimension as the multi-head attention input matrix X;
e. after the final output Z of the input matrix X in the same dimension is obtained, residual error connection is carried out by utilizing the final output Z of the multi-head attention module and the X, then layer normalization operation is carried out, the input of each layer of neurons is converted into the mean value variance, and the mean value variance is converted into the standard normal distribution LayerNorm (X + Z);
f. a feedforward module in the ALBERT pre-training model processes the result by using two layers of full connection layers, so that the output dimension is consistent with the input dimension, then residual connection and layer normalization operation are performed again, the output is used as the input of the next cycle, and N times of cycles are performed;
g. sending CLS vectors in an ALBERT pre-training model into a linear layer, activating, performing loss operation by adopting a two-class cross entropy loss function, and performing model parameter optimization by back propagation, wherein the loss, namely the loss, has the operation formula as follows:
loss=y n ·log(x n )+(1-y n )·log(1-x n ),
wherein, y n Is a true tag with a value range of {0,1}, x n Is the probability that the sample output by the model is positive, and the value range is (0, 1);
h. and repeating the steps a-g until the model training is completed.
CN202210475512.8A 2022-04-29 2022-04-29 Automatic judgment method for text style Active CN114861629B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210475512.8A CN114861629B (en) 2022-04-29 2022-04-29 Automatic judgment method for text style

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210475512.8A CN114861629B (en) 2022-04-29 2022-04-29 Automatic judgment method for text style

Publications (2)

Publication Number Publication Date
CN114861629A true CN114861629A (en) 2022-08-05
CN114861629B CN114861629B (en) 2023-04-04

Family

ID=82635015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210475512.8A Active CN114861629B (en) 2022-04-29 2022-04-29 Automatic judgment method for text style

Country Status (1)

Country Link
CN (1) CN114861629B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118410160A (en) * 2024-07-01 2024-07-30 腾讯科技(深圳)有限公司 Text processing method, device, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122416A (en) * 2017-03-31 2017-09-01 北京大学 A kind of Chinese event abstracting method
CN112101004A (en) * 2020-09-23 2020-12-18 电子科技大学 General webpage character information extraction method based on conditional random field and syntactic analysis
CN112214599A (en) * 2020-10-20 2021-01-12 电子科技大学 Multi-label text classification method based on statistics and pre-training language model
KR20210037934A (en) * 2019-09-30 2021-04-07 한국과학기술원 Method and system for trust level evaluationon personal data collector with privacy policy analysis
CN113158674A (en) * 2021-04-01 2021-07-23 华南理工大学 Method for extracting key information of document in field of artificial intelligence

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122416A (en) * 2017-03-31 2017-09-01 北京大学 A kind of Chinese event abstracting method
KR20210037934A (en) * 2019-09-30 2021-04-07 한국과학기술원 Method and system for trust level evaluationon personal data collector with privacy policy analysis
CN112101004A (en) * 2020-09-23 2020-12-18 电子科技大学 General webpage character information extraction method based on conditional random field and syntactic analysis
CN112214599A (en) * 2020-10-20 2021-01-12 电子科技大学 Multi-label text classification method based on statistics and pre-training language model
CN113158674A (en) * 2021-04-01 2021-07-23 华南理工大学 Method for extracting key information of document in field of artificial intelligence

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ALBERTO HOLTS等: "Automated Text Binary Classification Using Machine Learning Approach" *
景栋盛 等: "基于深度Q网络的垃圾邮件文本分类方法" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118410160A (en) * 2024-07-01 2024-07-30 腾讯科技(深圳)有限公司 Text processing method, device, equipment and medium

Also Published As

Publication number Publication date
CN114861629B (en) 2023-04-04

Similar Documents

Publication Publication Date Title
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
US11631007B2 (en) Method and device for text-enhanced knowledge graph joint representation learning
CN110134757B (en) Event argument role extraction method based on multi-head attention mechanism
CN112069811B (en) Electronic text event extraction method with multi-task interaction enhancement
CN112732916B (en) BERT-based multi-feature fusion fuzzy text classification system
CN111027595B (en) Double-stage semantic word vector generation method
CN109308319B (en) Text classification method, text classification device and computer readable storage medium
CN110580292A (en) Text label generation method and device and computer readable storage medium
CN110222163A (en) A kind of intelligent answer method and system merging CNN and two-way LSTM
CN111274790B (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN111506732B (en) Text multi-level label classification method
CN113516198B (en) Cultural resource text classification method based on memory network and graphic neural network
CN107688576B (en) Construction and tendency classification method of CNN-SVM model
CN109918647A (en) A kind of security fields name entity recognition method and neural network model
CN112487237B (en) Music classification method based on self-adaptive CNN and semi-supervised self-training model
CN116450796A (en) Intelligent question-answering model construction method and device
CN114461804B (en) Text classification method, classifier and system based on key information and dynamic routing
CN114004220A (en) Text emotion reason identification method based on CPC-ANN
CN114564563A (en) End-to-end entity relationship joint extraction method and system based on relationship decomposition
CN114648016A (en) Event argument extraction method based on event element interaction and tag semantic enhancement
CN114861629B (en) Automatic judgment method for text style
CN114841151A (en) Medical text entity relation joint extraction method based on decomposition-recombination strategy
CN115292490A (en) Analysis algorithm for policy interpretation semantics
CN113869054A (en) Deep learning-based electric power field project feature identification method
CN113486143A (en) User portrait generation method based on multi-level text representation and model fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant