CN112749557A - Text processing model construction method and text processing method - Google Patents

Text processing model construction method and text processing method Download PDF

Info

Publication number
CN112749557A
CN112749557A CN202010784088.6A CN202010784088A CN112749557A CN 112749557 A CN112749557 A CN 112749557A CN 202010784088 A CN202010784088 A CN 202010784088A CN 112749557 A CN112749557 A CN 112749557A
Authority
CN
China
Prior art keywords
text
vector
word
target
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010784088.6A
Other languages
Chinese (zh)
Inventor
黄剑辉
梁龙军
刘海波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010784088.6A priority Critical patent/CN112749557A/en
Publication of CN112749557A publication Critical patent/CN112749557A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a construction method of a text processing model and a text processing method. The construction method of the text processing model comprises the following steps: acquiring a text training sample, and performing word segmentation processing on the text training sample to obtain a plurality of target word segmentations and a classification result of each target word segmentation; calling a classification model to perform coding processing on a text training sample to obtain a text vector, and performing coding processing on a target word segmentation to obtain a first word vector of the target word segmentation; processing the first word vector through a threshold function cached in the classification model to obtain a second word vector; and training the classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model. Therefore, the interaction between the text and the target word segmentation is strengthened, the threshold control is carried out on the text vector through the word vector, the characteristic dimension of the vector irrelevant to the word can be inhibited, and the high-precision classification of the word in the text is supported.

Description

Text processing model construction method and text processing method
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for constructing a text processing model, a computer device, a storage medium, a text processing method, an apparatus, a computer device, and a storage medium.
Background
With the development and application of computer technology, text analysis processing technology is widely applied, and in application scenarios such as title understanding, discourse sentence pattern understanding, and the like, weights of different words need to be distinguished, and core words of a text need to be determined. Taking a video as an example, a video title is taken as an important component of video content, text parsing is completed by combining a title text with a natural language processing technology, and understanding of video semantic information is enhanced, so that the video search system is one of core works of the whole video search system.
The word weight task is to understand the semantics of the sentence and give a corresponding weight value to each word in the sentence, so as to distinguish the major and minor components of the sentence. The word weight task has a significant meaning for the interpretation of text semantic information, and therefore, it is necessary to process the word weight task. The word weight task can be modeled through the classification model, and the traditional classification model has the problem of low classification precision.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus, a computer device, and a storage medium for constructing a text processing model, and a method, an apparatus, a computer device, and a storage medium for text processing, which can improve the classification accuracy.
A method of constructing a text processing model, the method comprising:
acquiring a text training sample, and performing word segmentation processing on the text training sample to acquire a plurality of target word segmentations and a classification result of each target word segmentation;
calling a preset classification model to perform coding processing on the text training sample to obtain a text vector, and performing coding processing on the target word segmentation to obtain a first word vector of the target word segmentation;
calling a threshold function cached in the classification model to process the first word vector to obtain a second word vector;
and training the classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model.
An apparatus for building a text processing model, the apparatus comprising:
the sample data acquisition module is used for acquiring a text training sample, performing word segmentation processing on the text training sample and acquiring a plurality of target word segmentations and classification results of all the target word segmentations;
the coding processing module is used for calling a preset classification model to carry out coding processing on the text training sample to obtain a text vector, and carrying out coding processing on the target word segmentation to obtain a first word vector of the target word segmentation;
the threshold processing module is used for calling a threshold function cached in the classification model to process the first word vector to obtain a second word vector;
and the model determining module is used for training the classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring a text training sample, and performing word segmentation processing on the text training sample to acquire a plurality of target word segmentations and a classification result of each target word segmentation;
calling a first coding unit of a preset classification model to perform coding processing on the text training sample to obtain a text vector, and performing coding processing on the target word segmentation to obtain a first word vector of the target word segmentation;
calling a threshold function unit cached in the classification model to process the first word vector to obtain a second word vector;
and training a classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring a text training sample, and performing word segmentation processing on the text training sample to acquire a plurality of target word segmentations and a classification result of each target word segmentation;
calling a preset classification model to perform coding processing on the text training sample to obtain a text vector, and performing coding processing on the target word segmentation to obtain a first word vector of the target word segmentation;
calling a threshold function cached in the classification model to process the first word vector to obtain a second word vector;
and training a classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model.
According to the construction method, the construction device, the computer equipment and the storage medium of the text processing model, the text training samples are obtained, word segmentation processing is carried out on the text training samples, and a plurality of target word segmentations and classification results of all the target word segmentations are obtained; then, coding the text training sample through a preset classification model to obtain a text vector, and coding the target word segmentation to obtain a first word vector of the target word segmentation; processing the first word vector through a threshold function of the classification model to obtain a second word vector; the classification model is trained through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model, the second word vector obtained through threshold processing of the first word vector is introduced in the training process of the classification model, interaction between the text and the target word segmentation is strengthened, threshold control is conducted on the text vector through the word vector, feature dimensions of vectors irrelevant to words can be restrained, accordingly, dimension difference between the text vector and the first word vector is balanced, and high-precision classification of words in the text can be supported when the text is subsequently processed through the text processing model.
A method of text processing, the method comprising:
acquiring a text to be processed, and performing word segmentation processing on the text to be processed to obtain a plurality of target words;
calling a text processing model to perform coding processing on the text to be processed to obtain a target text vector, and performing coding processing on the target word to obtain a first word vector of the target word, wherein the text processing model is obtained by training a preset classification model through historical text sample data;
calling a threshold function cached in the text processing model to process the first word vector to obtain a second word vector of the target word;
and obtaining a classification result of the target words in the text to be processed based on the second word vector of the target words, the first word vector of the target words and the target text vector.
A text processing apparatus, the apparatus comprising:
the text splitting module is used for acquiring a text to be processed, and performing word segmentation processing on the text to be processed to obtain a plurality of target words;
the vector acquisition module is used for calling a text processing model to encode the text to be processed to obtain a target text vector, and encoding the target word to obtain a first word vector of the target word, wherein the text processing model is obtained by training a preset classification model through historical text sample data;
the word vector processing module is used for processing the first word vector through a threshold function cached in the text processing model to obtain a second word vector of the target word;
and the classification result determining module is used for obtaining a classification result of the target words in the text to be processed based on the second word vector of the target words, the first word vector of the target words and the target text vector.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring a text to be processed, and performing word segmentation processing on the text to be processed to obtain a plurality of target words;
calling a text processing model to perform coding processing on the text to be processed to obtain a target text vector, and performing coding processing on the target word to obtain a first word vector of the target word, wherein the text processing model is obtained by training a preset classification model through historical text sample data;
calling a threshold function cached in the text processing model to process the first word vector to obtain a second word vector of the target word;
and obtaining a classification result of the target words in the text to be processed based on the second word vector of the target words, the first word vector of the target words and the target text vector.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring a text to be processed, and performing word segmentation processing on the text to be processed to obtain a plurality of target words;
calling a text processing model to perform coding processing on the text to be processed to obtain a target text vector, and performing coding processing on the target word to obtain a first word vector of the target word, wherein the text processing model is obtained by training a preset classification model through historical text sample data;
calling a threshold function cached in the text processing model to process the first word vector to obtain a second word vector of the target word;
and obtaining a classification result of the target words in the text to be processed based on the second word vector of the target words, the first word vector of the target words and the target text vector.
According to the text processing method, the text processing device, the computer equipment and the storage medium, the text to be processed is obtained, word segmentation processing is carried out on the text to be processed, and a plurality of target words are obtained; coding a text to be processed through a text processing model to obtain a target text vector, coding a target word to obtain a first word vector of the target word, wherein the text processing model is obtained by training a preset classification model through historical text sample data; calling a threshold function cached in a text processing model to process the first word vector to obtain a second word vector of the target word; the method comprises the steps of obtaining a classification result of a target word in a text to be processed based on a second word vector of the target word, a first word vector of the target word and the target text vector, processing the first word vector through a threshold function of a text processing model to obtain a second word vector of the target word, and introducing the second word vector not only strengthens interaction between the text to be processed and the target word, but also controls the threshold of the text vector through the word vector to inhibit the characteristic dimension of a vector irrelevant to the word, so that the dimension difference between the target text vector and the first word vector of the target word is balanced, and the high-precision classification of the target word in the text to be processed can be realized.
Drawings
FIG. 1 is a diagram of an application environment of a method for constructing a text processing model in one embodiment;
FIG. 2 is a flowchart illustrating a method for constructing a text processing model according to an embodiment;
FIG. 3 is a flow diagram illustrating the weighting of entry words in one embodiment;
FIG. 4 is a flowchart illustrating a method of processing text in one embodiment;
FIG. 5 is a diagram illustrating text processing with a threshold function introduced in one embodiment;
FIG. 6 is a block diagram showing an example of the structure of a device for constructing a text processing model;
FIG. 7 is a block diagram showing a configuration of a text processing apparatus according to an embodiment;
FIG. 8 is a block diagram of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In order to facilitate those skilled in the art to better understand the construction method/text processing method of the text processing model provided in the embodiments of the present application, the relevant contents of natural language processing are described first.
Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
The construction method of the text processing model belongs to the semantic understanding category in natural language processing, and can be applied to the application environment shown in fig. 1. The application scenario includes a mobile terminal 102 and a server 104, and the mobile terminal 102 is connected to the server 104 through a network. A user can upload a text training sample to the server 104 through the mobile terminal 102, the server 104 obtains the text training sample, performs word segmentation processing on the text training sample, and obtains a plurality of target word segmentations and classification results of the target word segmentations; calling a preset classification model to encode the text training sample to obtain a text vector, and encoding the target word segmentation to obtain a first word vector of the target word segmentation; calling a threshold function cached in the classification model to process the first word vector to obtain a second word vector; and training the classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model. The mobile terminal 102 may be a mobile phone, a tablet computer, a notebook, a desktop computer, and the like, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers. The application scenario is exemplified by applying the method for constructing the text processing model to a system comprising a mobile terminal and a server, and is realized by interaction between the mobile terminal and the server.
In one embodiment, as shown in FIG. 2, a method of constructing a text processing model is provided. In this embodiment, the method is described by taking the server in fig. 1 as an example, and referring to fig. 2, the method specifically includes the following steps:
step 202, obtaining a text training sample, performing word segmentation processing on the text training sample, and obtaining a plurality of target word segmentations and a classification result of each target word segmentation.
The text training sample is text data used for model training, and word segmentation processing of the text training sample can be realized through a word segmentation method, and the specific word segmentation method comprises rule-based word segmentation, statistic-based word segmentation and the like, such as a forward maximum matching method, a reverse maximum matching method, a bidirectional maximum matching method and the like. As shown in fig. 3, taking "the luban is not saved, the economy is suppressed, and the mobile phone plays" as an example of one text in the text training sample, the text is subjected to word segmentation, and the obtained target word segmentation includes "this", "luban", "save", "economy", "suppress", "no-go", "mobile phone", and the like. The classification result of the target participle refers to whether the target participle is a core word of the text, for example, the classification result may be marked by 0 and 1, the core word is marked as 1, and the non-core word is marked as 0. Specifically, feature fusion may be performed on a coding vector of a word and a corresponding text coding vector, the fused vector is mapped through a full connection layer, and then a probability that the word is a core word is obtained through an activation function. For example, in fig. 3, "this", "luban", "non-rescue", "economy", "suppression", "mobile phone" corresponds to probabilities of 0.1, 0.91, 0.2, 0.81, 0.7, 0.2, and 0.3, respectively, and if the probability is greater than 0.6, it is determined as the core word, the target segmented word "this", "luban", "non-rescue", "economy", "suppression", "non-departure", and "mobile phone" are classified as 0,1, 0, and 0 in this order.
And 204, calling a preset classification model to encode the text training sample to obtain a text vector, and encoding the target word segmentation to obtain a first word vector of the target word segmentation.
The text may be encoded by the first encoding unit, and specifically, the text may be encoded by BERT (Bidirectional Encoder tokens from Transformers), neural network model encoding, or the like. And coding the sentence text through the first coding unit to obtain a text vector corresponding to the sentence.
The words may be encoded by the second encoding unit, and specifically, DNN (Deep Neural Networks ), one-hot encoding, and the like may be used. And coding the words through a second coding unit to obtain word vectors corresponding to the words.
Step 206, calling a threshold function cached in the classification model to process the first word vector, and obtaining a second word vector.
The threshold vector output by the threshold function corresponds to the vector value of [0,1 ]]Within the range, the state of the corresponding neuron in the classification model is controlled through the threshold vector, and the closer the threshold vector value is to 0, the more the corresponding neuron is inhibited seriously, and the influence is weakened. In the classification model, a first word vector of a certain word is used as the input of a threshold function unitAnd (3) indicating that the relevance between the sentence text and the word is determined by the word. In particular, the threshold function may be expressed as σ (w)Tx + b), wherein x is a first word vector after word coding processing, σ adopts an activation function, specifically, the activation function can be a sigmoid activation function, and also can be a tanh activation function or other activation functions, w is a vector mapping matrix, and b is a bias constant. And mapping the first word vector through an activation function to obtain a second word vector with the same dimension as the text vector.
And 208, training the classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model.
According to the second word vector and the first word vector corresponding to a certain word and the text vector corresponding to a sentence, the binary classification result of the word can be obtained, namely whether the word is the core word or the non-core word of the sentence, and then according to the binary classification result of the word and the real classification result thereof, the classification model is trained by adjusting the parameters of the classification model, and finally the text processing model is obtained. Specifically, training the classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model, including: obtaining a classification prediction result of the target word segmentation based on the second word vector, the first word vector and the text vector; and when the classification prediction result is inconsistent with the classification result, adjusting the parameters of the classification model until the classification prediction result is consistent with the classification result, and obtaining the text processing model. When model training is carried out, the classification prediction result is inconsistent with the real classification result, parameters of the first coding unit, the second coding unit and the threshold function in the classification model can be adjusted, so that the classification prediction result is consistent with the classification real result, and the obtained classification model is the text processing model.
In one embodiment, obtaining a classification prediction result of the target segmented word based on the second word vector, the first word vector and the text vector comprises: performing vector operation based on the second word vector and the text vector to obtain a converted text vector; and performing feature fusion processing on the basis of the converted text vector and the first word vector to obtain a classification prediction result of the target word segmentation. Performing vector operation based on the second word vector and the text vector to obtain a converted text vector, including: and performing vector calculation on the second word vector and the text vector in a point multiplication mode to obtain a converted text vector. Performing feature fusion processing based on the converted text vector and the first word vector to obtain a classification prediction result of the target word segmentation, wherein the classification prediction result comprises the following steps: performing fusion processing on the second text vector and the first word vector to obtain a vector subjected to fusion processing; and determining probability distribution data of the target word segmentation based on the vector subjected to the fusion processing, and obtaining a classification prediction result of the target word segmentation according to the probability distribution data. The fusion processing may adopt a cascade mode, or other fusion modes such as point multiplication and tensor introduction, for example, the dimension corresponding to the second text vector is 768 dimensions, and the dimension corresponding to the first word vector is 50 dimensions, so that the dimension corresponding to the connected vectors is 818 dimensions.
The construction method of the text processing model comprises the steps of obtaining a text training sample, carrying out word segmentation processing on the text training sample, and obtaining a plurality of target word segmentations and classification results of all the target word segmentations; then, coding the text training sample through a preset classification model to obtain a text vector, and coding the target word segmentation to obtain a first word vector of the target word segmentation; processing the first word vector through a threshold function of the classification model to obtain a second word vector; the classification model is trained through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model, the second word vector obtained through threshold processing of the first word vector is introduced in the training process of the classification model, interaction between the text and the target word segmentation is strengthened, threshold control is conducted on the text vector through the word vector, feature dimensions of vectors irrelevant to words can be restrained, accordingly, dimension difference between the text vector and the first word vector is balanced, and high-precision classification of words in the text can be supported when the text is subsequently processed through the text processing model.
As shown in fig. 4, in an embodiment, a text processing method is provided, which specifically includes the following steps:
step 402, obtaining a text to be processed, and performing word segmentation processing on the text to be processed to obtain a plurality of target words.
And step 404, calling a first coding unit of the text processing model to code the text to be processed to obtain a target text vector, and coding the target word to obtain a first word vector of the target word, wherein the text processing model is obtained by training a preset classification model through historical text sample data.
Step 406, a threshold function cached in the text processing model is called to process the first word vector, and a second word vector of the target word is obtained.
And step 408, obtaining a classification result of the target words in the text to be processed based on the second word vector of the target words, the first word vector of the target words and the target text vector.
According to the text processing method, the text to be processed is obtained, word segmentation processing is carried out on the text to be processed, and a plurality of target words are obtained; coding a text to be processed through a text processing model to obtain a target text vector, coding a target word to obtain a first word vector of the target word, wherein the text processing model is obtained by training a preset classification model through historical text sample data; processing the first word vector through a threshold function of a text processing model to obtain a second word vector of the target word; the method comprises the steps of obtaining a classification result of a target word in a text to be processed based on a second word vector of the target word, a first word vector of the target word and the target text vector, processing the first word vector through a threshold function of a text processing model to obtain a second word vector of the target word, and introducing the second word vector not only strengthens interaction between the text to be processed and the target word, but also controls the threshold of the text vector through the word vector to inhibit the characteristic dimension of a vector irrelevant to the word, so that the dimension difference between the target text vector and the first word vector of the target word is balanced, and the high-precision classification of the target word in the text to be processed can be realized.
The application also provides an application scene, and the application scene applies the text processing model construction method and the text processing method. Specifically, the construction method of the text processing model and the application of the text processing method in the application scenario are as follows: firstly, model training is carried out through a construction method of a text processing model, and the text processing model for realizing the secondary classification of each word in the text is obtained. The construction method of the text processing model comprises the following steps: acquiring a text training sample, and performing word segmentation processing on the text training sample to obtain a plurality of target word segmentations and a classification result of each target word segmentation; coding the text training sample through a classification model to obtain a text vector, and coding the target word segmentation to obtain a first word vector of the target word segmentation; processing the first word vector through a threshold function of the classification model to obtain a second word vector; and training the classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model. Then, the text processing model is applied to carry out secondary classification on words in the text to be processed, and the text processing method comprises the following steps: acquiring a text to be processed, and performing word segmentation processing on the text to be processed to obtain a plurality of target words; coding a text to be processed through a text processing model to obtain a target text vector, and coding target words to obtain a first word vector of the target words; processing the first word vector through a threshold function of a text processing model to obtain a second word vector of the target word; and obtaining a classification result of the target words in the text to be processed based on the second word vectors of the target words, the first word vectors of the target words and the target text vectors.
Referring to fig. 5, a BERT encoder is used for encoding a statement, a DNN encoding method is used for encoding a current word, an encoding vector of the current word is used as an input of a threshold function, an output vector of the threshold function and the statement encoding vector are calculated in a point multiplication manner to obtain a final statement encoding vector, the final statement encoding vector and the encoding vector of the current word are connected, for example, the final statement encoding vector and the encoding vector of the current word are connected to have 818 dimensions, the final statement encoding vector and the encoding vector of the current word are mapped to have 1 dimension through a full connection layer, and then a sigmoid activation function is connected to obtain the probability that the current word is a core word, so that a model classification result of the current word is determined.
It should be understood that although the various steps in the flow charts of fig. 2-5 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-5 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.
Based on the same idea as the above method, fig. 6 is a schematic structural diagram of a text processing model construction apparatus according to an embodiment, which is described by taking the server 104 as an example.
As shown in fig. 6, the apparatus for constructing a text processing model in this embodiment includes: the sample data obtaining module 602 is configured to obtain a text training sample, perform word segmentation processing on the text training sample, and obtain a plurality of target word segmentations and a classification result of each target word segmentation. The encoding processing module 604 is configured to invoke a preset classification model to perform encoding processing on the text training sample to obtain a text vector, and perform encoding processing on the target word segmentation to obtain a first word vector of the target word segmentation. A threshold processing module 606, configured to invoke a threshold function cached in the classification model to process the first word vector, so as to obtain a second word vector. The model determining module 608 is configured to train the classification model according to the second word vector, the first word vector, the text vector, and the classification result of the target word segmentation, so as to obtain a text processing model.
In one embodiment, the model determining module is further configured to obtain a classification prediction result of the target word segmentation based on the second word vector, the first word vector and the text vector; and when the classification prediction result is inconsistent with the classification result, adjusting the parameters of the classification model until the classification prediction result is consistent with the classification result, and obtaining the text processing model.
In one embodiment, the model determining module is further configured to perform a vector operation based on the second word vector and the text vector to obtain a transformed text vector; and performing feature fusion processing on the basis of the converted text vector and the first word vector to obtain a classification prediction result of the target word segmentation.
In one embodiment, the model determining module is further configured to perform fusion processing on the transformed text vector and the first word vector to obtain a vector after the fusion processing; and determining probability distribution data of the target word segmentation based on the vector subjected to the fusion processing, and obtaining a classification prediction result of the target word segmentation according to the probability distribution data.
In one embodiment, the model determining module is further configured to perform vector calculation on the second word vector and the text vector in a point-by-point manner to obtain a transformed text vector.
Fig. 7 is a schematic structural diagram of a text processing apparatus according to an embodiment, the text processing apparatus including: the text splitting module 702 is configured to obtain a text to be processed, perform word segmentation processing on the text to be processed, and obtain a plurality of target words. The vector obtaining module 704 is configured to invoke a text processing model to perform coding processing on the text to be processed to obtain a target text vector, perform coding processing on the target word to obtain a first word vector of the target word, where the text processing model is obtained by training a preset classification model through historical text sample data. And a word vector processing module 706, configured to invoke a threshold function of the text processing model to process the first word vector, so as to obtain a second word vector of the target word. The classification result determining module 708 is configured to obtain a classification result of the target word in the text to be processed based on the second word vector of the target word, the first word vector of the target word, and the target text vector.
The specific limitations of the text processing model building device/text processing device can be referred to the above limitations of the text processing model building method/text processing method, and are not described herein again. The building device of the text processing model/each module in the text processing device may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method of constructing/a method of text processing a text processing model. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for constructing a text processing model, the method comprising:
acquiring a text training sample, and performing word segmentation processing on the text training sample to acquire a plurality of target word segmentations and a classification result of each target word segmentation;
calling a preset classification model to perform coding processing on the text training sample to obtain a text vector, and performing coding processing on the target word segmentation to obtain a first word vector of the target word segmentation;
calling a threshold function cached in the classification model to process the first word vector to obtain a second word vector;
and training the classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model.
2. The method of claim 1, wherein training the classification model according to the second word vector, the first word vector, the text vector, and the classification result of the target word segmentation to obtain a text processing model comprises:
obtaining a classification prediction result of the target word segmentation based on the second word vector, the first word vector and the text vector;
and when the classification prediction result is inconsistent with the classification result, adjusting parameters of a classification model until the classification prediction result is consistent with the classification result, and obtaining a text processing model.
3. The method of claim 2, wherein obtaining the classification prediction result for the target participle based on the second word vector, the first word vector, and the text vector comprises:
performing vector operation based on the second word vector and the text vector to obtain a converted text vector;
and performing feature fusion processing on the basis of the converted text vector and the first word vector to obtain a classification prediction result of the target word segmentation.
4. The method according to claim 3, wherein the performing a feature fusion process based on the transformed text vector and the first word vector to obtain a classification prediction result of the target word segmentation comprises:
performing fusion processing on the second text vector and the first word vector to obtain a vector subjected to fusion processing;
and determining probability distribution data of the target word segmentation based on the vector subjected to the fusion processing, and obtaining a classification prediction result of the target word segmentation according to the probability distribution data.
5. The method of claim 3, wherein performing a vector operation based on the second word vector and the text vector to obtain a transformed text vector comprises:
and performing vector calculation on the second word vector and the text vector in a point multiplication mode to obtain a converted text vector.
6. A method of text processing, the method comprising:
acquiring a text to be processed, and performing word segmentation processing on the text to be processed to obtain a plurality of target words;
calling a text processing model to perform coding processing on the text to be processed to obtain a target text vector, and performing coding processing on the target word to obtain a first word vector of the target word, wherein the text processing model is obtained by training a preset classification model through historical text sample data;
calling a threshold function cached in the text processing model to process the first word vector to obtain a second word vector of the target word;
and obtaining a classification result of the target words in the text to be processed based on the second word vector of the target words, the first word vector of the target words and the target text vector.
7. An apparatus for constructing a text processing model, the apparatus comprising:
the sample data acquisition module is used for acquiring a text training sample, performing word segmentation processing on the text training sample and acquiring a plurality of target word segmentations and classification results of all the target word segmentations;
the coding processing module is used for calling a preset classification model to carry out coding processing on the text training sample to obtain a text vector, and carrying out coding processing on the target word segmentation to obtain a first word vector of the target word segmentation;
the threshold processing module is used for calling a threshold function cached in the classification model to process the first word vector to obtain a second word vector;
and the model determining module is used for training a classification model through the second word vector, the first word vector, the text vector and the classification result of the target word segmentation to obtain a text processing model.
8. A text processing apparatus, characterized in that the apparatus comprises:
the text splitting module is used for acquiring a text to be processed, and performing word segmentation processing on the text to be processed to obtain a plurality of target words;
the vector acquisition module is used for calling a text processing model to encode the text to be processed to obtain a target text vector, and encoding the target word to obtain a first word vector of the target word, wherein the text processing model is obtained by training a preset classification model through historical text sample data;
the word vector processing module is used for calling a threshold function cached in the text processing model to process the first word vector to obtain a second word vector of the target word;
and the classification result determining module is used for obtaining a classification result of the target words in the text to be processed based on the second word vector of the target words, the first word vector of the target words and the target text vector.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 6 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN202010784088.6A 2020-08-06 2020-08-06 Text processing model construction method and text processing method Pending CN112749557A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010784088.6A CN112749557A (en) 2020-08-06 2020-08-06 Text processing model construction method and text processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010784088.6A CN112749557A (en) 2020-08-06 2020-08-06 Text processing model construction method and text processing method

Publications (1)

Publication Number Publication Date
CN112749557A true CN112749557A (en) 2021-05-04

Family

ID=75645751

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010784088.6A Pending CN112749557A (en) 2020-08-06 2020-08-06 Text processing model construction method and text processing method

Country Status (1)

Country Link
CN (1) CN112749557A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204961A (en) * 2021-05-31 2021-08-03 平安科技(深圳)有限公司 Language model construction method, device, equipment and medium for NLP task
CN113988456A (en) * 2021-11-10 2022-01-28 中国工商银行股份有限公司 Emotion classification model training method, emotion prediction method and emotion prediction device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107491541A (en) * 2017-08-24 2017-12-19 北京丁牛科技有限公司 File classification method and device
CN108920466A (en) * 2018-07-27 2018-11-30 杭州电子科技大学 A kind of scientific text keyword extracting method based on word2vec and TextRank
CN109388712A (en) * 2018-09-21 2019-02-26 平安科技(深圳)有限公司 A kind of trade classification method and terminal device based on machine learning
CN109960804A (en) * 2019-03-21 2019-07-02 江西风向标教育科技有限公司 A kind of topic text sentence vector generation method and device
CN110297888A (en) * 2019-06-27 2019-10-01 四川长虹电器股份有限公司 A kind of domain classification method based on prefix trees and Recognition with Recurrent Neural Network
CN110704576A (en) * 2019-09-30 2020-01-17 北京邮电大学 Text-based entity relationship extraction method and device
CN110717039A (en) * 2019-09-17 2020-01-21 平安科技(深圳)有限公司 Text classification method and device, electronic equipment and computer-readable storage medium
CN110895559A (en) * 2018-09-12 2020-03-20 阿里巴巴集团控股有限公司 Model training method, text processing method, device and equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107491541A (en) * 2017-08-24 2017-12-19 北京丁牛科技有限公司 File classification method and device
CN108920466A (en) * 2018-07-27 2018-11-30 杭州电子科技大学 A kind of scientific text keyword extracting method based on word2vec and TextRank
CN110895559A (en) * 2018-09-12 2020-03-20 阿里巴巴集团控股有限公司 Model training method, text processing method, device and equipment
CN109388712A (en) * 2018-09-21 2019-02-26 平安科技(深圳)有限公司 A kind of trade classification method and terminal device based on machine learning
CN109960804A (en) * 2019-03-21 2019-07-02 江西风向标教育科技有限公司 A kind of topic text sentence vector generation method and device
CN110297888A (en) * 2019-06-27 2019-10-01 四川长虹电器股份有限公司 A kind of domain classification method based on prefix trees and Recognition with Recurrent Neural Network
CN110717039A (en) * 2019-09-17 2020-01-21 平安科技(深圳)有限公司 Text classification method and device, electronic equipment and computer-readable storage medium
CN110704576A (en) * 2019-09-30 2020-01-17 北京邮电大学 Text-based entity relationship extraction method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204961A (en) * 2021-05-31 2021-08-03 平安科技(深圳)有限公司 Language model construction method, device, equipment and medium for NLP task
CN113204961B (en) * 2021-05-31 2023-12-19 平安科技(深圳)有限公司 Language model construction method, device, equipment and medium for NLP task
CN113988456A (en) * 2021-11-10 2022-01-28 中国工商银行股份有限公司 Emotion classification model training method, emotion prediction method and emotion prediction device

Similar Documents

Publication Publication Date Title
CN111368993B (en) Data processing method and related equipment
CN109902301B (en) Deep neural network-based relationship reasoning method, device and equipment
CN112257858A (en) Model compression method and device
CN112131883B (en) Language model training method, device, computer equipment and storage medium
CN111783903B (en) Text processing method, text model processing method and device and computer equipment
CN113688631A (en) Nested named entity recognition method, system, computer and storage medium
CN112749557A (en) Text processing model construction method and text processing method
CN115994317A (en) Incomplete multi-view multi-label classification method and system based on depth contrast learning
CN116821339A (en) Misuse language detection method, device and storage medium
CN114792097B (en) Method and device for determining prompt vector of pre-training model and electronic equipment
CN116957006A (en) Training method, device, equipment, medium and program product of prediction model
CN113947185B (en) Task processing network generation method, task processing device, electronic equipment and storage medium
CN115952266A (en) Question generation method and device, computer equipment and storage medium
CN115050371A (en) Speech recognition method, speech recognition device, computer equipment and storage medium
CN110852066A (en) Multi-language entity relation extraction method and system based on confrontation training mechanism
CN112818658B (en) Training method, classifying method, device and storage medium for text classification model
CN114648021A (en) Question-answering model training method, question-answering method and device, equipment and storage medium
CN115840817A (en) Information clustering processing method and device based on contrast learning and computer equipment
CN112818688A (en) Text processing method, device, equipment and storage medium
CN114492661B (en) Text data classification method and device, computer equipment and storage medium
CN113033212B (en) Text data processing method and device
CN117521674B (en) Method, device, computer equipment and storage medium for generating countermeasure information
Li et al. [Retracted] Research on Oral English Dialogue Understanding Based on Deep Learning
US20240137042A1 (en) Coding apparatuses, and data processing methods and apparatueses
CN113836940A (en) Knowledge fusion method and device in electric power metering field and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40048365

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination